Am 04.02.2006 um 17:08 schrieb Vlad Seryakov:
That could be true on Solaris, but in Linux 2.6 mmap/munmap is very
fast and looking into kernel source it tells you that they conver
sbrk ito mmap imternally but the different is that mmap is
multithreaded-aware while sbrk not.
Solaris (1 CPU)
Tcl: 8.4.12, threads 16, loops 500000
starting 16 malloc threads...waiting....done: 3 seconds, 938700 usec
starting 16 ckalloc threads...waiting....done: 6 seconds, 62454 usec
starting 16 _malloc threads...waiting....done: 9 seconds, 755277 usec
Linux (1 CPU, 1.8 GHz)
Tcl: 8.4.12, threads 16, loops 500000
starting 16 malloc threads...waiting....done: 2 seconds, 298735 usec
starting 16 ckalloc threads...waiting....done: 3 seconds, 331197 usec
starting 16 _malloc threads...waiting....done: 1 seconds, 323865 usec
Mac OSX (1 CPU 1.5Ghz)
zoran:~ zoran$ ./m2
Tcl: 8.4.12, threads 16, loops 500000
starting 16 malloc threads...waiting....done: 57 seconds, 300088 usec
starting 16 ckalloc threads...waiting....done: 195 seconds, 526369 usec
starting 16 _malloc threads...waiting....done: 13 seconds, 869307 usec
Mac OSX (2 CPU 867MHz)
panther:~ zoran$ ./m2
Tcl: 8.4.12, threads 16, loops 500000
starting 16 malloc threads...waiting....done: 189 seconds, 228665 usec
starting 16 ckalloc threads...waiting....done: 730 seconds, 700258
usec (!!!!!)
starting 16 _malloc threads...waiting....done: 19 seconds, 958533 usec
Now, using mmap to allocate block of memory and then re-using that
this is waht i am doing, but i do not use munmap, still it is
possible.
With random allocations from 1-128L, Tcl alloc gives the worst
results, constantly, which means it is good on small allocations only?
Aparently it is all above 16284 bytes that uses malloc directly.
I am not trying to re-invent the wheel, it is just accidentally i
replaced sbrk with mmap and removed mutexes around it and it became
much faster than what we have now, at least on Linux.
The only part where it is not faster is single-cpu solaris.
I have no idea why. I can test it on 2 cpu solaris next week.
Anyway, from all this tests, it appears that the Tcl allocator
is slower than anything else, at least for the test-pattern
used in your test.
Cheers
Zoran