On 4/4/2014 9:43 AM, Dimitry Sibiryakov wrote: > 04.04.2014 15:37, Alex Peshkoff wrote: >> On 04/04/14 17:01, James Starkey wrote: >>> An alternate solution that is close is thread specific sub-pools, which is >>> nice because a thread specific sub-pool doesn't even need interlocked >>> instructions. It does require a fetch of thread specific data on every >>> allocate and release, but some platforms dedicate a register to point to >>> thread specific data, making the op essentially free. >> Unfortunately on a lot of platforms which seem to provide such support >> it does not work in dynamic libraries. Our engine is dynamic library... >> To be precise even use of traditional TLS calls is relatively fast - >> TLS_get requires about 10-15 machine instructions to complete which much >> less than single atomic op, not to say about locking something. > With thread-specific pools isn't it impossible to allocate object in one > thread and > then release in other?.. > No, but maybe yes. Most memory allocators (and all of mine) manage small blocks and large blocks differently. Small blocks, when released, go on a size-class stack for reuse while large blocks are factored into a block recombination data structure. Small blocks just go into the sub-pool of the thread that released them and are available for re-use on that thread. You have you choice of what to do with big blocks, but I think sending them directly back to the main is best.
There is a problem when most allocations are done a group of worker threads and most deallocations by a garbage collector class. The solution is straightforward in most cases, however, just have the GC thread do frequent sub-pool flushes. Interestingly enough, this is where an interlocked instruction free deallocation really wins as the flush to the main pool requires a lock, a large number of blocks can migrate from the sub-pool to the main pool in one fell swoop. NuoDB originally used a thread specific memory manager, but someone found an open source memory manager that while slightly slower had significantly less fragmentation leading to an overall improvement. One other word of caution: Overloading "new" is theoretically and practically possible, but very very hard to work with both Visual Studio and gcc. For AmorphousDB I'm ignoring high-performance memory allocators as a localized problem somebody else can solve later. I dare say that none of this is as important as when I first wrote the Interbase memory allocator and machines had a max of 3 MB physical memory. Ann still says that I'm a victim of the bit depression. ------------------------------------------------------------------------------ Firebird-Devel mailing list, web interface at https://lists.sourceforge.net/lists/listinfo/firebird-devel