On 4/4/2014 9:43 AM, Dimitry Sibiryakov wrote:
> 04.04.2014 15:37, Alex Peshkoff wrote:
>> On 04/04/14 17:01, James Starkey wrote:
>>> An alternate solution that is close is thread specific sub-pools, which is
>>> nice because a thread specific sub-pool doesn't even need interlocked
>>> instructions.  It does require a fetch of thread specific data on every
>>> allocate and release, but some platforms dedicate a register to point to
>>> thread specific data, making the op essentially free.
>> Unfortunately on a lot of platforms which seem to provide such support
>> it does not work in dynamic libraries. Our engine is dynamic library...
>> To be precise even use of traditional TLS calls is relatively fast -
>> TLS_get requires about 10-15 machine instructions to complete which much
>> less than single atomic op, not to say about locking something.
>     With thread-specific pools isn't it impossible to allocate object in one 
> thread and
> then release in other?..
>
No, but maybe yes.  Most memory allocators (and all of mine) manage 
small blocks and large blocks differently.  Small blocks, when released, 
go on a size-class stack for reuse while large blocks are factored into 
a block recombination data structure.  Small blocks just go into the 
sub-pool of the thread that released them and are available for re-use 
on that thread.  You have you choice of what to do with big blocks, but 
I think sending them directly back to the main is best.

There is a problem when most allocations are done a group of worker 
threads and most deallocations by a garbage collector class.  The 
solution is straightforward in most cases, however, just have the GC 
thread do frequent sub-pool flushes.  Interestingly enough, this is 
where an interlocked instruction free deallocation really wins as the 
flush to the main pool requires a lock, a large number of blocks can 
migrate from the sub-pool to the main pool in one fell swoop.

NuoDB originally used a thread specific memory manager, but someone 
found an open source memory manager that while slightly slower had 
significantly less fragmentation leading to an overall improvement.

One other word of caution:  Overloading "new" is theoretically and 
practically possible, but very very hard to work with both Visual Studio 
and gcc.  For AmorphousDB I'm ignoring high-performance memory 
allocators as a localized problem somebody else can solve later.

I dare say that none of this is as important as when I first wrote the 
Interbase memory allocator and machines had a max of 3 MB physical 
memory.  Ann still says that I'm a victim of the bit depression.


------------------------------------------------------------------------------
Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel

Reply via email to