I want your opinion guys regarding two features implemented in attempt to
greatly reduce number of memory allocation without major surgery in the

The features are:
1. Custom STL allocator, which allocates first N items from the STL
container itself. This is semi-transparent replacement of standard
allocator. Just need to replace std::map with ceph_map for example.
Limitation: a) Brakes move semantic. b) No deallocation implemented, so no
good for big long living containers.
2. Placement allocator, which allows chained allocation of shorter living
object from longer living. Example would be allocation of finish contexts
from aio completion context.
Limitation: a) May require some code rearrangement in order to avoid
concurrent deallocations, otherwise deallocation code uses synchronization
what limits performance. b) same as above b)

Performance results for 32 threads in a synthetic test, std allocator time
to custom 
allocator time ratio:
            stlalloc                     stl+placement alloc            
block jemalloc tcmalloc ptmalloc      jemalloc tcmalloc ptmalloc
1M      1298.01 650.66  137.64          735.49  824.45  9.62
64K     514.84  2.82    304.62          570.74  4.85    12.21
32K     838.89  2.17    5.03            1600.5  7.43    8.28
4K      2.76    1.99    4.98            4.36    5.3     8.23
32B     2.67    5.09    3.69            4.41    8.48    6.4
(100M test iterations for 32B and 4K, 2M for 32K and 64K, 200K for 1M)

I didn¹t see any performance improvement in 100% write fio test, it still
can shine in other workloads or proper classes replaced.
Let me know if it worth to PR them.

STL allocator: 
STL allocator usage example:
Placement allocator:
Placement allocator usage example:


To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to