[ https://issues.apache.org/jira/browse/CASSANDRA-6794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14595443#comment-14595443 ]
Arun Chaitanya Miriappalli commented on CASSANDRA-6794: ------------------------------------------------------- I completely understand that "large numbers of CFs" is an anti-pattern. But unfortunately, in our use case we have many CFs. Now we settled on the following approach - Use "Off Heap Memory" Modifications to default cassandra.yaml and cassandra-env.sh ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ * memory_allocator: JEMallocAllocator (https://issues.apache.org/jira/browse/CASSANDRA-7883) * memtable_allocation_type: offheap_objects By above two, the slab allocation (https://issues.apache.org/jira/browse/CASSANDRA-5935), which requires 1MB heap memory per table, is disabled. The memory for table metadata, caches and memtable are thus allocated natively and does not affect GC performance. * tombstone_failure_threshold: 100000000 Without this, C* throws TombstoneOverwhelmingException while in startup. This setting looks problematic so I want to know why just creating tables makes so many tombstones ... * -XX:+UseG1GC It is good for reducing GC time. Without this, full GCs > 1s are observed. We created 5000 column families with about 1000 entries per column family. The read/write performance seems to stable and comparable. The problem we saw is only with startup time. No of CFs 500 onHeap 5000 off Heap Cassandra Start Time (s) 20 349 Average CPU Usage (%) 40 49.65 GC Actitivy (%) 2.6 0.6 I want to know if there are any problems that are foreseen in the production environment. Sorry, if this is not the right place to ask this question. > Optimise slab allocator to enable higher number of column families > ------------------------------------------------------------------ > > Key: CASSANDRA-6794 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6794 > Project: Cassandra > Issue Type: New Feature > Components: Core > Reporter: Jeremy Hanna > Priority: Minor > > Currently the slab allocator allocates 1MB per column family. This has been > very beneficial for gc efficiency. However, it makes it more difficult to > have large numbers of column families. > It would be preferable to have a more intelligent way to allocate slabs so > that there is more flexibility between slab allocator and non-slab allocator > behaviour. > A simple first step is to ramp up size of slabs from small (say 8KB) when > empty, to 1MB after a few slabs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)