>
>
> memtable_offheap_space_in_mb: 4096
>
> memtable_cleanup_threshold: 0.99
>

^ What led to this setting? You are basically telling Cassandra to not
flush the highest-traffic memtable until the memtable space is 99% full.
With that many tables and keyspaces, you are basically locking up
everything on the flush queue, causing substantial back pressure. If you
run 'nodetool tpstats' you will probably see a massive number of 'All Time
Blocked' for FlushWriter and 'Dropped' for Mutations.

Actually, this is probably why you are seeing a lot of small tables: commit
log segments are being filled and blocked from flushing due to the above,
so they have to attempt to flush repeatedly with whatever is there whenever
they get the chance.

thrift_framed_transport_size_in_mb: 150
>

^ This is also a super bad idea. Thrift buffers grow as needed to
accomodate larger results, but they dont ever shrink. This will lead to a
bunch of open connections holding onto large, empty byte arrays. This will
show up immediately in a heap dump inspection.


> concurrent_compactors: 4
>
> compaction_throughput_mb_per_sec: 0
>
> endpoint_snitch: GossipingPropertyFileSnitch
>
>
>
> This grinds our system to a halt and causes a major GC nearly every second.
>
>
>
> So far the only way to get around this is to run a cron job every hour
> that does a “nodetool compact”.
>

What's the output of 'nodetool compactionstats'? CASSANDRA-9882
and CASSANDRA-9592 could be to blame (both fixed in recent versions) or
this could just be a side effect of the memory pressure from the above
settings.

Start back at the default settings (except snitch - GPFS is always a good
place to start) and change settings serially and in small increments based
on feedback gleaned from monitoring runtimes.


-- 
-----------------
Nate McCall
Austin, TX
@zznate

Co-Founder & Sr. Technical Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com

Reply via email to