I do understand how MaxTenuringThreshold works, thanks for your evaluation though.
I dont think you saw my complete post with the values i have used for the heap size and and the *memtable_total_space_in_mb=2048* which is two times smaller than the young generation space i am using. additionally *memtable_flush_queue_size=1 *so there are not many memtables in memory, this coupled with the fact that i am writing out to cassandra wit 20 threads, it should pretty much just collect the objects from ParNewGC, *which is what it is doing now. * there are only 2 CMS collections that happened for me in 15 mins when running at full capacity, what i am now concerned about is that the CMS-remark phase is about 70 ms and that is something i am looking to bring down. There seems to be valuable pointers @ *Cassandra-8150 *still which i am going to try. On Fri, Apr 10, 2015 at 7:26 PM, [email protected] <[email protected]> wrote: > > MaxTenuringThreshold is low as i think most of the objects should be > ephemeral with only writes. > > You don't understant how *MaxTenuringThreshold* works. If you keep it > low, than GC will move objects which is still "alive" to old gen space. > Yes, they ephemeral, but C* will keep it until flushed to disk. So, again, > you should balance *heap space*, *memtable_total_space_in_mb, > memtable_cleanup_threshold *and your *disk_throughput *to rid off > memtables as soon as possible. If *memtable_total_space_in_mb *is large > and young gen is large too, then you have to increase MaxTenuringThreshold, > to keep CMS off of moving data to old gen. > If you sure that young gen is filled not so fast, that you can increase > *CMSWaitDuration* to avoid useless calls of CMS. > > > > On 04/10/2015 03:42 PM, Anishek Agarwal wrote: > > Sorry i forgot to update but i am not using "CMSIncrementalMode" anymore > as it over rides "UseCMSInitiatingOccupancyOnly". > > @Graham : thanks for the "CMSParallelInitialMarkEnabled" and " > CMSEdenChunksRecordAlways" i havent used them, i will try it. My initial > mark is only around 6ms though. > > With my current config(with incorporating the changes above), I have > been able to reduce the number of CMS run significantly now and mostly > ParNewGC is running but when CMS triggers it takes a lot of time for Remark > hence started using -XX:+CMSParallelRemarkEnabled which gave some > improvement. This is still around 70 ms. > > MaxTenuringThreshold is low as i think most of the objects should be > ephemeral with only writes. > > @Sebastian : I started from that Issue :), though i havent tried the GC > affinity ones as of yet still. Thanks for the link! > > Thanks > anishek > > > On Fri, Apr 10, 2015 at 5:49 PM, Sebastian Estevez < > [email protected]> wrote: > >> Did you check out Cassandra-8150? >> On Apr 10, 2015 7:04 AM, "Anishek Agarwal" <[email protected]> wrote: >> >>> Hey, >>> >>> Any reason you think the MaxTenuringThreshold should be increased. I >>> am pumping data at full capacity that a single nodes seems to take so all >>> the data becomes stale soon enough (when its flushed), additionally the >>> whole memtable can be in young generation only. There seems to be enough >>> additional space to even hold the bloom filters for the respective >>> SSTTAbles i would guess. >>> >>> I will try with the CMSWaitDuration that should help in reducing the >>> CMS initial mark phase i think. >>> >>> Though i am not sure what is getting moved to old generation >>> continuously to fill it ? >>> >>> Thanks for the pointers. >>> >>> On Fri, Apr 10, 2015 at 12:12 PM, [email protected] <[email protected]> >>> wrote: >>> >>>> Hi, >>>> >>>> You should increase *MaxTenuringThreshold* and *CMSWaitDuration* to >>>> keep your data in young generation longer (until the data will be flushed >>>> to disk). >>>> Depending on your load, combine values of the next parameters: >>>> *HEAP_NEWSIZE, >>>> memtable_total_space_in_mb, memtable_cleanup_threshold *and your >>>> *disk_throughput*. >>>> Ideally, only ParNewGC will work to collect ephemeral objects, and it >>>> will take very short delays. >>>> >>>> >>>> On 04/09/2015 09:30 AM, Anishek Agarwal wrote: >>>> >>>> Hello, >>>> >>>> We have only on CF as >>>> >>>> CREATE TABLE t1(id bigint, ts timestamp, definition text, primary key >>>> (id, ts)) >>>> with clustering order by (ts desc) and gc_grace_seconds=0 >>>> and compaction = {'class': 'DateTieredCompactionStrategy', >>>> 'timestamp_resolution':'SECONDS', 'base_time_seconds':'20', >>>> 'max_sstable_age_days':'30'} >>>> and compression={'sstable_compression' : ''}; >>>> >>>> on a single Node using the following in >>>> >>>> cassandra.yaml::::: >>>> memtable_total_space_in_mb: 2048 >>>> commitlog_total_space_in_mb: 4096 >>>> memtable_flush_writers: 2 >>>> memtable_flush_queue_size: 1 >>>> >>>> cassandra-env.sh :::: >>>> MAX_HEAP_SIZE="8G" >>>> HEAP_NEWSIZE="5120M" >>>> JVM_OPTS="$JVM_OPTS -XX:+UseParNewGC" >>>> JVM_OPTS="$JVM_OPTS -XX:+UseConcMarkSweepGC" >>>> JVM_OPTS="$JVM_OPTS -XX:+CMSParallelRemarkEnabled" >>>> JVM_OPTS="$JVM_OPTS -XX:SurvivorRatio=6" >>>> JVM_OPTS="$JVM_OPTS -XX:MaxTenuringThreshold=1" >>>> JVM_OPTS="$JVM_OPTS -XX:CMSInitiatingOccupancyFraction=70" >>>> JVM_OPTS="$JVM_OPTS -XX:+UseCMSInitiatingOccupancyOnly" >>>> JVM_OPTS="$JVM_OPTS -XX:+UseTLAB" >>>> JVM_OPTS="$JVM_OPTS -XX:MaxPermSize=256m" >>>> JVM_OPTS="$JVM_OPTS -XX:+AggressiveOpts" >>>> JVM_OPTS="$JVM_OPTS -XX:+UseCompressedOops" >>>> JVM_OPTS="$JVM_OPTS -XX:+CMSIncrementalMode" >>>> JVM_OPTS="$JVM_OPTS -XX:+CMSIncrementalPacing" >>>> JVM_OPTS="$JVM_OPTS -XX:+PrintGCDetails" >>>> JVM_OPTS="$JVM_OPTS -XX:+PrintGCTimeStamps -verbose:gc" >>>> JVM_OPTS="$JVM_OPTS >>>> -Xloggc:/home/anishek/apache-cassandra-2.0.13/logs/gc.log" >>>> JVM_OPTS="$JVM_OPTS -XX:+PrintHeapAtGC" >>>> JVM_OPTS="$JVM_OPTS -XX:+PrintTenuringDistribution" >>>> >>>> >>>> I am writing via 20 threads continuously to this table above. >>>> I see that some data keeps moving from the young generation to the >>>> older generation continuously. >>>> >>>> I am wondering why this is happening. Given i am writing constantly >>>> and my young generation is more than twice the max mem table space used i >>>> would think only the young generation space would be used and nothing would >>>> ever go old generation. >>>> >>>> ** System.log show no compactions happening. >>>> ** There are no read operations. >>>> ** Cassandra version 2.0.13 on centos with 16 cores and 16 GB Ram >>>> >>>> Thanks >>>> Anishek >>>> >>>> >>>> -- >>>> Thanks, >>>> Serj >>>> >>>> >>> > > -- > Thanks, > Serj > >
