Re: Heap memory usage while writing

Anishek Agarwal Sun, 12 Apr 2015 21:17:22 -0700

I do understand how MaxTenuringThreshold works, thanks for your evaluation
though.


I dont think you saw my complete post with the values i have used for the
heap size and and the *memtable_total_space_in_mb=2048* which is two times
smaller than the young generation space i am using. additionally
*memtable_flush_queue_size=1
*so there are not many memtables in memory, this coupled with the fact that
i am writing out to cassandra wit 20 threads, it should pretty much just
collect the objects from ParNewGC, *which is what it is doing now. *

there are only 2 CMS collections that happened for me in 15 mins when
running at full capacity, what i am now concerned about is that the
CMS-remark phase is about 70 ms and that is something i am looking to bring
down. There seems to be valuable pointers @ *Cassandra-8150 *still which i
am going to try.



On Fri, Apr 10, 2015 at 7:26 PM, [email protected] <[email protected]>
wrote:

>
> MaxTenuringThreshold is low as i think most of the objects should be
> ephemeral with only writes.
>
>  You don't understant how *MaxTenuringThreshold* works. If you keep it
> low, than GC will move objects which is still "alive" to old gen space.
> Yes, they ephemeral, but C* will keep it until flushed to disk. So, again,
> you should balance *heap space*, *memtable_total_space_in_mb,
> memtable_cleanup_threshold *and your *disk_throughput *to rid off
> memtables as soon as possible. If *memtable_total_space_in_mb *is large
> and young gen is large too, then you have to increase MaxTenuringThreshold,
> to keep CMS off of moving data to old gen.
> If you sure that young gen is filled not so fast, that you can increase
> *CMSWaitDuration* to avoid useless calls of CMS.
>
>
>
> On 04/10/2015 03:42 PM, Anishek Agarwal wrote:
>
> Sorry i forgot to update but i am not using "CMSIncrementalMode" anymore
> as it over rides "UseCMSInitiatingOccupancyOnly".
>
> @Graham : thanks for the "CMSParallelInitialMarkEnabled" and "
> CMSEdenChunksRecordAlways"  i havent used them, i will try it. My initial
> mark is only around 6ms though.
>
>  With my current config(with incorporating the changes above), I have
> been able to reduce the number of CMS run significantly now and mostly
> ParNewGC is running but when CMS triggers it takes a lot of time for Remark
> hence started using  -XX:+CMSParallelRemarkEnabled which gave some
> improvement. This is still around 70 ms.
>
>  MaxTenuringThreshold is low as i think most of the objects should be
> ephemeral with only writes.
>
>  @Sebastian : I started from that Issue :), though i havent tried the GC
> affinity ones as of yet still. Thanks for the link!
>
>  Thanks
> anishek
>
>
> On Fri, Apr 10, 2015 at 5:49 PM, Sebastian Estevez <
> [email protected]> wrote:
>
>> Did you check out Cassandra-8150?
>>  On Apr 10, 2015 7:04 AM, "Anishek Agarwal" <[email protected]> wrote:
>>
>>> Hey,
>>>
>>>  Any reason you think the MaxTenuringThreshold should be increased. I
>>> am pumping data at full capacity that a single nodes seems to take so all
>>> the data becomes stale soon enough (when its flushed), additionally the
>>> whole memtable can be in young generation only. There seems to be enough
>>> additional space to even hold the bloom filters for the respective
>>> SSTTAbles i would guess.
>>>
>>>  I will try with the CMSWaitDuration that should help in reducing the
>>> CMS initial mark phase i think.
>>>
>>>  Though i am not sure what is getting moved to old generation
>>> continuously to fill it ?
>>>
>>>  Thanks for the pointers.
>>>
>>> On Fri, Apr 10, 2015 at 12:12 PM, [email protected] <[email protected]>
>>> wrote:
>>>
>>>>  Hi,
>>>>
>>>> You should increase *MaxTenuringThreshold* and *CMSWaitDuration* to
>>>> keep your data in young generation longer (until the data will be flushed
>>>> to disk).
>>>> Depending on your load, combine values of the next parameters: 
>>>> *HEAP_NEWSIZE,
>>>> memtable_total_space_in_mb, memtable_cleanup_threshold *and your
>>>> *disk_throughput*.
>>>> Ideally, only ParNewGC will work to collect ephemeral objects, and it
>>>> will take very short delays.
>>>>
>>>>
>>>> On 04/09/2015 09:30 AM, Anishek Agarwal wrote:
>>>>
>>>> Hello,
>>>>
>>>>  We have only on CF as
>>>>
>>>>  CREATE TABLE t1(id bigint, ts timestamp, definition text, primary key
>>>> (id, ts))
>>>> with clustering order by (ts desc) and gc_grace_seconds=0
>>>> and compaction = {'class': 'DateTieredCompactionStrategy',
>>>> 'timestamp_resolution':'SECONDS', 'base_time_seconds':'20',
>>>> 'max_sstable_age_days':'30'}
>>>> and compression={'sstable_compression' : ''};
>>>>
>>>>  on a single Node using the following in
>>>>
>>>>  cassandra.yaml:::::
>>>>  memtable_total_space_in_mb: 2048
>>>>  commitlog_total_space_in_mb: 4096
>>>>  memtable_flush_writers: 2
>>>>  memtable_flush_queue_size: 1
>>>>
>>>>  cassandra-env.sh ::::
>>>>  MAX_HEAP_SIZE="8G"
>>>> HEAP_NEWSIZE="5120M"
>>>>  JVM_OPTS="$JVM_OPTS -XX:+UseParNewGC"
>>>> JVM_OPTS="$JVM_OPTS -XX:+UseConcMarkSweepGC"
>>>> JVM_OPTS="$JVM_OPTS -XX:+CMSParallelRemarkEnabled"
>>>> JVM_OPTS="$JVM_OPTS -XX:SurvivorRatio=6"
>>>> JVM_OPTS="$JVM_OPTS -XX:MaxTenuringThreshold=1"
>>>> JVM_OPTS="$JVM_OPTS -XX:CMSInitiatingOccupancyFraction=70"
>>>> JVM_OPTS="$JVM_OPTS -XX:+UseCMSInitiatingOccupancyOnly"
>>>> JVM_OPTS="$JVM_OPTS -XX:+UseTLAB"
>>>> JVM_OPTS="$JVM_OPTS -XX:MaxPermSize=256m"
>>>> JVM_OPTS="$JVM_OPTS -XX:+AggressiveOpts"
>>>> JVM_OPTS="$JVM_OPTS -XX:+UseCompressedOops"
>>>> JVM_OPTS="$JVM_OPTS -XX:+CMSIncrementalMode"
>>>> JVM_OPTS="$JVM_OPTS -XX:+CMSIncrementalPacing"
>>>>  JVM_OPTS="$JVM_OPTS -XX:+PrintGCDetails"
>>>> JVM_OPTS="$JVM_OPTS -XX:+PrintGCTimeStamps -verbose:gc"
>>>>  JVM_OPTS="$JVM_OPTS
>>>> -Xloggc:/home/anishek/apache-cassandra-2.0.13/logs/gc.log"
>>>>  JVM_OPTS="$JVM_OPTS -XX:+PrintHeapAtGC"
>>>> JVM_OPTS="$JVM_OPTS -XX:+PrintTenuringDistribution"
>>>>
>>>>
>>>>  I am writing via 20 threads continuously to this table above.
>>>> I see that some data keeps moving from the young generation to the
>>>> older generation continuously.
>>>>
>>>>  I am wondering why this is happening. Given i am writing constantly
>>>> and my young generation is more than twice the max mem table space used i
>>>> would think only the young generation space would be used and nothing would
>>>> ever go old generation.
>>>>
>>>>  ** System.log show no compactions happening.
>>>> ** There are no read operations.
>>>>  ** Cassandra version 2.0.13 on centos with 16 cores and 16 GB Ram
>>>>
>>>>  Thanks
>>>> Anishek
>>>>
>>>>
>>>>   --
>>>> Thanks,
>>>> Serj
>>>>
>>>>
>>>
>
> --
> Thanks,
> Serj
>
>

Re: Heap memory usage while writing

Reply via email to