> How does compaction_throughput relate to memory usage?  
It reduces the rate of memory allocation. 
e.g. Say normally ParNew can keep up with the rate of memory usage without 
stopping for too long: so the rate of promotion is low'ish and every thing is 
allocated to Eden. If the allocation rate gets higher ParNew may be more 
frequent and objects may be promoted to tenured that don't really need to be 
there.  

>  I assumed that was more for IO tuning.  I noticed that lowering 
> concurrent_compactors to 4 (from default of 8) lowered the memory used during 
> compactions.
Similar thing to above. This may reduce the number of rows held in memory at 
any instant for compaction. 

Only rows less than in_memory_compaction_limit are loaded into memory during 
compaction. So reducing that may reduce the memory usage.

>  Since then I've reduced the TTL to 1 hour and set gc_grace_seconds to 0 so 
> the number of rows and data dropped to a level it can handle.
Cool. Sorry if took so long to get there. 


-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 26/10/2012, at 8:08 AM, Bryan Talbot <btal...@aeriagames.com> wrote:

> On Thu, Oct 25, 2012 at 4:15 AM, aaron morton <aa...@thelastpickle.com> wrote:
>> This sounds very much like "my heap is so consumed by (mostly) bloom
>> filters that I am in steady state GC thrash."
>> 
>> Yes, I think that was at least part of the issue.
> 
> The rough numbers I've used to estimate working set are:
> 
> * bloom filter size for 400M rows at 0.00074 fp without java fudge (they are 
> just a big array) 714 MB
> * memtable size 1024 MB 
> * index sampling:
>       *  24 bytes + key (16 bytes for UUID) = 32 bytes 
>       * 400M / 128 default sampling = 3,125,000
>       *  3,125,000 * 32 = 95 MB
>       * java fudge X5 or X10 = 475MB to 950MB
> * ignoring row cache and key cache
>  
> So the high side number is 2213 to 2,688. High because the fudge is a 
> delicious sticky guess and the memtable space would rarely be full. 
> 
> On a 5120 MB heap, with 800MB new you have roughly  4300 MB tenured  (some 
> goes to perm) and 75% of that is 3,225 MB. Not terrible but it depends on the 
> working set and how quickly stuff get's tenured which depends on the 
> workload. 
> 
> These values seem reasonable and in line with what I was seeing.  There are 
> other CF and apps sharing this cluster but this one was the largest.  
> 
> 
>   
> 
> You can confirm these guesses somewhat manually by enabling all the GC 
> logging in cassandra-env.sh. Restart the node and let it operate normally, 
> probably best to keep repair off.
> 
> 
> 
> I was using jstat to monitor gc activity and some snippets from that are in 
> my original email in this thread.  The key behavior was that full gc was 
> running pretty often and never able to reclaim much (if any) space.
> 
> 
>  
> 
> There are a few things you could try:
> 
> * increase the JVM heap by say 1Gb and see how it goes
> * increase bloom filter false positive,  try 0.1 first (see 
> http://www.datastax.com/docs/1.1/configuration/storage_configuration#bloom-filter-fp-chance)
>  
> * increase index_interval sampling in yaml.  
> * decreasing compaction_throughput and in_memory_compaction_limit can lesson 
> the additional memory pressure compaction adds. 
> * disable caches or ensure off heap caches are used.
> 
> I've done several of these already in addition to changing the app to reduce 
> the number of rows retained.  How does compaction_throughput relate to memory 
> usage?  I assumed that was more for IO tuning.  I noticed that lowering 
> concurrent_compactors to 4 (from default of 8) lowered the memory used during 
> compactions.  in_memory_compaction_limit_in_mb seems to only be used for wide 
> rows and this CF didn't have any wider than in_memory_compaction_limit_in_mb. 
>  My multithreaded_compaction is still false.
> 
>  
> 
> Watching the gc logs and the cassandra log is a great way to get a feel for 
> what works in your situation. Also take note of any scheduled processing your 
> app does which may impact things, and look for poorly performing queries. 
> 
> Finally this book is a good reference on Java GC http://amzn.com/0137142528 
> 
> For my understanding what was the average row size for the 400 million keys ? 
> 
> 
> 
> The compacted row mean size for the CF is 8815 (as reported by cfstats) but 
> that comes out to be much larger than the real load per node I was seeing.  
> Each node had about 200GB of data for the CF with 4 nodes in the cluster and 
> RF=3.  At the time, the TTL for all columns was 3 days and gc_grace_seconds 
> was 5 days.  Since then I've reduced the TTL to 1 hour and set 
> gc_grace_seconds to 0 so the number of rows and data dropped to a level it 
> can handle.
> 
> 
> -Bryan
> 

Reply via email to