The live:serialized size ratio depends on what your data looks like
(small columns will be less efficient than large blobs) but using the
rule of thumb of 10x, around 1G * (1 + memtable_flush_writers +
memtable_flush_queue_size).

So first thing I would do is drop writers and queue to 1 and 1.

Then I would drop the max heap to 1G, memtable size to 8MB so the heap
dump is easier to analyze. Then let it OOM and look at the dump with
http://www.eclipse.org/mat/

On Sat, May 7, 2011 at 3:54 PM, Serediuk, Adam
<adam.sered...@serialssolutions.com> wrote:
> How much memory should a single hot cf with a 128mb memtable take with row 
> and key caching disabled during read?
>
> Because I'm seeing heap go from 3.5gb skyrocketing straight to max 
> (regardless of the size, 8gb and 24gb both do the same) at which time the jvm 
> will do nothing but full gc and is unable to reclaim any meaningful amount of 
> memory. Cassandra then becomes unusable.
>
> I see the same behavior with smaller memtables, eg 64mb.
>
> This happens well into the read operation an only on a small number of nodes 
> in the cluster(1-4 out of a total of 60 nodes.)
>
> Sent from my iPhone
>
> On May 6, 2011, at 22:45, "Jonathan Ellis" <jbel...@gmail.com> wrote:
>
>> You don't GC storm without legitimately having a too-full heap.  It's
>> normal to see occasional full GCs from fragmentation, but that will
>> actually compact the heap and everything goes back to normal IF you
>> had space actually freed up.
>>
>> You say you've played w/ memtable size but that would still be my bet.
>> Most people severely underestimate how much space this takes (10x in
>> memory over serialized size), which will bite you when you have lots
>> of CFs defined.
>>
>> Otherwise, force a heap dump after a full GC and take a look to see
>> what's referencing all the memory.
>>
>> On Fri, May 6, 2011 at 12:25 PM, Serediuk, Adam
>> <adam.sered...@serialssolutions.com> wrote:
>>> We're troubleshooting a memory usage problem during batch reads. We've 
>>> spent the last few days profiling and trying different GC settings. The 
>>> symptoms are that after a certain amount of time during reads one or more 
>>> nodes in the cluster will exhibit extreme memory pressure followed by a gc 
>>> storm. We've tried every possible JVM setting and different GC methods and 
>>> the issue persists. This is pointing towards something instantiating a lot 
>>> of objects and keeping references so that they can't be cleaned up.
>>>
>>> Typically nothing is ever logged other than the GC failures however just 
>>> now one of the nodes emitted logs we've never seen before:
>>>
>>>  INFO [ScheduledTasks:1] 2011-05-06 15:04:55,085 StorageService.java (line 
>>> 2218) Unable to reduce heap usage since there are no dirty column families
>>>
>>> We have tried increasing the heap on these nodes to large values, eg 24GB 
>>> and still run into the same issue. We're running 8GB of heap normally and 
>>> only one or two nodes will ever exhibit this issue, randomly. We don't use 
>>> key/row caching and our memtable sizing is 64mb/0.3. Larger or smaller 
>>> memtables make no difference in avoiding the issue. We're on 0.7.5, mmap, 
>>> jna and jdk 1.6.0_24
>>>
>>> We've somewhat hit the wall in troubleshooting and any advice is greatly 
>>> appreciated.
>>>
>>> --
>>> Adam
>>>
>>
>>
>>
>> --
>> Jonathan Ellis
>> Project Chair, Apache Cassandra
>> co-founder of DataStax, the source for professional Cassandra support
>> http://www.datastax.com
>>
>
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com

Reply via email to