On Dec 14, 2010, at 19:38, Peter Schuller wrote:

> For debugging purposes you may want to switch Cassandra to "standard"
> IO mode instead of mmap. This will have a performance-penalty, but the
> virtual/resident sizes won't be polluted with mmap():ed data.

Already did so. It *seems* to run more stable, but it's still far off from 
being stable.

I actually already put 100 millions rows into a local cassandra instance (on 
OSX [and on rc1], not xen'ed Linux), 
so this is unlikely a cassandra Java code problem but rather something native 
code/platform related.

> In general, unless you're hitting something particularly strange or
> just a bug in Cassandra, you shouldn't be randomly getting OOM:s
> unless you are truly using that heap space. What do you mean by
> "always bound in compactionexecutor" - by what method did you
> determine this to be the case?

heap dumps -> MAT (http://www.eclipse.org/mat/)

> There should be no magic need for CPU. Unless you are severely taxing
> it in terms of very high write load or similar, an out-of-the-box
> configured cassandra should be needing limited amounts of memory. Did
> you run with default memtable thresholds (memtable_throughput_in_mb i

Yes

>> This is my only CF currently in use (via JMX):
>> 
>> - column_families:
>>  - column_type: Standard
>>    comment: tracking column family
>>    compare_with: org.apache.cassandra.db.marshal.UTF8Type
>>    default_validation_class: org.apache.cassandra.db.marshal.UTF8Type
>>    gc_grace_seconds: 864000
>>    key_cache_save_period_in_seconds: 3600
>>    keys_cached: 200000.0
>>    max_compaction_threshold: 32
>>    memtable_flush_after_mins: 60
>>    min_compaction_threshold: 4
>>    name: tracking
>>    read_repair_chance: 1.0
>>    row_cache_save_period_in_seconds: 0
>>    rows_cached: 0.0
>>  name: test
>>  replica_placement_strategy: org.apache.cassandra.locator.SimpleStrategy
>>  replication_factor: 3
> 
> This is the only column family being used?

Current, for testing, yes.

>> In addition...actually there is plenty of free memory on the heap (?):
>> 
>> 3605.478: [GC 3605.478: [ParNew
>> Desired survivor size 2162688 bytes, new threshold 1 (max 1)
>> - age   1:     416112 bytes,     416112 total
>> : 16887K->553K(38336K), 0.0209550 secs]3605.499: [CMS: 
>> 1145267K->447565K(2054592K), 1.9143630 secs] 1161938K->447565K(2092928K), 
>> [CMS Perm : 18186K->18158K(30472K)], 1.9355340 secs] [Times: user=1.95 
>> sys=0.00, real=1.94 secs]
>> 3607.414: [Full GC 3607.414: [CMS: 447565K->447453K(2054592K), 1.9694960 
>> secs] 447565K->447453K(2092928K), [CMS Perm : 18158K->18025K(30472K)], 
>> 1.9696450 secs] [Times: user=1.92 sys=0.00, real=1.97 secs]
> 
> 1.9 seconds to do [CMS: 1145267K->447565K(2054592K) is completely
> abnormal if that represents a pause (but not if it's just concurrent
> mark/sweep time). I don't quite recognize the format of this log...
> I'm suddenly unsure what this log output is coming from. A normal
> -XX:+PrintGC and -XX:+PrintGCDetails should yield stuff like:

I just uncommented the GC JVMOPTS from the shipped cassandra start script and 
use Sun JVM 1.6.0_23. Hmm, but these "GC tuning options" are also uncommented. 
I'll comment them again and try again.

Reply via email to