I turns out there were several bugs that make 0.3 run out of memory
during sustained insert.  These are fixed in trunk, which is almost
stable (#233 is the last disk format change, and will be committed as
soon as review is done).

-Jonathan

On Mon, Aug 10, 2009 at 7:20 PM, Huming Wu<huming...@gmail.com> wrote:
> I am currently doing some test on cassandra (0.3.0-final). two nodes
> with each node
> has 8G ram and 8 core cpus. And here are some setting from my
> storage-conf.xml:
>
> <ReplicationFactor>2</ReplicationFactor>
> <ColumnIndexSizeInKB>256</ColumnIndexSizeInKB>
> <MemtableSizeInMB>1024</MemtableSizeInMB>
> <MemtableObjectCountInMillions>2</MemtableObjectCountInMillions>
>
> The test data I have has about 880K unique keys and my test program
> simply inserts the same 5 columns into Table1.Standard1 using
> thrift.batch_insert. For each key, the record size ranges from 21
> bytes to 5K with the average being 40 bytes. The program calls
> batch_insert repeatedly - 4 million times with 50 concurrent thrift
> connections (about 220MB data excluding keys is sent to cassandra).
> What I see was basically the JAVA resident memory grows to the GC
> limit (6G) and everything just halts after that. If I restarted
> cassandra I can see the footprint is around 1.9G and I can do insert
> again but the memory keeps growing and so on. Here is my JVM setting:
>
> -Xmx6000m -Xms6000m -XX:+HeapDumpOnOutOfMemoryError -XX:NewSize=1000m
> -XX:MaxNewSize=1000m -XX:SurvivorRatio=8 -XX:+UseConcMarkSweepGC
> -XX:+CMSIncrementalMode -verbose:gc -XX:+PrintHeapAtGC
> -XX:+PrintGCDetails -Xloggc:gc.log
>
> Here is the jmap output (top 10 objects):
>
> num   #instances    #bytes  class name
> --------------------------------------
>
>  1:   1436005   658220304  [Ljava.lang.Object;
>  2:  12100491   484019640  java.lang.String
>  3:   9904511   437577600  [C
>  4:   2709812   322398784  [I
>  5:   5607988   224319520  java.util.concurrent.ConcurrentSkipListMap$Node
>  6:   4469810   214550880  org.apache.cassandra.db.Column
>  7:   3339219   213710016  org.cliffc.high_scale_lib.ConcurrentAutoTable$CAT
>  8:   3339230   191142200  [J
>  9:   4506140   179147648  [B
>  10:   2220024   106561152  
> java.util.concurrent.ConcurrentSkipListMap$HeadIndex
>
> Does anyone have any idea why cassandra uses so much memory? From
> gc.log I do see gc has kicked in many times (but not major compaction
> though). I'd expect that with this small data set everything would
> just work fine with the avail. memory (I mean the test should just go
> on for weeks).
>
> Any suggestion?
>
> Thanks,
> Huming
>

Reply via email to