I think your file (as cassandra column value) is too large. And I also think Cassandra is not good at store files.
On Wed, Apr 28, 2010 at 10:24 PM, Jussi P?öri <ju...@androidconsulting.com>wrote: > new try, previous went to wrong place... > > Hi all, > > i'm trying to run a scenario of adding files from specific folder to > cassandra. Now I have 64 files(about 15-20 MB per file) and overall of 1GB > of data. > I'm able to insert a round 40 files, but after that the cassandra goes to > some GC loop and I finally get an timeout to the client. > It is not going to OOM, but it just jams. > > Here is what I had last marks in log file: > NFO [GC inspection] 2010-04-28 10:07:55,297 GCInspector.java (line 110) GC > for ParNew: 232 ms, 25731128 reclaimed leaving 553241120 used; max is > 4108386304 > INFO [GC inspection] 2010-04-28 10:09:02,331 GCInspector.java (line 110) > GC for ParNew: 2844 ms, 238909856 reclaimed leaving 1435582832 used; max is > 4108386304 > INFO [GC inspection] 2010-04-28 10:09:49,421 GCInspector.java (line 110) > GC for ParNew: 30666 ms, 11185824 reclaimed leaving 1679795336 used; max is > 4108386304 > INFO [GC inspection] 2010-04-28 10:11:18,090 GCInspector.java (line 110) > GC for ParNew: 895 ms, 17921680 reclaimed leaving 1589308456 used; max is > 4108386304 > > > > I think that I must have something wrong in my configurations or in how I > use cassandra, because here people are inserting 10 times more stuff and it > works. > > Column family I using: > <ColumnFamily CompareWith="BytesType" Name="Standard1"/> > Basically inserting with key name is "Folder_name" and column name is "file > name" and value is the file content. > I tried with Hector(mainly) and directly using thrift(insert and > batch_mutate). > > In my case, the data does not need to readable immediately after insert, > but I don't know it that helps in anyway. > > > My environment : > mac and/or linux, tested in both > java 1.6.0_17 > Cassandra 0.6.1 > > > > <RpcTimeoutInMillis>60000</RpcTimeoutInMillis> > <CommitLogRotationThresholdInMB>32</CommitLogRotationThresholdInMB> > <RowWarningThresholdInMB>512</RowWarningThresholdInMB> > <SlicedBufferSizeInKB>32</SlicedBufferSizeInKB> > <FlushDataBufferSizeInMB>32</FlushDataBufferSizeInMB> > <FlushIndexBufferSizeInMB>8</FlushIndexBufferSizeInMB> > <ColumnIndexSizeInKB>64</ColumnIndexSizeInKB> > <MemtableThroughputInMB>64</MemtableThroughputInMB> > <BinaryMemtableThroughputInMB>256</BinaryMemtableThroughputInMB> > <MemtableOperationsInMillions>0.1</MemtableOperationsInMillions> > <MemtableFlushAfterMinutes>60</MemtableFlushAfterMinutes> > <ConcurrentReads>8</ConcurrentReads> > <ConcurrentWrites>32</ConcurrentWrites> > <CommitLogSync>batch</CommitLogSync> > <!-- CommitLogSyncPeriodInMS>10000</CommitLogSyncPeriodInMS --> > <CommitLogSyncBatchWindowInMS>1.0</CommitLogSyncBatchWindowInMS> > <GCGraceSeconds>500</GCGraceSeconds> > > JVM_OPTS=" \ > -server \ > -Xms3G \ > -Xmx3G \ > -XX:PermSize=512m \ > -XX:MaxPermSize=800m \ > -XX:MaxNewSize=256m \ > -XX:NewSize=128m \ > -XX:TargetSurvivorRatio=90 \ > -XX:+AggressiveOpts \ > -XX:+UseParNewGC \ > -XX:+UseConcMarkSweepGC \ > -XX:+CMSParallelRemarkEnabled \ > -XX:+HeapDumpOnOutOfMemoryError \ > -XX:SurvivorRatio=128 \ > -XX:MaxTenuringThreshold=0 \ > -XX:+DisableExplicitGC \ > -Dcom.sun.management.jmxremote.port=8080 \ > -Dcom.sun.management.jmxremote.ssl=false \ > -Dcom.sun.management.jmxremote.authenticate=false" > >