get_range_slices OOM on CompressionMetadata.readChunkOffsets(..)
After an upgrade to cassandra-1.0 any get_range_slices gives me: java.lang.OutOfMemoryError: Java heap space at org.apache.cassandra.io.compress.CompressionMetadata.readChunkOffsets(CompressionMetadata.java:93) at org.apache.cassandra.io.compress.CompressionMetadata.init(CompressionMetadata.java:66) at org.apache.cassandra.io.compress.CompressedRandomAccessReader.metadata(CompressedRandomAccessReader.java:53) at org.apache.cassandra.io.compress.CompressedRandomAccessReader.open(CompressedRandomAccessReader.java:63) at org.apache.cassandra.io.sstable.SSTableReader.openDataReader(SSTableReader.java:896) at org.apache.cassandra.io.sstable.SSTableScanner.init(SSTableScanner.java:72) at org.apache.cassandra.io.sstable.SSTableReader.getScanner(SSTableReader.java:748) at org.apache.cassandra.db.RowIteratorFactory.getIterator(RowIteratorFactory.java:88) at org.apache.cassandra.db.ColumnFamilyStore.getRangeSlice(ColumnFamilyStore.java:1310) at org.apache.cassandra.service.StorageProxy.getRangeSlice(StorageProxy.java:840) at org.apache.cassandra.thrift.CassandraServer.get_range_slices(CassandraServer.java:698) I set chunk_length_kb to 16 as my rows are very skinny (typically 100b) Any way around this? ~mck -- Physics is the universe's operating system. Steven R Garman | http://semb.wever.org | http://sesat.no | | http://tech.finn.no | Java XSS Filter | signature.asc Description: This is a digitally signed message part
Re: get_range_slices OOM on CompressionMetadata.readChunkOffsets(..)
On Mon, 2011-10-31 at 08:00 +0100, Mick Semb Wever wrote: After an upgrade to cassandra-1.0 any get_range_slices gives me: java.lang.OutOfMemoryError: Java heap space at org.apache.cassandra.io.compress.CompressionMetadata.readChunkOffsets(CompressionMetadata.java:93) at org.apache.cassandra.io.compress.CompressionMetadata.init(CompressionMetadata.java:66) at org.apache.cassandra.io.compress.CompressedRandomAccessReader.metadata(CompressedRandomAccessReader.java:53) at org.apache.cassandra.io.compress.CompressedRandomAccessReader.open(CompressedRandomAccessReader.java:63) at org.apache.cassandra.io.sstable.SSTableReader.openDataReader(SSTableReader.java:896) at org.apache.cassandra.io.sstable.SSTableScanner.init(SSTableScanner.java:72) at org.apache.cassandra.io.sstable.SSTableReader.getScanner(SSTableReader.java:748) at org.apache.cassandra.db.RowIteratorFactory.getIterator(RowIteratorFactory.java:88) at org.apache.cassandra.db.ColumnFamilyStore.getRangeSlice(ColumnFamilyStore.java:1310) at org.apache.cassandra.service.StorageProxy.getRangeSlice(StorageProxy.java:840) at org.apache.cassandra.thrift.CassandraServer.get_range_slices(CassandraServer.java:698) I set chunk_length_kb to 16 as my rows are very skinny (typically 100b) I see now this was a bad choice. The read pattern of these rows is always in bulk so the chunk_length could have been much higher so to reduce memory usage (my largest sstable is 61G). After changing the ckunk_length is there any way to rebuild just some sstables rather than having to do a full nodetool scrub ? ~mck -- “An idea is a point of departure and no more. As soon as you elaborate it, it becomes transformed by thought.” - Pablo Picasso | http://semb.wever.org | http://sesat.no | | http://tech.finn.no | Java XSS Filter | signature.asc Description: This is a digitally signed message part
Re: get_range_slices OOM on CompressionMetadata.readChunkOffsets(..)
On Mon, Oct 31, 2011 at 9:07 AM, Mick Semb Wever m...@apache.org wrote: On Mon, 2011-10-31 at 08:00 +0100, Mick Semb Wever wrote: After an upgrade to cassandra-1.0 any get_range_slices gives me: java.lang.OutOfMemoryError: Java heap space at org.apache.cassandra.io.compress.CompressionMetadata.readChunkOffsets(CompressionMetadata.java:93) at org.apache.cassandra.io.compress.CompressionMetadata.init(CompressionMetadata.java:66) at org.apache.cassandra.io.compress.CompressedRandomAccessReader.metadata(CompressedRandomAccessReader.java:53) at org.apache.cassandra.io.compress.CompressedRandomAccessReader.open(CompressedRandomAccessReader.java:63) at org.apache.cassandra.io.sstable.SSTableReader.openDataReader(SSTableReader.java:896) at org.apache.cassandra.io.sstable.SSTableScanner.init(SSTableScanner.java:72) at org.apache.cassandra.io.sstable.SSTableReader.getScanner(SSTableReader.java:748) at org.apache.cassandra.db.RowIteratorFactory.getIterator(RowIteratorFactory.java:88) at org.apache.cassandra.db.ColumnFamilyStore.getRangeSlice(ColumnFamilyStore.java:1310) at org.apache.cassandra.service.StorageProxy.getRangeSlice(StorageProxy.java:840) at org.apache.cassandra.thrift.CassandraServer.get_range_slices(CassandraServer.java:698) I set chunk_length_kb to 16 as my rows are very skinny (typically 100b) I see now this was a bad choice. The read pattern of these rows is always in bulk so the chunk_length could have been much higher so to reduce memory usage (my largest sstable is 61G). After changing the ckunk_length is there any way to rebuild just some sstables rather than having to do a full nodetool scrub ? Provided you're using SizeTieredCompaction (i.e, the default), you can trigger a user defined compaction through JMX on each of the sstable you want to rebuild. Not necessarily a fun process though. Also note that you can scrub only an individual column family if that was the question. -- Sylvain ~mck -- “An idea is a point of departure and no more. As soon as you elaborate it, it becomes transformed by thought.” - Pablo Picasso | http://semb.wever.org | http://sesat.no | | http://tech.finn.no | Java XSS Filter |
Re: get_range_slices OOM on CompressionMetadata.readChunkOffsets(..)
On Mon, 2011-10-31 at 10:08 +0100, Sylvain Lebresne wrote: I set chunk_length_kb to 16 as my rows are very skinny (typically 100b) I see now this was a bad choice. The read pattern of these rows is always in bulk so the chunk_length could have been much higher so to reduce memory usage (my largest sstable is 61G). After changing the ckunk_length is there any way to rebuild just some sstables rather than having to do a full nodetool scrub ? Provided you're using SizeTieredCompaction (i.e, the default), you can trigger a user defined compaction through JMX on each of the sstable you want to rebuild. Not necessarily a fun process though. Also note that you can scrub only an individual column family if that was the question. Actually this won't work i think. I presume that scrub or any user defined compaction will still need to SSTableReader.openDataReader(..) and so will still OOM no matter what... How the hell am i supposed to re-chunk_length an sstable? :-( ~mck -- We all may have come on different ships, but we’re in the same boat now. Martin Luther King. Jr. | http://semb.wever.org | http://sesat.no | | http://tech.finn.no | Java XSS Filter | signature.asc Description: This is a digitally signed message part
Re: get_range_slices OOM on CompressionMetadata.readChunkOffsets(..)
On Mon, Oct 31, 2011 at 11:35 AM, Mick Semb Wever m...@apache.org wrote: On Mon, 2011-10-31 at 10:08 +0100, Sylvain Lebresne wrote: I set chunk_length_kb to 16 as my rows are very skinny (typically 100b) I see now this was a bad choice. The read pattern of these rows is always in bulk so the chunk_length could have been much higher so to reduce memory usage (my largest sstable is 61G). After changing the ckunk_length is there any way to rebuild just some sstables rather than having to do a full nodetool scrub ? Provided you're using SizeTieredCompaction (i.e, the default), you can trigger a user defined compaction through JMX on each of the sstable you want to rebuild. Not necessarily a fun process though. Also note that you can scrub only an individual column family if that was the question. Actually this won't work i think. I presume that scrub or any user defined compaction will still need to SSTableReader.openDataReader(..) and so will still OOM no matter what... How the hell am i supposed to re-chunk_length an sstable? :-( You could start the node without joining the ring (to make sure it doesn't get any work), i.e, with -Dcassandra.join_ring=false and giving the jvm the maximum heap the machine can allow. Hopefully that could be enough to recompact the sstable without OOMing. ~mck -- We all may have come on different ships, but we’re in the same boat now. Martin Luther King. Jr. | http://semb.wever.org | http://sesat.no | | http://tech.finn.no | Java XSS Filter |
Re: get_range_slices OOM on CompressionMetadata.readChunkOffsets(..)
On Mon, Oct 31, 2011 at 11:41 AM, Mick Semb Wever m...@apache.org wrote: On Mon, 2011-10-31 at 10:08 +0100, Sylvain Lebresne wrote: you can trigger a user defined compaction through JMX on each of the sstable you want to rebuild. May i ask how? Everything i see from NodeProbe to StorageProxy is ks and cf based. It's exposed through JMX but not nodetool (i.e. NodeProbe). It's in the CompactionManagerMBean and it's called forceUserDefinedCompaction. It takes a ks and a comma separated list of path to sstables (but it's fine with with only one sstable). ~mck -- “Anyone who lives within their means suffers from a lack of imagination.” - Oscar Wilde | http://semb.wever.org | http://sesat.no | | http://tech.finn.no | Java XSS Filter |
Re: OOM on CompressionMetadata.readChunkOffsets(..)
On Mon, 2011-10-31 at 09:07 +0100, Mick Semb Wever wrote: The read pattern of these rows is always in bulk so the chunk_length could have been much higher so to reduce memory usage (my largest sstable is 61G). Isn't CompressionMetadata.readChunkOffsets(..) rather dangerous here? Given a 60G sstable, even with 64kb chunk_length, to read just that one sstable requires close to 8G free heap memory... Especially when the default for cassandra is 4G heap in total. ~mck -- Anyone who has attended a computer conference in a fancy hotel can tell you that a sentence like You're one of those computer people, aren't you? is roughly equivalent to Look, another amazingly mobile form of slime mold! in the mouth of a hotel cocktail waitress. Elizabeth Zwicky | http://semb.wever.org | http://sesat.no | | http://tech.finn.no | Java XSS Filter | signature.asc Description: This is a digitally signed message part
Re: OOM on CompressionMetadata.readChunkOffsets(..)
On Mon, 2011-10-31 at 13:05 +0100, Mick Semb Wever wrote: Given a 60G sstable, even with 64kb chunk_length, to read just that one sstable requires close to 8G free heap memory... Arg, that calculation was a little off... (a long isn't exactly 8K...) But you get my concern... ~mck -- When you say: I wrote a program that crashed Windows, people just stare at you blankly and say: Hey, I got those with the system -- for free. Linus Torvalds | http://semb.wever.org | http://sesat.no | | http://tech.finn.no | Java XSS Filter | signature.asc Description: This is a digitally signed message part
Re: get_range_slices OOM on CompressionMetadata.readChunkOffsets(..)
Cleanup would have the same effect I think, in exchange for a minor amount of extra CPU used. On Mon, Oct 31, 2011 at 4:08 AM, Sylvain Lebresne sylv...@datastax.com wrote: On Mon, Oct 31, 2011 at 9:07 AM, Mick Semb Wever m...@apache.org wrote: On Mon, 2011-10-31 at 08:00 +0100, Mick Semb Wever wrote: After an upgrade to cassandra-1.0 any get_range_slices gives me: java.lang.OutOfMemoryError: Java heap space at org.apache.cassandra.io.compress.CompressionMetadata.readChunkOffsets(CompressionMetadata.java:93) at org.apache.cassandra.io.compress.CompressionMetadata.init(CompressionMetadata.java:66) at org.apache.cassandra.io.compress.CompressedRandomAccessReader.metadata(CompressedRandomAccessReader.java:53) at org.apache.cassandra.io.compress.CompressedRandomAccessReader.open(CompressedRandomAccessReader.java:63) at org.apache.cassandra.io.sstable.SSTableReader.openDataReader(SSTableReader.java:896) at org.apache.cassandra.io.sstable.SSTableScanner.init(SSTableScanner.java:72) at org.apache.cassandra.io.sstable.SSTableReader.getScanner(SSTableReader.java:748) at org.apache.cassandra.db.RowIteratorFactory.getIterator(RowIteratorFactory.java:88) at org.apache.cassandra.db.ColumnFamilyStore.getRangeSlice(ColumnFamilyStore.java:1310) at org.apache.cassandra.service.StorageProxy.getRangeSlice(StorageProxy.java:840) at org.apache.cassandra.thrift.CassandraServer.get_range_slices(CassandraServer.java:698) I set chunk_length_kb to 16 as my rows are very skinny (typically 100b) I see now this was a bad choice. The read pattern of these rows is always in bulk so the chunk_length could have been much higher so to reduce memory usage (my largest sstable is 61G). After changing the ckunk_length is there any way to rebuild just some sstables rather than having to do a full nodetool scrub ? Provided you're using SizeTieredCompaction (i.e, the default), you can trigger a user defined compaction through JMX on each of the sstable you want to rebuild. Not necessarily a fun process though. Also note that you can scrub only an individual column family if that was the question. -- Sylvain ~mck -- “An idea is a point of departure and no more. As soon as you elaborate it, it becomes transformed by thought.” - Pablo Picasso | http://semb.wever.org | http://sesat.no | | http://tech.finn.no | Java XSS Filter | -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of DataStax, the source for professional Cassandra support http://www.datastax.com
Re: OOM on CompressionMetadata.readChunkOffsets(..)
On Mon, Oct 31, 2011 at 1:10 PM, Mick Semb Wever m...@apache.org wrote: On Mon, 2011-10-31 at 13:05 +0100, Mick Semb Wever wrote: Given a 60G sstable, even with 64kb chunk_length, to read just that one sstable requires close to 8G free heap memory... Arg, that calculation was a little off... (a long isn't exactly 8K...) But you get my concern... Well, with a long being only 8 bytes, that's 8MB of free heap memory. Without being negligible, that's not completely crazy to me. No, the problem is that we create those 8MB for each reads, which *is* crazy (the fact that we allocate those 8MB in one block is not very nice for the GC either but that's another problem). Anyway, that's really a bug and I've created CASSANDRA-3427 to fix. -- Sylvain ~mck -- When you say: I wrote a program that crashed Windows, people just stare at you blankly and say: Hey, I got those with the system -- for free. Linus Torvalds | http://semb.wever.org | http://sesat.no | | http://tech.finn.no | Java XSS Filter |
Re: OOM on CompressionMetadata.readChunkOffsets(..)
On Mon, Oct 31, 2011 at 2:58 PM, Sylvain Lebresne sylv...@datastax.com wrote: On Mon, Oct 31, 2011 at 1:10 PM, Mick Semb Wever m...@apache.org wrote: On Mon, 2011-10-31 at 13:05 +0100, Mick Semb Wever wrote: Given a 60G sstable, even with 64kb chunk_length, to read just that one sstable requires close to 8G free heap memory... Arg, that calculation was a little off... (a long isn't exactly 8K...) But you get my concern... Well, with a long being only 8 bytes, that's 8MB of free heap memory. Without being negligible, that's not completely crazy to me. No, the problem is that we create those 8MB for each reads, which *is* crazy (the fact that we allocate those 8MB in one block is not very nice for the GC either but that's another problem). Anyway, that's really a bug and I've created CASSANDRA-3427 to fix. Note that it's only a problem for range queries. -- Sylvain -- Sylvain ~mck -- When you say: I wrote a program that crashed Windows, people just stare at you blankly and say: Hey, I got those with the system -- for free. Linus Torvalds | http://semb.wever.org | http://sesat.no | | http://tech.finn.no | Java XSS Filter |