get_range_slices OOM on CompressionMetadata.readChunkOffsets(..)

2011-10-31 Thread Mick Semb Wever
After an upgrade to cassandra-1.0 any get_range_slices gives me:

java.lang.OutOfMemoryError: Java heap space
at 
org.apache.cassandra.io.compress.CompressionMetadata.readChunkOffsets(CompressionMetadata.java:93)
at 
org.apache.cassandra.io.compress.CompressionMetadata.init(CompressionMetadata.java:66)
at 
org.apache.cassandra.io.compress.CompressedRandomAccessReader.metadata(CompressedRandomAccessReader.java:53)
at 
org.apache.cassandra.io.compress.CompressedRandomAccessReader.open(CompressedRandomAccessReader.java:63)
at 
org.apache.cassandra.io.sstable.SSTableReader.openDataReader(SSTableReader.java:896)
at 
org.apache.cassandra.io.sstable.SSTableScanner.init(SSTableScanner.java:72)
at 
org.apache.cassandra.io.sstable.SSTableReader.getScanner(SSTableReader.java:748)
at 
org.apache.cassandra.db.RowIteratorFactory.getIterator(RowIteratorFactory.java:88)
at 
org.apache.cassandra.db.ColumnFamilyStore.getRangeSlice(ColumnFamilyStore.java:1310)
at 
org.apache.cassandra.service.StorageProxy.getRangeSlice(StorageProxy.java:840)
at 
org.apache.cassandra.thrift.CassandraServer.get_range_slices(CassandraServer.java:698)


I set chunk_length_kb to 16 as my rows are very skinny (typically 100b)

Any way around this?

~mck

-- 
Physics is the universe's operating system. Steven R Garman 

| http://semb.wever.org | http://sesat.no |
| http://tech.finn.no   | Java XSS Filter |



signature.asc
Description: This is a digitally signed message part


Re: get_range_slices OOM on CompressionMetadata.readChunkOffsets(..)

2011-10-31 Thread Mick Semb Wever
On Mon, 2011-10-31 at 08:00 +0100, Mick Semb Wever wrote:
 After an upgrade to cassandra-1.0 any get_range_slices gives me:
 
 java.lang.OutOfMemoryError: Java heap space
   at 
 org.apache.cassandra.io.compress.CompressionMetadata.readChunkOffsets(CompressionMetadata.java:93)
   at 
 org.apache.cassandra.io.compress.CompressionMetadata.init(CompressionMetadata.java:66)
   at 
 org.apache.cassandra.io.compress.CompressedRandomAccessReader.metadata(CompressedRandomAccessReader.java:53)
   at 
 org.apache.cassandra.io.compress.CompressedRandomAccessReader.open(CompressedRandomAccessReader.java:63)
   at 
 org.apache.cassandra.io.sstable.SSTableReader.openDataReader(SSTableReader.java:896)
   at 
 org.apache.cassandra.io.sstable.SSTableScanner.init(SSTableScanner.java:72)
   at 
 org.apache.cassandra.io.sstable.SSTableReader.getScanner(SSTableReader.java:748)
   at 
 org.apache.cassandra.db.RowIteratorFactory.getIterator(RowIteratorFactory.java:88)
   at 
 org.apache.cassandra.db.ColumnFamilyStore.getRangeSlice(ColumnFamilyStore.java:1310)
   at 
 org.apache.cassandra.service.StorageProxy.getRangeSlice(StorageProxy.java:840)
   at 
 org.apache.cassandra.thrift.CassandraServer.get_range_slices(CassandraServer.java:698)
 
 
 I set chunk_length_kb to 16 as my rows are very skinny (typically 100b)


I see now this was a bad choice.
The read pattern of these rows is always in bulk so the chunk_length
could have been much higher so to reduce memory usage (my largest
sstable is 61G).

After changing the ckunk_length is there any way to rebuild just some
sstables rather than having to do a full nodetool scrub ?

~mck

-- 
“An idea is a point of departure and no more. As soon as you elaborate
it, it becomes transformed by thought.” - Pablo Picasso 

| http://semb.wever.org | http://sesat.no |
| http://tech.finn.no   | Java XSS Filter |


signature.asc
Description: This is a digitally signed message part


Re: get_range_slices OOM on CompressionMetadata.readChunkOffsets(..)

2011-10-31 Thread Sylvain Lebresne
On Mon, Oct 31, 2011 at 9:07 AM, Mick Semb Wever m...@apache.org wrote:
 On Mon, 2011-10-31 at 08:00 +0100, Mick Semb Wever wrote:
 After an upgrade to cassandra-1.0 any get_range_slices gives me:

 java.lang.OutOfMemoryError: Java heap space
       at 
 org.apache.cassandra.io.compress.CompressionMetadata.readChunkOffsets(CompressionMetadata.java:93)
       at 
 org.apache.cassandra.io.compress.CompressionMetadata.init(CompressionMetadata.java:66)
       at 
 org.apache.cassandra.io.compress.CompressedRandomAccessReader.metadata(CompressedRandomAccessReader.java:53)
       at 
 org.apache.cassandra.io.compress.CompressedRandomAccessReader.open(CompressedRandomAccessReader.java:63)
       at 
 org.apache.cassandra.io.sstable.SSTableReader.openDataReader(SSTableReader.java:896)
       at 
 org.apache.cassandra.io.sstable.SSTableScanner.init(SSTableScanner.java:72)
       at 
 org.apache.cassandra.io.sstable.SSTableReader.getScanner(SSTableReader.java:748)
       at 
 org.apache.cassandra.db.RowIteratorFactory.getIterator(RowIteratorFactory.java:88)
       at 
 org.apache.cassandra.db.ColumnFamilyStore.getRangeSlice(ColumnFamilyStore.java:1310)
       at 
 org.apache.cassandra.service.StorageProxy.getRangeSlice(StorageProxy.java:840)
       at 
 org.apache.cassandra.thrift.CassandraServer.get_range_slices(CassandraServer.java:698)


 I set chunk_length_kb to 16 as my rows are very skinny (typically 100b)


 I see now this was a bad choice.
 The read pattern of these rows is always in bulk so the chunk_length
 could have been much higher so to reduce memory usage (my largest
 sstable is 61G).

 After changing the ckunk_length is there any way to rebuild just some
 sstables rather than having to do a full nodetool scrub ?

Provided you're using SizeTieredCompaction (i.e, the default), you can
trigger a user defined compaction through JMX on each of the sstable
you want to rebuild. Not necessarily a fun process though. Also note that
you can scrub only an individual column family if that was the question.

--
Sylvain


 ~mck

 --
 “An idea is a point of departure and no more. As soon as you elaborate
 it, it becomes transformed by thought.” - Pablo Picasso

 | http://semb.wever.org | http://sesat.no |
 | http://tech.finn.no   | Java XSS Filter |



Re: get_range_slices OOM on CompressionMetadata.readChunkOffsets(..)

2011-10-31 Thread Mick Semb Wever
On Mon, 2011-10-31 at 10:08 +0100, Sylvain Lebresne wrote:
  I set chunk_length_kb to 16 as my rows are very skinny (typically 100b)
 
 
  I see now this was a bad choice.
  The read pattern of these rows is always in bulk so the chunk_length
  could have been much higher so to reduce memory usage (my largest
  sstable is 61G).
 
  After changing the ckunk_length is there any way to rebuild just some
  sstables rather than having to do a full nodetool scrub ?
 
 Provided you're using SizeTieredCompaction (i.e, the default), you can
 trigger a user defined compaction through JMX on each of the sstable
 you want to rebuild. Not necessarily a fun process though. Also note that
 you can scrub only an individual column family if that was the question. 

Actually this won't work i think.

I presume that scrub or any user defined compaction will still need to
SSTableReader.openDataReader(..) and so will still OOM no matter what...

How the hell am i supposed to re-chunk_length an sstable? :-(

~mck

-- 
We all may have come on different ships, but we’re in the same boat
now. Martin Luther King. Jr. 

| http://semb.wever.org | http://sesat.no |
| http://tech.finn.no   | Java XSS Filter |



signature.asc
Description: This is a digitally signed message part


Re: get_range_slices OOM on CompressionMetadata.readChunkOffsets(..)

2011-10-31 Thread Sylvain Lebresne
On Mon, Oct 31, 2011 at 11:35 AM, Mick Semb Wever m...@apache.org wrote:
 On Mon, 2011-10-31 at 10:08 +0100, Sylvain Lebresne wrote:
  I set chunk_length_kb to 16 as my rows are very skinny (typically 100b)
 
 
  I see now this was a bad choice.
  The read pattern of these rows is always in bulk so the chunk_length
  could have been much higher so to reduce memory usage (my largest
  sstable is 61G).
 
  After changing the ckunk_length is there any way to rebuild just some
  sstables rather than having to do a full nodetool scrub ?

 Provided you're using SizeTieredCompaction (i.e, the default), you can
 trigger a user defined compaction through JMX on each of the sstable
 you want to rebuild. Not necessarily a fun process though. Also note that
 you can scrub only an individual column family if that was the question.

 Actually this won't work i think.

 I presume that scrub or any user defined compaction will still need to
 SSTableReader.openDataReader(..) and so will still OOM no matter what...

 How the hell am i supposed to re-chunk_length an sstable? :-(

You could start the node without joining the ring (to make sure it doesn't
get any work), i.e, with -Dcassandra.join_ring=false and giving the jvm
the maximum heap the machine can allow. Hopefully that could be enough
to recompact the sstable without OOMing.


 ~mck

 --
 We all may have come on different ships, but we’re in the same boat
 now. Martin Luther King. Jr.

 | http://semb.wever.org | http://sesat.no |
 | http://tech.finn.no   | Java XSS Filter |




Re: get_range_slices OOM on CompressionMetadata.readChunkOffsets(..)

2011-10-31 Thread Sylvain Lebresne
On Mon, Oct 31, 2011 at 11:41 AM, Mick Semb Wever m...@apache.org wrote:
 On Mon, 2011-10-31 at 10:08 +0100, Sylvain Lebresne wrote:
 you can
 trigger a user defined compaction through JMX on each of the sstable
 you want to rebuild.

 May i ask how?
 Everything i see from NodeProbe to StorageProxy is ks and cf based.

It's exposed through JMX but not nodetool (i.e. NodeProbe). It's in the
CompactionManagerMBean and it's called forceUserDefinedCompaction.
It takes a ks and a comma separated list of path to sstables (but it's fine
with with only one sstable).


 ~mck

 --
 “Anyone who lives within their means suffers from a lack of
 imagination.” - Oscar Wilde

 | http://semb.wever.org | http://sesat.no |
 | http://tech.finn.no   | Java XSS Filter |



Re: OOM on CompressionMetadata.readChunkOffsets(..)

2011-10-31 Thread Mick Semb Wever
On Mon, 2011-10-31 at 09:07 +0100, Mick Semb Wever wrote:
 The read pattern of these rows is always in bulk so the chunk_length
 could have been much higher so to reduce memory usage (my largest
 sstable is 61G). 

Isn't CompressionMetadata.readChunkOffsets(..) rather dangerous here?

Given a 60G sstable, even with 64kb chunk_length, to read just that one
sstable requires close to 8G free heap memory...

Especially when the default for cassandra is 4G heap in total.

~mck

-- 
Anyone who has attended a computer conference in a fancy hotel can tell
you that a sentence like You're one of those computer people, aren't
you? is roughly equivalent to Look, another amazingly mobile form of
slime mold! in the mouth of a hotel cocktail waitress. Elizabeth
Zwicky 

| http://semb.wever.org | http://sesat.no |
| http://tech.finn.no   | Java XSS Filter |


signature.asc
Description: This is a digitally signed message part


Re: OOM on CompressionMetadata.readChunkOffsets(..)

2011-10-31 Thread Mick Semb Wever
On Mon, 2011-10-31 at 13:05 +0100, Mick Semb Wever wrote:
 Given a 60G sstable, even with 64kb chunk_length, to read just that one
 sstable requires close to 8G free heap memory... 

Arg, that calculation was a little off...
 (a long isn't exactly 8K...)

But you get my concern...

~mck

-- 
When you say: I wrote a program that crashed Windows, people just
stare at you blankly and say: Hey, I got those with the system -- for
free. Linus Torvalds 

| http://semb.wever.org | http://sesat.no |
| http://tech.finn.no   | Java XSS Filter |


signature.asc
Description: This is a digitally signed message part


Re: get_range_slices OOM on CompressionMetadata.readChunkOffsets(..)

2011-10-31 Thread Jonathan Ellis
Cleanup would have the same effect I think, in exchange for a minor
amount of extra CPU used.

On Mon, Oct 31, 2011 at 4:08 AM, Sylvain Lebresne sylv...@datastax.com wrote:
 On Mon, Oct 31, 2011 at 9:07 AM, Mick Semb Wever m...@apache.org wrote:
 On Mon, 2011-10-31 at 08:00 +0100, Mick Semb Wever wrote:
 After an upgrade to cassandra-1.0 any get_range_slices gives me:

 java.lang.OutOfMemoryError: Java heap space
       at 
 org.apache.cassandra.io.compress.CompressionMetadata.readChunkOffsets(CompressionMetadata.java:93)
       at 
 org.apache.cassandra.io.compress.CompressionMetadata.init(CompressionMetadata.java:66)
       at 
 org.apache.cassandra.io.compress.CompressedRandomAccessReader.metadata(CompressedRandomAccessReader.java:53)
       at 
 org.apache.cassandra.io.compress.CompressedRandomAccessReader.open(CompressedRandomAccessReader.java:63)
       at 
 org.apache.cassandra.io.sstable.SSTableReader.openDataReader(SSTableReader.java:896)
       at 
 org.apache.cassandra.io.sstable.SSTableScanner.init(SSTableScanner.java:72)
       at 
 org.apache.cassandra.io.sstable.SSTableReader.getScanner(SSTableReader.java:748)
       at 
 org.apache.cassandra.db.RowIteratorFactory.getIterator(RowIteratorFactory.java:88)
       at 
 org.apache.cassandra.db.ColumnFamilyStore.getRangeSlice(ColumnFamilyStore.java:1310)
       at 
 org.apache.cassandra.service.StorageProxy.getRangeSlice(StorageProxy.java:840)
       at 
 org.apache.cassandra.thrift.CassandraServer.get_range_slices(CassandraServer.java:698)


 I set chunk_length_kb to 16 as my rows are very skinny (typically 100b)


 I see now this was a bad choice.
 The read pattern of these rows is always in bulk so the chunk_length
 could have been much higher so to reduce memory usage (my largest
 sstable is 61G).

 After changing the ckunk_length is there any way to rebuild just some
 sstables rather than having to do a full nodetool scrub ?

 Provided you're using SizeTieredCompaction (i.e, the default), you can
 trigger a user defined compaction through JMX on each of the sstable
 you want to rebuild. Not necessarily a fun process though. Also note that
 you can scrub only an individual column family if that was the question.

 --
 Sylvain


 ~mck

 --
 “An idea is a point of departure and no more. As soon as you elaborate
 it, it becomes transformed by thought.” - Pablo Picasso

 | http://semb.wever.org | http://sesat.no |
 | http://tech.finn.no   | Java XSS Filter |





-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com


Re: OOM on CompressionMetadata.readChunkOffsets(..)

2011-10-31 Thread Sylvain Lebresne
On Mon, Oct 31, 2011 at 1:10 PM, Mick Semb Wever m...@apache.org wrote:
 On Mon, 2011-10-31 at 13:05 +0100, Mick Semb Wever wrote:
 Given a 60G sstable, even with 64kb chunk_length, to read just that one
 sstable requires close to 8G free heap memory...

 Arg, that calculation was a little off...
  (a long isn't exactly 8K...)

 But you get my concern...

Well, with a long being only 8 bytes, that's 8MB of free heap memory. Without
being negligible, that's not completely crazy to me.

No, the problem is that we create those 8MB for each reads, which *is* crazy
(the fact that we allocate those 8MB in one block is not very nice for
the GC either
but that's another problem).
Anyway, that's really a bug and I've created CASSANDRA-3427 to fix.

--
Sylvain


 ~mck

 --
 When you say: I wrote a program that crashed Windows, people just
 stare at you blankly and say: Hey, I got those with the system -- for
 free. Linus Torvalds

 | http://semb.wever.org | http://sesat.no |
 | http://tech.finn.no   | Java XSS Filter |



Re: OOM on CompressionMetadata.readChunkOffsets(..)

2011-10-31 Thread Sylvain Lebresne
On Mon, Oct 31, 2011 at 2:58 PM, Sylvain Lebresne sylv...@datastax.com wrote:
 On Mon, Oct 31, 2011 at 1:10 PM, Mick Semb Wever m...@apache.org wrote:
 On Mon, 2011-10-31 at 13:05 +0100, Mick Semb Wever wrote:
 Given a 60G sstable, even with 64kb chunk_length, to read just that one
 sstable requires close to 8G free heap memory...

 Arg, that calculation was a little off...
  (a long isn't exactly 8K...)

 But you get my concern...

 Well, with a long being only 8 bytes, that's 8MB of free heap memory. Without
 being negligible, that's not completely crazy to me.

 No, the problem is that we create those 8MB for each reads, which *is* crazy
 (the fact that we allocate those 8MB in one block is not very nice for
 the GC either
 but that's another problem).
 Anyway, that's really a bug and I've created CASSANDRA-3427 to fix.

Note that it's only a problem for range queries.

--
Sylvain


 --
 Sylvain


 ~mck

 --
 When you say: I wrote a program that crashed Windows, people just
 stare at you blankly and say: Hey, I got those with the system -- for
 free. Linus Torvalds

 | http://semb.wever.org | http://sesat.no |
 | http://tech.finn.no   | Java XSS Filter |