StackOverflowError on high load
I'm running some high load writes on a pair of cassandra hosts using an OrderPresenrvingPartitioner and ran into the following error after which one of the hosts killed itself. Has anyone seen it and can advice? (cassandra v0.5.0) ERROR [HINTED-HANDOFF-POOL:1] 2010-02-17 04:50:09,602 CassandraDaemon.java (line 71) Fatal exception in thread Thread[HINTED-HANDOFF-POOL:1,5,main] java.lang.StackOverflowError at sun.nio.cs.UTF_8$Encoder.encodeArrayLoop(UTF_8.java:341) at sun.nio.cs.UTF_8$Encoder.encodeLoop(UTF_8.java:447) at java.nio.charset.CharsetEncoder.encode(CharsetEncoder.java:544) at java.lang.StringCoding$StringEncoder.encode(StringCoding.java:240) at java.lang.StringCoding.encode(StringCoding.java:272) at java.lang.String.getBytes(String.java:947) at java.io.UnixFileSystem.getSpace(Native Method) at java.io.File.getUsableSpace(File.java:1660) at org.apache.cassandra.config.DatabaseDescriptor.getDataFileLocationForTable(DatabaseDescriptor.java:891) at org.apache.cassandra.db.ColumnFamilyStore.doFileCompaction(ColumnFamilyStore.java:876) at org.apache.cassandra.db.ColumnFamilyStore.doFileCompaction(ColumnFamilyStore.java:884) at org.apache.cassandra.db.ColumnFamilyStore.doFileCompaction(ColumnFamilyStore.java:884) at org.apache.cassandra.db.ColumnFamilyStore.doFileCompaction(ColumnFamilyStore.java:884) at org.apache.cassandra.db.ColumnFamilyStore.doFileCompaction(ColumnFamilyStore.java:884) at org.apache.cassandra.db.ColumnFamilyStore.doFileCompaction(ColumnFamilyStore.java:884) at org.apache.cassandra.db.ColumnFamilyStore.doFileCompaction(ColumnFamilyStore.java:884) at org.apache.cassandra.db.ColumnFamilyStore.doFileCompaction(ColumnFamilyStore.java:884) ... at org.apache.cassandra.db.ColumnFamilyStore.doFileCompaction(ColumnFamilyStore.java:884) at org.apache.cassandra.db.ColumnFamilyStore.doFileCompaction(ColumnFamilyStore.java:884) at org.apache.cassandra.db.ColumnFamilyStore.doFileCompaction(ColumnFamilyStore.java:884) INFO [ROW-MUTATION-STAGE:28] 2010-02-17 04:50:53,230 ColumnFamilyStore.java (line 393) DocumentMapping has reached its threshold; switching in a fresh Memtable INFO [ROW-MUTATION-STAGE:28] 2010-02-17 04:50:53,230 ColumnFamilyStore.java (line 1035) Enqueuing flush of Memtable(DocumentMapping)@122980220 INFO [FLUSH-SORTER-POOL:1] 2010-02-17 04:50:53,230 Memtable.java (line 183) Sorting Memtable(DocumentMapping)@122980220 INFO [FLUSH-WRITER-POOL:1] 2010-02-17 04:50:53,386 Memtable.java (line 192) Writing Memtable(DocumentMapping)@122980220 ERROR [FLUSH-WRITER-POOL:1] 2010-02-17 04:50:54,010 DebuggableThreadPoolExecutor.java (line 162) Error in executor futuretask java.util.concurrent.ExecutionException: java.lang.RuntimeException: java.io.IOException: No space left on device at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222) at java.util.concurrent.FutureTask.get(FutureTask.java:83) at org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor.afterExecute(DebuggableThreadPoolExecutor.java:154) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:888) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619) Caused by: java.lang.RuntimeException: java.io.IOException: No space left on device at org.apache.cassandra.db.ColumnFamilyStore$3$1.run(ColumnFamilyStore.java:1060) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) ... 2 more Caused by: java.io.IOException: No space left on device at java.io.FileOutputStream.write(Native Method) at java.io.DataOutputStream.writeInt(DataOutputStream.java:180) at org.apache.cassandra.utils.BloomFilterSerializer.serialize(BloomFilter.java:158) at org.apache.cassandra.utils.BloomFilterSerializer.serialize(BloomFilter.java:153) at org.apache.cassandra.io.SSTableWriter.closeAndOpenReader(SSTableWriter.java:123) at org.apache.cassandra.db.Memtable.writeSortedContents(Memtable.java:207) at org.apache.cassandra.db.ColumnFamilyStore$3$1.run(ColumnFamilyStore.java:1056) ... 6 more
Re: StackOverflowError on high load
I think that you have not enough room for your data. run df -h to see that one of your discs is full 2010/2/17 Ran Tavory ran...@gmail.com I'm running some high load writes on a pair of cassandra hosts using an OrderPresenrvingPartitioner and ran into the following error after which one of the hosts killed itself. Has anyone seen it and can advice? (cassandra v0.5.0) ERROR [HINTED-HANDOFF-POOL:1] 2010-02-17 04:50:09,602 CassandraDaemon.java (line 71) Fatal exception in thread Thread[HINTED-HANDOFF-POOL:1,5,main] java.lang.StackOverflowError at sun.nio.cs.UTF_8$Encoder.encodeArrayLoop(UTF_8.java:341) at sun.nio.cs.UTF_8$Encoder.encodeLoop(UTF_8.java:447) at java.nio.charset.CharsetEncoder.encode(CharsetEncoder.java:544) at java.lang.StringCoding$StringEncoder.encode(StringCoding.java:240) at java.lang.StringCoding.encode(StringCoding.java:272) at java.lang.String.getBytes(String.java:947) at java.io.UnixFileSystem.getSpace(Native Method) at java.io.File.getUsableSpace(File.java:1660) at org.apache.cassandra.config.DatabaseDescriptor.getDataFileLocationForTable(DatabaseDescriptor.java:891) at org.apache.cassandra.db.ColumnFamilyStore.doFileCompaction(ColumnFamilyStore.java:876) at org.apache.cassandra.db.ColumnFamilyStore.doFileCompaction(ColumnFamilyStore.java:884) at org.apache.cassandra.db.ColumnFamilyStore.doFileCompaction(ColumnFamilyStore.java:884) at org.apache.cassandra.db.ColumnFamilyStore.doFileCompaction(ColumnFamilyStore.java:884) at org.apache.cassandra.db.ColumnFamilyStore.doFileCompaction(ColumnFamilyStore.java:884) at org.apache.cassandra.db.ColumnFamilyStore.doFileCompaction(ColumnFamilyStore.java:884) at org.apache.cassandra.db.ColumnFamilyStore.doFileCompaction(ColumnFamilyStore.java:884) at org.apache.cassandra.db.ColumnFamilyStore.doFileCompaction(ColumnFamilyStore.java:884) ... at org.apache.cassandra.db.ColumnFamilyStore.doFileCompaction(ColumnFamilyStore.java:884) at org.apache.cassandra.db.ColumnFamilyStore.doFileCompaction(ColumnFamilyStore.java:884) at org.apache.cassandra.db.ColumnFamilyStore.doFileCompaction(ColumnFamilyStore.java:884) INFO [ROW-MUTATION-STAGE:28] 2010-02-17 04:50:53,230 ColumnFamilyStore.java (line 393) DocumentMapping has reached its threshold; switching in a fresh Memtable INFO [ROW-MUTATION-STAGE:28] 2010-02-17 04:50:53,230 ColumnFamilyStore.java (line 1035) Enqueuing flush of Memtable(DocumentMapping)@122980220 INFO [FLUSH-SORTER-POOL:1] 2010-02-17 04:50:53,230 Memtable.java (line 183) Sorting Memtable(DocumentMapping)@122980220 INFO [FLUSH-WRITER-POOL:1] 2010-02-17 04:50:53,386 Memtable.java (line 192) Writing Memtable(DocumentMapping)@122980220 ERROR [FLUSH-WRITER-POOL:1] 2010-02-17 04:50:54,010 DebuggableThreadPoolExecutor.java (line 162) Error in executor futuretask java.util.concurrent.ExecutionException: java.lang.RuntimeException: java.io.IOException: No space left on device at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222) at java.util.concurrent.FutureTask.get(FutureTask.java:83) at org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor.afterExecute(DebuggableThreadPoolExecutor.java:154) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:888) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619) Caused by: java.lang.RuntimeException: java.io.IOException: No space left on device at org.apache.cassandra.db.ColumnFamilyStore$3$1.run(ColumnFamilyStore.java:1060) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) ... 2 more Caused by: java.io.IOException: No space left on device at java.io.FileOutputStream.write(Native Method) at java.io.DataOutputStream.writeInt(DataOutputStream.java:180) at org.apache.cassandra.utils.BloomFilterSerializer.serialize(BloomFilter.java:158) at org.apache.cassandra.utils.BloomFilterSerializer.serialize(BloomFilter.java:153) at org.apache.cassandra.io.SSTableWriter.closeAndOpenReader(SSTableWriter.java:123) at org.apache.cassandra.db.Memtable.writeSortedContents(Memtable.java:207) at org.apache.cassandra.db.ColumnFamilyStore$3$1.run(ColumnFamilyStore.java:1056) ... 6 more
Re: How to increase replications count
change RF to 3, restart all your nodes, and run repair on each of them On Wed, Feb 17, 2010 at 3:15 AM, ruslan usifov ruslan.usi...@gmail.com wrote: Hello If i have worked cluster with replication count 2, how can i increase replication factor to 3, without loss of data??
Re: Get UnavailableException() when write to multiple Node
On Wed, Feb 17, 2010 at 5:26 AM, Richard Grossman richie...@gmail.com wrote: If someone can help me to understand what going on. Is the machine itself overloaded. It's a single machine with 4 virtual machine sharing the same disk is it the cause ? Probably.
Re: StackOverflowError on high load
you temporarily need up to 2x your current space used to perform compactions. disk too full is almost certainly actually the problem. created https://issues.apache.org/jira/browse/CASSANDRA-804 to fix this. On Wed, Feb 17, 2010 at 5:59 AM, Ran Tavory ran...@gmail.com wrote: no, that's not it, disk isn't full. After restarting the server I can write again. Still, however, this error is troubling... On Wed, Feb 17, 2010 at 12:24 PM, ruslan usifov ruslan.usi...@gmail.com wrote: I think that you have not enough room for your data. run df -h to see that one of your discs is full 2010/2/17 Ran Tavory ran...@gmail.com I'm running some high load writes on a pair of cassandra hosts using an OrderPresenrvingPartitioner and ran into the following error after which one of the hosts killed itself. Has anyone seen it and can advice? (cassandra v0.5.0) ERROR [HINTED-HANDOFF-POOL:1] 2010-02-17 04:50:09,602 CassandraDaemon.java (line 71) Fatal exception in thread Thread[HINTED-HANDOFF-POOL:1,5,main] java.lang.StackOverflowError at sun.nio.cs.UTF_8$Encoder.encodeArrayLoop(UTF_8.java:341) at sun.nio.cs.UTF_8$Encoder.encodeLoop(UTF_8.java:447) at java.nio.charset.CharsetEncoder.encode(CharsetEncoder.java:544) at java.lang.StringCoding$StringEncoder.encode(StringCoding.java:240) at java.lang.StringCoding.encode(StringCoding.java:272) at java.lang.String.getBytes(String.java:947) at java.io.UnixFileSystem.getSpace(Native Method) at java.io.File.getUsableSpace(File.java:1660) at org.apache.cassandra.config.DatabaseDescriptor.getDataFileLocationForTable(DatabaseDescriptor.java:891) at org.apache.cassandra.db.ColumnFamilyStore.doFileCompaction(ColumnFamilyStore.java:876) at org.apache.cassandra.db.ColumnFamilyStore.doFileCompaction(ColumnFamilyStore.java:884) at org.apache.cassandra.db.ColumnFamilyStore.doFileCompaction(ColumnFamilyStore.java:884) at org.apache.cassandra.db.ColumnFamilyStore.doFileCompaction(ColumnFamilyStore.java:884) at org.apache.cassandra.db.ColumnFamilyStore.doFileCompaction(ColumnFamilyStore.java:884) at org.apache.cassandra.db.ColumnFamilyStore.doFileCompaction(ColumnFamilyStore.java:884) at org.apache.cassandra.db.ColumnFamilyStore.doFileCompaction(ColumnFamilyStore.java:884) at org.apache.cassandra.db.ColumnFamilyStore.doFileCompaction(ColumnFamilyStore.java:884) ... at org.apache.cassandra.db.ColumnFamilyStore.doFileCompaction(ColumnFamilyStore.java:884) at org.apache.cassandra.db.ColumnFamilyStore.doFileCompaction(ColumnFamilyStore.java:884) at org.apache.cassandra.db.ColumnFamilyStore.doFileCompaction(ColumnFamilyStore.java:884) INFO [ROW-MUTATION-STAGE:28] 2010-02-17 04:50:53,230 ColumnFamilyStore.java (line 393) DocumentMapping has reached its threshold; switching in a fresh Memtable INFO [ROW-MUTATION-STAGE:28] 2010-02-17 04:50:53,230 ColumnFamilyStore.java (line 1035) Enqueuing flush of Memtable(DocumentMapping)@122980220 INFO [FLUSH-SORTER-POOL:1] 2010-02-17 04:50:53,230 Memtable.java (line 183) Sorting Memtable(DocumentMapping)@122980220 INFO [FLUSH-WRITER-POOL:1] 2010-02-17 04:50:53,386 Memtable.java (line 192) Writing Memtable(DocumentMapping)@122980220 ERROR [FLUSH-WRITER-POOL:1] 2010-02-17 04:50:54,010 DebuggableThreadPoolExecutor.java (line 162) Error in executor futuretask java.util.concurrent.ExecutionException: java.lang.RuntimeException: java.io.IOException: No space left on device at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222) at java.util.concurrent.FutureTask.get(FutureTask.java:83) at org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor.afterExecute(DebuggableThreadPoolExecutor.java:154) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:888) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619) Caused by: java.lang.RuntimeException: java.io.IOException: No space left on device at org.apache.cassandra.db.ColumnFamilyStore$3$1.run(ColumnFamilyStore.java:1060) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) ... 2 more Caused by: java.io.IOException: No space left on device at java.io.FileOutputStream.write(Native Method) at java.io.DataOutputStream.writeInt(DataOutputStream.java:180) at org.apache.cassandra.utils.BloomFilterSerializer.serialize(BloomFilter.java:158) at
Re: StackOverflowError on high load
Are we talking about the CommitLogDirectory that needs to be up 2x? So it needs to be 2x of what? Did I miss this in the config file somewhere? On Wed, Feb 17, 2010 at 3:52 PM, Jonathan Ellis jbel...@gmail.com wrote: you temporarily need up to 2x your current space used to perform compactions. disk too full is almost certainly actually the problem. created https://issues.apache.org/jira/browse/CASSANDRA-804 to fix this. On Wed, Feb 17, 2010 at 5:59 AM, Ran Tavory ran...@gmail.com wrote: no, that's not it, disk isn't full. After restarting the server I can write again. Still, however, this error is troubling... On Wed, Feb 17, 2010 at 12:24 PM, ruslan usifov ruslan.usi...@gmail.com wrote: I think that you have not enough room for your data. run df -h to see that one of your discs is full 2010/2/17 Ran Tavory ran...@gmail.com I'm running some high load writes on a pair of cassandra hosts using an OrderPresenrvingPartitioner and ran into the following error after which one of the hosts killed itself. Has anyone seen it and can advice? (cassandra v0.5.0) ERROR [HINTED-HANDOFF-POOL:1] 2010-02-17 04:50:09,602 CassandraDaemon.java (line 71) Fatal exception in thread Thread[HINTED-HANDOFF-POOL:1,5,main] java.lang.StackOverflowError at sun.nio.cs.UTF_8$Encoder.encodeArrayLoop(UTF_8.java:341) at sun.nio.cs.UTF_8$Encoder.encodeLoop(UTF_8.java:447) at java.nio.charset.CharsetEncoder.encode(CharsetEncoder.java:544) at java.lang.StringCoding$StringEncoder.encode(StringCoding.java:240) at java.lang.StringCoding.encode(StringCoding.java:272) at java.lang.String.getBytes(String.java:947) at java.io.UnixFileSystem.getSpace(Native Method) at java.io.File.getUsableSpace(File.java:1660) at org.apache.cassandra.config.DatabaseDescriptor.getDataFileLocationForTable(DatabaseDescriptor.java:891) at org.apache.cassandra.db.ColumnFamilyStore.doFileCompaction(ColumnFamilyStore.java:876) at org.apache.cassandra.db.ColumnFamilyStore.doFileCompaction(ColumnFamilyStore.java:884) at org.apache.cassandra.db.ColumnFamilyStore.doFileCompaction(ColumnFamilyStore.java:884) at org.apache.cassandra.db.ColumnFamilyStore.doFileCompaction(ColumnFamilyStore.java:884) at org.apache.cassandra.db.ColumnFamilyStore.doFileCompaction(ColumnFamilyStore.java:884) at org.apache.cassandra.db.ColumnFamilyStore.doFileCompaction(ColumnFamilyStore.java:884) at org.apache.cassandra.db.ColumnFamilyStore.doFileCompaction(ColumnFamilyStore.java:884) at org.apache.cassandra.db.ColumnFamilyStore.doFileCompaction(ColumnFamilyStore.java:884) ... at org.apache.cassandra.db.ColumnFamilyStore.doFileCompaction(ColumnFamilyStore.java:884) at org.apache.cassandra.db.ColumnFamilyStore.doFileCompaction(ColumnFamilyStore.java:884) at org.apache.cassandra.db.ColumnFamilyStore.doFileCompaction(ColumnFamilyStore.java:884) INFO [ROW-MUTATION-STAGE:28] 2010-02-17 04:50:53,230 ColumnFamilyStore.java (line 393) DocumentMapping has reached its threshold; switching in a fresh Memtable INFO [ROW-MUTATION-STAGE:28] 2010-02-17 04:50:53,230 ColumnFamilyStore.java (line 1035) Enqueuing flush of Memtable(DocumentMapping)@122980220 INFO [FLUSH-SORTER-POOL:1] 2010-02-17 04:50:53,230 Memtable.java (line 183) Sorting Memtable(DocumentMapping)@122980220 INFO [FLUSH-WRITER-POOL:1] 2010-02-17 04:50:53,386 Memtable.java (line 192) Writing Memtable(DocumentMapping)@122980220 ERROR [FLUSH-WRITER-POOL:1] 2010-02-17 04:50:54,010 DebuggableThreadPoolExecutor.java (line 162) Error in executor futuretask java.util.concurrent.ExecutionException: java.lang.RuntimeException: java.io.IOException: No space left on device at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222) at java.util.concurrent.FutureTask.get(FutureTask.java:83) at org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor.afterExecute(DebuggableThreadPoolExecutor.java:154) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:888) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619) Caused by: java.lang.RuntimeException: java.io.IOException: No space left on device at org.apache.cassandra.db.ColumnFamilyStore$3$1.run(ColumnFamilyStore.java:1060) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) ... 2 more
Re: StackOverflowError on high load
On Wed, Feb 17, 2010 at 8:19 AM, Ran Tavory ran...@gmail.com wrote: Are we talking about the CommitLogDirectory that needs to be up 2x? no. data, not commitlog. So it needs to be 2x of what? Did I miss this in the config file somewhere? current space used, in worst case.
Row with many columns
Hello For example if we have table, which have rows with many columns (1 or more) how this data will by partitioned?? I expect that one row may by slit on some nodes. But look at source of cassandra i think that one row store on one node, and never slits, or i be mistaken??
Re: Row with many columns
On Wed, Feb 17, 2010 at 9:48 AM, ruslan usifov ruslan.usi...@gmail.comwrote: Hello For example if we have table, which have rows with many columns (1 or more) how this data will by partitioned?? I expect that one row may by slit on some nodes. But look at source of cassandra i think that one row store on one node, and never slits, or i be mistaken?? You are correct, a row must fit on a node. -Brandon
Re: Row with many columns
you are correct, partitioning is entirely by row key. as long as your number of rows is an order of magnitude or two more than your node count in your cluster though, that doesn't really matter. On Wed, Feb 17, 2010 at 9:48 AM, ruslan usifov ruslan.usi...@gmail.com wrote: Hello For example if we have table, which have rows with many columns (1 or more) how this data will by partitioned?? I expect that one row may by slit on some nodes. But look at source of cassandra i think that one row store on one node, and never slits, or i be mistaken??
Re: Get UnavailableException() when write to multiple Node
Just guessing: are you sure that these virtual machines don't suffer from a broken time synchronization? On Wed, Feb 17, 2010 at 12:26, Richard Grossman richie...@gmail.com wrote: Hi I've configured 4 Virtual Machines CentOS 4GB memory. Each run cassandra 0.5 release. All is ok util I begin to get error like on the client side : UnavailableException() at org.apache.cassandra.service.Cassandra$batch_insert_result.read(Cassandra.java:10892) at org.apache.cassandra.service.Cassandra$Client.recv_batch_insert(Cassandra.java:616) at org.apache.cassandra.service.Cassandra$Client.batch_insert(Cassandra.java:591) at tv.bee.hiveplus.crud.CassandraThread.insertChannelShow(CassandraThread.java:229) at tv.bee.hiveplus.crud.CassandraThread.call(CassandraThread.java:59) at tv.bee.hiveplus.crud.CassandraThread.call(CassandraThread.java:1) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) at java.util.concurrent.FutureTask.run(FutureTask.java:166) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:636) On the server more interresting : INFO [Timer-1] 2010-02-17 11:17:58,268 Gossiper.java (line 194) InetAddress /38.96.191.42 is now dead. INFO [GMFD:1] 2010-02-17 11:17:58,680 Gossiper.java (line 543) InetAddress /38.96.191.42 is now UP INFO [FLUSH-WRITER-POOL:1] 2010-02-17 11:18:06,604 Memtable.java (line 209) Completed flushing /root/cassandraDB/data/Keyspace1/channelShow-14-Data.db INFO [COMPACTION-POOL:1] 2010-02-17 11:18:06,604 ColumnFamilyStore.java (line 875) Compacting [org.apache.cassandra.io.SSTableReader(path='/root/cassandraDB/data/Keyspace1/channelShow-11-Data.db'),org.apache.cassandra.io.SSTableReader(path='/root/cassandraDB/data/Keyspace1/channelShow-12-Data.db'),org.apache.cassandra.io.SSTableReader(path='/root/cassandraDB/data/Keyspace1/channelShow-13-Data.db'),org.apache.cassandra.io.SSTableReader(path='/root/cassandraDB/data/Keyspace1/channelShow-14-Data.db')] INFO [COMPACTION-POOL:1] 2010-02-17 11:19:41,231 ColumnFamilyStore.java (line 943) Compacted to /root/cassandraDB/data/Keyspace1/channelShow-15-Data.db. 80405396/80405396 bytes for 110384 keys. Time: 94627ms. INFO [Timer-1] 2010-02-17 11:20:15,047 Gossiper.java (line 194) InetAddress /38.96.191.40 is now dead. WARN [MESSAGING-SERVICE-POOL:2] 2010-02-17 11:21:50,307 TcpConnection.java (line 484) Problem reading from socket connected to : java.nio.channels.SocketChannel[connected local=/38.96.191.41:7000remote=/ 38.96.191.39:50133] WARN [MESSAGING-SERVICE-POOL:2] 2010-02-17 11:21:50,307 TcpConnection.java (line 485) Exception was generated at : 02/17/2010 11:21:50 on thread MESSAGING-SERVICE-POOL:2 Reached an EOL or something bizzare occured. Reading from: /38.96.191.39BufferSizeRemaining: 16 java.io.IOException: Reached an EOL or something bizzare occured. Reading from: /38.96.191.39 BufferSizeRemaining: 16 at org.apache.cassandra.net.io.StartState.doRead(StartState.java:44) at org.apache.cassandra.net.io.ProtocolState.read(ProtocolState.java:39) at org.apache.cassandra.net.io.TcpReader.read(TcpReader.java:95) at org.apache.cassandra.net.TcpConnection$ReadWorkItem.run(TcpConnection.java:445) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:636) As you can see it's like after some time the communication between node just gone the node is declared dead but after some millis the node is up again between this short time all the insert just gone. It occurs after some time meaning the data is loaded into the memtable. If someone can help me to understand what going on. Is the machine itself overloaded. It's a single machine with 4 virtual machine sharing the same disk is it the cause ? Thanks for any help Richard
Re: StackOverflowError on high load
On Wed, Feb 17, 2010 at 6:40 AM, Ran Tavory ran...@gmail.com wrote: If it's the data directory, then I have a pretty big one. Maybe it's something else $ df -h /outbrain/cassandra/data/ Filesystem Size Used Avail Use% Mounted on /dev/mapper/cassandra-data 97G 11G 82G 12% /outbrain/cassandra/data Perhaps a temporary file? JVM defaults to /tmp, which may be on a smaller (root) partition? -+ Tatu +-
Re: Testing row cache feature in trunk: write should put record in cache
OK I'll work on the change later because there's another problem to solve: the overhead for cache is too big that 1.4mil records (1k each) consumed all of the 6gb memory of JVM (I guess 4gb are consumed by the row cache). I'm thinking that ConcurrentHashMap is not a good choice for LRU and the row cache needs to store compressed key data to reduce memory usage. I'll do more investigation on this and let you know. -Weijun On Tue, Feb 16, 2010 at 9:22 PM, Jonathan Ellis jbel...@gmail.com wrote: ... tell you what, if you write the option-processing part in DatabaseDescriptor I will do the actual cache part. :) On Tue, Feb 16, 2010 at 11:07 PM, Jonathan Ellis jbel...@gmail.com wrote: https://issues.apache.org/jira/secure/CreateIssue!default.jspahttps://issues.apache.org/jira/secure/CreateIssue%21default.jspa, but this is pretty low priority for me. On Tue, Feb 16, 2010 at 8:37 PM, Weijun Li weiju...@gmail.com wrote: Just tried to make quick change to enable it but it didn't work out :-( ColumnFamily cachedRow = cfs.getRawCachedRow(mutation.key()); // What I modified if( cachedRow == null ) { cfs.cacheRow(mutation.key()); cachedRow = cfs.getRawCachedRow(mutation.key()); } if (cachedRow != null) cachedRow.addAll(columnFamily); How can I open a ticket for you to make the change (enable row cache write through with an option)? Thanks, -Weijun On Tue, Feb 16, 2010 at 5:20 PM, Jonathan Ellis jbel...@gmail.com wrote: On Tue, Feb 16, 2010 at 7:17 PM, Jonathan Ellis jbel...@gmail.com wrote: On Tue, Feb 16, 2010 at 7:11 PM, Weijun Li weiju...@gmail.com wrote: Just started to play with the row cache feature in trunk: it seems to be working fine so far except that for RowsCached parameter you need to specify number of rows rather than a percentage (e.g., 20% doesn't work). 20% works, but it's 20% of the rows at server startup. So on a fresh start that is zero. Maybe we should just get rid of the % feature... (Actually, it shouldn't be hard to update this on flush, if you want to open a ticket.)
Re: Testing row cache feature in trunk: write should put record in cache
Great! On Wed, Feb 17, 2010 at 1:51 PM, Weijun Li weiju...@gmail.com wrote: OK I'll work on the change later because there's another problem to solve: the overhead for cache is too big that 1.4mil records (1k each) consumed all of the 6gb memory of JVM (I guess 4gb are consumed by the row cache). I'm thinking that ConcurrentHashMap is not a good choice for LRU and the row cache needs to store compressed key data to reduce memory usage. I'll do more investigation on this and let you know. -Weijun On Tue, Feb 16, 2010 at 9:22 PM, Jonathan Ellis jbel...@gmail.com wrote: ... tell you what, if you write the option-processing part in DatabaseDescriptor I will do the actual cache part. :) On Tue, Feb 16, 2010 at 11:07 PM, Jonathan Ellis jbel...@gmail.com wrote: https://issues.apache.org/jira/secure/CreateIssue!default.jspa, but this is pretty low priority for me. On Tue, Feb 16, 2010 at 8:37 PM, Weijun Li weiju...@gmail.com wrote: Just tried to make quick change to enable it but it didn't work out :-( ColumnFamily cachedRow = cfs.getRawCachedRow(mutation.key()); // What I modified if( cachedRow == null ) { cfs.cacheRow(mutation.key()); cachedRow = cfs.getRawCachedRow(mutation.key()); } if (cachedRow != null) cachedRow.addAll(columnFamily); How can I open a ticket for you to make the change (enable row cache write through with an option)? Thanks, -Weijun On Tue, Feb 16, 2010 at 5:20 PM, Jonathan Ellis jbel...@gmail.com wrote: On Tue, Feb 16, 2010 at 7:17 PM, Jonathan Ellis jbel...@gmail.com wrote: On Tue, Feb 16, 2010 at 7:11 PM, Weijun Li weiju...@gmail.com wrote: Just started to play with the row cache feature in trunk: it seems to be working fine so far except that for RowsCached parameter you need to specify number of rows rather than a percentage (e.g., 20% doesn't work). 20% works, but it's 20% of the rows at server startup. So on a fresh start that is zero. Maybe we should just get rid of the % feature... (Actually, it shouldn't be hard to update this on flush, if you want to open a ticket.)
Re: Row with many columns
Hm. Pity I have Table where 10 rows have 10 columns about 200 bytes in each column, So if read only this 10 records only nodes that have this rows does work, another nodes are idle. This bad, and cassandra doesn't provide any solution to solve this
Re: StackOverflowError on high load
I ran the process again and after a few hours the same node crashed the same way. Now I can tell for sure this is indeed what Jonathan proposed - the data directory needs to be 2x of what it is, but it looks like a design problem, how large to I need to tell my admin to set it then? Here's what I see when the server crashes: $ df -h /outbrain/cassandra/data/ FilesystemSize Used Avail Use% Mounted on /dev/mapper/cassandra-data 97G 46G 47G 50% /outbrain/cassandra/data The directory is 97G and when the host crashes it's at 50% use. I'm also monitoring various JMX counters and I see that COMPACTION-POOL PendingTasks grows for a while on this host (not on the other host, btw, which is fine, just this host) and then flats for 3 hours. After 3 hours of flat it crashes. I'm attaching the graph. When I restart cassandra on this host (not changed file allocation size, just restart) it does manage to compact the data files pretty fast, so after a minute I get 12% use, so I wonder what made it crash before that doesn't now? (could be the load that's not running now) $ df -h /outbrain/cassandra/data/ FilesystemSize Used Avail Use% Mounted on /dev/mapper/cassandra-data 97G 11G 82G 12% /outbrain/cassandra/data The question is what size does the data directory need to be? It's not 2x the size of the data I expect to have (I only have 11G of real data after compaction and the dir is 97G, so it should have been enough). If it's 2x of something dynamic that keeps growing and isn't bound then it'll just grow infinitely, right? What's the bound? Alternatively, what jmx counter thresholds are the best indicators for the crash that's about to happen? Thanks On Wed, Feb 17, 2010 at 9:00 PM, Tatu Saloranta tsalora...@gmail.comwrote: On Wed, Feb 17, 2010 at 6:40 AM, Ran Tavory ran...@gmail.com wrote: If it's the data directory, then I have a pretty big one. Maybe it's something else $ df -h /outbrain/cassandra/data/ FilesystemSize Used Avail Use% Mounted on /dev/mapper/cassandra-data 97G 11G 82G 12% /outbrain/cassandra/data Perhaps a temporary file? JVM defaults to /tmp, which may be on a smaller (root) partition? -+ Tatu +- attachment: Zenoss_ test2.nydc1.outbrain.com.png