Git Push Summary
Updated Tags: refs/tags/cassandra-1.0.10 [created] 2ad9d19f1
Git Push Summary
Updated Tags: refs/tags/1.0.10-tentative [deleted] b2ca7f821
svn commit: r1335384 - in /cassandra/site: publish/download/index.html src/settings.py
Author: slebresne Date: Tue May 8 09:02:53 2012 New Revision: 1335384 URL: http://svn.apache.org/viewvc?rev=1335384view=rev Log: Update website for 1.0.10 release Modified: cassandra/site/publish/download/index.html cassandra/site/src/settings.py Modified: cassandra/site/publish/download/index.html URL: http://svn.apache.org/viewvc/cassandra/site/publish/download/index.html?rev=1335384r1=1335383r2=1335384view=diff == --- cassandra/site/publish/download/index.html (original) +++ cassandra/site/publish/download/index.html Tue May 8 09:02:53 2012 @@ -105,16 +105,16 @@ p Previous stable branches of Cassandra continue to see periodic maintenance for some time after a new major release is made. The lastest release on the - 1.0 branch is 1.0.9 (released on - 2012-04-06). + 1.0 branch is 1.0.10 (released on + 2012-05-08). /p ul li -a class=filename href=http://www.apache.org/dyn/closer.cgi?path=/cassandra/1.0.9/apache-cassandra-1.0.9-bin.tar.gz;apache-cassandra-1.0.9-bin.tar.gz/a -[a href=http://www.apache.org/dist/cassandra/1.0.9/apache-cassandra-1.0.9-bin.tar.gz.asc;PGP/a] -[a href=http://www.apache.org/dist/cassandra/1.0.9/apache-cassandra-1.0.9-bin.tar.gz.md5;MD5/a] -[a href=http://www.apache.org/dist/cassandra/1.0.9/apache-cassandra-1.0.9-bin.tar.gz.sha1;SHA1/a] +a class=filename href=http://www.apache.org/dyn/closer.cgi?path=/cassandra/1.0.10/apache-cassandra-1.0.10-bin.tar.gz;apache-cassandra-1.0.10-bin.tar.gz/a +[a href=http://www.apache.org/dist/cassandra/1.0.10/apache-cassandra-1.0.10-bin.tar.gz.asc;PGP/a] +[a href=http://www.apache.org/dist/cassandra/1.0.10/apache-cassandra-1.0.10-bin.tar.gz.md5;MD5/a] +[a href=http://www.apache.org/dist/cassandra/1.0.10/apache-cassandra-1.0.10-bin.tar.gz.sha1;SHA1/a] /li /ul @@ -157,10 +157,10 @@ /li li -a class=filename href=http://www.apache.org/dyn/closer.cgi?path=/cassandra/1.0.9/apache-cassandra-1.0.9-src.tar.gz;apache-cassandra-1.0.9-src.tar.gz/a -[a href=http://www.apache.org/dist/cassandra/1.0.9/apache-cassandra-1.0.9-src.tar.gz.asc;PGP/a] -[a href=http://www.apache.org/dist/cassandra/1.0.9/apache-cassandra-1.0.9-src.tar.gz.md5;MD5/a] -[a href=http://www.apache.org/dist/cassandra/1.0.9/apache-cassandra-1.0.9-src.tar.gz.sha1;SHA1/a] +a class=filename href=http://www.apache.org/dyn/closer.cgi?path=/cassandra/1.0.10/apache-cassandra-1.0.10-src.tar.gz;apache-cassandra-1.0.10-src.tar.gz/a +[a href=http://www.apache.org/dist/cassandra/1.0.10/apache-cassandra-1.0.10-src.tar.gz.asc;PGP/a] +[a href=http://www.apache.org/dist/cassandra/1.0.10/apache-cassandra-1.0.10-src.tar.gz.md5;MD5/a] +[a href=http://www.apache.org/dist/cassandra/1.0.10/apache-cassandra-1.0.10-src.tar.gz.sha1;SHA1/a] /li Modified: cassandra/site/src/settings.py URL: http://svn.apache.org/viewvc/cassandra/site/src/settings.py?rev=1335384r1=1335383r2=1335384view=diff == --- cassandra/site/src/settings.py (original) +++ cassandra/site/src/settings.py Tue May 8 09:02:53 2012 @@ -92,8 +92,8 @@ SITE_POST_PROCESSORS = { } class CassandraDef(object): -oldstable_version = '1.0.9' -oldstable_release_date = '2012-04-06' +oldstable_version = '1.0.10' +oldstable_release_date = '2012-05-08' oldstable_exists = True veryoldstable_version = '0.8.10' veryoldstable_release_date = '2012-02-13'
[jira] [Commented] (CASSANDRA-2598) incremental_backups and snapshot_before_compaction duplicate hard links
[ https://issues.apache.org/jira/browse/CASSANDRA-2598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13270362#comment-13270362 ] André Cruz commented on CASSANDRA-2598: --- I think I have a similar issue but I don't have incremental_backups or snapshot_before_compaction enabled, and I'm using 1.1. Since I upgraded to Cassandra 1.1, I get the following error when trying to delete a CF. After this happens the CF is not accessible anymore, but I cannot create another one with the same name until I restart the server. INFO [MigrationStage:1] 2012-05-07 18:10:12,682 ColumnFamilyStore.java (line 634) Enqueuing flush of Memtable-schema_columnfamilies@1128094887(978/1222 serialized/live bytes, 21 ops) INFO [FlushWriter:2] 2012-05-07 18:10:12,682 Memtable.java (line 266) Writing Memtable-schema_columnfamilies@1128094887(978/1222 serialized/live bytes, 21 ops) INFO [FlushWriter:2] 2012-05-07 18:10:12,720 Memtable.java (line 307) Completed flushing /var/lib/cassandra/data/system/schema_columnfamilies/system-schema_columnfamilies-hc-28-Data.db (1041 bytes) INFO [MigrationStage:1] 2012-05-07 18:10:12,721 ColumnFamilyStore.java (line 634) Enqueuing flush of Memtable-schema_columns@1599271050(392/490 serialized/live bytes, 8 ops) INFO [FlushWriter:2] 2012-05-07 18:10:12,722 Memtable.java (line 266) Writing Memtable-schema_columns@1599271050(392/490 serialized/live bytes, 8 ops) INFO [CompactionExecutor:8] 2012-05-07 18:10:12,722 CompactionTask.java (line 114) Compacting [SSTableReader(path='/var/lib/cassandra/data/system/schema_columnfamilies/system-schema_columnfamilies-hc-26-Data.db'), SSTableReader(path='/var/lib/cassandra/data/system/schema_columnfamilies/system-schema_columnfamilies-hc-28-Data.db'), SSTableReader(path='/var/lib/cassandra/data/system/schema_columnfamilies/system-schema_columnfam ilies-hc-27-Data.db'), SSTableReader(path='/var/lib/cassandra/data/system/schema_columnfamilies/system-schema_columnfamilies-hc-25-Data.db')] INFO [FlushWriter:2] 2012-05-07 18:10:12,806 Memtable.java (line 307) Completed flushing /var/lib/cassandra/data/system/schema_columns/system-schema_columns-hc-23-Data.db (447 bytes) INFO [CompactionExecutor:8] 2012-05-07 18:10:12,811 CompactionTask.java (line 225) Compacted to [/var/lib/cassandra/data/system/schema_columnfamilies/system-schema_columnfamilies-hc-29-Data.db,]. 24,797 to 21,431 (~86% of original) bytes for 2 keys at 0.232252MB/s. Time: 88ms. ERROR [MigrationStage:1] 2012-05-07 18:10:12,895 CLibrary.java (line 158) Unable to create hard link com.sun.jna.LastErrorException: errno was 17 at org.apache.cassandra.utils.CLibrary.link(Native Method) at org.apache.cassandra.utils.CLibrary.createHardLink(CLibrary.java:150) at org.apache.cassandra.db.Directories.snapshotLeveledManifest(Directories.java:343) at org.apache.cassandra.db.ColumnFamilyStore.snapshotWithoutFlush(ColumnFamilyStore.java:1450) at org.apache.cassandra.db.ColumnFamilyStore.snapshot(ColumnFamilyStore.java:1483) at org.apache.cassandra.db.DefsTable.dropColumnFamily(DefsTable.java:512) at org.apache.cassandra.db.DefsTable.mergeColumnFamilies(DefsTable.java:403) at org.apache.cassandra.db.DefsTable.mergeSchema(DefsTable.java:270) at org.apache.cassandra.service.MigrationManager$1.call(MigrationManager.java:214) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) ERROR [Thrift:17] 2012-05-07 18:10:12,898 CustomTThreadPoolServer.java (line 204) Error occurred during processing of message. java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.io.IOError: java.io.IOException: Unable to create hard link from /var/lib/cassandra/data/Disco/Client/Client.json to /var/lib/cassandra/data/ Disco/Client/snapshots/1336410612893-Client/Client.json (errno 17) at org.apache.cassandra.utils.FBUtilities.waitOnFuture(FBUtilities.java:372) at org.apache.cassandra.service.MigrationManager.announce(MigrationManager.java:191) at org.apache.cassandra.service.MigrationManager.announceColumnFamilyDrop(MigrationManager.java:182) at org.apache.cassandra.thrift.CassandraServer.system_drop_column_family(CassandraServer.java:948) at org.apache.cassandra.thrift.Cassandra$Processor$system_drop_column_family.getResult(Cassandra.java:3348) at org.apache.cassandra.thrift.Cassandra$Processor$system_drop_column_family.getResult(Cassandra.java:3336) at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:32) at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:34) at org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:186) at
[jira] [Commented] (CASSANDRA-4219) Problem with creating keyspace after drop
[ https://issues.apache.org/jira/browse/CASSANDRA-4219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13270458#comment-13270458 ] Jeff Williams commented on CASSANDRA-4219: -- Anything useful in that log? I seems to have replicated the issue somehow. Firstly, I moved the servers onto public IP's, though the last octet is the same: nodetool -h meta01 ring Address DC RackStatus State LoadOwns Token 113427455640312821154458202477256070485 91.223.192.25 CPH R1 Up Normal 11.2 MB 33.33% 0 91.223.192.26 CPH R1 Up Normal 15.16 MB33.33% 56713727820156410577229101238628035242 91.223.192.24 CPH R1 Up Normal 20.11 MB33.33% 113427455640312821154458202477256070485 I created a new keyspace PlayLog2 (PlayLog still does not work), and a column family playlog. This was available on all nodes. I then ran a few test inserts which worked fine. Then, to test fail-over, I shutdown the node 91.223.192.26 during inserts. The inserts completed fine and a while later I restarted the node 91.223.192.26. Then, when I went to re-run my tests, I see (Hector client): 5710 [Thread-1] DEBUG me.prettyprint.cassandra.connection.client.HThriftClient - Creating a new thrift connection to meta02.cph.aspiro.com(91.223.192.26):9160 5711 [Thread-0] DEBUG me.prettyprint.cassandra.connection.client.HThriftClient - keyspace reseting from null to PlayLog2 Exception in thread Thread-1 me.prettyprint.hector.api.exceptions.HInvalidRequestException: InvalidRequestException(why:Keyspace PlayLog2 does not exist) Sure enough, from command line client on 91.223.192.26, I see no PlayLog2 keyspace, yet it exists on 91.223.192.24 and 91.223.192.25. I have attached the system.log from 91.223.192.26 in the hope that it is useful. Problem with creating keyspace after drop - Key: CASSANDRA-4219 URL: https://issues.apache.org/jira/browse/CASSANDRA-4219 Project: Cassandra Issue Type: Bug Affects Versions: 1.1.0 Environment: Debian 6.0.4 x64 Reporter: Jeff Williams Fix For: 1.1.1 Attachments: 0001-Add-debug-logs.txt, system-debug.log.gz, system-startup-debug.log.gz, system.log.gz Hi, I'm doing testing and wanted to drop a keyspace (with a column family) to re-add it with a different strategy. So I ran in cqlsh: DROP KEYSPACE PlayLog; CREATE KEYSPACE PlayLog WITH strategy_class = 'SimpleStrategy' AND strategy_options:replication_factor = 2; And everything seemed to be fine. I ran some inserts, which also seemed to go fine, but then selecting them gave me: cqlsh:PlayLog select count(*) from playlog; TSocket read 0 bytes I wasn't sure what was wrong, so I tried dropping and creating again, and now when I try to create I get: cqlsh CREATE KEYSPACE PlayLog WITH strategy_class = 'SimpleStrategy' ... AND strategy_options:replication_factor = 2; TSocket read 0 bytes And the keyspace doesn't get created. In the log it shows: ERROR [Thrift:4] 2012-05-03 18:23:05,124 CustomTThreadPoolServer.java (line 204) Error occurred during processing of message. java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.AssertionError at org.apache.cassandra.utils.FBUtilities.waitOnFuture(FBUtilities.java:372) at org.apache.cassandra.service.MigrationManager.announce(MigrationManager.java:191) at org.apache.cassandra.service.MigrationManager.announceNewKeyspace(MigrationManager.java:129) at org.apache.cassandra.cql.QueryProcessor.processStatement(QueryProcessor.java:701) at org.apache.cassandra.cql.QueryProcessor.process(QueryProcessor.java:875) at org.apache.cassandra.thrift.CassandraServer.execute_cql_query(CassandraServer.java:1235) at org.apache.cassandra.thrift.Cassandra$Processor$execute_cql_query.getResult(Cassandra.java:3458) at org.apache.cassandra.thrift.Cassandra$Processor$execute_cql_query.getResult(Cassandra.java:3446) at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:32) at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:34) at org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:186) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) Caused by:
[jira] [Updated] (CASSANDRA-4219) Problem with creating keyspace after drop
[ https://issues.apache.org/jira/browse/CASSANDRA-4219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Williams updated CASSANDRA-4219: - Attachment: system-91.223.192.26.log.gz Problem with creating keyspace after drop - Key: CASSANDRA-4219 URL: https://issues.apache.org/jira/browse/CASSANDRA-4219 Project: Cassandra Issue Type: Bug Affects Versions: 1.1.0 Environment: Debian 6.0.4 x64 Reporter: Jeff Williams Fix For: 1.1.1 Attachments: 0001-Add-debug-logs.txt, system-91.223.192.26.log.gz, system-debug.log.gz, system-startup-debug.log.gz, system.log.gz Hi, I'm doing testing and wanted to drop a keyspace (with a column family) to re-add it with a different strategy. So I ran in cqlsh: DROP KEYSPACE PlayLog; CREATE KEYSPACE PlayLog WITH strategy_class = 'SimpleStrategy' AND strategy_options:replication_factor = 2; And everything seemed to be fine. I ran some inserts, which also seemed to go fine, but then selecting them gave me: cqlsh:PlayLog select count(*) from playlog; TSocket read 0 bytes I wasn't sure what was wrong, so I tried dropping and creating again, and now when I try to create I get: cqlsh CREATE KEYSPACE PlayLog WITH strategy_class = 'SimpleStrategy' ... AND strategy_options:replication_factor = 2; TSocket read 0 bytes And the keyspace doesn't get created. In the log it shows: ERROR [Thrift:4] 2012-05-03 18:23:05,124 CustomTThreadPoolServer.java (line 204) Error occurred during processing of message. java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.AssertionError at org.apache.cassandra.utils.FBUtilities.waitOnFuture(FBUtilities.java:372) at org.apache.cassandra.service.MigrationManager.announce(MigrationManager.java:191) at org.apache.cassandra.service.MigrationManager.announceNewKeyspace(MigrationManager.java:129) at org.apache.cassandra.cql.QueryProcessor.processStatement(QueryProcessor.java:701) at org.apache.cassandra.cql.QueryProcessor.process(QueryProcessor.java:875) at org.apache.cassandra.thrift.CassandraServer.execute_cql_query(CassandraServer.java:1235) at org.apache.cassandra.thrift.Cassandra$Processor$execute_cql_query.getResult(Cassandra.java:3458) at org.apache.cassandra.thrift.Cassandra$Processor$execute_cql_query.getResult(Cassandra.java:3446) at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:32) at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:34) at org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:186) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) Caused by: java.util.concurrent.ExecutionException: java.lang.AssertionError at java.util.concurrent.FutureTask$Sync.innerGet(Unknown Source) at java.util.concurrent.FutureTask.get(Unknown Source) at org.apache.cassandra.utils.FBUtilities.waitOnFuture(FBUtilities.java:368) ... 13 more Caused by: java.lang.AssertionError at org.apache.cassandra.db.DefsTable.updateKeyspace(DefsTable.java:441) at org.apache.cassandra.db.DefsTable.mergeKeyspaces(DefsTable.java:339) at org.apache.cassandra.db.DefsTable.mergeSchema(DefsTable.java:269) at org.apache.cassandra.service.MigrationManager$1.call(MigrationManager.java:214) at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source) at java.util.concurrent.FutureTask.run(Unknown Source) ... 3 more ERROR [MigrationStage:1] 2012-05-03 18:23:05,124 AbstractCassandraDaemon.java (line 134) Exception in thread Thread[MigrationStage:1,5,main] java.lang.AssertionError at org.apache.cassandra.db.DefsTable.updateKeyspace(DefsTable.java:441) at org.apache.cassandra.db.DefsTable.mergeKeyspaces(DefsTable.java:339) at org.apache.cassandra.db.DefsTable.mergeSchema(DefsTable.java:269) at org.apache.cassandra.service.MigrationManager$1.call(MigrationManager.java:214) at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source) at java.util.concurrent.FutureTask.run(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) Any ideas how I can recover from this? I am running version 1.1.0 and have tried nodetool repair, cleanup, compact. I can create other keyspaces, but still can't create a keyspace called PlayLog even though it is not listed anywhere. Jeff
[jira] [Updated] (CASSANDRA-4221) Error while deleting a columnfamily that is being compacted.
[ https://issues.apache.org/jira/browse/CASSANDRA-4221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tyler Patterson updated CASSANDRA-4221: --- Description: The following dtest command produces an error: {code}export CASSANDRA_VERSION=git:cassandra-1.1; nosetests --nocapture --nologcapture concurrent_schema_changes_test.py:TestConcurrentSchemaChanges.load_test{code} Here is the error: {code} Error occured during compaction java.util.concurrent.ExecutionException: java.io.IOError: java.io.FileNotFoundException: /tmp/dtest-6ECMgy/test/node1/data/Keyspace1/Standard1/Keyspace1-Standard1-hc-47-Data.db (No such file or directory) at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:252) at java.util.concurrent.FutureTask.get(FutureTask.java:111) at org.apache.cassandra.db.compaction.CompactionManager.performMaximal(CompactionManager.java:239) at org.apache.cassandra.db.ColumnFamilyStore.forceMajorCompaction(ColumnFamilyStore.java:1580) at org.apache.cassandra.service.StorageService.forceTableCompaction(StorageService.java:1770) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:616) at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:111) at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:45) at com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:226) at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:138) at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:251) at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:857) at com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:795) at javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1450) at javax.management.remote.rmi.RMIConnectionImpl.access$200(RMIConnectionImpl.java:90) at javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1285) at javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1383) at javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:807) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:616) at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:322) at sun.rmi.transport.Transport$1.run(Transport.java:177) at java.security.AccessController.doPrivileged(Native Method) at sun.rmi.transport.Transport.serviceCall(Transport.java:173) at sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:553) at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:808) at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:667) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:679) Caused by: java.io.IOError: java.io.FileNotFoundException: /tmp/dtest-6ECMgy/test/node1/data/Keyspace1/Standard1/Keyspace1-Standard1-hc-47-Data.db (No such file or directory) at org.apache.cassandra.io.sstable.SSTableScanner.init(SSTableScanner.java:61) at org.apache.cassandra.io.sstable.SSTableReader.getDirectScanner(SSTableReader.java:839) at org.apache.cassandra.io.sstable.SSTableReader.getDirectScanner(SSTableReader.java:851) at org.apache.cassandra.db.compaction.AbstractCompactionStrategy.getScanners(AbstractCompactionStrategy.java:142) at org.apache.cassandra.db.compaction.AbstractCompactionStrategy.getScanners(AbstractCompactionStrategy.java:148) at org.apache.cassandra.db.compaction.CompactionTask.execute(CompactionTask.java:121) at org.apache.cassandra.db.compaction.CompactionManager$6.runMayThrow(CompactionManager.java:264) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) at java.util.concurrent.FutureTask.run(FutureTask.java:166) ...
[jira] [Commented] (CASSANDRA-4219) Problem with creating keyspace after drop
[ https://issues.apache.org/jira/browse/CASSANDRA-4219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13270509#comment-13270509 ] Jeff Williams commented on CASSANDRA-4219: -- Ok, I can now reproduce this on my cluster. If I start with all three servers running. And on one of the servers I create a keyspace, create a column family and test, it all works fine. If I then drop the keyspace and re-create it, everything continues to work. However, as soon as one of the nodes is restarted, the keyspace disappears on that node. If I restart every node in the cluster, then the keyspace cannot be seen anyhwere, however, I can still no longer create a keyspace with that name. Problem with creating keyspace after drop - Key: CASSANDRA-4219 URL: https://issues.apache.org/jira/browse/CASSANDRA-4219 Project: Cassandra Issue Type: Bug Affects Versions: 1.1.0 Environment: Debian 6.0.4 x64 Reporter: Jeff Williams Fix For: 1.1.1 Attachments: 0001-Add-debug-logs.txt, system-91.223.192.26.log.gz, system-debug.log.gz, system-startup-debug.log.gz, system.log.gz Hi, I'm doing testing and wanted to drop a keyspace (with a column family) to re-add it with a different strategy. So I ran in cqlsh: DROP KEYSPACE PlayLog; CREATE KEYSPACE PlayLog WITH strategy_class = 'SimpleStrategy' AND strategy_options:replication_factor = 2; And everything seemed to be fine. I ran some inserts, which also seemed to go fine, but then selecting them gave me: cqlsh:PlayLog select count(*) from playlog; TSocket read 0 bytes I wasn't sure what was wrong, so I tried dropping and creating again, and now when I try to create I get: cqlsh CREATE KEYSPACE PlayLog WITH strategy_class = 'SimpleStrategy' ... AND strategy_options:replication_factor = 2; TSocket read 0 bytes And the keyspace doesn't get created. In the log it shows: ERROR [Thrift:4] 2012-05-03 18:23:05,124 CustomTThreadPoolServer.java (line 204) Error occurred during processing of message. java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.AssertionError at org.apache.cassandra.utils.FBUtilities.waitOnFuture(FBUtilities.java:372) at org.apache.cassandra.service.MigrationManager.announce(MigrationManager.java:191) at org.apache.cassandra.service.MigrationManager.announceNewKeyspace(MigrationManager.java:129) at org.apache.cassandra.cql.QueryProcessor.processStatement(QueryProcessor.java:701) at org.apache.cassandra.cql.QueryProcessor.process(QueryProcessor.java:875) at org.apache.cassandra.thrift.CassandraServer.execute_cql_query(CassandraServer.java:1235) at org.apache.cassandra.thrift.Cassandra$Processor$execute_cql_query.getResult(Cassandra.java:3458) at org.apache.cassandra.thrift.Cassandra$Processor$execute_cql_query.getResult(Cassandra.java:3446) at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:32) at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:34) at org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:186) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) Caused by: java.util.concurrent.ExecutionException: java.lang.AssertionError at java.util.concurrent.FutureTask$Sync.innerGet(Unknown Source) at java.util.concurrent.FutureTask.get(Unknown Source) at org.apache.cassandra.utils.FBUtilities.waitOnFuture(FBUtilities.java:368) ... 13 more Caused by: java.lang.AssertionError at org.apache.cassandra.db.DefsTable.updateKeyspace(DefsTable.java:441) at org.apache.cassandra.db.DefsTable.mergeKeyspaces(DefsTable.java:339) at org.apache.cassandra.db.DefsTable.mergeSchema(DefsTable.java:269) at org.apache.cassandra.service.MigrationManager$1.call(MigrationManager.java:214) at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source) at java.util.concurrent.FutureTask.run(Unknown Source) ... 3 more ERROR [MigrationStage:1] 2012-05-03 18:23:05,124 AbstractCassandraDaemon.java (line 134) Exception in thread Thread[MigrationStage:1,5,main] java.lang.AssertionError at org.apache.cassandra.db.DefsTable.updateKeyspace(DefsTable.java:441) at org.apache.cassandra.db.DefsTable.mergeKeyspaces(DefsTable.java:339) at org.apache.cassandra.db.DefsTable.mergeSchema(DefsTable.java:269) at org.apache.cassandra.service.MigrationManager$1.call(MigrationManager.java:214) at
[jira] [Assigned] (CASSANDRA-4219) Problem with creating keyspace after drop
[ https://issues.apache.org/jira/browse/CASSANDRA-4219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis reassigned CASSANDRA-4219: - Assignee: Pavel Yaskevich Problem with creating keyspace after drop - Key: CASSANDRA-4219 URL: https://issues.apache.org/jira/browse/CASSANDRA-4219 Project: Cassandra Issue Type: Bug Affects Versions: 1.1.0 Environment: Debian 6.0.4 x64 Reporter: Jeff Williams Assignee: Pavel Yaskevich Fix For: 1.1.1 Attachments: 0001-Add-debug-logs.txt, system-91.223.192.26.log.gz, system-debug.log.gz, system-startup-debug.log.gz, system.log.gz Hi, I'm doing testing and wanted to drop a keyspace (with a column family) to re-add it with a different strategy. So I ran in cqlsh: DROP KEYSPACE PlayLog; CREATE KEYSPACE PlayLog WITH strategy_class = 'SimpleStrategy' AND strategy_options:replication_factor = 2; And everything seemed to be fine. I ran some inserts, which also seemed to go fine, but then selecting them gave me: cqlsh:PlayLog select count(*) from playlog; TSocket read 0 bytes I wasn't sure what was wrong, so I tried dropping and creating again, and now when I try to create I get: cqlsh CREATE KEYSPACE PlayLog WITH strategy_class = 'SimpleStrategy' ... AND strategy_options:replication_factor = 2; TSocket read 0 bytes And the keyspace doesn't get created. In the log it shows: ERROR [Thrift:4] 2012-05-03 18:23:05,124 CustomTThreadPoolServer.java (line 204) Error occurred during processing of message. java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.AssertionError at org.apache.cassandra.utils.FBUtilities.waitOnFuture(FBUtilities.java:372) at org.apache.cassandra.service.MigrationManager.announce(MigrationManager.java:191) at org.apache.cassandra.service.MigrationManager.announceNewKeyspace(MigrationManager.java:129) at org.apache.cassandra.cql.QueryProcessor.processStatement(QueryProcessor.java:701) at org.apache.cassandra.cql.QueryProcessor.process(QueryProcessor.java:875) at org.apache.cassandra.thrift.CassandraServer.execute_cql_query(CassandraServer.java:1235) at org.apache.cassandra.thrift.Cassandra$Processor$execute_cql_query.getResult(Cassandra.java:3458) at org.apache.cassandra.thrift.Cassandra$Processor$execute_cql_query.getResult(Cassandra.java:3446) at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:32) at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:34) at org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:186) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) Caused by: java.util.concurrent.ExecutionException: java.lang.AssertionError at java.util.concurrent.FutureTask$Sync.innerGet(Unknown Source) at java.util.concurrent.FutureTask.get(Unknown Source) at org.apache.cassandra.utils.FBUtilities.waitOnFuture(FBUtilities.java:368) ... 13 more Caused by: java.lang.AssertionError at org.apache.cassandra.db.DefsTable.updateKeyspace(DefsTable.java:441) at org.apache.cassandra.db.DefsTable.mergeKeyspaces(DefsTable.java:339) at org.apache.cassandra.db.DefsTable.mergeSchema(DefsTable.java:269) at org.apache.cassandra.service.MigrationManager$1.call(MigrationManager.java:214) at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source) at java.util.concurrent.FutureTask.run(Unknown Source) ... 3 more ERROR [MigrationStage:1] 2012-05-03 18:23:05,124 AbstractCassandraDaemon.java (line 134) Exception in thread Thread[MigrationStage:1,5,main] java.lang.AssertionError at org.apache.cassandra.db.DefsTable.updateKeyspace(DefsTable.java:441) at org.apache.cassandra.db.DefsTable.mergeKeyspaces(DefsTable.java:339) at org.apache.cassandra.db.DefsTable.mergeSchema(DefsTable.java:269) at org.apache.cassandra.service.MigrationManager$1.call(MigrationManager.java:214) at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source) at java.util.concurrent.FutureTask.run(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) Any ideas how I can recover from this? I am running version 1.1.0 and have tried nodetool repair, cleanup, compact. I can create other keyspaces, but still can't create a keyspace called PlayLog even though
[jira] [Commented] (CASSANDRA-4223) Non Unique Streaming session ID's
[ https://issues.apache.org/jira/browse/CASSANDRA-4223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13270545#comment-13270545 ] Jonathan Ellis commented on CASSANDRA-4223: --- ... actually just an AtomicLong counter should be fine. If we have more than 2^63 streaming sessions I'll live w/ the overflow. :) Non Unique Streaming session ID's - Key: CASSANDRA-4223 URL: https://issues.apache.org/jira/browse/CASSANDRA-4223 Project: Cassandra Issue Type: Bug Components: Core Environment: Ubuntu 10.04.2 LTS java version 1.6.0_24 Java(TM) SE Runtime Environment (build 1.6.0_24-b07) Java HotSpot(TM) 64-Bit Server VM (build 19.1-b02, mixed mode) Bare metal servers from https://www.stormondemand.com/servers/baremetal.html The servers run on a custom hypervisor. Reporter: Aaron Morton Assignee: Aaron Morton Labels: datastax_qa Fix For: 1.0.11, 1.1.1 Attachments: NanoTest.java, fmm streaming bug.txt I have observed repair processes failing due to duplicate Streaming session ID's. In this installation it is preventing rebalance from completing. I believe it has also prevented repair from completing in the past. The attached streaming-logs.txt file contains log messages and an explanation of what was happening during a repair operation. it has the evidence for duplicate session ID's. The duplicate session id's were generated on the repairing node and sent to the streaming node. The streaming source replaced the first session with the second which resulted in both sessions failing when the first FILE_COMPLETE message was received. The errors were: {code:java} DEBUG [MiscStage:1] 2012-05-03 21:40:33,997 StreamReplyVerbHandler.java (line 47) Received StreamReply StreamReply(sessionId=26132848816442266, file='/var/lib/cassandra/data/FMM_Studio/PartsData-hc-1-Data.db', action=FILE_FINISHED) ERROR [MiscStage:1] 2012-05-03 21:40:34,027 AbstractCassandraDaemon.java (line 139) Fatal exception in thread Thread[MiscStage:1,5,main] java.lang.IllegalStateException: target reports current file is /var/lib/cassandra/data/FMM_Studio/PartsData-hc-1-Data.db but is null at org.apache.cassandra.streaming.StreamOutSession.validateCurrentFile(StreamOutSession.java:195) at org.apache.cassandra.streaming.StreamReplyVerbHandler.doVerb(StreamReplyVerbHandler.java:58) at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:59) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) {code} and {code:java} DEBUG [MiscStage:2] 2012-05-03 21:40:36,497 StreamReplyVerbHandler.java (line 47) Received StreamReply StreamReply(sessionId=26132848816442266, file='/var/lib/cassandra/data/OpsCenter/rollups7200-hc-3-Data.db', action=FILE_FINISHED) ERROR [MiscStage:2] 2012-05-03 21:40:36,497 AbstractCassandraDaemon.java (line 139) Fatal exception in thread Thread[MiscStage:2,5,main] java.lang.IllegalStateException: target reports current file is /var/lib/cassandra/data/OpsCenter/rollups7200-hc-3-Data.db but is null at org.apache.cassandra.streaming.StreamOutSession.validateCurrentFile(StreamOutSession.java:195) at org.apache.cassandra.streaming.StreamReplyVerbHandler.doVerb(StreamReplyVerbHandler.java:58) at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:59) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) {code} I think this is because System.nanoTime() is used for the session ID when creating the StreamInSession objects (driven from StorageService.requestRanges()) . From the documentation (http://docs.oracle.com/javase/6/docs/api/java/lang/System.html#nanoTime()) {quote} This method provides nanosecond precision, but not necessarily nanosecond accuracy. No guarantees are made about how frequently values change. {quote} Also some info here on clocks and timers https://blogs.oracle.com/dholmes/entry/inside_the_hotspot_vm_clocks The hypervisor may be at fault here. But it seems like we cannot rely on successive calls to nanoTime() to return different values. To avoid message/interface changes on the StreamHeader it would be good to keep the session ID a long. The simplest approach may be to make successive calls to nanoTime until the result changes. We could fail if a certain number of milliseconds have passed. Hashing the file names and ranges is also a possibility,
[jira] [Created] (CASSANDRA-4225) EC2 nodes randomly hard-crash the machine on newest EC2 Linux AMI
Delaney Manders created CASSANDRA-4225: -- Summary: EC2 nodes randomly hard-crash the machine on newest EC2 Linux AMI Key: CASSANDRA-4225 URL: https://issues.apache.org/jira/browse/CASSANDRA-4225 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 1.1.0 Environment: Amazon Linux AMI release 2012.03 3.2.12-3.2.4.amzn1.x86_64 m1.xlarge Nodes have: Cassandra built and installed from source. Ant binary (apache-ant-1.8.3-bin.tar.gz), automake(1.11.1), autoconf(2.64), libtool(2.2.10) installed from AWS repository. Sun Java: java -version java version 1.6.0_31 Java(TM) SE Runtime Environment (build 1.6.0_31-b04) Java HotSpot(TM) 64-Bit Server VM (build 20.6-b01, mixed mode) Only system changes are: echo root soft memlock unlimited | sudo tee -a /etc/security/limits.conf echo root hard memlock unlimited | sudo tee -a /etc/security/limits.conf Setup scripts available. Cassandra cluster has two datacenters, with DC1 having 8 nodes and DC2 having 4, DC2 being reserved for Hadoop jobs. DC2 nodes have not had the same frequency of hard crashes, though it has happened. Storage is set up with 4 ephemeral drives raided for commit, 4 EBS drives raided for storage. Usage is exclusively write, with all mutations being done in batch mutations, where each batch mutation has a set of columns added/modified to a single key. There are ~2000 threads streaming batch mutations from a web edge of varying size, distributed across DC1. Client is Hector(1.0-5) w/ DynamicLoadBalancing. In an effort to mitigate this issue, I've removed jna.jar platform.jar from $CASSANDRA_HOME/lib, and set disk_access_mode: standard in $CASSANDRA_HOME/conf.cassandra.yaml. Neither has seemed to help. Reporter: Delaney Manders At fairly random intervals, about once/day, one of my Cassandra nodes does a hard crash (kernel panic). I can find no system logs (/var/log/*) which have any errors. No cassandra logs have any errors. On one machine I was watching as it went down, and caught the following comment: Message from syslogd@domU-12-31-38-00-64-31 at May 3 18:24:17 ... kernel:[252906.019808] Oops: 0002 [#1] SMP An AWS support guy found one entry in the console logs: [30178.298308] Pid: 2238, comm: java Not tainted 3.2.12-3.2.4.amzn1.x86_64 #1 I've replaced two of the nodes with new instances, but all are showing the same behaviour. It's very reproduceable on my system, though it takes a little waiting. Leaving it running is no big deal for another day or so, I just need to restart Cassandra every once in a while when I get alerted. I'm open to any additional requested debugging steps before bailing and going back to 1.0.9. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-4225) EC2 nodes randomly hard-crash the machine on newest EC2 Linux AMI
[ https://issues.apache.org/jira/browse/CASSANDRA-4225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13270582#comment-13270582 ] Delaney Manders commented on CASSANDRA-4225: Would be happy to provide login credentials to the most recently crashed machine an active contributor who wants to see the environment first-hand. EC2 nodes randomly hard-crash the machine on newest EC2 Linux AMI - Key: CASSANDRA-4225 URL: https://issues.apache.org/jira/browse/CASSANDRA-4225 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 1.1.0 Environment: Amazon Linux AMI release 2012.03 3.2.12-3.2.4.amzn1.x86_64 m1.xlarge Nodes have: Cassandra built and installed from source. Ant binary (apache-ant-1.8.3-bin.tar.gz), automake(1.11.1), autoconf(2.64), libtool(2.2.10) installed from AWS repository. Sun Java: java -version java version 1.6.0_31 Java(TM) SE Runtime Environment (build 1.6.0_31-b04) Java HotSpot(TM) 64-Bit Server VM (build 20.6-b01, mixed mode) Only system changes are: echo root soft memlock unlimited | sudo tee -a /etc/security/limits.conf echo root hard memlock unlimited | sudo tee -a /etc/security/limits.conf Setup scripts available. Cassandra cluster has two datacenters, with DC1 having 8 nodes and DC2 having 4, DC2 being reserved for Hadoop jobs. DC2 nodes have not had the same frequency of hard crashes, though it has happened. Storage is set up with 4 ephemeral drives raided for commit, 4 EBS drives raided for storage. Usage is exclusively write, with all mutations being done in batch mutations, where each batch mutation has a set of columns added/modified to a single key. There are ~2000 threads streaming batch mutations from a web edge of varying size, distributed across DC1. Client is Hector(1.0-5) w/ DynamicLoadBalancing. In an effort to mitigate this issue, I've removed jna.jar platform.jar from $CASSANDRA_HOME/lib, and set disk_access_mode: standard in $CASSANDRA_HOME/conf.cassandra.yaml. Neither has seemed to help. Reporter: Delaney Manders At fairly random intervals, about once/day, one of my Cassandra nodes does a hard crash (kernel panic). I can find no system logs (/var/log/*) which have any errors. No cassandra logs have any errors. On one machine I was watching as it went down, and caught the following comment: Message from syslogd@domU-12-31-38-00-64-31 at May 3 18:24:17 ... kernel:[252906.019808] Oops: 0002 [#1] SMP An AWS support guy found one entry in the console logs: [30178.298308] Pid: 2238, comm: java Not tainted 3.2.12-3.2.4.amzn1.x86_64 #1 I've replaced two of the nodes with new instances, but all are showing the same behaviour. It's very reproduceable on my system, though it takes a little waiting. Leaving it running is no big deal for another day or so, I just need to restart Cassandra every once in a while when I get alerted. I'm open to any additional requested debugging steps before bailing and going back to 1.0.9. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-4225) EC2 nodes randomly hard-crash the machine on newest EC2 Linux AMI
[ https://issues.apache.org/jira/browse/CASSANDRA-4225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13270605#comment-13270605 ] Brandon Williams commented on CASSANDRA-4225: - We shouldn't be able to crash the kernel at all (or we'd be crashing _every_ kernel by doing something stupid like writing to /dev/mem) but running as a non-root user should provide a smoking gun for the upstream kernel devs. Suggested as much on irc. EC2 nodes randomly hard-crash the machine on newest EC2 Linux AMI - Key: CASSANDRA-4225 URL: https://issues.apache.org/jira/browse/CASSANDRA-4225 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 1.1.0 Environment: Amazon Linux AMI release 2012.03 3.2.12-3.2.4.amzn1.x86_64 m1.xlarge Nodes have: Cassandra built and installed from source. Ant binary (apache-ant-1.8.3-bin.tar.gz), automake(1.11.1), autoconf(2.64), libtool(2.2.10) installed from AWS repository. Sun Java: java -version java version 1.6.0_31 Java(TM) SE Runtime Environment (build 1.6.0_31-b04) Java HotSpot(TM) 64-Bit Server VM (build 20.6-b01, mixed mode) Only system changes are: echo root soft memlock unlimited | sudo tee -a /etc/security/limits.conf echo root hard memlock unlimited | sudo tee -a /etc/security/limits.conf Setup scripts available. Cassandra cluster has two datacenters, with DC1 having 8 nodes and DC2 having 4, DC2 being reserved for Hadoop jobs. DC2 nodes have not had the same frequency of hard crashes, though it has happened. Storage is set up with 4 ephemeral drives raided for commit, 4 EBS drives raided for storage. Usage is exclusively write, with all mutations being done in batch mutations, where each batch mutation has a set of columns added/modified to a single key. There are ~2000 threads streaming batch mutations from a web edge of varying size, distributed across DC1. Client is Hector(1.0-5) w/ DynamicLoadBalancing. In an effort to mitigate this issue, I've removed jna.jar platform.jar from $CASSANDRA_HOME/lib, and set disk_access_mode: standard in $CASSANDRA_HOME/conf.cassandra.yaml. Neither has seemed to help. Reporter: Delaney Manders At fairly random intervals, about once/day, one of my Cassandra nodes does a hard crash (kernel panic). I can find no system logs (/var/log/*) which have any errors. No cassandra logs have any errors. On one machine I was watching as it went down, and caught the following comment: Message from syslogd@domU-12-31-38-00-64-31 at May 3 18:24:17 ... kernel:[252906.019808] Oops: 0002 [#1] SMP An AWS support guy found one entry in the console logs: [30178.298308] Pid: 2238, comm: java Not tainted 3.2.12-3.2.4.amzn1.x86_64 #1 I've replaced two of the nodes with new instances, but all are showing the same behaviour. It's very reproduceable on my system, though it takes a little waiting. Leaving it running is no big deal for another day or so, I just need to restart Cassandra every once in a while when I get alerted. I'm open to any additional requested debugging steps before bailing and going back to 1.0.9. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-4225) EC2 nodes randomly hard-crash the machine on newest EC2 Linux AMI
[ https://issues.apache.org/jira/browse/CASSANDRA-4225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13270606#comment-13270606 ] Delaney Manders commented on CASSANDRA-4225: Deploying that change now, I'll report back w/ the next crash. Thanks Brandon. :) EC2 nodes randomly hard-crash the machine on newest EC2 Linux AMI - Key: CASSANDRA-4225 URL: https://issues.apache.org/jira/browse/CASSANDRA-4225 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 1.1.0 Environment: Amazon Linux AMI release 2012.03 3.2.12-3.2.4.amzn1.x86_64 m1.xlarge Nodes have: Cassandra built and installed from source. Ant binary (apache-ant-1.8.3-bin.tar.gz), automake(1.11.1), autoconf(2.64), libtool(2.2.10) installed from AWS repository. Sun Java: java -version java version 1.6.0_31 Java(TM) SE Runtime Environment (build 1.6.0_31-b04) Java HotSpot(TM) 64-Bit Server VM (build 20.6-b01, mixed mode) Only system changes are: echo root soft memlock unlimited | sudo tee -a /etc/security/limits.conf echo root hard memlock unlimited | sudo tee -a /etc/security/limits.conf Setup scripts available. Cassandra cluster has two datacenters, with DC1 having 8 nodes and DC2 having 4, DC2 being reserved for Hadoop jobs. DC2 nodes have not had the same frequency of hard crashes, though it has happened. Storage is set up with 4 ephemeral drives raided for commit, 4 EBS drives raided for storage. Usage is exclusively write, with all mutations being done in batch mutations, where each batch mutation has a set of columns added/modified to a single key. There are ~2000 threads streaming batch mutations from a web edge of varying size, distributed across DC1. Client is Hector(1.0-5) w/ DynamicLoadBalancing. In an effort to mitigate this issue, I've removed jna.jar platform.jar from $CASSANDRA_HOME/lib, and set disk_access_mode: standard in $CASSANDRA_HOME/conf.cassandra.yaml. Neither has seemed to help. Reporter: Delaney Manders At fairly random intervals, about once/day, one of my Cassandra nodes does a hard crash (kernel panic). I can find no system logs (/var/log/*) which have any errors. No cassandra logs have any errors. On one machine I was watching as it went down, and caught the following comment: Message from syslogd@domU-12-31-38-00-64-31 at May 3 18:24:17 ... kernel:[252906.019808] Oops: 0002 [#1] SMP An AWS support guy found one entry in the console logs: [30178.298308] Pid: 2238, comm: java Not tainted 3.2.12-3.2.4.amzn1.x86_64 #1 I've replaced two of the nodes with new instances, but all are showing the same behaviour. It's very reproduceable on my system, though it takes a little waiting. Leaving it running is no big deal for another day or so, I just need to restart Cassandra every once in a while when I get alerted. I'm open to any additional requested debugging steps before bailing and going back to 1.0.9. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[2/15] git commit: remove unnecessary class
remove unnecessary class Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/16f6b0ca Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/16f6b0ca Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/16f6b0ca Branch: refs/heads/trunk Commit: 16f6b0ca10de59a1b6e1eb2eb7c899b700890680 Parents: 587cb58 Author: Yuki Morishita mor.y...@gmail.com Authored: Tue Apr 10 13:23:52 2012 -0500 Committer: Jonathan Ellis jbel...@apache.org Committed: Tue May 8 12:51:11 2012 -0500 -- .../cassandra/gms/GossipShutdownMessage.java | 34 --- 1 files changed, 0 insertions(+), 34 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/16f6b0ca/src/java/org/apache/cassandra/gms/GossipShutdownMessage.java -- diff --git a/src/java/org/apache/cassandra/gms/GossipShutdownMessage.java b/src/java/org/apache/cassandra/gms/GossipShutdownMessage.java deleted file mode 100644 index 5bcd646..000 --- a/src/java/org/apache/cassandra/gms/GossipShutdownMessage.java +++ /dev/null @@ -1,34 +0,0 @@ -/* - * Licensed to the Apache Software Foundation (ASF) under one - * or more contributor license agreements. See the NOTICE file - * distributed with this work for additional information - * regarding copyright ownership. The ASF licenses this file - * to you under the Apache License, Version 2.0 (the - * License); you may not use this file except in compliance - * with the License. You may obtain a copy of the License at - * - * http://www.apache.org/licenses/LICENSE-2.0 - * - * Unless required by applicable law or agreed to in writing, software - * distributed under the License is distributed on an AS IS BASIS, - * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. - * See the License for the specific language governing permissions and - * limitations under the License. - */ -package org.apache.cassandra.gms; - -import org.apache.cassandra.io.IVersionedSerializer; - -import java.io.DataInput; -import java.io.DataOutput; -import java.io.IOException; - -/** - * This message indicates the gossiper is shutting down - */ -class GossipShutdownMessage -{ -GossipShutdownMessage() -{ -} -}
[4/15] serialier() - serializer and some clean up
http://git-wip-us.apache.org/repos/asf/cassandra/blob/587cb582/src/java/org/apache/cassandra/streaming/StreamingRepairTask.java -- diff --git a/src/java/org/apache/cassandra/streaming/StreamingRepairTask.java b/src/java/org/apache/cassandra/streaming/StreamingRepairTask.java index 947de9b..ac46c7b 100644 --- a/src/java/org/apache/cassandra/streaming/StreamingRepairTask.java +++ b/src/java/org/apache/cassandra/streaming/StreamingRepairTask.java @@ -242,9 +242,7 @@ public class StreamingRepairTask implements Runnable dos.writeUTF(task.cfName); dos.writeInt(task.ranges.size()); for (RangeToken range : task.ranges) -{ -AbstractBounds.serializer().serialize(range, dos, version); -} +AbstractBounds.serializer.serialize(range, dos, version); // We don't serialize the callback on purpose } @@ -259,9 +257,7 @@ public class StreamingRepairTask implements Runnable int rangesCount = dis.readInt(); ListRangeToken ranges = new ArrayListRangeToken(rangesCount); for (int i = 0; i rangesCount; ++i) -{ -ranges.add((RangeToken) AbstractBounds.serializer().deserialize(dis, version).toTokenBounds()); -} +ranges.add((RangeToken) AbstractBounds.serializer.deserialize(dis, version).toTokenBounds()); return new StreamingRepairTask(id, owner, src, dst, tableName, cfName, ranges, makeReplyingCallback(owner, id)); } @@ -273,7 +269,7 @@ public class StreamingRepairTask implements Runnable size += FBUtilities.serializedUTF8Size(task.cfName); size += DBTypeSizes.NATIVE.sizeof(task.ranges.size()); for (RangeToken range : task.ranges) -size += AbstractBounds.serializer().serializedSize(range, version); +size += AbstractBounds.serializer.serializedSize(range, version); return size; } } http://git-wip-us.apache.org/repos/asf/cassandra/blob/587cb582/src/java/org/apache/cassandra/utils/MerkleTree.java -- diff --git a/src/java/org/apache/cassandra/utils/MerkleTree.java b/src/java/org/apache/cassandra/utils/MerkleTree.java index d012912..40d90a1 100644 --- a/src/java/org/apache/cassandra/utils/MerkleTree.java +++ b/src/java/org/apache/cassandra/utils/MerkleTree.java @@ -695,7 +695,7 @@ public class MerkleTree implements Serializable dos.writeInt(inner.hash.length); dos.write(inner.hash); } -Token.serializer().serialize(inner.token, dos); +Token.serializer.serialize(inner.token, dos); Hashable.serializer.serialize(inner.lchild, dos, version); Hashable.serializer.serialize(inner.rchild, dos, version); } @@ -706,7 +706,7 @@ public class MerkleTree implements Serializable byte[] hash = hashLen = 0 ? new byte[hashLen] : null; if (hash != null) dis.readFully(hash); -Token token = Token.serializer().deserialize(dis); +Token token = Token.serializer.deserialize(dis); Hashable lchild = Hashable.serializer.deserialize(dis, version); Hashable rchild = Hashable.serializer.deserialize(dis, version); return new Inner(token, lchild, rchild); @@ -718,7 +718,7 @@ public class MerkleTree implements Serializable ? DBTypeSizes.NATIVE.sizeof(-1) : DBTypeSizes.NATIVE.sizeof(inner.hash().length) + inner.hash().length; -size += Token.serializer().serializedSize(inner.token, DBTypeSizes.NATIVE) +size += Token.serializer.serializedSize(inner.token, DBTypeSizes.NATIVE) + Hashable.serializer.serializedSize(inner.lchild, version) + Hashable.serializer.serializedSize(inner.rchild, version); return size; http://git-wip-us.apache.org/repos/asf/cassandra/blob/587cb582/test/unit/org/apache/cassandra/Util.java -- diff --git a/test/unit/org/apache/cassandra/Util.java b/test/unit/org/apache/cassandra/Util.java index a55787c..2dd0e36 100644 --- a/test/unit/org/apache/cassandra/Util.java +++ b/test/unit/org/apache/cassandra/Util.java @@ -278,7 +278,7 @@ public class Util { ByteArrayOutputStream baos = new ByteArrayOutputStream(); DataOutputStream dos = new DataOutputStream(baos); -cf.serializer().serializeForSSTable(cf, dos); +cf.serializer.serializeForSSTable(cf, dos); return ByteBuffer.wrap(baos.toByteArray()); } }
[5/15] git commit: cleanup CallbackInfo and mark deprecated Verbs
cleanup CallbackInfo and mark deprecated Verbs Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/dd020e1f Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/dd020e1f Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/dd020e1f Branch: refs/heads/trunk Commit: dd020e1fd97019af4a2be66813071540b3d1be28 Parents: f81cc74 Author: Yuki Morishita mor.y...@gmail.com Authored: Wed Apr 4 16:26:21 2012 -0500 Committer: Jonathan Ellis jbel...@apache.org Committed: Tue May 8 12:41:01 2012 -0500 -- .../org/apache/cassandra/net/CallbackInfo.java | 15 +-- .../org/apache/cassandra/net/MessagingService.java | 12 ++-- 2 files changed, 15 insertions(+), 12 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/dd020e1f/src/java/org/apache/cassandra/net/CallbackInfo.java -- diff --git a/src/java/org/apache/cassandra/net/CallbackInfo.java b/src/java/org/apache/cassandra/net/CallbackInfo.java index 1def33a..a5fc8ad 100644 --- a/src/java/org/apache/cassandra/net/CallbackInfo.java +++ b/src/java/org/apache/cassandra/net/CallbackInfo.java @@ -1,4 +1,4 @@ -/** +/* * Licensed to the Apache Software Foundation (ASF) under one * or more contributor license agreements. See the NOTICE file * distributed with this work for additional information @@ -15,7 +15,6 @@ * See the License for the specific language governing permissions and * limitations under the License. */ - package org.apache.cassandra.net; import java.net.InetAddress; @@ -35,12 +34,16 @@ public class CallbackInfo protected final MessageOut? sentMessage; protected final IVersionedSerializer? serializer; +/** + * Create CallbackInfo without sent message + * + * @param target target to send message + * @param callback + * @param serializer serializer to deserialize response message + */ public CallbackInfo(InetAddress target, IMessageCallback callback, IVersionedSerializer? serializer) { -this.target = target; -this.callback = callback; -this.serializer = serializer; -this.sentMessage = null; +this(target, callback, null, serializer); } public CallbackInfo(InetAddress target, IMessageCallback callback, MessageOut? sentMessage, IVersionedSerializer? serializer) http://git-wip-us.apache.org/repos/asf/cassandra/blob/dd020e1f/src/java/org/apache/cassandra/net/MessagingService.java -- diff --git a/src/java/org/apache/cassandra/net/MessagingService.java b/src/java/org/apache/cassandra/net/MessagingService.java index 4f3b603..5fa4d31 100644 --- a/src/java/org/apache/cassandra/net/MessagingService.java +++ b/src/java/org/apache/cassandra/net/MessagingService.java @@ -87,27 +87,27 @@ public final class MessagingService implements MessagingServiceMBean public enum Verb { MUTATION, -BINARY, // Deprecated +@Deprecated BINARY, READ_REPAIR, READ, REQUEST_RESPONSE, // client-initiated reads and writes -STREAM_INITIATE, // Deprecated -STREAM_INITIATE_DONE, // Deprecated +@Deprecated STREAM_INITIATE, +@Deprecated STREAM_INITIATE_DONE, STREAM_REPLY, STREAM_REQUEST, RANGE_SLICE, BOOTSTRAP_TOKEN, TREE_REQUEST, TREE_RESPONSE, -JOIN, // Deprecated +@Deprecated JOIN, GOSSIP_DIGEST_SYN, GOSSIP_DIGEST_ACK, GOSSIP_DIGEST_ACK2, -DEFINITIONS_ANNOUNCE, // Deprecated +@Deprecated DEFINITIONS_ANNOUNCE, DEFINITIONS_UPDATE, TRUNCATE, SCHEMA_CHECK, -INDEX_SCAN, // Deprecated +@Deprecated INDEX_SCAN, REPLICATION_FINISHED, INTERNAL_RESPONSE, // responses to internal calls COUNTER_MUTATION,
[7/15] git commit: serializedSize implementations, part 1 (gossip and streaming packages) patch by jbellis; reviewed by yukim for CASSANDRA-3617
serializedSize implementations, part 1 (gossip and streaming packages) patch by jbellis; reviewed by yukim for CASSANDRA-3617 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/5b9fc26c Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/5b9fc26c Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/5b9fc26c Branch: refs/heads/trunk Commit: 5b9fc26c51161837f01a9383aad8a2786445a4bd Parents: 9471e8d Author: Jonathan Ellis jbel...@apache.org Authored: Mon Mar 26 17:53:59 2012 -0500 Committer: Jonathan Ellis jbel...@apache.org Committed: Tue May 8 12:40:53 2012 -0500 -- .../org/apache/cassandra/dht/AbstractBounds.java | 32 --- .../org/apache/cassandra/gms/EndpointState.java| 13 - .../org/apache/cassandra/gms/GossipDigest.java |9 ++- .../org/apache/cassandra/gms/GossipDigestAck.java | 44 --- .../org/apache/cassandra/gms/GossipDigestAck2.java | 32 +-- .../org/apache/cassandra/gms/GossipDigestSyn.java | 38 +++- .../cassandra/gms/GossipShutdownMessage.java |2 +- .../org/apache/cassandra/gms/HeartBeatState.java |5 +- .../org/apache/cassandra/gms/VersionedValue.java | 21 --- .../apache/cassandra/service/MigrationManager.java |4 +- .../apache/cassandra/streaming/PendingFile.java| 18 +- .../apache/cassandra/streaming/StreamHeader.java | 14 - .../apache/cassandra/streaming/StreamReply.java|6 +- .../apache/cassandra/streaming/StreamRequest.java | 28 +++-- .../cassandra/streaming/StreamingRepairTask.java | 11 +++- .../org/apache/cassandra/utils/FBUtilities.java|7 ++ .../org/apache/cassandra/utils/MerkleTree.java | 20 +-- src/java/org/apache/cassandra/utils/UUIDGen.java |4 +- 18 files changed, 218 insertions(+), 90 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/5b9fc26c/src/java/org/apache/cassandra/dht/AbstractBounds.java -- diff --git a/src/java/org/apache/cassandra/dht/AbstractBounds.java b/src/java/org/apache/cassandra/dht/AbstractBounds.java index cdca2b2..44344cc 100644 --- a/src/java/org/apache/cassandra/dht/AbstractBounds.java +++ b/src/java/org/apache/cassandra/dht/AbstractBounds.java @@ -23,6 +23,7 @@ import java.io.IOException; import java.io.Serializable; import java.util.*; +import org.apache.cassandra.db.DBTypeSizes; import org.apache.cassandra.db.RowPosition; import org.apache.cassandra.io.IVersionedSerializer; import org.apache.cassandra.net.MessagingService; @@ -118,12 +119,8 @@ public abstract class AbstractBoundsT extends RingPosition implements Serializ * The first int tells us if it's a range or bounds (depending on the value) _and_ if it's tokens or keys (depending on the * sign). We use negative kind for keys so as to preserve the serialization of token from older version. */ -boolean isToken = range.left instanceof Token; -int kind = range instanceof Range ? Type.RANGE.ordinal() : Type.BOUNDS.ordinal(); -if (!isToken) -kind = -(kind+1); -out.writeInt(kind); -if (isToken) +out.writeInt(kindInt(range)); +if (range.left instanceof Token) { Token.serializer().serialize((Token)range.left, out); Token.serializer().serialize((Token)range.right, out); @@ -135,6 +132,14 @@ public abstract class AbstractBoundsT extends RingPosition implements Serializ } } +private int kindInt(AbstractBounds? ab) +{ +int kind = ab instanceof Range ? Type.RANGE.ordinal() : Type.BOUNDS.ordinal(); +if (!(ab.left instanceof Token)) +kind = -(kind + 1); +return kind; +} + public AbstractBounds? deserialize(DataInput in, int version) throws IOException { int kind = in.readInt(); @@ -159,9 +164,20 @@ public abstract class AbstractBoundsT extends RingPosition implements Serializ return new Bounds(left, right); } -public long serializedSize(AbstractBounds? abstractBounds, int version) +public long serializedSize(AbstractBounds? ab, int version) { -throw new UnsupportedOperationException(); +int size = DBTypeSizes.NATIVE.sizeof(kindInt(ab)); +if (ab.left instanceof Token) +{ +size += Token.serializer().serializedSize((Token) ab.left, DBTypeSizes.NATIVE); +size += Token.serializer().serializedSize((Token) ab.right, DBTypeSizes.NATIVE); +} +else +{ +
[15/15] Introduce MessageOut class, which wraps an object to be sent in the payload field. The old Header class is inlined into the parameters map. patch by jbellis; reviewed by yukim for CASSANDR
http://git-wip-us.apache.org/repos/asf/cassandra/blob/5a6f0b85/test/unit/org/apache/cassandra/db/SerializationsTest.java -- diff --git a/test/unit/org/apache/cassandra/db/SerializationsTest.java b/test/unit/org/apache/cassandra/db/SerializationsTest.java index aea609e..a340a94 100644 --- a/test/unit/org/apache/cassandra/db/SerializationsTest.java +++ b/test/unit/org/apache/cassandra/db/SerializationsTest.java @@ -30,6 +30,7 @@ import org.apache.cassandra.dht.IPartitioner; import org.apache.cassandra.dht.Range; import org.apache.cassandra.dht.Token; import org.apache.cassandra.net.Message; +import org.apache.cassandra.net.MessageOut; import org.apache.cassandra.net.MessageSerializer; import org.apache.cassandra.net.MessagingService; import org.apache.cassandra.service.StorageService; @@ -68,22 +69,21 @@ public class SerializationsTest extends AbstractSerializationsTester IPartitioner part = StorageService.getPartitioner(); AbstractBoundsRowPosition bounds = new RangeToken(part.getRandomToken(), part.getRandomToken()).toRowBounds(); -Message namesCmd = new RangeSliceCommand(Statics.KS, Standard1, null, namesPred, bounds, 100).getMessage(MessagingService.current_version); -Message emptyRangeCmd = new RangeSliceCommand(Statics.KS, Standard1, null, emptyRangePred, bounds, 100).getMessage(MessagingService.current_version); -Message regRangeCmd = new RangeSliceCommand(Statics.KS, Standard1, null, nonEmptyRangePred, bounds, 100).getMessage(MessagingService.current_version); -Message namesCmdSup = new RangeSliceCommand(Statics.KS, Super1, Statics.SC, namesPred, bounds, 100).getMessage(MessagingService.current_version); -Message emptyRangeCmdSup = new RangeSliceCommand(Statics.KS, Super1, Statics.SC, emptyRangePred, bounds, 100).getMessage(MessagingService.current_version); -Message regRangeCmdSup = new RangeSliceCommand(Statics.KS, Super1, Statics.SC, nonEmptyRangePred, bounds, 100).getMessage(MessagingService.current_version); - -DataOutputStream dout = getOutput(db.RangeSliceCommand.bin); - -messageSerializer.serialize(namesCmd, dout, getVersion()); -messageSerializer.serialize(emptyRangeCmd, dout, getVersion()); -messageSerializer.serialize(regRangeCmd, dout, getVersion()); -messageSerializer.serialize(namesCmdSup, dout, getVersion()); -messageSerializer.serialize(emptyRangeCmdSup, dout, getVersion()); -messageSerializer.serialize(regRangeCmdSup, dout, getVersion()); -dout.close(); +MessageOutRangeSliceCommand namesCmd = new RangeSliceCommand(Statics.KS, Standard1, null, namesPred, bounds, 100).createMessage(); +MessageOutRangeSliceCommand emptyRangeCmd = new RangeSliceCommand(Statics.KS, Standard1, null, emptyRangePred, bounds, 100).createMessage(); +MessageOutRangeSliceCommand regRangeCmd = new RangeSliceCommand(Statics.KS, Standard1, null, nonEmptyRangePred, bounds, 100).createMessage(); +MessageOutRangeSliceCommand namesCmdSup = new RangeSliceCommand(Statics.KS, Super1, Statics.SC, namesPred, bounds, 100).createMessage(); +MessageOutRangeSliceCommand emptyRangeCmdSup = new RangeSliceCommand(Statics.KS, Super1, Statics.SC, emptyRangePred, bounds, 100).createMessage(); +MessageOutRangeSliceCommand regRangeCmdSup = new RangeSliceCommand(Statics.KS, Super1, Statics.SC, nonEmptyRangePred, bounds, 100).createMessage(); + +DataOutputStream out = getOutput(db.RangeSliceCommand.bin); +namesCmd.serialize(out, getVersion()); +emptyRangeCmd.serialize(out, getVersion()); +regRangeCmd.serialize(out, getVersion()); +namesCmdSup.serialize(out, getVersion()); +emptyRangeCmdSup.serialize(out, getVersion()); +regRangeCmdSup.serialize(out, getVersion()); +out.close(); } @Test @@ -111,8 +111,8 @@ public class SerializationsTest extends AbstractSerializationsTester SliceByNamesReadCommand.serializer().serialize(superCmd, out, getVersion()); ReadCommand.serializer().serialize(standardCmd, out, getVersion()); ReadCommand.serializer().serialize(superCmd, out, getVersion()); -messageSerializer.serialize(standardCmd.getMessage(getVersion()), out, getVersion()); -messageSerializer.serialize(superCmd.getMessage(getVersion()), out, getVersion()); +standardCmd.createMessage().serialize(out, getVersion()); +superCmd.createMessage().serialize(out, getVersion()); out.close(); } @@ -141,8 +141,8 @@ public class SerializationsTest extends AbstractSerializationsTester SliceFromReadCommand.serializer().serialize(superCmd, out, getVersion()); ReadCommand.serializer().serialize(standardCmd, out, getVersion()); ReadCommand.serializer().serialize(superCmd, out, getVersion()); -
[jira] [Commented] (CASSANDRA-4223) Non Unique Streaming session ID's
[ https://issues.apache.org/jira/browse/CASSANDRA-4223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13270824#comment-13270824 ] Aaron Morton commented on CASSANDRA-4223: - What about if a node received streaming requests from two other nodes? There would be an (small) chance of the nodes generating the same session ID. Adding the gossip generation adds a little entropy to the id's. Happy to go with the simpler counter idea if you think this is a non-problem. Non Unique Streaming session ID's - Key: CASSANDRA-4223 URL: https://issues.apache.org/jira/browse/CASSANDRA-4223 Project: Cassandra Issue Type: Bug Components: Core Environment: Ubuntu 10.04.2 LTS java version 1.6.0_24 Java(TM) SE Runtime Environment (build 1.6.0_24-b07) Java HotSpot(TM) 64-Bit Server VM (build 19.1-b02, mixed mode) Bare metal servers from https://www.stormondemand.com/servers/baremetal.html The servers run on a custom hypervisor. Reporter: Aaron Morton Assignee: Aaron Morton Labels: datastax_qa Fix For: 1.0.11, 1.1.1 Attachments: NanoTest.java, fmm streaming bug.txt I have observed repair processes failing due to duplicate Streaming session ID's. In this installation it is preventing rebalance from completing. I believe it has also prevented repair from completing in the past. The attached streaming-logs.txt file contains log messages and an explanation of what was happening during a repair operation. it has the evidence for duplicate session ID's. The duplicate session id's were generated on the repairing node and sent to the streaming node. The streaming source replaced the first session with the second which resulted in both sessions failing when the first FILE_COMPLETE message was received. The errors were: {code:java} DEBUG [MiscStage:1] 2012-05-03 21:40:33,997 StreamReplyVerbHandler.java (line 47) Received StreamReply StreamReply(sessionId=26132848816442266, file='/var/lib/cassandra/data/FMM_Studio/PartsData-hc-1-Data.db', action=FILE_FINISHED) ERROR [MiscStage:1] 2012-05-03 21:40:34,027 AbstractCassandraDaemon.java (line 139) Fatal exception in thread Thread[MiscStage:1,5,main] java.lang.IllegalStateException: target reports current file is /var/lib/cassandra/data/FMM_Studio/PartsData-hc-1-Data.db but is null at org.apache.cassandra.streaming.StreamOutSession.validateCurrentFile(StreamOutSession.java:195) at org.apache.cassandra.streaming.StreamReplyVerbHandler.doVerb(StreamReplyVerbHandler.java:58) at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:59) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) {code} and {code:java} DEBUG [MiscStage:2] 2012-05-03 21:40:36,497 StreamReplyVerbHandler.java (line 47) Received StreamReply StreamReply(sessionId=26132848816442266, file='/var/lib/cassandra/data/OpsCenter/rollups7200-hc-3-Data.db', action=FILE_FINISHED) ERROR [MiscStage:2] 2012-05-03 21:40:36,497 AbstractCassandraDaemon.java (line 139) Fatal exception in thread Thread[MiscStage:2,5,main] java.lang.IllegalStateException: target reports current file is /var/lib/cassandra/data/OpsCenter/rollups7200-hc-3-Data.db but is null at org.apache.cassandra.streaming.StreamOutSession.validateCurrentFile(StreamOutSession.java:195) at org.apache.cassandra.streaming.StreamReplyVerbHandler.doVerb(StreamReplyVerbHandler.java:58) at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:59) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) {code} I think this is because System.nanoTime() is used for the session ID when creating the StreamInSession objects (driven from StorageService.requestRanges()) . From the documentation (http://docs.oracle.com/javase/6/docs/api/java/lang/System.html#nanoTime()) {quote} This method provides nanosecond precision, but not necessarily nanosecond accuracy. No guarantees are made about how frequently values change. {quote} Also some info here on clocks and timers https://blogs.oracle.com/dholmes/entry/inside_the_hotspot_vm_clocks The hypervisor may be at fault here. But it seems like we cannot rely on successive calls to nanoTime() to return different values. To avoid message/interface changes on the StreamHeader it would be good to keep the session ID a long. The simplest approach may be to make successive calls to
[jira] [Commented] (CASSANDRA-4223) Non Unique Streaming session ID's
[ https://issues.apache.org/jira/browse/CASSANDRA-4223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13270838#comment-13270838 ] Jonathan Ellis commented on CASSANDRA-4223: --- It doesn't need to be unique-per-cluster because StreamInSession tracks it as a PairInetAddress, Long. Non Unique Streaming session ID's - Key: CASSANDRA-4223 URL: https://issues.apache.org/jira/browse/CASSANDRA-4223 Project: Cassandra Issue Type: Bug Components: Core Environment: Ubuntu 10.04.2 LTS java version 1.6.0_24 Java(TM) SE Runtime Environment (build 1.6.0_24-b07) Java HotSpot(TM) 64-Bit Server VM (build 19.1-b02, mixed mode) Bare metal servers from https://www.stormondemand.com/servers/baremetal.html The servers run on a custom hypervisor. Reporter: Aaron Morton Assignee: Aaron Morton Labels: datastax_qa Fix For: 1.0.11, 1.1.1 Attachments: NanoTest.java, fmm streaming bug.txt I have observed repair processes failing due to duplicate Streaming session ID's. In this installation it is preventing rebalance from completing. I believe it has also prevented repair from completing in the past. The attached streaming-logs.txt file contains log messages and an explanation of what was happening during a repair operation. it has the evidence for duplicate session ID's. The duplicate session id's were generated on the repairing node and sent to the streaming node. The streaming source replaced the first session with the second which resulted in both sessions failing when the first FILE_COMPLETE message was received. The errors were: {code:java} DEBUG [MiscStage:1] 2012-05-03 21:40:33,997 StreamReplyVerbHandler.java (line 47) Received StreamReply StreamReply(sessionId=26132848816442266, file='/var/lib/cassandra/data/FMM_Studio/PartsData-hc-1-Data.db', action=FILE_FINISHED) ERROR [MiscStage:1] 2012-05-03 21:40:34,027 AbstractCassandraDaemon.java (line 139) Fatal exception in thread Thread[MiscStage:1,5,main] java.lang.IllegalStateException: target reports current file is /var/lib/cassandra/data/FMM_Studio/PartsData-hc-1-Data.db but is null at org.apache.cassandra.streaming.StreamOutSession.validateCurrentFile(StreamOutSession.java:195) at org.apache.cassandra.streaming.StreamReplyVerbHandler.doVerb(StreamReplyVerbHandler.java:58) at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:59) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) {code} and {code:java} DEBUG [MiscStage:2] 2012-05-03 21:40:36,497 StreamReplyVerbHandler.java (line 47) Received StreamReply StreamReply(sessionId=26132848816442266, file='/var/lib/cassandra/data/OpsCenter/rollups7200-hc-3-Data.db', action=FILE_FINISHED) ERROR [MiscStage:2] 2012-05-03 21:40:36,497 AbstractCassandraDaemon.java (line 139) Fatal exception in thread Thread[MiscStage:2,5,main] java.lang.IllegalStateException: target reports current file is /var/lib/cassandra/data/OpsCenter/rollups7200-hc-3-Data.db but is null at org.apache.cassandra.streaming.StreamOutSession.validateCurrentFile(StreamOutSession.java:195) at org.apache.cassandra.streaming.StreamReplyVerbHandler.doVerb(StreamReplyVerbHandler.java:58) at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:59) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) {code} I think this is because System.nanoTime() is used for the session ID when creating the StreamInSession objects (driven from StorageService.requestRanges()) . From the documentation (http://docs.oracle.com/javase/6/docs/api/java/lang/System.html#nanoTime()) {quote} This method provides nanosecond precision, but not necessarily nanosecond accuracy. No guarantees are made about how frequently values change. {quote} Also some info here on clocks and timers https://blogs.oracle.com/dholmes/entry/inside_the_hotspot_vm_clocks The hypervisor may be at fault here. But it seems like we cannot rely on successive calls to nanoTime() to return different values. To avoid message/interface changes on the StreamHeader it would be good to keep the session ID a long. The simplest approach may be to make successive calls to nanoTime until the result changes. We could fail if a certain number of milliseconds have passed. Hashing the file names and ranges is also a possibility, but more involved. (We may
[jira] [Commented] (CASSANDRA-2598) incremental_backups and snapshot_before_compaction duplicate hard links
[ https://issues.apache.org/jira/browse/CASSANDRA-2598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13270842#comment-13270842 ] Jonathan Ellis commented on CASSANDRA-2598: --- Please create a new issue with steps to reproduce. Thanks! incremental_backups and snapshot_before_compaction duplicate hard links --- Key: CASSANDRA-2598 URL: https://issues.apache.org/jira/browse/CASSANDRA-2598 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 0.8 beta 1 Environment: linux jna Reporter: Mck SembWever Assignee: Jonathan Ellis Priority: Minor Fix For: 0.8.0 Attachments: 2598.txt See discussion @Â http://thread.gmane.org/gmane.comp.db.cassandra.user/15933/ Enabling both incremental_backups and snapshot_before_compaction leads to the same hard links trying to be created. This gives stacktraces like java.io.IOError: java.io.IOException: Unable to create hard link from /cassandra-data/keyspace/cf-f-3875-Data.db to /cassandra-data/keyspace/snapshots/compact-cf/cf-f-3875-Data.db (errno 17) at org.apache.cassandra.db.ColumnFamilyStore.snapshotWithoutFlush(ColumnFamilyStore.java:1629) at org.apache.cassandra.db.ColumnFamilyStore.snapshot(ColumnFamilyStore.java:1654) at org.apache.cassandra.db.Table.snapshot(Table.java:198) at org.apache.cassandra.db.CompactionManager.doCompaction(CompactionManager.java:504) at org.apache.cassandra.db.CompactionManager$1.call(CompactionManager.java:146) at org.apache.cassandra.db.CompactionManager$1.call(CompactionManager.java:112) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) Caused by: java.io.IOException: Unable to create hard link from /cassandra-data/keyspace/cf-f-3875-Data.db to /cassandra-data/keyspace/snapshots/compact-cf/cf-f-3875-Data.db (errno 17) at org.apache.cassandra.utils.CLibrary.createHardLink(CLibrary.java:155) at org.apache.cassandra.io.sstable.SSTableReader.createLinks(SSTableReader.java:713) at org.apache.cassandra.db.ColumnFamilyStore.snapshotWithoutFlush(ColumnFamilyStore.java:1622) ... 10 more -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-4223) Non Unique Streaming session ID's
[ https://issues.apache.org/jira/browse/CASSANDRA-4223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13270843#comment-13270843 ] Yuki Morishita commented on CASSANDRA-4223: --- I see possible collision here, since StreamOutSession is identified by destination host + timestamp(for now) and StreamInSession in destination node uses it when received it from remote. Adding one more key (source IP?) would make ID unique? Non Unique Streaming session ID's - Key: CASSANDRA-4223 URL: https://issues.apache.org/jira/browse/CASSANDRA-4223 Project: Cassandra Issue Type: Bug Components: Core Environment: Ubuntu 10.04.2 LTS java version 1.6.0_24 Java(TM) SE Runtime Environment (build 1.6.0_24-b07) Java HotSpot(TM) 64-Bit Server VM (build 19.1-b02, mixed mode) Bare metal servers from https://www.stormondemand.com/servers/baremetal.html The servers run on a custom hypervisor. Reporter: Aaron Morton Assignee: Aaron Morton Labels: datastax_qa Fix For: 1.0.11, 1.1.1 Attachments: NanoTest.java, fmm streaming bug.txt I have observed repair processes failing due to duplicate Streaming session ID's. In this installation it is preventing rebalance from completing. I believe it has also prevented repair from completing in the past. The attached streaming-logs.txt file contains log messages and an explanation of what was happening during a repair operation. it has the evidence for duplicate session ID's. The duplicate session id's were generated on the repairing node and sent to the streaming node. The streaming source replaced the first session with the second which resulted in both sessions failing when the first FILE_COMPLETE message was received. The errors were: {code:java} DEBUG [MiscStage:1] 2012-05-03 21:40:33,997 StreamReplyVerbHandler.java (line 47) Received StreamReply StreamReply(sessionId=26132848816442266, file='/var/lib/cassandra/data/FMM_Studio/PartsData-hc-1-Data.db', action=FILE_FINISHED) ERROR [MiscStage:1] 2012-05-03 21:40:34,027 AbstractCassandraDaemon.java (line 139) Fatal exception in thread Thread[MiscStage:1,5,main] java.lang.IllegalStateException: target reports current file is /var/lib/cassandra/data/FMM_Studio/PartsData-hc-1-Data.db but is null at org.apache.cassandra.streaming.StreamOutSession.validateCurrentFile(StreamOutSession.java:195) at org.apache.cassandra.streaming.StreamReplyVerbHandler.doVerb(StreamReplyVerbHandler.java:58) at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:59) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) {code} and {code:java} DEBUG [MiscStage:2] 2012-05-03 21:40:36,497 StreamReplyVerbHandler.java (line 47) Received StreamReply StreamReply(sessionId=26132848816442266, file='/var/lib/cassandra/data/OpsCenter/rollups7200-hc-3-Data.db', action=FILE_FINISHED) ERROR [MiscStage:2] 2012-05-03 21:40:36,497 AbstractCassandraDaemon.java (line 139) Fatal exception in thread Thread[MiscStage:2,5,main] java.lang.IllegalStateException: target reports current file is /var/lib/cassandra/data/OpsCenter/rollups7200-hc-3-Data.db but is null at org.apache.cassandra.streaming.StreamOutSession.validateCurrentFile(StreamOutSession.java:195) at org.apache.cassandra.streaming.StreamReplyVerbHandler.doVerb(StreamReplyVerbHandler.java:58) at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:59) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) {code} I think this is because System.nanoTime() is used for the session ID when creating the StreamInSession objects (driven from StorageService.requestRanges()) . From the documentation (http://docs.oracle.com/javase/6/docs/api/java/lang/System.html#nanoTime()) {quote} This method provides nanosecond precision, but not necessarily nanosecond accuracy. No guarantees are made about how frequently values change. {quote} Also some info here on clocks and timers https://blogs.oracle.com/dholmes/entry/inside_the_hotspot_vm_clocks The hypervisor may be at fault here. But it seems like we cannot rely on successive calls to nanoTime() to return different values. To avoid message/interface changes on the StreamHeader it would be good to keep the session ID a long. The simplest approach may be to make successive calls to nanoTime until the result changes. We could
git commit: add TypeSizes.sizeof(String)
Updated Branches: refs/heads/trunk 2ae527218 - 70554b2a3 add TypeSizes.sizeof(String) Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/70554b2a Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/70554b2a Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/70554b2a Branch: refs/heads/trunk Commit: 70554b2a31706eb272eab0245c8ef25cbfdf6bf5 Parents: 2ae5272 Author: Jonathan Ellis jbel...@apache.org Authored: Tue May 8 16:41:00 2012 -0500 Committer: Jonathan Ellis jbel...@apache.org Committed: Tue May 8 16:41:00 2012 -0500 -- .../org/apache/cassandra/db/CounterMutation.java |4 +-- .../org/apache/cassandra/db/RangeSliceCommand.java | 26 ++ src/java/org/apache/cassandra/db/RowMutation.java | 11 +++--- .../cassandra/db/SliceByNamesReadCommand.java | 15 .../apache/cassandra/db/SliceFromReadCommand.java | 19 +- .../org/apache/cassandra/db/SnapshotCommand.java |8 ++--- .../org/apache/cassandra/db/TruncateResponse.java |6 +-- src/java/org/apache/cassandra/db/Truncation.java |5 +-- src/java/org/apache/cassandra/db/TypeSizes.java| 27 +++ .../org/apache/cassandra/db/WriteResponse.java |9 ++--- .../org/apache/cassandra/db/filter/QueryPath.java |8 + .../org/apache/cassandra/dht/BootStrapper.java |5 +-- .../org/apache/cassandra/gms/GossipDigestSyn.java |3 +- .../org/apache/cassandra/gms/VersionedValue.java |3 +- .../cassandra/service/AntiEntropyService.java | 12 ++ .../apache/cassandra/streaming/PendingFile.java|7 ++-- .../apache/cassandra/streaming/StreamHeader.java |2 +- .../apache/cassandra/streaming/StreamReply.java|3 +- .../apache/cassandra/streaming/StreamRequest.java |5 +-- .../cassandra/streaming/StreamingRepairTask.java |4 +- .../org/apache/cassandra/utils/FBUtilities.java| 23 21 files changed, 89 insertions(+), 116 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/70554b2a/src/java/org/apache/cassandra/db/CounterMutation.java -- diff --git a/src/java/org/apache/cassandra/db/CounterMutation.java b/src/java/org/apache/cassandra/db/CounterMutation.java index 9a1c117..c3256cc 100644 --- a/src/java/org/apache/cassandra/db/CounterMutation.java +++ b/src/java/org/apache/cassandra/db/CounterMutation.java @@ -34,7 +34,6 @@ import org.apache.cassandra.net.MessageOut; import org.apache.cassandra.net.MessagingService; import org.apache.cassandra.thrift.ConsistencyLevel; import org.apache.cassandra.utils.ByteBufferUtil; -import org.apache.cassandra.utils.FBUtilities; import org.apache.cassandra.utils.HeapAllocator; public class CounterMutation implements IMutation @@ -182,8 +181,7 @@ class CounterMutationSerializer implements IVersionedSerializerCounterMutation public long serializedSize(CounterMutation cm, int version) { -int tableSize = FBUtilities.encodedUTF8Length(cm.consistency().name()); return RowMutation.serializer.serializedSize(cm.rowMutation(), version) - + TypeSizes.NATIVE.sizeof((short) tableSize) + tableSize; + + TypeSizes.NATIVE.sizeof(cm.consistency().name()); } } http://git-wip-us.apache.org/repos/asf/cassandra/blob/70554b2a/src/java/org/apache/cassandra/db/RangeSliceCommand.java -- diff --git a/src/java/org/apache/cassandra/db/RangeSliceCommand.java b/src/java/org/apache/cassandra/db/RangeSliceCommand.java index 2ad4b5d..8516e06 100644 --- a/src/java/org/apache/cassandra/db/RangeSliceCommand.java +++ b/src/java/org/apache/cassandra/db/RangeSliceCommand.java @@ -217,14 +217,12 @@ class RangeSliceCommandSerializer implements IVersionedSerializerRangeSliceComm return new RangeSliceCommand(keyspace, columnFamily, superColumn, pred, range, rowFilter, maxResults, maxIsColumns, isPaging); } -public long serializedSize(RangeSliceCommand rangeSliceCommand, int version) +public long serializedSize(RangeSliceCommand rsc, int version) { -int ksLength = FBUtilities.encodedUTF8Length(rangeSliceCommand.keyspace); -long size = TypeSizes.NATIVE.sizeof(ksLength) + ksLength; -int cfLength = FBUtilities.encodedUTF8Length(rangeSliceCommand.column_family); -size += TypeSizes.NATIVE.sizeof(cfLength) + cfLength; +long size = TypeSizes.NATIVE.sizeof(rsc.keyspace); +size += TypeSizes.NATIVE.sizeof(rsc.column_family); -ByteBuffer sc = rangeSliceCommand.super_column; +ByteBuffer sc = rsc.super_column; if (sc != null) { size +=
[jira] [Commented] (CASSANDRA-4139) Add varint encoding to Messaging service
[ https://issues.apache.org/jira/browse/CASSANDRA-4139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13270862#comment-13270862 ] Jonathan Ellis commented on CASSANDRA-4139: --- Can you rebase? Finally got CASSANDRA-3617 committed, which conflicts. (Probably useful: I added TypeSizes.sizeof(String) to supplement the raw encodedUTF8Length.) Add varint encoding to Messaging service Key: CASSANDRA-4139 URL: https://issues.apache.org/jira/browse/CASSANDRA-4139 Project: Cassandra Issue Type: Sub-task Components: Core Reporter: Vijay Assignee: Vijay Fix For: 1.2 Attachments: 0001-CASSANDRA-4139-v1.patch, 0002-add-bytes-written-metric.patch, 4139-Test.rtf -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[1/5] git commit: Merge branch 'cassandra-1.1' into trunk
Updated Branches: refs/heads/cassandra-1.1 8b81c8f2f - 641346b0d refs/heads/trunk 70554b2a3 - 4357676f3 Merge branch 'cassandra-1.1' into trunk Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/4357676f Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/4357676f Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/4357676f Branch: refs/heads/trunk Commit: 4357676f30b30291be9fb6f8d0de79cca767efbb Parents: 70554b2 641346b Author: Jonathan Ellis jbel...@apache.org Authored: Tue May 8 16:54:11 2012 -0500 Committer: Jonathan Ellis jbel...@apache.org Committed: Tue May 8 16:54:11 2012 -0500 -- CHANGES.txt|1 + src/java/org/apache/cassandra/cql3/Cql.g |7 + src/java/org/apache/cassandra/cql3/Relation.java | 13 +- src/java/org/apache/cassandra/cql3/Term.java | 70 --- .../cassandra/cql3/statements/SelectStatement.java | 157 +-- .../org/apache/cassandra/db/marshal/DateType.java |1 - src/java/org/apache/cassandra/tools/NodeCmd.java | 23 ++- 7 files changed, 174 insertions(+), 98 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/4357676f/CHANGES.txt -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/4357676f/src/java/org/apache/cassandra/cql3/Relation.java -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/4357676f/src/java/org/apache/cassandra/cql3/Term.java -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/4357676f/src/java/org/apache/cassandra/cql3/statements/SelectStatement.java -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/4357676f/src/java/org/apache/cassandra/db/marshal/DateType.java -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/4357676f/src/java/org/apache/cassandra/tools/NodeCmd.java --
[2/5] git commit: more user-friendly error messages for unknown host and connection failures patch by Noa Resare; reviewed by jbellis for CASSANDRA-4224
more user-friendly error messages for unknown host and connection failures patch by Noa Resare; reviewed by jbellis for CASSANDRA-4224 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/641346b0 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/641346b0 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/641346b0 Branch: refs/heads/cassandra-1.1 Commit: 641346b0dc85a723f1bd755007c334b310c71e67 Parents: 8b81c8f Author: Jonathan Ellis jbel...@apache.org Authored: Tue May 8 16:53:49 2012 -0500 Committer: Jonathan Ellis jbel...@apache.org Committed: Tue May 8 16:54:01 2012 -0500 -- src/java/org/apache/cassandra/tools/NodeCmd.java | 23 - 1 files changed, 22 insertions(+), 1 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/641346b0/src/java/org/apache/cassandra/tools/NodeCmd.java -- diff --git a/src/java/org/apache/cassandra/tools/NodeCmd.java b/src/java/org/apache/cassandra/tools/NodeCmd.java index 07f560d..e33f698 100644 --- a/src/java/org/apache/cassandra/tools/NodeCmd.java +++ b/src/java/org/apache/cassandra/tools/NodeCmd.java @@ -24,6 +24,7 @@ package org.apache.cassandra.tools; import java.io.IOException; import java.io.PrintStream; import java.lang.management.MemoryUsage; +import java.net.ConnectException; import java.net.InetAddress; import java.net.UnknownHostException; import java.text.DecimalFormat; @@ -660,7 +661,21 @@ public class NodeCmd } catch (IOException ioe) { -err(ioe, Error connection to remote JMX agent!); +Throwable inner = findInnermostThrowable(ioe); +if (inner instanceof ConnectException) +{ +System.err.printf(Failed to connect to '%s:%d': %s\n, host, port, inner.getMessage()); +System.exit(1); +} +else if (inner instanceof UnknownHostException) +{ +System.err.printf(Cannot resolve '%s': unknown host\n, host); +System.exit(1); +} +else +{ +err(ioe, Error connecting to remote JMX agent!); +} } try { @@ -856,6 +871,12 @@ public class NodeCmd System.exit(0); } +private static Throwable findInnermostThrowable(Throwable ex) +{ +Throwable inner = ex.getCause(); +return inner == null ? ex : findInnermostThrowable(inner); +} + private void printDescribeRing(String keyspaceName, PrintStream out) { out.println(Schema Version: + probe.getSchemaVersion());
[3/5] git commit: more user-friendly error messages for unknown host and connection failures patch by Noa Resare; reviewed by jbellis for CASSANDRA-4224
more user-friendly error messages for unknown host and connection failures patch by Noa Resare; reviewed by jbellis for CASSANDRA-4224 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/641346b0 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/641346b0 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/641346b0 Branch: refs/heads/trunk Commit: 641346b0dc85a723f1bd755007c334b310c71e67 Parents: 8b81c8f Author: Jonathan Ellis jbel...@apache.org Authored: Tue May 8 16:53:49 2012 -0500 Committer: Jonathan Ellis jbel...@apache.org Committed: Tue May 8 16:54:01 2012 -0500 -- src/java/org/apache/cassandra/tools/NodeCmd.java | 23 - 1 files changed, 22 insertions(+), 1 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/641346b0/src/java/org/apache/cassandra/tools/NodeCmd.java -- diff --git a/src/java/org/apache/cassandra/tools/NodeCmd.java b/src/java/org/apache/cassandra/tools/NodeCmd.java index 07f560d..e33f698 100644 --- a/src/java/org/apache/cassandra/tools/NodeCmd.java +++ b/src/java/org/apache/cassandra/tools/NodeCmd.java @@ -24,6 +24,7 @@ package org.apache.cassandra.tools; import java.io.IOException; import java.io.PrintStream; import java.lang.management.MemoryUsage; +import java.net.ConnectException; import java.net.InetAddress; import java.net.UnknownHostException; import java.text.DecimalFormat; @@ -660,7 +661,21 @@ public class NodeCmd } catch (IOException ioe) { -err(ioe, Error connection to remote JMX agent!); +Throwable inner = findInnermostThrowable(ioe); +if (inner instanceof ConnectException) +{ +System.err.printf(Failed to connect to '%s:%d': %s\n, host, port, inner.getMessage()); +System.exit(1); +} +else if (inner instanceof UnknownHostException) +{ +System.err.printf(Cannot resolve '%s': unknown host\n, host); +System.exit(1); +} +else +{ +err(ioe, Error connecting to remote JMX agent!); +} } try { @@ -856,6 +871,12 @@ public class NodeCmd System.exit(0); } +private static Throwable findInnermostThrowable(Throwable ex) +{ +Throwable inner = ex.getCause(); +return inner == null ? ex : findInnermostThrowable(inner); +} + private void printDescribeRing(String keyspaceName, PrintStream out) { out.println(Schema Version: + probe.getSchemaVersion());
[5/5] git commit: Fix RoundTripTest
Fix RoundTripTest Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/1ab4ec17 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/1ab4ec17 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/1ab4ec17 Branch: refs/heads/trunk Commit: 1ab4ec1748617e7d2bd0f7d0037fecba0226e7b2 Parents: 4e2e547 Author: Sylvain Lebresne sylv...@datastax.com Authored: Sat May 5 12:43:26 2012 +0200 Committer: Sylvain Lebresne sylv...@datastax.com Committed: Sat May 5 12:44:04 2012 +0200 -- .../org/apache/cassandra/db/marshal/DateType.java |1 - 1 files changed, 0 insertions(+), 1 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/1ab4ec17/src/java/org/apache/cassandra/db/marshal/DateType.java -- diff --git a/src/java/org/apache/cassandra/db/marshal/DateType.java b/src/java/org/apache/cassandra/db/marshal/DateType.java index 671d31c..09f0ecd 100644 --- a/src/java/org/apache/cassandra/db/marshal/DateType.java +++ b/src/java/org/apache/cassandra/db/marshal/DateType.java @@ -88,7 +88,6 @@ public class DateType extends AbstractTypeDate public static long dateStringToTimestamp(String source) throws MarshalException { long millis; - source = source.toLowerCase(); if (source.toLowerCase().equals(now)) {
[jira] [Commented] (CASSANDRA-2864) Alternative Row Cache Implementation
[ https://issues.apache.org/jira/browse/CASSANDRA-2864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13270873#comment-13270873 ] Jonathan Ellis commented on CASSANDRA-2864: --- Okay, so this is complementary wrt to CASSANDRA-1956 -- 1956 addresses caching different kinds of queries, and this is strictly about not throwing away a [serialized] cache row in the face of updates. +1 from me in theory. Can you rebase to 1.2? Will need Sylvain's input on counters, though. Alternative Row Cache Implementation Key: CASSANDRA-2864 URL: https://issues.apache.org/jira/browse/CASSANDRA-2864 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Daniel Doubleday Assignee: Daniel Doubleday Priority: Minor Labels: cache Fix For: 1.2 we have been working on an alternative implementation to the existing row cache(s) We have 2 main goals: - Decrease memory - get more rows in the cache without suffering a huge performance penalty - Reduce gc pressure This sounds a lot like we should be using the new serializing cache in 0.8. Unfortunately our workload consists of loads of updates which would invalidate the cache all the time. *Note: Updated Patch Description (Please check history if you're interested where this was comming from)* h3. Rough Idea - Keep serialized row (ByteBuffer) in mem which represents unfiltered but collated columns of all ssts but not memtable columns - Writes dont affect the cache at all. They go only to the memtables - Reads collect columns from memtables and row cache - Serialized Row is re-written (merged) with mem tables when flushed h3. Some Implementation Details h4. Reads - Basically the read logic differ from regular uncached reads only in that a special CollationController which is deserializing columns from in memory bytes - In the first version of this cache the serialized in memory format was the same as the fs format but test showed that performance sufferd because a lot of unnecessary deserialization takes place and that columns seeks are O( n ) whithin one block - To improve on that a different in memory format was used. It splits length meta info and data of columns so that the names can be binary searched. {noformat} === Header (24) === MaxTimestamp:long LocalDeletionTime: int MarkedForDeleteAt: long NumColumns: int === Column Index (num cols * 12) === NameOffset: int ValueOffset: int ValueLength: int === Column Data === Name:byte[] Value: byte[] SerializationFlags: byte Misc:? Timestamp: long --- Misc Counter Column --- TSOfLastDelete: long --- Misc Expiring Column --- TimeToLive: int LocalDeletionTime: int === {noformat} - These rows are read by 2 new column interators which correspond to SSTableNamesIterator and SSTableSliceIterator. During filtering only columns that actually match are constructed. The searching / skipping is performed on the raw ByteBuffer and does not create any objects. - A special CollationController is used to access and collate via cache and said new iterators. It also supports skipping the cached row by max update timestamp h4. Writes - Writes dont update or invalidate the cache. - In CFS.replaceFlushed memtables are merged before the data view is switched. I fear that this is killing counters because they would be overcounted but my understading of counters is somewhere between weak and non-existing. I guess that counters if one wants to support them here would need an additional unique local identifier in memory and in serialized cache to be able to filter duplicates or something like that. {noformat} void replaceFlushed(Memtable memtable, SSTableReader sstable) { if (sstCache.getCapacity() 0) { mergeSSTCache(memtable); } data.replaceFlushed(memtable, sstable); CompactionManager.instance.submitBackground(this); } {noformat} Test Results: See comments below -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see:
[jira] [Updated] (CASSANDRA-2864) Alternative Row Cache Implementation
[ https://issues.apache.org/jira/browse/CASSANDRA-2864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-2864: -- Reviewer: slebresne Priority: Major (was: Minor) Fix Version/s: 1.2 Labels: cache (was: ) Alternative Row Cache Implementation Key: CASSANDRA-2864 URL: https://issues.apache.org/jira/browse/CASSANDRA-2864 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Daniel Doubleday Assignee: Daniel Doubleday Labels: cache Fix For: 1.2 we have been working on an alternative implementation to the existing row cache(s) We have 2 main goals: - Decrease memory - get more rows in the cache without suffering a huge performance penalty - Reduce gc pressure This sounds a lot like we should be using the new serializing cache in 0.8. Unfortunately our workload consists of loads of updates which would invalidate the cache all the time. *Note: Updated Patch Description (Please check history if you're interested where this was comming from)* h3. Rough Idea - Keep serialized row (ByteBuffer) in mem which represents unfiltered but collated columns of all ssts but not memtable columns - Writes dont affect the cache at all. They go only to the memtables - Reads collect columns from memtables and row cache - Serialized Row is re-written (merged) with mem tables when flushed h3. Some Implementation Details h4. Reads - Basically the read logic differ from regular uncached reads only in that a special CollationController which is deserializing columns from in memory bytes - In the first version of this cache the serialized in memory format was the same as the fs format but test showed that performance sufferd because a lot of unnecessary deserialization takes place and that columns seeks are O( n ) whithin one block - To improve on that a different in memory format was used. It splits length meta info and data of columns so that the names can be binary searched. {noformat} === Header (24) === MaxTimestamp:long LocalDeletionTime: int MarkedForDeleteAt: long NumColumns: int === Column Index (num cols * 12) === NameOffset: int ValueOffset: int ValueLength: int === Column Data === Name:byte[] Value: byte[] SerializationFlags: byte Misc:? Timestamp: long --- Misc Counter Column --- TSOfLastDelete: long --- Misc Expiring Column --- TimeToLive: int LocalDeletionTime: int === {noformat} - These rows are read by 2 new column interators which correspond to SSTableNamesIterator and SSTableSliceIterator. During filtering only columns that actually match are constructed. The searching / skipping is performed on the raw ByteBuffer and does not create any objects. - A special CollationController is used to access and collate via cache and said new iterators. It also supports skipping the cached row by max update timestamp h4. Writes - Writes dont update or invalidate the cache. - In CFS.replaceFlushed memtables are merged before the data view is switched. I fear that this is killing counters because they would be overcounted but my understading of counters is somewhere between weak and non-existing. I guess that counters if one wants to support them here would need an additional unique local identifier in memory and in serialized cache to be able to filter duplicates or something like that. {noformat} void replaceFlushed(Memtable memtable, SSTableReader sstable) { if (sstCache.getCapacity() 0) { mergeSSTCache(memtable); } data.replaceFlushed(memtable, sstable); CompactionManager.instance.submitBackground(this); } {noformat} Test Results: See comments below -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Issue Comment Edited] (CASSANDRA-4223) Non Unique Streaming session ID's
[ https://issues.apache.org/jira/browse/CASSANDRA-4223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13270886#comment-13270886 ] Jonathan Ellis edited comment on CASSANDRA-4223 at 5/8/12 10:12 PM: Ugh, so the problem is that sometimes session IDs are generated by the target, and sometimes by the source? That's broken... I think there's several possible solutions: # always generate session IDs on the source, so our Pair really is unique [the way I thought it worked :)] # try to make the 64bit session IDs unique enough across the cluster [aaron's timestamp + counter] # guarantee the session IDs are unique-per-host, and make the session context (source, id-generated-by-ip, id) instead of just (source, id) [yuki's suggestion] # just switch to a UUID #4 is probably simplest, but only #2 and #3 will be backwards-compatible. Of those two I feel more confident about #3... The only unique-id-in-64-bit schemas I know of require some coordination up front among participating nodes (e.g., http://engineering.twitter.com/2010/06/announcing-snowflake.html) was (Author: jbellis): Ugh, so the problem is that sometimes session IDs are generated by the target, and sometimes by the source? That's broken... I think there's three possible solutions: # always generate session IDs on the source, so our Pair really is unique [the way I thought it worked :)] # try to make the 64bit session IDs unique enough across the cluster [aaron's timestamp + counter] # guarantee the session IDs are unique-per-host, and make the session context (source, id-generated-by-ip, id) instead of just (source, id) [yuki's suggestion] # just switch to a UUID #4 is probably simplest, but only #2 and #3 will be backwards-compatible. Of those two I feel more confident about #3... The only unique-id-in-64-bit schemas I know of require some coordination up front among participating nodes (e.g., http://engineering.twitter.com/2010/06/announcing-snowflake.html) Non Unique Streaming session ID's - Key: CASSANDRA-4223 URL: https://issues.apache.org/jira/browse/CASSANDRA-4223 Project: Cassandra Issue Type: Bug Components: Core Environment: Ubuntu 10.04.2 LTS java version 1.6.0_24 Java(TM) SE Runtime Environment (build 1.6.0_24-b07) Java HotSpot(TM) 64-Bit Server VM (build 19.1-b02, mixed mode) Bare metal servers from https://www.stormondemand.com/servers/baremetal.html The servers run on a custom hypervisor. Reporter: Aaron Morton Assignee: Aaron Morton Labels: datastax_qa Fix For: 1.0.11, 1.1.1 Attachments: NanoTest.java, fmm streaming bug.txt I have observed repair processes failing due to duplicate Streaming session ID's. In this installation it is preventing rebalance from completing. I believe it has also prevented repair from completing in the past. The attached streaming-logs.txt file contains log messages and an explanation of what was happening during a repair operation. it has the evidence for duplicate session ID's. The duplicate session id's were generated on the repairing node and sent to the streaming node. The streaming source replaced the first session with the second which resulted in both sessions failing when the first FILE_COMPLETE message was received. The errors were: {code:java} DEBUG [MiscStage:1] 2012-05-03 21:40:33,997 StreamReplyVerbHandler.java (line 47) Received StreamReply StreamReply(sessionId=26132848816442266, file='/var/lib/cassandra/data/FMM_Studio/PartsData-hc-1-Data.db', action=FILE_FINISHED) ERROR [MiscStage:1] 2012-05-03 21:40:34,027 AbstractCassandraDaemon.java (line 139) Fatal exception in thread Thread[MiscStage:1,5,main] java.lang.IllegalStateException: target reports current file is /var/lib/cassandra/data/FMM_Studio/PartsData-hc-1-Data.db but is null at org.apache.cassandra.streaming.StreamOutSession.validateCurrentFile(StreamOutSession.java:195) at org.apache.cassandra.streaming.StreamReplyVerbHandler.doVerb(StreamReplyVerbHandler.java:58) at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:59) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) {code} and {code:java} DEBUG [MiscStage:2] 2012-05-03 21:40:36,497 StreamReplyVerbHandler.java (line 47) Received StreamReply StreamReply(sessionId=26132848816442266, file='/var/lib/cassandra/data/OpsCenter/rollups7200-hc-3-Data.db', action=FILE_FINISHED) ERROR [MiscStage:2] 2012-05-03 21:40:36,497 AbstractCassandraDaemon.java (line 139) Fatal
[jira] [Commented] (CASSANDRA-4223) Non Unique Streaming session ID's
[ https://issues.apache.org/jira/browse/CASSANDRA-4223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13270886#comment-13270886 ] Jonathan Ellis commented on CASSANDRA-4223: --- Ugh, so the problem is that sometimes session IDs are generated by the target, and sometimes by the source? That's broken... I think there's three possible solutions: # always generate session IDs on the source, so our Pair really is unique [the way I thought it worked :)] # try to make the 64bit session IDs unique enough across the cluster [aaron's timestamp + counter] # guarantee the session IDs are unique-per-host, and make the session context (source, id-generated-by-ip, id) instead of just (source, id) [yuki's suggestion] # just switch to a UUID #4 is probably simplest, but only #2 and #3 will be backwards-compatible. Of those two I feel more confident about #3... The only unique-id-in-64-bit schemas I know of require some coordination up front among participating nodes (e.g., http://engineering.twitter.com/2010/06/announcing-snowflake.html) Non Unique Streaming session ID's - Key: CASSANDRA-4223 URL: https://issues.apache.org/jira/browse/CASSANDRA-4223 Project: Cassandra Issue Type: Bug Components: Core Environment: Ubuntu 10.04.2 LTS java version 1.6.0_24 Java(TM) SE Runtime Environment (build 1.6.0_24-b07) Java HotSpot(TM) 64-Bit Server VM (build 19.1-b02, mixed mode) Bare metal servers from https://www.stormondemand.com/servers/baremetal.html The servers run on a custom hypervisor. Reporter: Aaron Morton Assignee: Aaron Morton Labels: datastax_qa Fix For: 1.0.11, 1.1.1 Attachments: NanoTest.java, fmm streaming bug.txt I have observed repair processes failing due to duplicate Streaming session ID's. In this installation it is preventing rebalance from completing. I believe it has also prevented repair from completing in the past. The attached streaming-logs.txt file contains log messages and an explanation of what was happening during a repair operation. it has the evidence for duplicate session ID's. The duplicate session id's were generated on the repairing node and sent to the streaming node. The streaming source replaced the first session with the second which resulted in both sessions failing when the first FILE_COMPLETE message was received. The errors were: {code:java} DEBUG [MiscStage:1] 2012-05-03 21:40:33,997 StreamReplyVerbHandler.java (line 47) Received StreamReply StreamReply(sessionId=26132848816442266, file='/var/lib/cassandra/data/FMM_Studio/PartsData-hc-1-Data.db', action=FILE_FINISHED) ERROR [MiscStage:1] 2012-05-03 21:40:34,027 AbstractCassandraDaemon.java (line 139) Fatal exception in thread Thread[MiscStage:1,5,main] java.lang.IllegalStateException: target reports current file is /var/lib/cassandra/data/FMM_Studio/PartsData-hc-1-Data.db but is null at org.apache.cassandra.streaming.StreamOutSession.validateCurrentFile(StreamOutSession.java:195) at org.apache.cassandra.streaming.StreamReplyVerbHandler.doVerb(StreamReplyVerbHandler.java:58) at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:59) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) {code} and {code:java} DEBUG [MiscStage:2] 2012-05-03 21:40:36,497 StreamReplyVerbHandler.java (line 47) Received StreamReply StreamReply(sessionId=26132848816442266, file='/var/lib/cassandra/data/OpsCenter/rollups7200-hc-3-Data.db', action=FILE_FINISHED) ERROR [MiscStage:2] 2012-05-03 21:40:36,497 AbstractCassandraDaemon.java (line 139) Fatal exception in thread Thread[MiscStage:2,5,main] java.lang.IllegalStateException: target reports current file is /var/lib/cassandra/data/OpsCenter/rollups7200-hc-3-Data.db but is null at org.apache.cassandra.streaming.StreamOutSession.validateCurrentFile(StreamOutSession.java:195) at org.apache.cassandra.streaming.StreamReplyVerbHandler.doVerb(StreamReplyVerbHandler.java:58) at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:59) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) {code} I think this is because System.nanoTime() is used for the session ID when creating the StreamInSession objects (driven from StorageService.requestRanges()) . From the documentation
[jira] [Updated] (CASSANDRA-4150) Looks like Maximum amount of cache available in 1.1 is 2 GB
[ https://issues.apache.org/jira/browse/CASSANDRA-4150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vijay updated CASSANDRA-4150: - Attachment: 0002-Use-EntryWeigher-for-HeapCache.patch 0001-CASSANDRA-4150.patch concurrentlinkedhashmap-lru-1.3.jar CLHM (Ben) fixed the issue: {quote} Fixed in v1.3. I plan on releasing this tonight. Also introduced EntryWeigherK, V to allow key/value weighing. We fixed this oversight in Guava's CacheBuilder from the get-go. I believe Cassandra wanted entry weighers too, but it wasn't high priority (no bug filed). Please consider adopting it when you upgrade the library. {quote} Looks like Maximum amount of cache available in 1.1 is 2 GB --- Key: CASSANDRA-4150 URL: https://issues.apache.org/jira/browse/CASSANDRA-4150 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 1.1.0 Reporter: Vijay Assignee: Vijay Attachments: 0001-CASSANDRA-4150.patch, 0002-Use-EntryWeigher-for-HeapCache.patch, concurrentlinkedhashmap-lru-1.3.jar The problem is that capacity is a Integer which can maximum hold 2 GB, I will post a fix to CLHM in the mean time we might want to remove the maximumWeightedCapacity code path (atleast for Serializing cache) and implement it in our code. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-4223) Non Unique Streaming session ID's
[ https://issues.apache.org/jira/browse/CASSANDRA-4223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13271010#comment-13271010 ] Aaron Morton commented on CASSANDRA-4223: - I'll take another look at how unique the session id need to be. Non Unique Streaming session ID's - Key: CASSANDRA-4223 URL: https://issues.apache.org/jira/browse/CASSANDRA-4223 Project: Cassandra Issue Type: Bug Components: Core Environment: Ubuntu 10.04.2 LTS java version 1.6.0_24 Java(TM) SE Runtime Environment (build 1.6.0_24-b07) Java HotSpot(TM) 64-Bit Server VM (build 19.1-b02, mixed mode) Bare metal servers from https://www.stormondemand.com/servers/baremetal.html The servers run on a custom hypervisor. Reporter: Aaron Morton Assignee: Aaron Morton Labels: datastax_qa Fix For: 1.0.11, 1.1.1 Attachments: NanoTest.java, fmm streaming bug.txt I have observed repair processes failing due to duplicate Streaming session ID's. In this installation it is preventing rebalance from completing. I believe it has also prevented repair from completing in the past. The attached streaming-logs.txt file contains log messages and an explanation of what was happening during a repair operation. it has the evidence for duplicate session ID's. The duplicate session id's were generated on the repairing node and sent to the streaming node. The streaming source replaced the first session with the second which resulted in both sessions failing when the first FILE_COMPLETE message was received. The errors were: {code:java} DEBUG [MiscStage:1] 2012-05-03 21:40:33,997 StreamReplyVerbHandler.java (line 47) Received StreamReply StreamReply(sessionId=26132848816442266, file='/var/lib/cassandra/data/FMM_Studio/PartsData-hc-1-Data.db', action=FILE_FINISHED) ERROR [MiscStage:1] 2012-05-03 21:40:34,027 AbstractCassandraDaemon.java (line 139) Fatal exception in thread Thread[MiscStage:1,5,main] java.lang.IllegalStateException: target reports current file is /var/lib/cassandra/data/FMM_Studio/PartsData-hc-1-Data.db but is null at org.apache.cassandra.streaming.StreamOutSession.validateCurrentFile(StreamOutSession.java:195) at org.apache.cassandra.streaming.StreamReplyVerbHandler.doVerb(StreamReplyVerbHandler.java:58) at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:59) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) {code} and {code:java} DEBUG [MiscStage:2] 2012-05-03 21:40:36,497 StreamReplyVerbHandler.java (line 47) Received StreamReply StreamReply(sessionId=26132848816442266, file='/var/lib/cassandra/data/OpsCenter/rollups7200-hc-3-Data.db', action=FILE_FINISHED) ERROR [MiscStage:2] 2012-05-03 21:40:36,497 AbstractCassandraDaemon.java (line 139) Fatal exception in thread Thread[MiscStage:2,5,main] java.lang.IllegalStateException: target reports current file is /var/lib/cassandra/data/OpsCenter/rollups7200-hc-3-Data.db but is null at org.apache.cassandra.streaming.StreamOutSession.validateCurrentFile(StreamOutSession.java:195) at org.apache.cassandra.streaming.StreamReplyVerbHandler.doVerb(StreamReplyVerbHandler.java:58) at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:59) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) {code} I think this is because System.nanoTime() is used for the session ID when creating the StreamInSession objects (driven from StorageService.requestRanges()) . From the documentation (http://docs.oracle.com/javase/6/docs/api/java/lang/System.html#nanoTime()) {quote} This method provides nanosecond precision, but not necessarily nanosecond accuracy. No guarantees are made about how frequently values change. {quote} Also some info here on clocks and timers https://blogs.oracle.com/dholmes/entry/inside_the_hotspot_vm_clocks The hypervisor may be at fault here. But it seems like we cannot rely on successive calls to nanoTime() to return different values. To avoid message/interface changes on the StreamHeader it would be good to keep the session ID a long. The simplest approach may be to make successive calls to nanoTime until the result changes. We could fail if a certain number of milliseconds have passed. Hashing the file names and ranges is also a possibility, but more involved. (We may also want to drop latency times that are 0
[jira] [Commented] (CASSANDRA-1991) CFS.maybeSwitchMemtable() calls CommitLog.instance.getContext(), which may block, under flusher lock write lock
[ https://issues.apache.org/jira/browse/CASSANDRA-1991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13271030#comment-13271030 ] Jonathan Ellis commented on CASSANDRA-1991: --- Let me see if I can summarize the problem here: We grab the writelock to (1) make sure only one thread flushes a given memtable and (2) make sure that we can mark the replay position in the sstable we flush accurately, that is, that we won't replay mutations on restart that are already in an sstable, which as Sylvain notes is critical for counters. The mechanism behind (2) is submitting a getContext task to the commitlog, while our writelock blocks out any other writes from happening -- Table.apply grabs the readlock before attempting to submit a mutation to the commitlog. Basically, getContext within the writelock drains the commitlog queue temporarily. The reason we need to do this is, Table.apply throws the mutation at the commitlog and then continues on to update memtables, indexes, etc. This decoupling is normally a Good Thing; otherwise (if you wait for the commitlog append before continuing), you serialize all your writer threads on the commitlog one. But it's a problem if the commitlog gets behind and then you have to wait for it to drain -- for all the mutations entered in your memtable prior to the flush, to actually get appended to the commitlog -- before continuing. I can think of a few ways to ameliorate this: # Change switchLock to a fair locking policy. Otherwise readers can continue to grab the lock while the flushing thread is trying to grab the write, letting the commitlog queue continue to grow. (We tried a fair policy early on in Cassandra and found the performance hit too great; maybe we've gained enough in other areas to make up for this.) # Wait for the commitlog write to succeed, before continuing on with the rest of Table.apply, so there is never any large backlog to worry about. Basically, some form of CASSANDRA-3578. # Don't block for the context result while holding the writeLock. We need the context when we actually write out the sstables, and when we discard/recycle completed segments post-flush, but the actual memtable switch doesn't care. I think we could just replace the context, with a Future. So the flush thread will wait for everything it is writing to be appended to the commitlog, but that wouldn't need to block new commitlog activity since it has a handle to the CL progress in the future we've created. #3 is pretty non-invasive and should be performant. Anyone see any problems with that? CFS.maybeSwitchMemtable() calls CommitLog.instance.getContext(), which may block, under flusher lock write lock --- Key: CASSANDRA-1991 URL: https://issues.apache.org/jira/browse/CASSANDRA-1991 Project: Cassandra Issue Type: Improvement Reporter: Peter Schuller Assignee: Peter Schuller Attachments: 1991-checkpointing-flush.txt, 1991-logchanges.txt, 1991-trunk-v2.txt, 1991-trunk.txt, 1991-v3.txt, 1991-v4.txt, 1991-v5.txt, 1991-v6.txt, 1991-v7.txt, 1991-v8.txt, 1991-v9.txt, trigger.py While investigate CASSANDRA-1955 I realized I was seeing very poor latencies for reasons that had nothing to do with flush_writers, even when using periodic commit log mode (and flush writers set ridiculously high, 500). It turns out writes blocked were slow because Table.apply() was spending lots of time (I can easily trigger seconds on moderate work-load) trying to acquire a flusher lock read lock (flush lock millis log printout in the logging patch I'll attach). That in turns is caused by CFS.maybeSwitchMemtable() which acquires the flusher lock write lock. Bisecting further revealed that the offending line of code that blocked was: final CommitLogSegment.CommitLogContext ctx = writeCommitLog ? CommitLog.instance.getContext() : null; Indeed, CommitLog.getContext() simply returns currentSegment().getContext(), but does so by submitting a callable on the service executor. So independently of flush writers, this can block all (global, for all cf:s) writes very easily, and does. I'll attach a file that is an independent Python script that triggers it on my macos laptop (with an intel SSD, which is why I was particularly surprised) (it assumes CPython, out-of-the-box-or-almost Cassandra on localhost that isn't in a cluster, and it will drop/recreate a keyspace called '1955'). I'm also attaching, just FYI, the patch with log entries that I used while tracking it down. Finally, I'll attach a patch with a suggested solution of keeping track of the latest commit log with an AtomicReference (as an alternative to synchronizing
[Cassandra Wiki] Update of ClientOptions by JonathanEllis
Dear Wiki user, You have subscribed to a wiki page or wiki category on Cassandra Wiki for change notification. The ClientOptions page has been changed by JonathanEllis: http://wiki.apache.org/cassandra/ClientOptions?action=diffrev1=154rev2=155 Comment: add erlang client * Haskell * cassy: https://github.com/ozataman/cassy * HackageDB Page: http://hackage.haskell.org/package/cassy + * Erlang + * erlcassa: https://github.com/ostinelli/erlcassa == Older clients ==
[jira] [Created] (CASSANDRA-4226) make sure MessageIn skips the entire payload size
Dave Brosius created CASSANDRA-4226: --- Summary: make sure MessageIn skips the entire payload size Key: CASSANDRA-4226 URL: https://issues.apache.org/jira/browse/CASSANDRA-4226 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Dave Brosius Priority: Trivial Fix For: 1.2 Attachments: message_skip_fully.diff DataInput.skipBytes isn't guaranteed to actually skip the number of bytes requested, use FileUtils.skipBytesFully instead. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (CASSANDRA-4227) StorageProxy throws NPEs for when there's no hostids for a target
Dave Brosius created CASSANDRA-4227: --- Summary: StorageProxy throws NPEs for when there's no hostids for a target Key: CASSANDRA-4227 URL: https://issues.apache.org/jira/browse/CASSANDRA-4227 Project: Cassandra Issue Type: Bug Components: Core Reporter: Dave Brosius Priority: Trivial On trunk... if there is no host id due to an old node, an info log is generated, but the code continues to use the null host id causing NPEs in decompose... Should this bypass this code, or perhaps can the plain ip address be used in this case? don't know. as follows... UUID hostId = StorageService.instance.getTokenMetadata().getHostId(target); if ((hostId == null) (Gossiper.instance.getVersion(target) MessagingService.VERSION_12)) logger.info(Unable to store hint for host with missing ID, {} (old node?), target.toString()); RowMutation hintedMutation = RowMutation.hintFor(mutation, ByteBuffer.wrap(UUIDGen.decompose(hostId))); hintedMutation.apply(); -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira