[jira] [Updated] (CASSANDRA-3936) Gossip should have a 'goodbye' command to indicate shutdown
[ https://issues.apache.org/jira/browse/CASSANDRA-3936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Schuller updated CASSANDRA-3936: -- Reviewer: scode Gossip should have a 'goodbye' command to indicate shutdown --- Key: CASSANDRA-3936 URL: https://issues.apache.org/jira/browse/CASSANDRA-3936 Project: Cassandra Issue Type: New Feature Components: Core Reporter: Brandon Williams Assignee: Brandon Williams Fix For: 1.2 Attachments: 3936.txt Cassandra is crash-only, however there are times when you _know_ you are taking the node down (rolling restarts, for instance) where it would be advantageous to instantly have the node marked down rather than wait on the FD. We could also improve the efficacy of the 'disablegossip' command this way as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-4032) memtable.updateLiveRatio() is blocking, causing insane latencies for writes
[ https://issues.apache.org/jira/browse/CASSANDRA-4032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Schuller updated CASSANDRA-4032: -- Description: Reproduce by just starting a fresh cassandra with a heap large enough for live ratio calculation (which is {{O(n)}}) to be insanely slow, and then running {{./bin/stress -d host -n1 -t10}}. With a large enough heap and default flushing behavior this is bad enough that stress gets timeouts. Example (blocked for is my debug log added around submit()): {code} INFO [MemoryMeter:1] 2012-03-09 15:07:30,857 Memtable.java (line 198) CFS(Keyspace='Keyspace1', ColumnFamily='Standard1') liveRatio is 8.89014894083727 (just-counted was 8.89014894083727). calculation took 28273ms for 1320245 columns WARN [MutationStage:8] 2012-03-09 15:07:30,857 Memtable.java (line 209) submit() blocked for: 231135 {code} The calling code was written assuming a RejectedExecutionException is thrown, but it's not because {{DebuggableThreadPoolExecutor}} installs a blocking rejection handler. was: Reproduce by just starting a fresh cassandra with a heap large enough for live ratio calculation (which is {{O(n)}}) to be insanely slow, and then running {{./bin/stress -d smf1-amv-01-sr1 -n1 -t10}}. With a large enough heap and default flushing behavior this is bad enough that stress gets timeouts. Example (blocked for is my debug log added around submit()): {code} INFO [MemoryMeter:1] 2012-03-09 15:07:30,857 Memtable.java (line 198) CFS(Keyspace='Keyspace1', ColumnFamily='Standard1') liveRatio is 8.89014894083727 (just-counted was 8.89014894083727). calculation took 28273ms for 1320245 columns WARN [MutationStage:8] 2012-03-09 15:07:30,857 Memtable.java (line 209) submit() blocked for: 231135 {code} The calling code was written assuming a RejectedExecutionException is thrown, but it's not because {{DebuggableThreadPoolExecutor}} installs a blocking rejection handler. memtable.updateLiveRatio() is blocking, causing insane latencies for writes --- Key: CASSANDRA-4032 URL: https://issues.apache.org/jira/browse/CASSANDRA-4032 Project: Cassandra Issue Type: Bug Components: Core Reporter: Peter Schuller Assignee: Peter Schuller Fix For: 1.1.0 Reproduce by just starting a fresh cassandra with a heap large enough for live ratio calculation (which is {{O(n)}}) to be insanely slow, and then running {{./bin/stress -d host -n1 -t10}}. With a large enough heap and default flushing behavior this is bad enough that stress gets timeouts. Example (blocked for is my debug log added around submit()): {code} INFO [MemoryMeter:1] 2012-03-09 15:07:30,857 Memtable.java (line 198) CFS(Keyspace='Keyspace1', ColumnFamily='Standard1') liveRatio is 8.89014894083727 (just-counted was 8.89014894083727). calculation took 28273ms for 1320245 columns WARN [MutationStage:8] 2012-03-09 15:07:30,857 Memtable.java (line 209) submit() blocked for: 231135 {code} The calling code was written assuming a RejectedExecutionException is thrown, but it's not because {{DebuggableThreadPoolExecutor}} installs a blocking rejection handler. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-4032) memtable.updateLiveRatio() is blocking, causing insane latencies for writes
[ https://issues.apache.org/jira/browse/CASSANDRA-4032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Schuller updated CASSANDRA-4032: -- Attachment: CASSANDRA-4032-1.1.0-v1.txt Attaching patch that allows us to create both blocking and non-blocking {{DebuggableThreadPoolExecturor}}:s; use that for this particular case. memtable.updateLiveRatio() is blocking, causing insane latencies for writes --- Key: CASSANDRA-4032 URL: https://issues.apache.org/jira/browse/CASSANDRA-4032 Project: Cassandra Issue Type: Bug Components: Core Reporter: Peter Schuller Assignee: Peter Schuller Fix For: 1.1.0 Attachments: CASSANDRA-4032-1.1.0-v1.txt Reproduce by just starting a fresh cassandra with a heap large enough for live ratio calculation (which is {{O(n)}}) to be insanely slow, and then running {{./bin/stress -d host -n1 -t10}}. With a large enough heap and default flushing behavior this is bad enough that stress gets timeouts. Example (blocked for is my debug log added around submit()): {code} INFO [MemoryMeter:1] 2012-03-09 15:07:30,857 Memtable.java (line 198) CFS(Keyspace='Keyspace1', ColumnFamily='Standard1') liveRatio is 8.89014894083727 (just-counted was 8.89014894083727). calculation took 28273ms for 1320245 columns WARN [MutationStage:8] 2012-03-09 15:07:30,857 Memtable.java (line 209) submit() blocked for: 231135 {code} The calling code was written assuming a RejectedExecutionException is thrown, but it's not because {{DebuggableThreadPoolExecutor}} installs a blocking rejection handler. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-4032) memtable.updateLiveRatio() is blocking, causing insane latencies for writes
[ https://issues.apache.org/jira/browse/CASSANDRA-4032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Schuller updated CASSANDRA-4032: -- Attachment: CASSANDRA-4032-1.1.0-v2.txt Attaching {{v2}}. Keeps NBHM mapping CFS - AtomicBoolean. Mappings are never removed (assuming reasonably bound number of unique CFS:s during a lifetime). Queue used for DTPE is unbounded. memtable.updateLiveRatio() is blocking, causing insane latencies for writes --- Key: CASSANDRA-4032 URL: https://issues.apache.org/jira/browse/CASSANDRA-4032 Project: Cassandra Issue Type: Bug Components: Core Reporter: Peter Schuller Assignee: Peter Schuller Fix For: 1.1.0 Attachments: CASSANDRA-4032-1.1.0-v1.txt, CASSANDRA-4032-1.1.0-v2.txt Reproduce by just starting a fresh cassandra with a heap large enough for live ratio calculation (which is {{O(n)}}) to be insanely slow, and then running {{./bin/stress -d host -n1 -t10}}. With a large enough heap and default flushing behavior this is bad enough that stress gets timeouts. Example (blocked for is my debug log added around submit()): {code} INFO [MemoryMeter:1] 2012-03-09 15:07:30,857 Memtable.java (line 198) CFS(Keyspace='Keyspace1', ColumnFamily='Standard1') liveRatio is 8.89014894083727 (just-counted was 8.89014894083727). calculation took 28273ms for 1320245 columns WARN [MutationStage:8] 2012-03-09 15:07:30,857 Memtable.java (line 209) submit() blocked for: 231135 {code} The calling code was written assuming a RejectedExecutionException is thrown, but it's not because {{DebuggableThreadPoolExecutor}} installs a blocking rejection handler. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-4035) post-effective ownership nodetool ring returns invalid information in some circumstances
[ https://issues.apache.org/jira/browse/CASSANDRA-4035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Schuller updated CASSANDRA-4035: -- Description: CASSANDRA-3412 broke something. We had a test cluster that I observed was unbalanced (unexpected because it wasn't supposed to be). Later, it wasn't. We realized a node was being replaced at the time it showed as unbalanced. The diff shows: {code} -10.34.115.115 rackael Up Normal 26.32 KB9.09% 36090554067372261276418518970036022421 -10.35.108.128 rackaoa Up Normal 24.42 KB9.09% 41246347505568298601621164537184025624 -10.34.244.104 rackajk Up Normal 27.11 KB9.09% 46402140943764335926823810104332028827 -10.35.86.129rackane Up Normal 31.67 KB9.09% 51557934381960373252026455671480032030 +10.35.108.128 rackaoa Up Normal 24.42 KB12.12% 41246347505568298601621164537184025624 +10.34.244.104 rackajk Up Normal 27.11 KB12.12% 46402140943764335926823810104332028827 +10.35.86.129rackane Up Normal 31.67 KB12.12% 51557934381960373252026455671480032030 {code} The node that caused this was being replaced (with replace token, not regular bootstrap) into the ring either during this or in relation to this in time. The node was never removed, and if a mistake was made to do regular bootstrap it should be showing up as joining. Hypothesis without looking at code: Somehow nodes in HIBERNATE state are incorrectly considered? (Marked fix for 1.1.1 because that's the fix-for of effective-ownership.) was: CASSANDRA-3412 broke something. We had a test cluster that I observed was unbalanced (unexpected because it wasn't supposed to be). Later, it wasn't. We realized a node was being replaced at the time it showed as unbalanced. The diff shows: {code} -10.34.115.115 smf1ael Up Normal 26.32 KB9.09% 36090554067372261276418518970036022421 -10.35.108.128 smf1aoa Up Normal 24.42 KB9.09% 41246347505568298601621164537184025624 -10.34.244.104 smf1ajk Up Normal 27.11 KB9.09% 46402140943764335926823810104332028827 -10.35.86.129smf1ane Up Normal 31.67 KB9.09% 51557934381960373252026455671480032030 +10.35.108.128 smf1aoa Up Normal 24.42 KB12.12% 41246347505568298601621164537184025624 +10.34.244.104 smf1ajk Up Normal 27.11 KB12.12% 46402140943764335926823810104332028827 +10.35.86.129smf1ane Up Normal 31.67 KB12.12% 51557934381960373252026455671480032030 {code} The node that caused this was being replaced (with replace token, not regular bootstrap) into the ring either during this or in relation to this in time. The node was never removed, and if a mistake was made to do regular bootstrap it should be showing up as joining. Hypothesis without looking at code: Somehow nodes in HIBERNATE state are incorrectly considered? (Marked fix for 1.1.1 because that's the fix-for of effective-ownership.) post-effective ownership nodetool ring returns invalid information in some circumstances Key: CASSANDRA-4035 URL: https://issues.apache.org/jira/browse/CASSANDRA-4035 Project: Cassandra Issue Type: Bug Components: Core Reporter: Peter Schuller Fix For: 1.1.1 CASSANDRA-3412 broke something. We had a test cluster that I observed was unbalanced (unexpected because it wasn't supposed to be). Later, it wasn't. We realized a node was being replaced at the time it showed as unbalanced. The diff shows: {code} -10.34.115.115 rackael Up Normal 26.32 KB9.09% 36090554067372261276418518970036022421 -10.35.108.128 rackaoa Up Normal 24.42 KB9.09% 41246347505568298601621164537184025624 -10.34.244.104 rackajk Up Normal 27.11 KB9.09% 46402140943764335926823810104332028827 -10.35.86.129rackane Up Normal 31.67 KB9.09% 51557934381960373252026455671480032030 +10.35.108.128 rackaoa Up Normal 24.42 KB 12.12% 41246347505568298601621164537184025624
[jira] [Updated] (CASSANDRA-4035) post-effective ownership nodetool ring returns invalid information in some circumstances
[ https://issues.apache.org/jira/browse/CASSANDRA-4035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Schuller updated CASSANDRA-4035: -- Description: CASSANDRA-3412 broke something. We had a test cluster that I observed was unbalanced (unexpected because it wasn't supposed to be). Later, it wasn't. We realized a node was being replaced at the time it showed as unbalanced. The diff shows: {code} -10.34.115.115 dcael Up Normal 26.32 KB9.09% 36090554067372261276418518970036022421 -10.35.108.128 dcaoa Up Normal 24.42 KB9.09% 41246347505568298601621164537184025624 -10.34.244.104 dcajk Up Normal 27.11 KB9.09% 46402140943764335926823810104332028827 -10.35.86.129dcane Up Normal 31.67 KB9.09% 51557934381960373252026455671480032030 +10.35.108.128 dcaoa Up Normal 24.42 KB12.12% 41246347505568298601621164537184025624 +10.34.244.104 dcajk Up Normal 27.11 KB12.12% 46402140943764335926823810104332028827 +10.35.86.129dcane Up Normal 31.67 KB12.12% 51557934381960373252026455671480032030 {code} The node that caused this was being replaced (with replace token, not regular bootstrap) into the ring either during this or in relation to this in time. The node was never removed, and if a mistake was made to do regular bootstrap it should be showing up as joining. Hypothesis without looking at code: Somehow nodes in HIBERNATE state are incorrectly considered? (Marked fix for 1.1.1 because that's the fix-for of effective-ownership.) was: CASSANDRA-3412 broke something. We had a test cluster that I observed was unbalanced (unexpected because it wasn't supposed to be). Later, it wasn't. We realized a node was being replaced at the time it showed as unbalanced. The diff shows: {code} -10.34.115.115 rackael Up Normal 26.32 KB9.09% 36090554067372261276418518970036022421 -10.35.108.128 rackaoa Up Normal 24.42 KB9.09% 41246347505568298601621164537184025624 -10.34.244.104 rackajk Up Normal 27.11 KB9.09% 46402140943764335926823810104332028827 -10.35.86.129rackane Up Normal 31.67 KB9.09% 51557934381960373252026455671480032030 +10.35.108.128 rackaoa Up Normal 24.42 KB12.12% 41246347505568298601621164537184025624 +10.34.244.104 rackajk Up Normal 27.11 KB12.12% 46402140943764335926823810104332028827 +10.35.86.129rackane Up Normal 31.67 KB12.12% 51557934381960373252026455671480032030 {code} The node that caused this was being replaced (with replace token, not regular bootstrap) into the ring either during this or in relation to this in time. The node was never removed, and if a mistake was made to do regular bootstrap it should be showing up as joining. Hypothesis without looking at code: Somehow nodes in HIBERNATE state are incorrectly considered? (Marked fix for 1.1.1 because that's the fix-for of effective-ownership.) post-effective ownership nodetool ring returns invalid information in some circumstances Key: CASSANDRA-4035 URL: https://issues.apache.org/jira/browse/CASSANDRA-4035 Project: Cassandra Issue Type: Bug Components: Core Reporter: Peter Schuller Fix For: 1.1.1 CASSANDRA-3412 broke something. We had a test cluster that I observed was unbalanced (unexpected because it wasn't supposed to be). Later, it wasn't. We realized a node was being replaced at the time it showed as unbalanced. The diff shows: {code} -10.34.115.115 dcael Up Normal 26.32 KB9.09% 36090554067372261276418518970036022421 -10.35.108.128 dcaoa Up Normal 24.42 KB9.09% 41246347505568298601621164537184025624 -10.34.244.104 dcajk Up Normal 27.11 KB9.09% 46402140943764335926823810104332028827 -10.35.86.129dcane Up Normal 31.67 KB9.09% 51557934381960373252026455671480032030 +10.35.108.128 dcaoa Up Normal 24.42 KB12.12% 41246347505568298601621164537184025624 +10.34.244.104 dcajk
[jira] [Updated] (CASSANDRA-3952) avoid quadratic startup time in LeveledManifest
[ https://issues.apache.org/jira/browse/CASSANDRA-3952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Schuller updated CASSANDRA-3952: -- Fix Version/s: (was: 1.1.1) 1.1.0 avoid quadratic startup time in LeveledManifest --- Key: CASSANDRA-3952 URL: https://issues.apache.org/jira/browse/CASSANDRA-3952 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Jonathan Ellis Assignee: Dave Brosius Priority: Minor Labels: lhf Fix For: 1.1.0 Attachments: speed_up_level_of.diff Checking that each sstable is in the manifest on startup is O(N**2) in the number of sstables: {code} . // ensure all SSTables are in the manifest for (SSTableReader ssTableReader : cfs.getSSTables()) { if (manifest.levelOf(ssTableReader) 0) manifest.add(ssTableReader); } {code} {code} private int levelOf(SSTableReader sstable) { for (int level = 0; level generations.length; level++) { if (generations[level].contains(sstable)) return level; } return -1; } {code} Note that the contains call is a linear List.contains. We need to switch to a sorted list and bsearch, or a tree, to support TB-levels of data in LeveledCompactionStrategy. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-3671) provide JMX counters for unavailables/timeouts for reads and writes
[ https://issues.apache.org/jira/browse/CASSANDRA-3671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Schuller updated CASSANDRA-3671: -- Fix Version/s: (was: 1.2) 1.1.1 provide JMX counters for unavailables/timeouts for reads and writes --- Key: CASSANDRA-3671 URL: https://issues.apache.org/jira/browse/CASSANDRA-3671 Project: Cassandra Issue Type: Improvement Components: Tools Reporter: Peter Schuller Assignee: Peter Schuller Priority: Minor Fix For: 1.1.1 Attachments: CASSANDRA-3671-trunk-coda-metrics-203-withjar.txt, CASSANDRA-3671-trunk-coda-metrics-v1.txt, CASSANDRA-3671-trunk-coda-metrics-v2.txt, CASSANDRA-3671-trunk-v2.txt, CASSANDRA-3671-trunk.txt, v1-0001-CASSANDRA-3671-trunk-coda-metrics-v2.txt.txt Attaching patch against trunk. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-3671) provide JMX counters for unavailables/timeouts for reads and writes
[ https://issues.apache.org/jira/browse/CASSANDRA-3671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Schuller updated CASSANDRA-3671: -- Fix Version/s: (was: 1.1.1) 1.1.0 provide JMX counters for unavailables/timeouts for reads and writes --- Key: CASSANDRA-3671 URL: https://issues.apache.org/jira/browse/CASSANDRA-3671 Project: Cassandra Issue Type: Improvement Components: Tools Reporter: Peter Schuller Assignee: Peter Schuller Priority: Minor Fix For: 1.1.0 Attachments: CASSANDRA-3671-trunk-coda-metrics-203-withjar.txt, CASSANDRA-3671-trunk-coda-metrics-v1.txt, CASSANDRA-3671-trunk-coda-metrics-v2.txt, CASSANDRA-3671-trunk-v2.txt, CASSANDRA-3671-trunk.txt, v1-0001-CASSANDRA-3671-trunk-coda-metrics-v2.txt.txt Attaching patch against trunk. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-3948) SequentialWriter doesn't fsync() before posix_fadvise()
[ https://issues.apache.org/jira/browse/CASSANDRA-3948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Schuller updated CASSANDRA-3948: -- Attachment: CASSANDRA-3948-trunk.txt Suggesting attached patch to rename the variable to reflect this (trunk only, since no functional change). SequentialWriter doesn't fsync() before posix_fadvise() --- Key: CASSANDRA-3948 URL: https://issues.apache.org/jira/browse/CASSANDRA-3948 Project: Cassandra Issue Type: Bug Components: Core Reporter: Peter Schuller Assignee: Pavel Yaskevich Fix For: 1.1.0 Attachments: CASSANDRA-3948-trunk.txt This should make the fadvising useless (mostly). See CASSANDRA-1470 for why, including links to kernel source. I have not investigated the history of when this broke or whether it was like from the beginning. For the record I have not confirmed this by testing, only by code inspection. I happened to notice it working on other things, so there is some chance that I'm just mis-reading the code. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-3948) rename RandomAccessReader.MAX_BYTES_IN_PAGE_CACHE
[ https://issues.apache.org/jira/browse/CASSANDRA-3948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Schuller updated CASSANDRA-3948: -- Summary: rename RandomAccessReader.MAX_BYTES_IN_PAGE_CACHE (was: SequentialWriter doesn't fsync() before posix_fadvise()) rename RandomAccessReader.MAX_BYTES_IN_PAGE_CACHE - Key: CASSANDRA-3948 URL: https://issues.apache.org/jira/browse/CASSANDRA-3948 Project: Cassandra Issue Type: Bug Components: Core Reporter: Peter Schuller Assignee: Pavel Yaskevich Fix For: 1.1.0 Attachments: CASSANDRA-3948-trunk.txt This should make the fadvising useless (mostly). See CASSANDRA-1470 for why, including links to kernel source. I have not investigated the history of when this broke or whether it was like from the beginning. For the record I have not confirmed this by testing, only by code inspection. I happened to notice it working on other things, so there is some chance that I'm just mis-reading the code. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-3948) rename RandomAccessReader.MAX_BYTES_IN_PAGE_CACHE
[ https://issues.apache.org/jira/browse/CASSANDRA-3948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Schuller updated CASSANDRA-3948: -- Fix Version/s: (was: 1.1.0) 1.1.1 rename RandomAccessReader.MAX_BYTES_IN_PAGE_CACHE - Key: CASSANDRA-3948 URL: https://issues.apache.org/jira/browse/CASSANDRA-3948 Project: Cassandra Issue Type: Bug Components: Core Reporter: Peter Schuller Assignee: Pavel Yaskevich Fix For: 1.1.1 Attachments: CASSANDRA-3948-trunk.txt This should make the fadvising useless (mostly). See CASSANDRA-1470 for why, including links to kernel source. I have not investigated the history of when this broke or whether it was like from the beginning. For the record I have not confirmed this by testing, only by code inspection. I happened to notice it working on other things, so there is some chance that I'm just mis-reading the code. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-3671) provide JMX counters for unavailables/timeouts for reads and writes
[ https://issues.apache.org/jira/browse/CASSANDRA-3671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Schuller updated CASSANDRA-3671: -- Attachment: CASSANDRA-3671-trunk-coda-metrics-203-withjar.txt Attaching full patch with 2.0.3 and including the .jar (produced with --binary). I've run {{ant artifacts}} and made sure the resulting .pom file includes the dependency on metrics-core. I am not very familiar with the build+dist process though so would appreciate a +1 that this part seems sufficient. For the record, here's how to download the dependency with maven: {code} mvn dependency:get -DartifactId=metrics-core -DgroupId=com.yammer.metrics -Dversion=2.0.3 -Dpackaging=jar -DrepoUrl= {code} provide JMX counters for unavailables/timeouts for reads and writes --- Key: CASSANDRA-3671 URL: https://issues.apache.org/jira/browse/CASSANDRA-3671 Project: Cassandra Issue Type: Improvement Components: Tools Reporter: Peter Schuller Assignee: Peter Schuller Priority: Minor Fix For: 1.2 Attachments: CASSANDRA-3671-trunk-coda-metrics-203-withjar.txt, CASSANDRA-3671-trunk-coda-metrics-v1.txt, CASSANDRA-3671-trunk-coda-metrics-v2.txt, CASSANDRA-3671-trunk-v2.txt, CASSANDRA-3671-trunk.txt Attaching patch against trunk. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-3950) support trickling fsync() on writes
[ https://issues.apache.org/jira/browse/CASSANDRA-3950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Schuller updated CASSANDRA-3950: -- Attachment: CASSANDRA-3950-1.1-v3.txt {{v3}} attached. Third time's the charm. I made it {{in_kb}} instead too, because I realized that megabyte resolution is not necessarily enough for really low-latency cases on a modern SSD (1 MB for a 90 MB/sec seq. writing SSD is 11 milliseconds). support trickling fsync() on writes --- Key: CASSANDRA-3950 URL: https://issues.apache.org/jira/browse/CASSANDRA-3950 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Peter Schuller Assignee: Peter Schuller Fix For: 1.1.0 Attachments: CASSANDRA-3950-1.1-v2.txt, CASSANDRA-3950-1.1-v3.txt, CASSANDRA-3950-1.1.txt Attaching a patch to support fsync():ing every N megabytes of data written using sequential writers. The motivation is to avoid the kernel flushing out pages in bulk. It makes sense for both platters and SSD:s, but it's particularly good for SSD:s because the negative consequences of fsync():ing more often are much more limited than with platters, and the *need* is to some extent greater because of the fact that with SSD:s you're much more likely to be e.g. streaming data quickly or compacting quickly, since you're not having to throttle everything as extremely as with platters, and you easily write fast enough for this to be a problem if you're targetting good latency at the outliers. I'm nominating it for 1.1.0 because, if disabled, the probability of this being a regression seems very low. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-3671) provide JMX counters for unavailables/timeouts for reads and writes
[ https://issues.apache.org/jira/browse/CASSANDRA-3671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Schuller updated CASSANDRA-3671: -- Attachment: CASSANDRA-3671-trunk-coda-metrics-v2.txt v2 rebased against current trunk. provide JMX counters for unavailables/timeouts for reads and writes --- Key: CASSANDRA-3671 URL: https://issues.apache.org/jira/browse/CASSANDRA-3671 Project: Cassandra Issue Type: Improvement Components: Tools Reporter: Peter Schuller Assignee: Peter Schuller Priority: Minor Fix For: 1.2 Attachments: CASSANDRA-3671-trunk-coda-metrics-v1.txt, CASSANDRA-3671-trunk-coda-metrics-v2.txt, CASSANDRA-3671-trunk-v2.txt, CASSANDRA-3671-trunk.txt Attaching patch against trunk. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-3950) support trickling fsync() on writes
[ https://issues.apache.org/jira/browse/CASSANDRA-3950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Schuller updated CASSANDRA-3950: -- Attachment: CASSANDRA-3950-1.1-v2.txt Apologies. Attaching fixed version. support trickling fsync() on writes --- Key: CASSANDRA-3950 URL: https://issues.apache.org/jira/browse/CASSANDRA-3950 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Peter Schuller Assignee: Peter Schuller Fix For: 1.1.0 Attachments: CASSANDRA-3950-1.1-v2.txt, CASSANDRA-3950-1.1.txt Attaching a patch to support fsync():ing every N megabytes of data written using sequential writers. The motivation is to avoid the kernel flushing out pages in bulk. It makes sense for both platters and SSD:s, but it's particularly good for SSD:s because the negative consequences of fsync():ing more often are much more limited than with platters, and the *need* is to some extent greater because of the fact that with SSD:s you're much more likely to be e.g. streaming data quickly or compacting quickly, since you're not having to throttle everything as extremely as with platters, and you easily write fast enough for this to be a problem if you're targetting good latency at the outliers. I'm nominating it for 1.1.0 because, if disabled, the probability of this being a regression seems very low. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-3950) support trickling fsync() on writes
[ https://issues.apache.org/jira/browse/CASSANDRA-3950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Schuller updated CASSANDRA-3950: -- Attachment: CASSANDRA-3950-1.1.txt support trickling fsync() on writes --- Key: CASSANDRA-3950 URL: https://issues.apache.org/jira/browse/CASSANDRA-3950 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Peter Schuller Assignee: Peter Schuller Fix For: 1.1.0 Attachments: CASSANDRA-3950-1.1.txt Attaching a patch to support fsync():ing every N megabytes of data written using sequential writers. The motivation is to avoid the kernel flushing out pages in bulk. It makes sense for both platters and SSD:s, but it's particularly good for SSD:s because the negative consequences of fsync():ing more often are much more limited than with platters, and the *need* is to some extent greater because of the fact that with SSD:s you're much more likely to be e.g. streaming data quickly or compacting quickly, since you're not having to throttle everything as extremely as with platters, and you easily write fast enough for this to be a problem if you're targetting good latency at the outliers. I'm nominating it for 1.1.0 because, if disabled, the probability of this being a regression seems very low. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-3922) streaming from all (not one) neighbors during rebuild/bootstrap
[ https://issues.apache.org/jira/browse/CASSANDRA-3922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Schuller updated CASSANDRA-3922: -- Attachment: CASSANDRA-3922-1.1.txt streaming from all (not one) neighbors during rebuild/bootstrap --- Key: CASSANDRA-3922 URL: https://issues.apache.org/jira/browse/CASSANDRA-3922 Project: Cassandra Issue Type: Bug Components: Core Reporter: Peter Schuller Assignee: Peter Schuller Priority: Blocker Fix For: 1.1.0 Attachments: CASSANDRA-3922-1.1.txt The last round of changes that happened in CASSANDRA-3483 before it went in actually changed behavior - we now stream from *ALL* neighbors that have a range, rather than just one. This leads to data size explosion. Attaching patch to revert to intended behavior. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-3922) streaming from all (not one) neighbors during rebuild/bootstrap
[ https://issues.apache.org/jira/browse/CASSANDRA-3922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Schuller updated CASSANDRA-3922: -- Fix Version/s: 1.1.0 streaming from all (not one) neighbors during rebuild/bootstrap --- Key: CASSANDRA-3922 URL: https://issues.apache.org/jira/browse/CASSANDRA-3922 Project: Cassandra Issue Type: Bug Components: Core Reporter: Peter Schuller Assignee: Peter Schuller Priority: Blocker Fix For: 1.1.0 Attachments: CASSANDRA-3922-1.1.txt The last round of changes that happened in CASSANDRA-3483 before it went in actually changed behavior - we now stream from *ALL* neighbors that have a range, rather than just one. This leads to data size explosion. Attaching patch to revert to intended behavior. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-3912) support incremental repair controlled by external agent
[ https://issues.apache.org/jira/browse/CASSANDRA-3912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Schuller updated CASSANDRA-3912: -- Attachment: CASSANDRA-3912-trunk-v1.txt support incremental repair controlled by external agent --- Key: CASSANDRA-3912 URL: https://issues.apache.org/jira/browse/CASSANDRA-3912 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Peter Schuller Assignee: Peter Schuller Attachments: CASSANDRA-3912-trunk-v1.txt As a poor man's pre-cursor to CASSANDRA-2699, exposing the ability to repair small parts of a range is extremely useful because it allows (with external scripting logic) to slowly repair a node's content over time. Other than avoiding the bulkyness of complete repairs, it means that you can safely do repairs even if you absolutely cannot afford e.g. disk spaces spikes (see CASSANDRA-2699 for what the issues are). Attaching a patch that exposes a repairincremental command to nodetool, where you specify a step and the number of total steps. Incrementally performing a repair in 100 steps, for example, would be done by: {code} nodetool repairincremental 0 100 nodetool repairincremental 1 100 ... nodetool repairincremental 99 100 {code} An external script can be used to keep track of what has been repaired and when. This should allow (1) allow incremental repair to happen now/soon, and (2) allow experimentation and evaluation for an implementation of CASSANDRA-2699 which I still think is a good idea. This patch does nothing to help the average deployment, but at least makes incremental repair possible given sufficient effort spent on external scripting. The big no-no about the patch is that it is entirely specific to RandomPartitioner and BigIntegerToken. If someone can suggest a way to implement this command generically using the Range/Token abstractions, I'd be happy to hear suggestions. An alternative would be to provide a nodetool command that allows you to simply specify the specific token ranges on the command line. It makes using it a bit more difficult, but would mean that it works for any partitioner and token type. Unless someone can suggest a better way to do this, I think I'll provide a patch that does this. I'm still leaning towards supporting the simple step N out of M form though. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-3417) InvocationTargetException ConcurrentModificationException at startup
[ https://issues.apache.org/jira/browse/CASSANDRA-3417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Schuller updated CASSANDRA-3417: -- Attachment: CASSANDRA-3417-tokenmap-1.0-v1.txt Attaching {{CASSANDRA\-3417\-tokenmap\-1.0\-v1.txt}} which is for 1.0. Apologies for the confusion; I only ever triggered and tested this on 1.1/trunk since that's what I was testing, despite this bug originally being against 1.0. I haven't done real testing with this patch for 1.0. Right now I can't use the cluster I was testing with to easily go to 1.0 to test either. But, the fix seems correct to me regardless of branch given that the iteration is clearly over a map that is getting modified. The biggest risk is a typo or similar mistake which is more easily spotted by review anyway. InvocationTargetException ConcurrentModificationException at startup Key: CASSANDRA-3417 URL: https://issues.apache.org/jira/browse/CASSANDRA-3417 Project: Cassandra Issue Type: Bug Affects Versions: 1.0.0 Reporter: Joaquin Casares Assignee: Peter Schuller Priority: Minor Fix For: 1.0.8 Attachments: 3417-2.txt, 3417-3.txt, 3417.txt, CASSANDRA-3417-tokenmap-1.0-v1.txt, CASSANDRA-3417-tokenmap-v2.txt, CASSANDRA-3417-tokenmap-v3.txt, CASSANDRA-3417-tokenmap.txt I was starting up the new DataStax AMI where the seed starts first and 34 nodes would latch on together. So far things have been working decently for launching, but right now I just got this during startup. {CODE} ubuntu@ip-10-40-190-143:~$ sudo cat /var/log/cassandra/output.log INFO 09:24:38,453 JVM vendor/version: Java HotSpot(TM) 64-Bit Server VM/1.6.0_26 INFO 09:24:38,456 Heap size: 1936719872/1937768448 INFO 09:24:38,457 Classpath: /usr/share/cassandra/lib/antlr-3.2.jar:/usr/share/cassandra/lib/avro-1.4.0-fixes.jar:/usr/share/cassandra/lib/avro-1.4.0-sources-fixes.jar:/usr/share/cassandra/lib/commons-cli-1.1.jar:/usr/share/cassandra/lib/commons-codec-1.2.jar:/usr/share/cassandra/lib/commons-lang-2.4.jar:/usr/share/cassandra/lib/compress-lzf-0.8.4.jar:/usr/share/cassandra/lib/concurrentlinkedhashmap-lru-1.2.jar:/usr/share/cassandra/lib/guava-r08.jar:/usr/share/cassandra/lib/high-scale-lib-1.1.2.jar:/usr/share/cassandra/lib/jackson-core-asl-1.4.0.jar:/usr/share/cassandra/lib/jackson-mapper-asl-1.4.0.jar:/usr/share/cassandra/lib/jamm-0.2.5.jar:/usr/share/cassandra/lib/jline-0.9.94.jar:/usr/share/cassandra/lib/joda-time-1.6.2.jar:/usr/share/cassandra/lib/json-simple-1.1.jar:/usr/share/cassandra/lib/libthrift-0.6.jar:/usr/share/cassandra/lib/log4j-1.2.16.jar:/usr/share/cassandra/lib/servlet-api-2.5-20081211.jar:/usr/share/cassandra/lib/slf4j-api-1.6.1.jar:/usr/share/cassandra/lib/slf4j-log4j12-1.6.1.jar:/usr/share/cassandra/lib/snakeyaml-1.6.jar:/usr/share/cassandra/lib/snappy-java-1.0.3.jar:/usr/share/cassandra/apache-cassandra-1.0.0.jar:/usr/share/cassandra/apache-cassandra-thrift-1.0.0.jar:/usr/share/cassandra/apache-cassandra.jar:/usr/share/java/jna.jar:/etc/cassandra:/usr/share/java/commons-daemon.jar:/usr/share/cassandra/lib/jamm-0.2.5.jar INFO 09:24:39,891 JNA mlockall successful INFO 09:24:39,901 Loading settings from file:/etc/cassandra/cassandra.yaml INFO 09:24:40,057 DiskAccessMode 'auto' determined to be mmap, indexAccessMode is mmap INFO 09:24:40,069 Global memtable threshold is enabled at 616MB INFO 09:24:40,159 EC2Snitch using region: us-east, zone: 1d. INFO 09:24:40,475 Creating new commitlog segment /raid0/cassandra/commitlog/CommitLog-1319793880475.log INFO 09:24:40,486 Couldn't detect any schema definitions in local storage. INFO 09:24:40,486 Found table data in data directories. Consider using the CLI to define your schema. INFO 09:24:40,497 No commitlog files found; skipping replay INFO 09:24:40,501 Cassandra version: 1.0.0 INFO 09:24:40,502 Thrift API version: 19.18.0 INFO 09:24:40,502 Loading persisted ring state INFO 09:24:40,506 Starting up server gossip INFO 09:24:40,529 Enqueuing flush of Memtable-LocationInfo@1388314661(190/237 serialized/live bytes, 4 ops) INFO 09:24:40,530 Writing Memtable-LocationInfo@1388314661(190/237 serialized/live bytes, 4 ops) INFO 09:24:40,600 Completed flushing /raid0/cassandra/data/system/LocationInfo-h-1-Data.db (298 bytes) INFO 09:24:40,613 Ec2Snitch adding ApplicationState ec2region=us-east ec2zone=1d INFO 09:24:40,621 Starting Messaging Service on /10.40.190.143:7000 INFO 09:24:40,628 Joining: waiting for ring and schema information INFO 09:24:43,389 InetAddress /10.194.29.156 is now dead. INFO 09:24:43,391 InetAddress /10.85.11.38 is now dead. INFO 09:24:43,392 InetAddress /10.34.42.28 is now dead. INFO 09:24:43,393 InetAddress /10.77.63.49 is now dead.
[jira] [Updated] (CASSANDRA-3904) do not generate NPE on aborted stream-out sessions
[ https://issues.apache.org/jira/browse/CASSANDRA-3904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Schuller updated CASSANDRA-3904: -- Attachment: CASSANDRA-3904-1.1.txt Attaching patch again 1.1. It replaces NPE with a friendlier message, and also augments the original stream out session message to clarify that streams may still be going in the background. do not generate NPE on aborted stream-out sessions -- Key: CASSANDRA-3904 URL: https://issues.apache.org/jira/browse/CASSANDRA-3904 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Peter Schuller Assignee: Peter Schuller Priority: Minor Fix For: 1.1.0 Attachments: CASSANDRA-3904-1.1.txt https://issues.apache.org/jira/browse/CASSANDRA-3569?focusedCommentId=13207189page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13207189 Attaching patch to make this a friendlier log entry. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-3905) fix typo in nodetool help for repair
[ https://issues.apache.org/jira/browse/CASSANDRA-3905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Schuller updated CASSANDRA-3905: -- Attachment: CASSANDRA-3905.txt fix typo in nodetool help for repair Key: CASSANDRA-3905 URL: https://issues.apache.org/jira/browse/CASSANDRA-3905 Project: Cassandra Issue Type: Bug Reporter: Peter Schuller Assignee: Peter Schuller Priority: Trivial Fix For: 1.1.0 Attachments: CASSANDRA-3905.txt It says to use {{-rp}} instead of {{-pr}}. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-3905) fix typo in nodetool help for repair
[ https://issues.apache.org/jira/browse/CASSANDRA-3905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Schuller updated CASSANDRA-3905: -- Fix Version/s: 1.1.0 fix typo in nodetool help for repair Key: CASSANDRA-3905 URL: https://issues.apache.org/jira/browse/CASSANDRA-3905 Project: Cassandra Issue Type: Bug Reporter: Peter Schuller Assignee: Peter Schuller Priority: Trivial Fix For: 1.1.0 Attachments: CASSANDRA-3905.txt It says to use {{-rp}} instead of {{-pr}}. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-3887) NPE on start-up due to missing stage
[ https://issues.apache.org/jira/browse/CASSANDRA-3887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Schuller updated CASSANDRA-3887: -- Attachment: CASSANDRA-3887-improve-assert-1.1.txt NPE on start-up due to missing stage - Key: CASSANDRA-3887 URL: https://issues.apache.org/jira/browse/CASSANDRA-3887 Project: Cassandra Issue Type: Bug Reporter: Peter Schuller Assignee: Peter Schuller Priority: Minor Attachments: CASSANDRA-3887-improve-assert-1.1.txt On 1.1 (with our patches, but fairly sure they aren't involved): {code} INFO [main] 2012-02-10 17:57:26,220 StorageService.java (line 768) JOINING: waiting for ring and schema information ERROR [Thread-6] 2012-02-10 17:57:26,333 AbstractCassandraDaemon.java (line 139) Fatal exception in thread Thread[Thread-6,5,main] java.lang.NullPointerException at org.apache.cassandra.net.MessagingService.receive(MessagingService.java:564) at org.apache.cassandra.net.IncomingTcpConnection.receiveMessage(IncomingTcpConnection.java:160) at org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:96) ERROR [Thread-8] 2012-02-10 17:57:26,334 AbstractCassandraDaemon.java (line 139) Fatal exception in thread Thread[Thread-8,5,main] java.lang.NullPointerException at org.apache.cassandra.net.MessagingService.receive(MessagingService.java:564) at org.apache.cassandra.net.IncomingTcpConnection.receiveMessage(IncomingTcpConnection.java:160) at org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:96) {code} That NPE is after an assertion (not triggered due to lack of -ea). Race on start-up - getting messages before stages set up? (not investigating further right now) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-3589) Degraded performance of sstable-generator api and sstable-loader utility in cassandra 1.0.x
[ https://issues.apache.org/jira/browse/CASSANDRA-3589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Schuller updated CASSANDRA-3589: -- Comment: was deleted (was: I just realized something. I wasn't looking into this but had a separate realization investigating streaming performance - ever since Cassandra moved to single-pass streaming (CASSANDRA-2677) streaming easily becomes CPU bound. If the mpbs numbers earlier in this ticket (35, 19) are mega*bytes*, the numbers are well within what you might reasonably expect from being CPU bound. I have yet to look into how the bulk loader stuff works, but is it possible it's going through the same path, such that it's spending CPU time re-creating the sstable on reception? (This may be obvious to folks, I have actually never look at the bulk loader support before.)) Degraded performance of sstable-generator api and sstable-loader utility in cassandra 1.0.x --- Key: CASSANDRA-3589 URL: https://issues.apache.org/jira/browse/CASSANDRA-3589 Project: Cassandra Issue Type: Bug Components: Tools Affects Versions: 1.0.0 Reporter: Samarth Gahire Priority: Minor we are using Sstable-Generation API and Sstable-Loader utility.As soon as newer version of cassandra releases I test them for sstable generation and loading for time taken by both the processes.Till cassandra 0.8.7 there is no significant change in time taken.But in all cassandra-1.0.x i have seen 3-4 times degraded performance in generation and 2 times degraded performance in loading.Because of this we are not upgrading the cassandra to latest version as we are processing some TeraBytes of data everyday time taken is very important. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-3892) improve TokenMetadata abstraction, naming - audit current uses
[ https://issues.apache.org/jira/browse/CASSANDRA-3892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Schuller updated CASSANDRA-3892: -- Attachment: CASSANDRA-3892-split-006-do-not-forget-nuking-in-moving-when-removing-endpoint.txt CASSANDRA-3892-split-005-add-assertion-when-declaring-moving-endpoint.txt CASSANDRA-3892-split-004-add-assertion-when-declaring-leaving-endpoint.txt CASSANDRA-3892-split-003-remove-redundant-comment.txt CASSANDRA-3892-split-002-make-moving-endpoints-a-map.txt CASSANDRA-3892-split-001-change-naming.txt Attaching 6 incremental patches. They must be applied/considered in order, as they won't apply out of order. There is a seventh patch coming containing the various changes and additions of comments/documentation. improve TokenMetadata abstraction, naming - audit current uses -- Key: CASSANDRA-3892 URL: https://issues.apache.org/jira/browse/CASSANDRA-3892 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Peter Schuller Assignee: Peter Schuller Attachments: CASSANDRA-3892-draft-v2.txt, CASSANDRA-3892-draft-v3.txt, CASSANDRA-3892-draft-v4.txt, CASSANDRA-3892-draft.txt, CASSANDRA-3892-split-001-change-naming.txt, CASSANDRA-3892-split-002-make-moving-endpoints-a-map.txt, CASSANDRA-3892-split-003-remove-redundant-comment.txt, CASSANDRA-3892-split-004-add-assertion-when-declaring-leaving-endpoint.txt, CASSANDRA-3892-split-005-add-assertion-when-declaring-moving-endpoint.txt, CASSANDRA-3892-split-006-do-not-forget-nuking-in-moving-when-removing-endpoint.txt CASSANDRA-3417 has some background. I want to make the distinction more clear between looking at the ring from different perspectives (reads, writes, others) and adjust naming to be more clear on this. I also want to go through each use case and try to spot any subtle pre-existing bugs that I almost introduced in CASSANDRA-3417, had not Jonathan caught me. I will submit a patch soonish. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-3892) improve TokenMetadata abstraction, naming - audit current uses
[ https://issues.apache.org/jira/browse/CASSANDRA-3892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Schuller updated CASSANDRA-3892: -- Attachment: CASSANDRA-3892-split-009-storageservice-comments.txt CASSANDRA-3892-split-008-make-firsttokenindex-private.txt CASSANDRA-3892-split-007-tokenmetadata-comments.txt Comments + making a method private now also attached. improve TokenMetadata abstraction, naming - audit current uses -- Key: CASSANDRA-3892 URL: https://issues.apache.org/jira/browse/CASSANDRA-3892 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Peter Schuller Assignee: Peter Schuller Attachments: CASSANDRA-3892-draft-v2.txt, CASSANDRA-3892-draft-v3.txt, CASSANDRA-3892-draft-v4.txt, CASSANDRA-3892-draft.txt, CASSANDRA-3892-split-001-change-naming.txt, CASSANDRA-3892-split-002-make-moving-endpoints-a-map.txt, CASSANDRA-3892-split-003-remove-redundant-comment.txt, CASSANDRA-3892-split-004-add-assertion-when-declaring-leaving-endpoint.txt, CASSANDRA-3892-split-005-add-assertion-when-declaring-moving-endpoint.txt, CASSANDRA-3892-split-006-do-not-forget-nuking-in-moving-when-removing-endpoint.txt, CASSANDRA-3892-split-007-tokenmetadata-comments.txt, CASSANDRA-3892-split-008-make-firsttokenindex-private.txt, CASSANDRA-3892-split-009-storageservice-comments.txt CASSANDRA-3417 has some background. I want to make the distinction more clear between looking at the ring from different perspectives (reads, writes, others) and adjust naming to be more clear on this. I also want to go through each use case and try to spot any subtle pre-existing bugs that I almost introduced in CASSANDRA-3417, had not Jonathan caught me. I will submit a patch soonish. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-3892) improve TokenMetadata abstraction, naming - audit current uses
[ https://issues.apache.org/jira/browse/CASSANDRA-3892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Schuller updated CASSANDRA-3892: -- Attachment: CASSANDRA-3892-draft.txt Attaching {{CASSANDRA-3892-draft.txt}} which is a draft/work in progress. Mainly I'm asking for a stop right there if these types of changes seem like something that will never be accepted (they're semi-significant even though most of it constitute non-functional changes). I'm not asking nor suggesting for careful review, as it's better that I submit a more finished patch before that happens. Any requests for patch splitting strategies or overall don't do this/don't do that would be helpful though, if someone has them. Other than what's there in the current version, I want to move pending range calculation into token meta data (it will need to be given a strategy), and things like {{StorageService.handleStateNormal()}} being responsible for keeping the internal state of tokenmetadata (removing from moving) up-to-date I want gone. I've begun making naming and concepts a bit more consistent; the token meta data is now more consistently (but not fully yet) talking about endpoints as the main abstraction rather than mixing endpoints and tokens, and we have joining endpoints instead of bootstrap tokens. Moving endpoints is now also a map with O(n) access, and kept up to date in removeEndpoint() (may be other places that need fixing). I adjusted comments for {{calculatePendingRanges}} to be clear:er; for example the old comments made it sound like we were sending writes to places for good measure because we're in doubt, rather than because it is strictly necessary. Unless I hear objections I'll likely continue this on Sunday and submit another patch. improve TokenMetadata abstraction, naming - audit current uses -- Key: CASSANDRA-3892 URL: https://issues.apache.org/jira/browse/CASSANDRA-3892 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Peter Schuller Assignee: Peter Schuller Attachments: CASSANDRA-3892-draft.txt CASSANDRA-3417 has some background. I want to make the distinction more clear between looking at the ring from different perspectives (reads, writes, others) and adjust naming to be more clear on this. I also want to go through each use case and try to spot any subtle pre-existing bugs that I almost introduced in CASSANDRA-3417, had not Jonathan caught me. I will submit a patch soonish. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-3892) improve TokenMetadata abstraction, naming - audit current uses
[ https://issues.apache.org/jira/browse/CASSANDRA-3892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Schuller updated CASSANDRA-3892: -- Attachment: CASSANDRA-3892-draft-v2.txt Attaching {{CASSANDRA-3892-draft-v2.txt}} with some more changes. I still consider it a draft because I have not yet done any testing, but it's more ripe for review now. A few of the sub-tasks I created are IMO serious as well. improve TokenMetadata abstraction, naming - audit current uses -- Key: CASSANDRA-3892 URL: https://issues.apache.org/jira/browse/CASSANDRA-3892 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Peter Schuller Assignee: Peter Schuller Attachments: CASSANDRA-3892-draft-v2.txt, CASSANDRA-3892-draft.txt CASSANDRA-3417 has some background. I want to make the distinction more clear between looking at the ring from different perspectives (reads, writes, others) and adjust naming to be more clear on this. I also want to go through each use case and try to spot any subtle pre-existing bugs that I almost introduced in CASSANDRA-3417, had not Jonathan caught me. I will submit a patch soonish. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-3892) improve TokenMetadata abstraction, naming - audit current uses
[ https://issues.apache.org/jira/browse/CASSANDRA-3892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Schuller updated CASSANDRA-3892: -- Attachment: CASSANDRA-3892-draft-v3.txt v3 draft has a few minor typos/style fixes. I'll try to come up with some better naming for the real eligible stuff. improve TokenMetadata abstraction, naming - audit current uses -- Key: CASSANDRA-3892 URL: https://issues.apache.org/jira/browse/CASSANDRA-3892 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Peter Schuller Assignee: Peter Schuller Attachments: CASSANDRA-3892-draft-v2.txt, CASSANDRA-3892-draft-v3.txt, CASSANDRA-3892-draft.txt CASSANDRA-3417 has some background. I want to make the distinction more clear between looking at the ring from different perspectives (reads, writes, others) and adjust naming to be more clear on this. I also want to go through each use case and try to spot any subtle pre-existing bugs that I almost introduced in CASSANDRA-3417, had not Jonathan caught me. I will submit a patch soonish. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-3892) improve TokenMetadata abstraction, naming - audit current uses
[ https://issues.apache.org/jira/browse/CASSANDRA-3892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Schuller updated CASSANDRA-3892: -- Attachment: CASSANDRA-3892-draft-v4.txt {{v4}} uses the terminology joined endpoints for what was previously the un-named things in the tokenToEndpointMap (they did not quite correspond to normal, because a node could simultaneously be moving and be in the tokenToEndpointMap - same for leaving). We now have: * Joined endpoints - fully joined, taking reads (previously unnamed, was in tokenToEndpointMap) * Moving endpoints - joined endpoints that are also currently moving. * Joining endpoints - not yet joined, in the process of joining (previously bootstrap tokens). * Leaving endpoints Some of the methods have a bit less obnoxious names now than in {{v3}}. improve TokenMetadata abstraction, naming - audit current uses -- Key: CASSANDRA-3892 URL: https://issues.apache.org/jira/browse/CASSANDRA-3892 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Peter Schuller Assignee: Peter Schuller Attachments: CASSANDRA-3892-draft-v2.txt, CASSANDRA-3892-draft-v3.txt, CASSANDRA-3892-draft-v4.txt, CASSANDRA-3892-draft.txt CASSANDRA-3417 has some background. I want to make the distinction more clear between looking at the ring from different perspectives (reads, writes, others) and adjust naming to be more clear on this. I also want to go through each use case and try to spot any subtle pre-existing bugs that I almost introduced in CASSANDRA-3417, had not Jonathan caught me. I will submit a patch soonish. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-3417) InvocationTargetException ConcurrentModificationException at startup
[ https://issues.apache.org/jira/browse/CASSANDRA-3417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Schuller updated CASSANDRA-3417: -- Attachment: CASSANDRA-3417-tokenmap-v3.txt {{v3}} slightly adjusted to not use joiningEndpoints.size() when constructing the copy w/o joining endpoints. InvocationTargetException ConcurrentModificationException at startup Key: CASSANDRA-3417 URL: https://issues.apache.org/jira/browse/CASSANDRA-3417 Project: Cassandra Issue Type: Bug Affects Versions: 1.0.0 Reporter: Joaquin Casares Assignee: Peter Schuller Priority: Minor Fix For: 1.0.8 Attachments: 3417-2.txt, 3417-3.txt, 3417.txt, CASSANDRA-3417-tokenmap-v2.txt, CASSANDRA-3417-tokenmap-v3.txt, CASSANDRA-3417-tokenmap.txt I was starting up the new DataStax AMI where the seed starts first and 34 nodes would latch on together. So far things have been working decently for launching, but right now I just got this during startup. {CODE} ubuntu@ip-10-40-190-143:~$ sudo cat /var/log/cassandra/output.log INFO 09:24:38,453 JVM vendor/version: Java HotSpot(TM) 64-Bit Server VM/1.6.0_26 INFO 09:24:38,456 Heap size: 1936719872/1937768448 INFO 09:24:38,457 Classpath: /usr/share/cassandra/lib/antlr-3.2.jar:/usr/share/cassandra/lib/avro-1.4.0-fixes.jar:/usr/share/cassandra/lib/avro-1.4.0-sources-fixes.jar:/usr/share/cassandra/lib/commons-cli-1.1.jar:/usr/share/cassandra/lib/commons-codec-1.2.jar:/usr/share/cassandra/lib/commons-lang-2.4.jar:/usr/share/cassandra/lib/compress-lzf-0.8.4.jar:/usr/share/cassandra/lib/concurrentlinkedhashmap-lru-1.2.jar:/usr/share/cassandra/lib/guava-r08.jar:/usr/share/cassandra/lib/high-scale-lib-1.1.2.jar:/usr/share/cassandra/lib/jackson-core-asl-1.4.0.jar:/usr/share/cassandra/lib/jackson-mapper-asl-1.4.0.jar:/usr/share/cassandra/lib/jamm-0.2.5.jar:/usr/share/cassandra/lib/jline-0.9.94.jar:/usr/share/cassandra/lib/joda-time-1.6.2.jar:/usr/share/cassandra/lib/json-simple-1.1.jar:/usr/share/cassandra/lib/libthrift-0.6.jar:/usr/share/cassandra/lib/log4j-1.2.16.jar:/usr/share/cassandra/lib/servlet-api-2.5-20081211.jar:/usr/share/cassandra/lib/slf4j-api-1.6.1.jar:/usr/share/cassandra/lib/slf4j-log4j12-1.6.1.jar:/usr/share/cassandra/lib/snakeyaml-1.6.jar:/usr/share/cassandra/lib/snappy-java-1.0.3.jar:/usr/share/cassandra/apache-cassandra-1.0.0.jar:/usr/share/cassandra/apache-cassandra-thrift-1.0.0.jar:/usr/share/cassandra/apache-cassandra.jar:/usr/share/java/jna.jar:/etc/cassandra:/usr/share/java/commons-daemon.jar:/usr/share/cassandra/lib/jamm-0.2.5.jar INFO 09:24:39,891 JNA mlockall successful INFO 09:24:39,901 Loading settings from file:/etc/cassandra/cassandra.yaml INFO 09:24:40,057 DiskAccessMode 'auto' determined to be mmap, indexAccessMode is mmap INFO 09:24:40,069 Global memtable threshold is enabled at 616MB INFO 09:24:40,159 EC2Snitch using region: us-east, zone: 1d. INFO 09:24:40,475 Creating new commitlog segment /raid0/cassandra/commitlog/CommitLog-1319793880475.log INFO 09:24:40,486 Couldn't detect any schema definitions in local storage. INFO 09:24:40,486 Found table data in data directories. Consider using the CLI to define your schema. INFO 09:24:40,497 No commitlog files found; skipping replay INFO 09:24:40,501 Cassandra version: 1.0.0 INFO 09:24:40,502 Thrift API version: 19.18.0 INFO 09:24:40,502 Loading persisted ring state INFO 09:24:40,506 Starting up server gossip INFO 09:24:40,529 Enqueuing flush of Memtable-LocationInfo@1388314661(190/237 serialized/live bytes, 4 ops) INFO 09:24:40,530 Writing Memtable-LocationInfo@1388314661(190/237 serialized/live bytes, 4 ops) INFO 09:24:40,600 Completed flushing /raid0/cassandra/data/system/LocationInfo-h-1-Data.db (298 bytes) INFO 09:24:40,613 Ec2Snitch adding ApplicationState ec2region=us-east ec2zone=1d INFO 09:24:40,621 Starting Messaging Service on /10.40.190.143:7000 INFO 09:24:40,628 Joining: waiting for ring and schema information INFO 09:24:43,389 InetAddress /10.194.29.156 is now dead. INFO 09:24:43,391 InetAddress /10.85.11.38 is now dead. INFO 09:24:43,392 InetAddress /10.34.42.28 is now dead. INFO 09:24:43,393 InetAddress /10.77.63.49 is now dead. INFO 09:24:43,394 InetAddress /10.194.22.191 is now dead. INFO 09:24:43,395 InetAddress /10.34.74.58 is now dead. INFO 09:24:43,395 Node /10.34.33.16 is now part of the cluster INFO 09:24:43,396 InetAddress /10.34.33.16 is now UP INFO 09:24:43,397 Enqueuing flush of Memtable-LocationInfo@1629818866(20/25 serialized/live bytes, 1 ops) INFO 09:24:43,398 Writing Memtable-LocationInfo@1629818866(20/25 serialized/live bytes, 1 ops) INFO 09:24:43,417 Completed flushing
[jira] [Updated] (CASSANDRA-3417) InvocationTargetException ConcurrentModificationException at startup
[ https://issues.apache.org/jira/browse/CASSANDRA-3417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Schuller updated CASSANDRA-3417: -- Attachment: CASSANDRA-3417-tokenmap-v2.txt Attaching {{v2}}. I decided I would submit a minimal patch to just fix this one problem. I will shortly open a separate ticket for further work on TokenMap. I realized I have to carefully audit every use-case of TokenMap to detect any issues of a similar type (the first thing I looked at showed a problem - the duplicate token check during bootstrap is not considering all tokens, but only normal+bootstrapping). I apologize for not being more thorough in my original patch submission and confirming what the method was actually doing. I remember looking closely at the method name and vaguely noticing the bootstrap tokens but it never sank in. InvocationTargetException ConcurrentModificationException at startup Key: CASSANDRA-3417 URL: https://issues.apache.org/jira/browse/CASSANDRA-3417 Project: Cassandra Issue Type: Bug Affects Versions: 1.0.0 Reporter: Joaquin Casares Assignee: Peter Schuller Priority: Minor Fix For: 1.0.8 Attachments: 3417-2.txt, 3417-3.txt, 3417.txt, CASSANDRA-3417-tokenmap-v2.txt, CASSANDRA-3417-tokenmap.txt I was starting up the new DataStax AMI where the seed starts first and 34 nodes would latch on together. So far things have been working decently for launching, but right now I just got this during startup. {CODE} ubuntu@ip-10-40-190-143:~$ sudo cat /var/log/cassandra/output.log INFO 09:24:38,453 JVM vendor/version: Java HotSpot(TM) 64-Bit Server VM/1.6.0_26 INFO 09:24:38,456 Heap size: 1936719872/1937768448 INFO 09:24:38,457 Classpath: /usr/share/cassandra/lib/antlr-3.2.jar:/usr/share/cassandra/lib/avro-1.4.0-fixes.jar:/usr/share/cassandra/lib/avro-1.4.0-sources-fixes.jar:/usr/share/cassandra/lib/commons-cli-1.1.jar:/usr/share/cassandra/lib/commons-codec-1.2.jar:/usr/share/cassandra/lib/commons-lang-2.4.jar:/usr/share/cassandra/lib/compress-lzf-0.8.4.jar:/usr/share/cassandra/lib/concurrentlinkedhashmap-lru-1.2.jar:/usr/share/cassandra/lib/guava-r08.jar:/usr/share/cassandra/lib/high-scale-lib-1.1.2.jar:/usr/share/cassandra/lib/jackson-core-asl-1.4.0.jar:/usr/share/cassandra/lib/jackson-mapper-asl-1.4.0.jar:/usr/share/cassandra/lib/jamm-0.2.5.jar:/usr/share/cassandra/lib/jline-0.9.94.jar:/usr/share/cassandra/lib/joda-time-1.6.2.jar:/usr/share/cassandra/lib/json-simple-1.1.jar:/usr/share/cassandra/lib/libthrift-0.6.jar:/usr/share/cassandra/lib/log4j-1.2.16.jar:/usr/share/cassandra/lib/servlet-api-2.5-20081211.jar:/usr/share/cassandra/lib/slf4j-api-1.6.1.jar:/usr/share/cassandra/lib/slf4j-log4j12-1.6.1.jar:/usr/share/cassandra/lib/snakeyaml-1.6.jar:/usr/share/cassandra/lib/snappy-java-1.0.3.jar:/usr/share/cassandra/apache-cassandra-1.0.0.jar:/usr/share/cassandra/apache-cassandra-thrift-1.0.0.jar:/usr/share/cassandra/apache-cassandra.jar:/usr/share/java/jna.jar:/etc/cassandra:/usr/share/java/commons-daemon.jar:/usr/share/cassandra/lib/jamm-0.2.5.jar INFO 09:24:39,891 JNA mlockall successful INFO 09:24:39,901 Loading settings from file:/etc/cassandra/cassandra.yaml INFO 09:24:40,057 DiskAccessMode 'auto' determined to be mmap, indexAccessMode is mmap INFO 09:24:40,069 Global memtable threshold is enabled at 616MB INFO 09:24:40,159 EC2Snitch using region: us-east, zone: 1d. INFO 09:24:40,475 Creating new commitlog segment /raid0/cassandra/commitlog/CommitLog-1319793880475.log INFO 09:24:40,486 Couldn't detect any schema definitions in local storage. INFO 09:24:40,486 Found table data in data directories. Consider using the CLI to define your schema. INFO 09:24:40,497 No commitlog files found; skipping replay INFO 09:24:40,501 Cassandra version: 1.0.0 INFO 09:24:40,502 Thrift API version: 19.18.0 INFO 09:24:40,502 Loading persisted ring state INFO 09:24:40,506 Starting up server gossip INFO 09:24:40,529 Enqueuing flush of Memtable-LocationInfo@1388314661(190/237 serialized/live bytes, 4 ops) INFO 09:24:40,530 Writing Memtable-LocationInfo@1388314661(190/237 serialized/live bytes, 4 ops) INFO 09:24:40,600 Completed flushing /raid0/cassandra/data/system/LocationInfo-h-1-Data.db (298 bytes) INFO 09:24:40,613 Ec2Snitch adding ApplicationState ec2region=us-east ec2zone=1d INFO 09:24:40,621 Starting Messaging Service on /10.40.190.143:7000 INFO 09:24:40,628 Joining: waiting for ring and schema information INFO 09:24:43,389 InetAddress /10.194.29.156 is now dead. INFO 09:24:43,391 InetAddress /10.85.11.38 is now dead. INFO 09:24:43,392 InetAddress /10.34.42.28 is now dead. INFO 09:24:43,393 InetAddress /10.77.63.49 is now dead. INFO 09:24:43,394
[jira] [Updated] (CASSANDRA-3833) support arbitrary topology transitions
[ https://issues.apache.org/jira/browse/CASSANDRA-3833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Schuller updated CASSANDRA-3833: -- Summary: support arbitrary topology transitions (was: support locator support arbitrary topology transitions ) support arbitrary topology transitions --- Key: CASSANDRA-3833 URL: https://issues.apache.org/jira/browse/CASSANDRA-3833 Project: Cassandra Issue Type: New Feature Components: Core Reporter: Peter Schuller Assignee: Peter Schuller Once we have the locator abstracted (with the gossiper being a particular concrete implementation), we want to change the locator abstraction to not express changes in ring topology on a per-node basis; rather we want to use an abstraction which communicates two arbitrary ring states; one state for the read set, and one for the write set. Once this abstraction is in place, the (pluggable) locator will be able to make bulk changes to a ring at once. Main points: * Must be careful in handling consistency level during ring transitions, such that a given node in the read set corresponds to a specific node in the write set. This will impose some restrictions on completion of transitions, to avoid code complexity, so it is an important point. * All code outside of gossip (and any other locator that works similarly) will be agnostic about individual changes to nodes, and will instead only be notified when new ring states are available (in aggregate). This makes the change non-trivial because all code that currently is oriented around individual node changes always producing a valid ring, will have to be changed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-3888) CliClient assertion error on cf update with keys_cached
[ https://issues.apache.org/jira/browse/CASSANDRA-3888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Schuller updated CASSANDRA-3888: -- Attachment: CASSANDRA-3888-1.1.txt CliClient assertion error on cf update with keys_cached --- Key: CASSANDRA-3888 URL: https://issues.apache.org/jira/browse/CASSANDRA-3888 Project: Cassandra Issue Type: Bug Components: Tools Reporter: Peter Schuller Assignee: Peter Schuller Attachments: CASSANDRA-3888-1.1.txt {code} [default@churnkeyspace] update column family churncf with keys_cached=100; Exception in thread main java.lang.AssertionError at org.apache.cassandra.cli.CliClient.updateCfDefAttributes(CliClient.java:1244) at org.apache.cassandra.cli.CliClient.executeUpdateColumnFamily(CliClient.java:1091) at org.apache.cassandra.cli.CliClient.executeCLIStatement(CliClient.java:234) at org.apache.cassandra.cli.CliMain.processStatementInteractive(CliMain.java:219) at org.apache.cassandra.cli.CliMain.main(CliMain.java:346) {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-3888) remove no-longer-valid values from ColumnFamilyArgument enum
[ https://issues.apache.org/jira/browse/CASSANDRA-3888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Schuller updated CASSANDRA-3888: -- Fix Version/s: 1.1.0 remove no-longer-valid values from ColumnFamilyArgument enum Key: CASSANDRA-3888 URL: https://issues.apache.org/jira/browse/CASSANDRA-3888 Project: Cassandra Issue Type: Bug Components: Tools Reporter: Peter Schuller Assignee: Peter Schuller Fix For: 1.1.0 Attachments: CASSANDRA-3888-1.1.txt {code} [default@churnkeyspace] update column family churncf with keys_cached=100; Exception in thread main java.lang.AssertionError at org.apache.cassandra.cli.CliClient.updateCfDefAttributes(CliClient.java:1244) at org.apache.cassandra.cli.CliClient.executeUpdateColumnFamily(CliClient.java:1091) at org.apache.cassandra.cli.CliClient.executeCLIStatement(CliClient.java:234) at org.apache.cassandra.cli.CliMain.processStatementInteractive(CliMain.java:219) at org.apache.cassandra.cli.CliMain.main(CliMain.java:346) {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-3888) remove no-longer-valid values from ColumnFamilyArgument enum
[ https://issues.apache.org/jira/browse/CASSANDRA-3888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Schuller updated CASSANDRA-3888: -- Summary: remove no-longer-valid values from ColumnFamilyArgument enum (was: CliClient assertion error on cf update with keys_cached) remove no-longer-valid values from ColumnFamilyArgument enum Key: CASSANDRA-3888 URL: https://issues.apache.org/jira/browse/CASSANDRA-3888 Project: Cassandra Issue Type: Bug Components: Tools Reporter: Peter Schuller Assignee: Peter Schuller Fix For: 1.1.0 Attachments: CASSANDRA-3888-1.1.txt {code} [default@churnkeyspace] update column family churncf with keys_cached=100; Exception in thread main java.lang.AssertionError at org.apache.cassandra.cli.CliClient.updateCfDefAttributes(CliClient.java:1244) at org.apache.cassandra.cli.CliClient.executeUpdateColumnFamily(CliClient.java:1091) at org.apache.cassandra.cli.CliClient.executeCLIStatement(CliClient.java:234) at org.apache.cassandra.cli.CliMain.processStatementInteractive(CliMain.java:219) at org.apache.cassandra.cli.CliMain.main(CliMain.java:346) {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-3832) gossip stage backed up due to migration manager future de-ref
[ https://issues.apache.org/jira/browse/CASSANDRA-3832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Schuller updated CASSANDRA-3832: -- Attachment: CASSANDRA-3832-trunk-dontwaitonfuture.txt Attaching simple patch to just not wait on the future. Given that we have no special code path to handle timeouts anyway, this does not introduce any actual lack of failure handling beyond what is already there, so as far as I can tell it should not cause any failure to reach schema agreement that we would not already be vulnerable to. Also upping priority since this bug causes clusters to refuse to start up even with full cluster re-starts by the operator. gossip stage backed up due to migration manager future de-ref -- Key: CASSANDRA-3832 URL: https://issues.apache.org/jira/browse/CASSANDRA-3832 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 1.1 Reporter: Peter Schuller Assignee: Peter Schuller Priority: Critical Fix For: 1.1 Attachments: CASSANDRA-3832-trunk-dontwaitonfuture.txt This is just bootstrapping a ~ 180 trunk cluster. After a while, a node I was on was stuck with thinking all nodes are down, because gossip stage was backed up, because it was spending a long time (multiple seconds or more, I suppose RPC timeout maybe) doing the following. Cluster-wide restart - back to normal. I have not investigated further. {code} GossipStage:1 daemon prio=10 tid=0x7f9d5847a800 nid=0xa6fc waiting on condition [0x4345f000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for 0x0005029ad1c0 (a java.util.concurrent.FutureTask$Sync) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158) at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:811) at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:969) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1281) at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:218) at java.util.concurrent.FutureTask.get(FutureTask.java:83) at org.apache.cassandra.utils.FBUtilities.waitOnFuture(FBUtilities.java:364) at org.apache.cassandra.service.MigrationManager.rectifySchema(MigrationManager.java:132) at org.apache.cassandra.service.MigrationManager.onAlive(MigrationManager.java:75) at org.apache.cassandra.gms.Gossiper.markAlive(Gossiper.java:802) at org.apache.cassandra.gms.Gossiper.applyStateLocally(Gossiper.java:918) at org.apache.cassandra.gms.GossipDigestAckVerbHandler.doVerb(GossipDigestAckVerbHandler.java:68) {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-3832) gossip stage backed up due to migration manager future de-ref
[ https://issues.apache.org/jira/browse/CASSANDRA-3832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Schuller updated CASSANDRA-3832: -- Priority: Blocker (was: Critical) gossip stage backed up due to migration manager future de-ref -- Key: CASSANDRA-3832 URL: https://issues.apache.org/jira/browse/CASSANDRA-3832 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 1.1 Reporter: Peter Schuller Assignee: Peter Schuller Priority: Blocker Fix For: 1.1 Attachments: CASSANDRA-3832-trunk-dontwaitonfuture.txt This is just bootstrapping a ~ 180 trunk cluster. After a while, a node I was on was stuck with thinking all nodes are down, because gossip stage was backed up, because it was spending a long time (multiple seconds or more, I suppose RPC timeout maybe) doing the following. Cluster-wide restart - back to normal. I have not investigated further. {code} GossipStage:1 daemon prio=10 tid=0x7f9d5847a800 nid=0xa6fc waiting on condition [0x4345f000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for 0x0005029ad1c0 (a java.util.concurrent.FutureTask$Sync) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158) at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:811) at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:969) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1281) at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:218) at java.util.concurrent.FutureTask.get(FutureTask.java:83) at org.apache.cassandra.utils.FBUtilities.waitOnFuture(FBUtilities.java:364) at org.apache.cassandra.service.MigrationManager.rectifySchema(MigrationManager.java:132) at org.apache.cassandra.service.MigrationManager.onAlive(MigrationManager.java:75) at org.apache.cassandra.gms.Gossiper.markAlive(Gossiper.java:802) at org.apache.cassandra.gms.Gossiper.applyStateLocally(Gossiper.java:918) at org.apache.cassandra.gms.GossipDigestAckVerbHandler.doVerb(GossipDigestAckVerbHandler.java:68) {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-3831) scaling to large clusters in GossipStage impossible due to calculatePendingRanges
[ https://issues.apache.org/jira/browse/CASSANDRA-3831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Schuller updated CASSANDRA-3831: -- Attachment: CASSANDRA-3831-trunk-group-add-dc-tokens.txt Attaching {{CASSANDRA\-3831\-trunk\-group\-add\-dc\-tokens.txt}} which adds a group-update interface to TokenMetadata that allows NTS to use it when constructing it's local per-dc meta datas. This is not even close to a complete fix for this issue, but I do think it is a clean change because it makes sense in terms of TokenMetadata API to provide a group-update method given the expense involved. And given it's existence, it makes sense for NTS to use it. This change mitigates the problem significantly on the ~ 180 node test cluster since it takes a way an {{n}} from the complexity, and should significantly raise the bar of how many nodes in a cluster is realistic without other changes. I think this might be a fix worthwhile committing because it feels safe and is maybe a candidate for the 1.1 release, assuming review doesn't yield anything obvious. But, leaving the JIRA open for a more overarching fix (I'm not sure what that is at the moment; I'm mulling it over). scaling to large clusters in GossipStage impossible due to calculatePendingRanges -- Key: CASSANDRA-3831 URL: https://issues.apache.org/jira/browse/CASSANDRA-3831 Project: Cassandra Issue Type: Bug Components: Core Reporter: Peter Schuller Assignee: Peter Schuller Priority: Critical Attachments: CASSANDRA-3831-memoization-not-for-inclusion.txt, CASSANDRA-3831-trunk-group-add-dc-tokens.txt (most observations below are from 0.8, but I just now tested on trunk and I can trigger this problem *just* by bootstrapping a ~180 nod cluster concurrently, presumably due to the number of nodes that are simultaneously in bootstrap state) It turns out that: * (1) calculatePendingRanges is not just expensive, it's computationally complex - cubic or worse * (2) it gets called *NOT* just once per node being bootstrapped/leaving etc, but is called repeatedly *while* nodes are in these states As a result, clusters start exploding when you start reading 100-300 nodes. The GossipStage will get backed up because a single calculdatePenginRanges takes seconds, and depending on what the average heartbeat interval is in relation to this, this can lead to *massive* cluster-wide flapping. This all started because we hit this in production; several nodes would start flapping several other nodes as down, with many nodes seeing the entire cluster, or a large portion of it, as down. Logging in to some of these nodes you would see that they would be constantly flapping up/down for minutes at a time until one became lucky and it stabilized. In the end we had to perform an emergency full-cluster restart with gossip patched to force-forget certain nodes in bootstrapping state. I can't go into all details here from the post-mortem (just the write-up would take a day), but in short: * We graphed the number of hosts in the cluster that had more than 5 Down (in a cluster that should have 0 down) on a minutely timeline. * We also graphed the number of hosts in the cluster that had GossipStage backed up. * The two graphs correlated *extremely* well * jstack sampling showed it being CPU bound doing mostly sorting under calculatePendingRanges * We were never able to exactly reproduce it with normal RING_DELAY and gossip intervals, even on a 184 node cluster (the production cluster is around 180). * Dropping RING_DELAY and in particular dropping gossip interval to 10 ms instead of 1000 ms, we were able to observe all of the behavior we saw in production. So our steps to reproduce are: * Launch 184 node cluster w/ gossip interval at 10ms and RING_DELAY at 1 second. * Do something like: {{while [ 1 ] ; do date ; echo decom ; nodetool decommission ; date ; echo done leaving decommed for a while ; sleep 3 ; date ; echo done restarting; sudo rm -rf /data/disk1/commitlog/* ; sudo rm -rf /data/diskarray/tables/* ; sudo monit restart cassandra ;date ; echo restarted waiting for a while ; sleep 40; done}} (or just do a manual decom/bootstrap once, it triggers every time) * Watch all nodes flap massively and not recover at all, or maybe after a *long* time. I observed the flapping using a python script that every 5 second (randomly spread out) asked for unreachable nodes from *all* nodes in the cluster, and printed any nodes and their counts when they had unreachables 5. The cluster can be observed instantly going into massive flapping when leaving/bootstrap is initiated. Script needs Cassandra running with Jolokia enabled for
[jira] [Updated] (CASSANDRA-3735) Fix Unable to create hard link SSTableReaderTest error messages
[ https://issues.apache.org/jira/browse/CASSANDRA-3735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Schuller updated CASSANDRA-3735: -- Attachment: 0002-remove-incremental-backups-before-reloading-sstables-v2.patch Attaching new version of 0002* that works post CASSANDRA_2794. Fix Unable to create hard link SSTableReaderTest error messages - Key: CASSANDRA-3735 URL: https://issues.apache.org/jira/browse/CASSANDRA-3735 Project: Cassandra Issue Type: Bug Reporter: Jonathan Ellis Assignee: Jonathan Ellis Priority: Minor Attachments: 0001-fix-generation-update-in-loadNewSSTables.patch, 0002-remove-incremental-backups-before-reloading-sstables-v2.patch, 0002-remove-incremental-backups-before-reloading-sstables.patch Sample failure (on Windows): {noformat} [junit] java.io.IOException: Exception while executing the command: cmd /c mklink /H C:\Users\Jonathan\projects\cassandra\git\build\test\cassandra\data\Keyspace1\backups\Standard1-hc-1-Index.db c:\Users\Jonathan\projects\cassandra\git\build\test\cassandra\data\Keyspace1\Standard1-hc-1-Index.db,command error Code: 1, command output: Cannot create a file when that file already exists. [junit] [junit] at org.apache.cassandra.utils.CLibrary.exec(CLibrary.java:213) [junit] at org.apache.cassandra.utils.CLibrary.createHardLinkWithExec(CLibrary.java:188) [junit] at org.apache.cassandra.utils.CLibrary.createHardLink(CLibrary.java:151) [junit] at org.apache.cassandra.io.sstable.SSTableReader.createLinks(SSTableReader.java:833) [junit] at org.apache.cassandra.db.DataTracker$1.runMayThrow(DataTracker.java:161) [junit] at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30) [junit] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) [junit] at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) [junit] at java.util.concurrent.FutureTask.run(FutureTask.java:138) [junit] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:98) [junit] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:206) [junit] at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) [junit] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) [junit] at java.lang.Thread.run(Thread.java:662) [junit] ERROR 17:10:17,111 Fatal exception in thread Thread[NonPeriodicTasks:1,5,main] {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-3735) Fix Unable to create hard link SSTableReaderTest error messages
[ https://issues.apache.org/jira/browse/CASSANDRA-3735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Schuller updated CASSANDRA-3735: -- Attachment: 0003-reset-file-index-generator-on-reset.patch Fix Unable to create hard link SSTableReaderTest error messages - Key: CASSANDRA-3735 URL: https://issues.apache.org/jira/browse/CASSANDRA-3735 Project: Cassandra Issue Type: Bug Reporter: Jonathan Ellis Assignee: Jonathan Ellis Priority: Minor Attachments: 0001-fix-generation-update-in-loadNewSSTables.patch, 0002-remove-incremental-backups-before-reloading-sstables-v2.patch, 0002-remove-incremental-backups-before-reloading-sstables.patch, 0003-reset-file-index-generator-on-reset.patch Sample failure (on Windows): {noformat} [junit] java.io.IOException: Exception while executing the command: cmd /c mklink /H C:\Users\Jonathan\projects\cassandra\git\build\test\cassandra\data\Keyspace1\backups\Standard1-hc-1-Index.db c:\Users\Jonathan\projects\cassandra\git\build\test\cassandra\data\Keyspace1\Standard1-hc-1-Index.db,command error Code: 1, command output: Cannot create a file when that file already exists. [junit] [junit] at org.apache.cassandra.utils.CLibrary.exec(CLibrary.java:213) [junit] at org.apache.cassandra.utils.CLibrary.createHardLinkWithExec(CLibrary.java:188) [junit] at org.apache.cassandra.utils.CLibrary.createHardLink(CLibrary.java:151) [junit] at org.apache.cassandra.io.sstable.SSTableReader.createLinks(SSTableReader.java:833) [junit] at org.apache.cassandra.db.DataTracker$1.runMayThrow(DataTracker.java:161) [junit] at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30) [junit] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) [junit] at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) [junit] at java.util.concurrent.FutureTask.run(FutureTask.java:138) [junit] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:98) [junit] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:206) [junit] at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) [junit] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) [junit] at java.lang.Thread.run(Thread.java:662) [junit] ERROR 17:10:17,111 Fatal exception in thread Thread[NonPeriodicTasks:1,5,main] {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-3831) scaling to large clusters in GossipStage impossible due to calculatePendingRanges
[ https://issues.apache.org/jira/browse/CASSANDRA-3831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Schuller updated CASSANDRA-3831: -- Attachment: CASSANDRA-3831-memoization-not-for-inclusion.txt I am attaching {{CASSANDRA\-3831\-memoization\-not\-for\-inclusion.txt}} as an FYI and in case it helps others. It's against 0.8, and implements memoization of calculate pending ranges. The correct/clean fix is probably to change behavior so that it doesn't get called unnecessarily to begin with. This patch was made specifically to address the production issue we are having in a minimally dangerous fashion, and is not to be taken as a suggested fix. scaling to large clusters in GossipStage impossible due to calculatePendingRanges -- Key: CASSANDRA-3831 URL: https://issues.apache.org/jira/browse/CASSANDRA-3831 Project: Cassandra Issue Type: Bug Components: Core Reporter: Peter Schuller Assignee: Peter Schuller Priority: Critical Attachments: CASSANDRA-3831-memoization-not-for-inclusion.txt (most observations below are from 0.8, but I just now tested on trunk and I can trigger this problem *just* by bootstrapping a ~180 nod cluster concurrently, presumably due to the number of nodes that are simultaneously in bootstrap state) It turns out that: * (1) calculatePendingRanges is not just expensive, it's computationally complex - cubic or worse * (2) it gets called *NOT* just once per node being bootstrapped/leaving etc, but is called repeatedly *while* nodes are in these states As a result, clusters start exploding when you start reading 100-300 nodes. The GossipStage will get backed up because a single calculdatePenginRanges takes seconds, and depending on what the average heartbeat interval is in relation to this, this can lead to *massive* cluster-wide flapping. This all started because we hit this in production; several nodes would start flapping several other nodes as down, with many nodes seeing the entire cluster, or a large portion of it, as down. Logging in to some of these nodes you would see that they would be constantly flapping up/down for minutes at a time until one became lucky and it stabilized. In the end we had to perform an emergency full-cluster restart with gossip patched to force-forget certain nodes in bootstrapping state. I can't go into all details here from the post-mortem (just the write-up would take a day), but in short: * We graphed the number of hosts in the cluster that had more than 5 Down (in a cluster that should have 0 down) on a minutely timeline. * We also graphed the number of hosts in the cluster that had GossipStage backed up. * The two graphs correlated *extremely* well * jstack sampling showed it being CPU bound doing mostly sorting under calculatePendingRanges * We were never able to exactly reproduce it with normal RING_DELAY and gossip intervals, even on a 184 node cluster (the production cluster is around 180). * Dropping RING_DELAY and in particular dropping gossip interval to 10 ms instead of 1000 ms, we were able to observe all of the behavior we saw in production. So our steps to reproduce are: * Launch 184 node cluster w/ gossip interval at 10ms and RING_DELAY at 1 second. * Do something like: {{while [ 1 ] ; do date ; echo decom ; nodetool decommission ; date ; echo done leaving decommed for a while ; sleep 3 ; date ; echo done restarting; sudo rm -rf /data/disk1/commitlog/* ; sudo rm -rf /data/diskarray/tables/* ; sudo monit restart cassandra ;date ; echo restarted waiting for a while ; sleep 40; done}} (or just do a manual decom/bootstrap once, it triggers every time) * Watch all nodes flap massively and not recover at all, or maybe after a *long* time. I observed the flapping using a python script that every 5 second (randomly spread out) asked for unreachable nodes from *all* nodes in the cluster, and printed any nodes and their counts when they had unreachables 5. The cluster can be observed instantly going into massive flapping when leaving/bootstrap is initiated. Script needs Cassandra running with Jolokia enabled for http/json access to JMX. Can provide scrit if needed after cleanup. The phi conviction, based on logging I added, was legitimate. Using the 10 ms interval the average heartbeat interval ends up being like 25 ms or something like that. As a result, a single ~ 2 second delay in gossip stage is huge in comparison to those 25 ms, and so we go past the phi conviction threshold. This is much more sensitive than in production, but it's the *same* effect, even if it triggers less easily for real. The best work around currently internally is to memoize calculatePendingRanges
[jira] [Updated] (CASSANDRA-3829) make seeds *only* be seeds, not special in gossip
[ https://issues.apache.org/jira/browse/CASSANDRA-3829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Schuller updated CASSANDRA-3829: -- Issue Type: Improvement (was: Bug) make seeds *only* be seeds, not special in gossip -- Key: CASSANDRA-3829 URL: https://issues.apache.org/jira/browse/CASSANDRA-3829 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Peter Schuller Assignee: Peter Schuller Priority: Minor First, a little bit of framing on how seeds work: The concept of seed hosts makes fundamental sense; you need to seed a new node with some information required in order to join a cluster. Seed hosts is the information Cassandra uses for this purpose. But seed hosts play a role even after the initial start-up of a new node in a ring. Specifically, seed hosts continue to be gossiped to separately by the Gossiper throughout the life of a node and the cluster. Generally, operators must be careful to ensure that all nodes in a cluster are appropriately configured to refer to an overlapping set of seed hosts. Strictly speaking this should not be necessary (see further down though), but is the general recommendation. An unfortunate side-effect of this is that whenever you are doing ring management, such as replacing nodes, removing nodes, etc, you have to keep in mind which nodes are seeds. For example, if you bring a new node into the cluster, doing everything right with token assignment and auto_bootstrap=true, it will just enter the cluster without bootstrap - causing inconsistent reads. This is dangerous. And worse - changing the notion of which nodes are seeds across a cluster requires a *rolling restart*. It can be argued that it should actually be okay for nodes other than the one being fiddled with to incorrectly treat the fiddled-with node as a seed node, but this fact is highly opaque to most users that are not intimately familiar with Cassandra internals. This adds additional complexity to operations, as it introduces a reason why you cannot view the ring as completely homogeneous, despite the fundamental idea of Cassandra that all nodes should be equal. Now, fast forward a bit to what we are doing over here to avoid this problem: We have a zookeeper based systems for keeping track of hosts in a cluster, which is used by our Cassandra client to discover nodes to talk to. This works well. In order to avoid the need to manually keep track of seeds, we wanted to make seeds be automatically discoverable in order to eliminate as an operational concern. We have implemented a seed provider that does this for us, based on the data we keep in zookeeper. We could see essentially three ways of plugging this in: * (1) We could simply rely on not needing overlapping seeds and grab whatever we have when a node starts. * (2) We could do something like continually treat all other nodes as seeds by dynamically changing the seed list (involves some other changes like having the Gossiper update it's notion of seeds. * (3) We could completely eliminate the use of seeds *except* for the very specific purpose of initial start-up of an unbootstrapped node, and keep using a static (for the duration of the node's uptime) seed list. (3) was attractive because it felt like this was the original intent of seeds; that they be used for *seeding*, and not be constantly required during cluster operation once nodes are already joined. Now before I make the suggestion, let me explain how we are currently (though not yet in production) handling seeds and start-up. First, we have the following relevant cases to consider during a normal start-up: * (a) we are starting up a cluster for the very first time * (b) we are starting up a new clean node in order to join it to a pre-existing cluster * (c) we are starting up a pre-existing already joined node in a pre-existing cluster First, we proceeded on the assumption that we wanted to remove the use of seeds during regular gossip (other than on initial startup). This means that for the (c) case, we can *completely* ignore seeds. We never even have to discover the seed list, or if we do, we don't have to use them. This leaves (a) and (b). In both cases, the critical invariant we want to achieve is that we must have one or more *valid* seeds (valid means for (b) that the seed is in the cluster, and for (a) that it is one of the nodes that are part of the initial cluster setup). In the (c) case the problem is trivial - ignore seeds. In the (a) case, the algorithm is: * Register with zookeeper as a seed * Wait until we see *at least one* seed *other than ourselves* in zookeeper * Continue regular start-up process with the seed list (with 1 or more seeds) In the
[jira] [Updated] (CASSANDRA-3417) InvocationTargetException ConcurrentModificationException at startup
[ https://issues.apache.org/jira/browse/CASSANDRA-3417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Schuller updated CASSANDRA-3417: -- Attachment: CASSANDRA-3417-tokenmap.txt {{CASSANDRA\-3417\-tokenmap.txt}} attached. InvocationTargetException ConcurrentModificationException at startup Key: CASSANDRA-3417 URL: https://issues.apache.org/jira/browse/CASSANDRA-3417 Project: Cassandra Issue Type: Bug Affects Versions: 1.0.0 Reporter: Joaquin Casares Assignee: Peter Schuller Priority: Minor Fix For: 1.0.8 Attachments: 3417-2.txt, 3417-3.txt, 3417.txt, CASSANDRA-3417-tokenmap.txt I was starting up the new DataStax AMI where the seed starts first and 34 nodes would latch on together. So far things have been working decently for launching, but right now I just got this during startup. {CODE} ubuntu@ip-10-40-190-143:~$ sudo cat /var/log/cassandra/output.log INFO 09:24:38,453 JVM vendor/version: Java HotSpot(TM) 64-Bit Server VM/1.6.0_26 INFO 09:24:38,456 Heap size: 1936719872/1937768448 INFO 09:24:38,457 Classpath: /usr/share/cassandra/lib/antlr-3.2.jar:/usr/share/cassandra/lib/avro-1.4.0-fixes.jar:/usr/share/cassandra/lib/avro-1.4.0-sources-fixes.jar:/usr/share/cassandra/lib/commons-cli-1.1.jar:/usr/share/cassandra/lib/commons-codec-1.2.jar:/usr/share/cassandra/lib/commons-lang-2.4.jar:/usr/share/cassandra/lib/compress-lzf-0.8.4.jar:/usr/share/cassandra/lib/concurrentlinkedhashmap-lru-1.2.jar:/usr/share/cassandra/lib/guava-r08.jar:/usr/share/cassandra/lib/high-scale-lib-1.1.2.jar:/usr/share/cassandra/lib/jackson-core-asl-1.4.0.jar:/usr/share/cassandra/lib/jackson-mapper-asl-1.4.0.jar:/usr/share/cassandra/lib/jamm-0.2.5.jar:/usr/share/cassandra/lib/jline-0.9.94.jar:/usr/share/cassandra/lib/joda-time-1.6.2.jar:/usr/share/cassandra/lib/json-simple-1.1.jar:/usr/share/cassandra/lib/libthrift-0.6.jar:/usr/share/cassandra/lib/log4j-1.2.16.jar:/usr/share/cassandra/lib/servlet-api-2.5-20081211.jar:/usr/share/cassandra/lib/slf4j-api-1.6.1.jar:/usr/share/cassandra/lib/slf4j-log4j12-1.6.1.jar:/usr/share/cassandra/lib/snakeyaml-1.6.jar:/usr/share/cassandra/lib/snappy-java-1.0.3.jar:/usr/share/cassandra/apache-cassandra-1.0.0.jar:/usr/share/cassandra/apache-cassandra-thrift-1.0.0.jar:/usr/share/cassandra/apache-cassandra.jar:/usr/share/java/jna.jar:/etc/cassandra:/usr/share/java/commons-daemon.jar:/usr/share/cassandra/lib/jamm-0.2.5.jar INFO 09:24:39,891 JNA mlockall successful INFO 09:24:39,901 Loading settings from file:/etc/cassandra/cassandra.yaml INFO 09:24:40,057 DiskAccessMode 'auto' determined to be mmap, indexAccessMode is mmap INFO 09:24:40,069 Global memtable threshold is enabled at 616MB INFO 09:24:40,159 EC2Snitch using region: us-east, zone: 1d. INFO 09:24:40,475 Creating new commitlog segment /raid0/cassandra/commitlog/CommitLog-1319793880475.log INFO 09:24:40,486 Couldn't detect any schema definitions in local storage. INFO 09:24:40,486 Found table data in data directories. Consider using the CLI to define your schema. INFO 09:24:40,497 No commitlog files found; skipping replay INFO 09:24:40,501 Cassandra version: 1.0.0 INFO 09:24:40,502 Thrift API version: 19.18.0 INFO 09:24:40,502 Loading persisted ring state INFO 09:24:40,506 Starting up server gossip INFO 09:24:40,529 Enqueuing flush of Memtable-LocationInfo@1388314661(190/237 serialized/live bytes, 4 ops) INFO 09:24:40,530 Writing Memtable-LocationInfo@1388314661(190/237 serialized/live bytes, 4 ops) INFO 09:24:40,600 Completed flushing /raid0/cassandra/data/system/LocationInfo-h-1-Data.db (298 bytes) INFO 09:24:40,613 Ec2Snitch adding ApplicationState ec2region=us-east ec2zone=1d INFO 09:24:40,621 Starting Messaging Service on /10.40.190.143:7000 INFO 09:24:40,628 Joining: waiting for ring and schema information INFO 09:24:43,389 InetAddress /10.194.29.156 is now dead. INFO 09:24:43,391 InetAddress /10.85.11.38 is now dead. INFO 09:24:43,392 InetAddress /10.34.42.28 is now dead. INFO 09:24:43,393 InetAddress /10.77.63.49 is now dead. INFO 09:24:43,394 InetAddress /10.194.22.191 is now dead. INFO 09:24:43,395 InetAddress /10.34.74.58 is now dead. INFO 09:24:43,395 Node /10.34.33.16 is now part of the cluster INFO 09:24:43,396 InetAddress /10.34.33.16 is now UP INFO 09:24:43,397 Enqueuing flush of Memtable-LocationInfo@1629818866(20/25 serialized/live bytes, 1 ops) INFO 09:24:43,398 Writing Memtable-LocationInfo@1629818866(20/25 serialized/live bytes, 1 ops) INFO 09:24:43,417 Completed flushing /raid0/cassandra/data/system/LocationInfo-h-2-Data.db (74 bytes) INFO 09:24:43,418 InetAddress /10.202.67.43 is now dead. INFO 09:24:43,419 InetAddress /10.116.215.81 is now
[jira] [Updated] (CASSANDRA-3417) InvocationTargetException ConcurrentModificationException at startup
[ https://issues.apache.org/jira/browse/CASSANDRA-3417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Schuller updated CASSANDRA-3417: -- Reviewer: jbellis (was: scode) InvocationTargetException ConcurrentModificationException at startup Key: CASSANDRA-3417 URL: https://issues.apache.org/jira/browse/CASSANDRA-3417 Project: Cassandra Issue Type: Bug Affects Versions: 1.0.0 Reporter: Joaquin Casares Assignee: Peter Schuller Priority: Minor Fix For: 1.0.8 Attachments: 3417-2.txt, 3417-3.txt, 3417.txt, CASSANDRA-3417-tokenmap.txt I was starting up the new DataStax AMI where the seed starts first and 34 nodes would latch on together. So far things have been working decently for launching, but right now I just got this during startup. {CODE} ubuntu@ip-10-40-190-143:~$ sudo cat /var/log/cassandra/output.log INFO 09:24:38,453 JVM vendor/version: Java HotSpot(TM) 64-Bit Server VM/1.6.0_26 INFO 09:24:38,456 Heap size: 1936719872/1937768448 INFO 09:24:38,457 Classpath: /usr/share/cassandra/lib/antlr-3.2.jar:/usr/share/cassandra/lib/avro-1.4.0-fixes.jar:/usr/share/cassandra/lib/avro-1.4.0-sources-fixes.jar:/usr/share/cassandra/lib/commons-cli-1.1.jar:/usr/share/cassandra/lib/commons-codec-1.2.jar:/usr/share/cassandra/lib/commons-lang-2.4.jar:/usr/share/cassandra/lib/compress-lzf-0.8.4.jar:/usr/share/cassandra/lib/concurrentlinkedhashmap-lru-1.2.jar:/usr/share/cassandra/lib/guava-r08.jar:/usr/share/cassandra/lib/high-scale-lib-1.1.2.jar:/usr/share/cassandra/lib/jackson-core-asl-1.4.0.jar:/usr/share/cassandra/lib/jackson-mapper-asl-1.4.0.jar:/usr/share/cassandra/lib/jamm-0.2.5.jar:/usr/share/cassandra/lib/jline-0.9.94.jar:/usr/share/cassandra/lib/joda-time-1.6.2.jar:/usr/share/cassandra/lib/json-simple-1.1.jar:/usr/share/cassandra/lib/libthrift-0.6.jar:/usr/share/cassandra/lib/log4j-1.2.16.jar:/usr/share/cassandra/lib/servlet-api-2.5-20081211.jar:/usr/share/cassandra/lib/slf4j-api-1.6.1.jar:/usr/share/cassandra/lib/slf4j-log4j12-1.6.1.jar:/usr/share/cassandra/lib/snakeyaml-1.6.jar:/usr/share/cassandra/lib/snappy-java-1.0.3.jar:/usr/share/cassandra/apache-cassandra-1.0.0.jar:/usr/share/cassandra/apache-cassandra-thrift-1.0.0.jar:/usr/share/cassandra/apache-cassandra.jar:/usr/share/java/jna.jar:/etc/cassandra:/usr/share/java/commons-daemon.jar:/usr/share/cassandra/lib/jamm-0.2.5.jar INFO 09:24:39,891 JNA mlockall successful INFO 09:24:39,901 Loading settings from file:/etc/cassandra/cassandra.yaml INFO 09:24:40,057 DiskAccessMode 'auto' determined to be mmap, indexAccessMode is mmap INFO 09:24:40,069 Global memtable threshold is enabled at 616MB INFO 09:24:40,159 EC2Snitch using region: us-east, zone: 1d. INFO 09:24:40,475 Creating new commitlog segment /raid0/cassandra/commitlog/CommitLog-1319793880475.log INFO 09:24:40,486 Couldn't detect any schema definitions in local storage. INFO 09:24:40,486 Found table data in data directories. Consider using the CLI to define your schema. INFO 09:24:40,497 No commitlog files found; skipping replay INFO 09:24:40,501 Cassandra version: 1.0.0 INFO 09:24:40,502 Thrift API version: 19.18.0 INFO 09:24:40,502 Loading persisted ring state INFO 09:24:40,506 Starting up server gossip INFO 09:24:40,529 Enqueuing flush of Memtable-LocationInfo@1388314661(190/237 serialized/live bytes, 4 ops) INFO 09:24:40,530 Writing Memtable-LocationInfo@1388314661(190/237 serialized/live bytes, 4 ops) INFO 09:24:40,600 Completed flushing /raid0/cassandra/data/system/LocationInfo-h-1-Data.db (298 bytes) INFO 09:24:40,613 Ec2Snitch adding ApplicationState ec2region=us-east ec2zone=1d INFO 09:24:40,621 Starting Messaging Service on /10.40.190.143:7000 INFO 09:24:40,628 Joining: waiting for ring and schema information INFO 09:24:43,389 InetAddress /10.194.29.156 is now dead. INFO 09:24:43,391 InetAddress /10.85.11.38 is now dead. INFO 09:24:43,392 InetAddress /10.34.42.28 is now dead. INFO 09:24:43,393 InetAddress /10.77.63.49 is now dead. INFO 09:24:43,394 InetAddress /10.194.22.191 is now dead. INFO 09:24:43,395 InetAddress /10.34.74.58 is now dead. INFO 09:24:43,395 Node /10.34.33.16 is now part of the cluster INFO 09:24:43,396 InetAddress /10.34.33.16 is now UP INFO 09:24:43,397 Enqueuing flush of Memtable-LocationInfo@1629818866(20/25 serialized/live bytes, 1 ops) INFO 09:24:43,398 Writing Memtable-LocationInfo@1629818866(20/25 serialized/live bytes, 1 ops) INFO 09:24:43,417 Completed flushing /raid0/cassandra/data/system/LocationInfo-h-2-Data.db (74 bytes) INFO 09:24:43,418 InetAddress /10.202.67.43 is now dead. INFO 09:24:43,419 InetAddress /10.116.215.81 is now dead. INFO 09:24:43,420 InetAddress /10.99.39.242 is now dead.
[jira] [Updated] (CASSANDRA-3483) Support bringing up a new datacenter to existing cluster without repair
[ https://issues.apache.org/jira/browse/CASSANDRA-3483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Schuller updated CASSANDRA-3483: -- Attachment: CASSANDRA-3483-trunk-refactored-v2.txt {{CASSANDRA\-3483\-trunk\-refactored\-v2.txt}} I believe addresses the concerns, plus makes other improvements. I'm much more happy with this one. It addresses CASSANDRA-3807 by supporting fetch consistency levels (though only ONE is currently usable without patching), and the filtering of hosts is abstracted out. There is still some duplication between {{Bootstrapper.bootstrap()}} and {{StorageService.rebuild()}} in that both do the dance of iteration over tables to construct the final map. I am not really feeling that abstracting away that is a good idea to include in this ticket, though I think it's worthwhile doing at some point separately. The unit test is fixed; my adjustment of it was wrong because I wasn't picking pending ranges (in the test). I've tested both rebuild and bootstrap in a 3 node cluster. I've added some more logging than what is typically the case; there have been several cases where I wished streaming was logged in more detail at INFO, particularly when bootstrapping or rebuilding. I think it's worthwhile to get that in while at it. Support bringing up a new datacenter to existing cluster without repair --- Key: CASSANDRA-3483 URL: https://issues.apache.org/jira/browse/CASSANDRA-3483 Project: Cassandra Issue Type: Bug Affects Versions: 1.0.2 Reporter: Chris Goffinet Assignee: Peter Schuller Attachments: CASSANDRA-3483-0.8-prelim.txt, CASSANDRA-3483-1.0.txt, CASSANDRA-3483-trunk-noredesign.txt, CASSANDRA-3483-trunk-rebase2.txt, CASSANDRA-3483-trunk-refactored-v1.txt, CASSANDRA-3483-trunk-refactored-v2.txt Was talking to Brandon in irc, and we ran into a case where we want to bring up a new DC to an existing cluster. He suggested from jbellis the way to do it currently was set strategy options of dc2:0, then add the nodes. After the nodes are up, change the RF of dc2, and run repair. I'd like to avoid a repair as it runs AES and is a bit more intense than how bootstrap works currently by just streaming ranges from the SSTables. Would it be possible to improve this functionality (adding a new DC to existing cluster) than the proposed method? We'd be happy to do a patch if we got some input on the best way to go about it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-3417) InvocationTargetException ConcurrentModificationException at startup
[ https://issues.apache.org/jira/browse/CASSANDRA-3417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Schuller updated CASSANDRA-3417: -- Attachment: 3417-3.txt Attaching v3 which also adds synchronization in TokenMap.getTokenToEndpointMap() - both for bootstrapTokens and tokenToEndpointMap. (x.putAll(y) is not an atomic observation from the perspective of y, even if it is from the perspective of x) InvocationTargetException ConcurrentModificationException at startup Key: CASSANDRA-3417 URL: https://issues.apache.org/jira/browse/CASSANDRA-3417 Project: Cassandra Issue Type: Bug Affects Versions: 1.0.0 Reporter: Joaquin Casares Assignee: Jonathan Ellis Priority: Minor Fix For: 1.0.8 Attachments: 3417-2.txt, 3417-3.txt, 3417.txt I was starting up the new DataStax AMI where the seed starts first and 34 nodes would latch on together. So far things have been working decently for launching, but right now I just got this during startup. {CODE} ubuntu@ip-10-40-190-143:~$ sudo cat /var/log/cassandra/output.log INFO 09:24:38,453 JVM vendor/version: Java HotSpot(TM) 64-Bit Server VM/1.6.0_26 INFO 09:24:38,456 Heap size: 1936719872/1937768448 INFO 09:24:38,457 Classpath: /usr/share/cassandra/lib/antlr-3.2.jar:/usr/share/cassandra/lib/avro-1.4.0-fixes.jar:/usr/share/cassandra/lib/avro-1.4.0-sources-fixes.jar:/usr/share/cassandra/lib/commons-cli-1.1.jar:/usr/share/cassandra/lib/commons-codec-1.2.jar:/usr/share/cassandra/lib/commons-lang-2.4.jar:/usr/share/cassandra/lib/compress-lzf-0.8.4.jar:/usr/share/cassandra/lib/concurrentlinkedhashmap-lru-1.2.jar:/usr/share/cassandra/lib/guava-r08.jar:/usr/share/cassandra/lib/high-scale-lib-1.1.2.jar:/usr/share/cassandra/lib/jackson-core-asl-1.4.0.jar:/usr/share/cassandra/lib/jackson-mapper-asl-1.4.0.jar:/usr/share/cassandra/lib/jamm-0.2.5.jar:/usr/share/cassandra/lib/jline-0.9.94.jar:/usr/share/cassandra/lib/joda-time-1.6.2.jar:/usr/share/cassandra/lib/json-simple-1.1.jar:/usr/share/cassandra/lib/libthrift-0.6.jar:/usr/share/cassandra/lib/log4j-1.2.16.jar:/usr/share/cassandra/lib/servlet-api-2.5-20081211.jar:/usr/share/cassandra/lib/slf4j-api-1.6.1.jar:/usr/share/cassandra/lib/slf4j-log4j12-1.6.1.jar:/usr/share/cassandra/lib/snakeyaml-1.6.jar:/usr/share/cassandra/lib/snappy-java-1.0.3.jar:/usr/share/cassandra/apache-cassandra-1.0.0.jar:/usr/share/cassandra/apache-cassandra-thrift-1.0.0.jar:/usr/share/cassandra/apache-cassandra.jar:/usr/share/java/jna.jar:/etc/cassandra:/usr/share/java/commons-daemon.jar:/usr/share/cassandra/lib/jamm-0.2.5.jar INFO 09:24:39,891 JNA mlockall successful INFO 09:24:39,901 Loading settings from file:/etc/cassandra/cassandra.yaml INFO 09:24:40,057 DiskAccessMode 'auto' determined to be mmap, indexAccessMode is mmap INFO 09:24:40,069 Global memtable threshold is enabled at 616MB INFO 09:24:40,159 EC2Snitch using region: us-east, zone: 1d. INFO 09:24:40,475 Creating new commitlog segment /raid0/cassandra/commitlog/CommitLog-1319793880475.log INFO 09:24:40,486 Couldn't detect any schema definitions in local storage. INFO 09:24:40,486 Found table data in data directories. Consider using the CLI to define your schema. INFO 09:24:40,497 No commitlog files found; skipping replay INFO 09:24:40,501 Cassandra version: 1.0.0 INFO 09:24:40,502 Thrift API version: 19.18.0 INFO 09:24:40,502 Loading persisted ring state INFO 09:24:40,506 Starting up server gossip INFO 09:24:40,529 Enqueuing flush of Memtable-LocationInfo@1388314661(190/237 serialized/live bytes, 4 ops) INFO 09:24:40,530 Writing Memtable-LocationInfo@1388314661(190/237 serialized/live bytes, 4 ops) INFO 09:24:40,600 Completed flushing /raid0/cassandra/data/system/LocationInfo-h-1-Data.db (298 bytes) INFO 09:24:40,613 Ec2Snitch adding ApplicationState ec2region=us-east ec2zone=1d INFO 09:24:40,621 Starting Messaging Service on /10.40.190.143:7000 INFO 09:24:40,628 Joining: waiting for ring and schema information INFO 09:24:43,389 InetAddress /10.194.29.156 is now dead. INFO 09:24:43,391 InetAddress /10.85.11.38 is now dead. INFO 09:24:43,392 InetAddress /10.34.42.28 is now dead. INFO 09:24:43,393 InetAddress /10.77.63.49 is now dead. INFO 09:24:43,394 InetAddress /10.194.22.191 is now dead. INFO 09:24:43,395 InetAddress /10.34.74.58 is now dead. INFO 09:24:43,395 Node /10.34.33.16 is now part of the cluster INFO 09:24:43,396 InetAddress /10.34.33.16 is now UP INFO 09:24:43,397 Enqueuing flush of Memtable-LocationInfo@1629818866(20/25 serialized/live bytes, 1 ops) INFO 09:24:43,398 Writing Memtable-LocationInfo@1629818866(20/25 serialized/live bytes, 1 ops) INFO 09:24:43,417 Completed flushing
[jira] [Updated] (CASSANDRA-3483) Support bringing up a new datacenter to existing cluster without repair
[ https://issues.apache.org/jira/browse/CASSANDRA-3483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Schuller updated CASSANDRA-3483: -- Attachment: CASSANDRA-3483-trunk-rebase2.txt I do. I'm sorry for the delay, this has been nagging me for quite some time. It's not forgotten, I have just been inundated with urgent stuff to do. I'm attaching a fresh rebase against current trunk and I hope to submit an improved version later tonight (keyword being hope). Support bringing up a new datacenter to existing cluster without repair --- Key: CASSANDRA-3483 URL: https://issues.apache.org/jira/browse/CASSANDRA-3483 Project: Cassandra Issue Type: Bug Affects Versions: 1.0.2 Reporter: Chris Goffinet Assignee: Peter Schuller Attachments: CASSANDRA-3483-0.8-prelim.txt, CASSANDRA-3483-1.0.txt, CASSANDRA-3483-trunk-noredesign.txt, CASSANDRA-3483-trunk-rebase2.txt Was talking to Brandon in irc, and we ran into a case where we want to bring up a new DC to an existing cluster. He suggested from jbellis the way to do it currently was set strategy options of dc2:0, then add the nodes. After the nodes are up, change the RF of dc2, and run repair. I'd like to avoid a repair as it runs AES and is a bit more intense than how bootstrap works currently by just streaming ranges from the SSTables. Would it be possible to improve this functionality (adding a new DC to existing cluster) than the proposed method? We'd be happy to do a patch if we got some input on the best way to go about it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-3806) merge from 1.0 (aa20c7206cdc1efc1983466de05c224eccac1084) breaks build
[ https://issues.apache.org/jira/browse/CASSANDRA-3806?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Schuller updated CASSANDRA-3806: -- Attachment: CASSANDRA-3806.txt Trivial patch attached. merge from 1.0 (aa20c7206cdc1efc1983466de05c224eccac1084) breaks build -- Key: CASSANDRA-3806 URL: https://issues.apache.org/jira/browse/CASSANDRA-3806 Project: Cassandra Issue Type: Bug Components: Core Reporter: Peter Schuller Assignee: Peter Schuller Attachments: CASSANDRA-3806.txt {code} build-project: [echo] apache-cassandra: /tmp/cas/cassandra/build.xml [javac] Compiling 40 source files to /tmp/cas/cassandra/build/classes/thrift [javac] Note: Some input files use unchecked or unsafe operations. [javac] Note: Recompile with -Xlint:unchecked for details. [javac] Compiling 296 source files to /tmp/cas/cassandra/build/classes/main [javac] StorageService.java:1343: illegal start of expression [javac] private MultimapInetAddress, RangeToken getNewSourceRanges(String table, SetRangeToken ranges) [javac] ^ [javac] StorageService.java:1343: ';' expected [javac] private MultimapInetAddress, RangeToken getNewSourceRanges(String table, SetRangeToken ranges) [javac] ^ [javac] StorageService.java:1343: ';' expected [javac] private MultimapInetAddress, RangeToken getNewSourceRanges(String table, SetRangeToken ranges) [javac] ^ [javac] StorageService.java:1343: illegal start of expression [javac] private MultimapInetAddress, RangeToken getNewSourceRanges(String table, SetRangeToken ranges) [javac] ^ [javac] StorageService.java:1343: illegal start of expression [javac] private MultimapInetAddress, RangeToken getNewSourceRanges(String table, SetRangeToken ranges) [javac] ^ [javac] StorageService.java:1343: ';' expected [javac] private MultimapInetAddress, RangeToken getNewSourceRanges(String table, SetRangeToken ranges) [javac] ^ [javac] 6 errors {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-3483) Support bringing up a new datacenter to existing cluster without repair
[ https://issues.apache.org/jira/browse/CASSANDRA-3483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Schuller updated CASSANDRA-3483: -- Attachment: CASSANDRA-3483-trunk-refactored-v1.txt {{CASSANDRA\-3483\-trunk\-refactored\-v1.txt}} addresses the duplication between BootStrapper and RangeStreamer. Next patch will address rebuild/getworkmap duplication. Support bringing up a new datacenter to existing cluster without repair --- Key: CASSANDRA-3483 URL: https://issues.apache.org/jira/browse/CASSANDRA-3483 Project: Cassandra Issue Type: Bug Affects Versions: 1.0.2 Reporter: Chris Goffinet Assignee: Peter Schuller Attachments: CASSANDRA-3483-0.8-prelim.txt, CASSANDRA-3483-1.0.txt, CASSANDRA-3483-trunk-noredesign.txt, CASSANDRA-3483-trunk-rebase2.txt, CASSANDRA-3483-trunk-refactored-v1.txt Was talking to Brandon in irc, and we ran into a case where we want to bring up a new DC to an existing cluster. He suggested from jbellis the way to do it currently was set strategy options of dc2:0, then add the nodes. After the nodes are up, change the RF of dc2, and run repair. I'd like to avoid a repair as it runs AES and is a bit more intense than how bootstrap works currently by just streaming ranges from the SSTables. Would it be possible to improve this functionality (adding a new DC to existing cluster) than the proposed method? We'd be happy to do a patch if we got some input on the best way to go about it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-3796) post-2392 trunk does not build with java 7
[ https://issues.apache.org/jira/browse/CASSANDRA-3796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Schuller updated CASSANDRA-3796: -- Attachment: CASSANDRA-3796-trunk-v1.txt Attaching patch. It would be great if someone from CASSANDRA-2392 could review and make sure I am not introducing a subtle bug by implementing the Comparator at the RowPosition level. post-2392 trunk does not build with java 7 -- Key: CASSANDRA-3796 URL: https://issues.apache.org/jira/browse/CASSANDRA-3796 Project: Cassandra Issue Type: Bug Reporter: Peter Schuller Assignee: Peter Schuller Priority: Minor Attachments: CASSANDRA-3796-trunk-v1.txt See below, on a fresh clone. Builds w/ java 6. {code} [javac] /tmp/c2/cassandra/src/java/org/apache/cassandra/io/sstable/SSTableReader.java:419: error: no suitable method found for binarySearch(ListDecoratedKey,RowPosition) [javac] int index = Collections.binarySearch(indexSummary.getKeys(), key); [javac]^ [javac] method Collections.T#1binarySearch(List? extends T#1,T#1,Comparator? super T#1) is not applicable [javac] (cannot instantiate from arguments because actual and formal argument lists differ in length) [javac] method Collections.T#2binarySearch(List? extends Comparable? super T#2,T#2) is not applicable [javac] (no instance(s) of type variable(s) T#2 exist so that argument type ListDecoratedKey conforms to formal parameter type List? extends Comparable? super T#2) [javac] where T#1,T#2 are type-variables: [javac] T#1 extends Object declared in method T#1binarySearch(List? extends T#1,T#1,Comparator? super T#1) [javac] T#2 extends Object declared in method T#2binarySearch(List? extends Comparable? super T#2,T#2) [javac] /tmp/c2/cassandra/src/java/org/apache/cassandra/io/sstable/SSTableReader.java:509: error: no suitable method found for binarySearch(ListDecoratedKey,RowPosition) [javac] int left = Collections.binarySearch(samples, leftPosition); [javac] ^ [javac] method Collections.T#1binarySearch(List? extends T#1,T#1,Comparator? super T#1) is not applicable [javac] (cannot instantiate from arguments because actual and formal argument lists differ in length) [javac] method Collections.T#2binarySearch(List? extends Comparable? super T#2,T#2) is not applicable [javac] (no instance(s) of type variable(s) T#2 exist so that argument type ListDecoratedKey conforms to formal parameter type List? extends Comparable? super T#2) [javac] where T#1,T#2 are type-variables: [javac] T#1 extends Object declared in method T#1binarySearch(List? extends T#1,T#1,Comparator? super T#1) [javac] T#2 extends Object declared in method T#2binarySearch(List? extends Comparable? super T#2,T#2) [javac] /tmp/c2/cassandra/src/java/org/apache/cassandra/io/sstable/SSTableReader.java:521: error: no suitable method found for binarySearch(ListDecoratedKey,RowPosition) [javac] : Collections.binarySearch(samples, rightPosition); [javac]^ [javac] method Collections.T#1binarySearch(List? extends T#1,T#1,Comparator? super T#1) is not applicable [javac] (cannot instantiate from arguments because actual and formal argument lists differ in length) [javac] method Collections.T#2binarySearch(List? extends Comparable? super T#2,T#2) is not applicable [javac] (no instance(s) of type variable(s) T#2 exist so that argument type ListDecoratedKey conforms to formal parameter type List? extends Comparable? super T#2) [javac] where T#1,T#2 are type-variables: [javac] T#1 extends Object declared in method T#1binarySearch(List? extends T#1,T#1,Comparator? super T#1) [javac] T#2 extends Object declared in method T#2binarySearch(List? extends Comparable? super T#2,T#2) {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-3671) provide JMX counters for unavailables/timeouts for reads and writes
[ https://issues.apache.org/jira/browse/CASSANDRA-3671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Schuller updated CASSANDRA-3671: -- Attachment: CASSANDRA-3671-v1.txt provide JMX counters for unavailables/timeouts for reads and writes --- Key: CASSANDRA-3671 URL: https://issues.apache.org/jira/browse/CASSANDRA-3671 Project: Cassandra Issue Type: Improvement Reporter: Peter Schuller Assignee: Peter Schuller Priority: Minor Attachments: CASSANDRA-3671-trunk-v2.txt, CASSANDRA-3671-trunk.txt Attaching patch against trunk. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-3671) provide JMX counters for unavailables/timeouts for reads and writes
[ https://issues.apache.org/jira/browse/CASSANDRA-3671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Schuller updated CASSANDRA-3671: -- Attachment: (was: CASSANDRA-3671-v1.txt) provide JMX counters for unavailables/timeouts for reads and writes --- Key: CASSANDRA-3671 URL: https://issues.apache.org/jira/browse/CASSANDRA-3671 Project: Cassandra Issue Type: Improvement Reporter: Peter Schuller Assignee: Peter Schuller Priority: Minor Attachments: CASSANDRA-3671-trunk-v2.txt, CASSANDRA-3671-trunk.txt Attaching patch against trunk. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-3671) provide JMX counters for unavailables/timeouts for reads and writes
[ https://issues.apache.org/jira/browse/CASSANDRA-3671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Schuller updated CASSANDRA-3671: -- Attachment: CASSANDRA-3671-trunk-coda-metrics-v1.txt (Marking not intended for inclusion because I don't want to create legal hassles due to the inclusion of the .jar file. I will resubmit when it's time to commit if needed.) provide JMX counters for unavailables/timeouts for reads and writes --- Key: CASSANDRA-3671 URL: https://issues.apache.org/jira/browse/CASSANDRA-3671 Project: Cassandra Issue Type: Improvement Reporter: Peter Schuller Assignee: Peter Schuller Priority: Minor Attachments: CASSANDRA-3671-trunk-coda-metrics-v1.txt, CASSANDRA-3671-trunk-v2.txt, CASSANDRA-3671-trunk.txt Attaching patch against trunk. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-3797) StorageProxy static initialization not triggered until thrift requests come in
[ https://issues.apache.org/jira/browse/CASSANDRA-3797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Schuller updated CASSANDRA-3797: -- Attachment: CASSANDRA-3797-trunk-v1.txt Attaching patch that has StorageService call a NOOP method on StorageProxy during start-up if not in client mode. This feels unclean to me, but barring a bigger change to avoid the subtle static initialization order problem properly it was the easiest/cleanest fix I could think of. StorageProxy static initialization not triggered until thrift requests come in -- Key: CASSANDRA-3797 URL: https://issues.apache.org/jira/browse/CASSANDRA-3797 Project: Cassandra Issue Type: Bug Reporter: Peter Schuller Assignee: Peter Schuller Priority: Minor Attachments: CASSANDRA-3797-trunk-v1.txt While plugging in the metrics library for CASSANDRA-3671 I realized (because the metrics library was trying to add a shutdown hook on metric creation) that starting cassandra and simply shutting it down, causes StorageProxy to not be initialized until the drain shutdown hook. Effects: * StorageProxy mbean missing in visualvm/jconsole after initial startup (seriously, I thought I was going nuts ;)) * And in general anything that makes assumptions about running early, or at least not during JVM shutdown, such as the metrics library, will be problematic -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-3773) make default schema migration wait time RING_DELAY
[ https://issues.apache.org/jira/browse/CASSANDRA-3773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Schuller updated CASSANDRA-3773: -- Description: The default of 10 seconds feels strange to me since we have established that RING_DELAY is the appropriate wait time for gossip, which should include schema migrations. Patch makes it RING_DELAY by default. Alternate patch makes it 30 seconds; I don't know how much we care about not referencing non-CLI code from the CLI. was: The default of 10 seconds feels strange to me since we have established that RING_DELAY is the appropriate wait time for gossip, which should include schema migrations. Patch makes it RING_DELAY by default. Alternate patch makes it 120 seconds; I don't know how much we care about not referencing non-CLI code from the CLI. make default schema migration wait time RING_DELAY -- Key: CASSANDRA-3773 URL: https://issues.apache.org/jira/browse/CASSANDRA-3773 Project: Cassandra Issue Type: Improvement Components: Tools Reporter: Peter Schuller Assignee: Peter Schuller Priority: Minor Attachments: CASSANDRA-3773-30s.txt, CASSANDRA-3773-RING_DELAY.txt The default of 10 seconds feels strange to me since we have established that RING_DELAY is the appropriate wait time for gossip, which should include schema migrations. Patch makes it RING_DELAY by default. Alternate patch makes it 30 seconds; I don't know how much we care about not referencing non-CLI code from the CLI. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-3773) make default schema migration wait time RING_DELAY
[ https://issues.apache.org/jira/browse/CASSANDRA-3773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Schuller updated CASSANDRA-3773: -- Attachment: CASSANDRA-3773-30s.txt CASSANDRA-3773-RING_DELAY.txt make default schema migration wait time RING_DELAY -- Key: CASSANDRA-3773 URL: https://issues.apache.org/jira/browse/CASSANDRA-3773 Project: Cassandra Issue Type: Improvement Components: Tools Reporter: Peter Schuller Assignee: Peter Schuller Priority: Minor Attachments: CASSANDRA-3773-30s.txt, CASSANDRA-3773-RING_DELAY.txt The default of 10 seconds feels strange to me since we have established that RING_DELAY is the appropriate wait time for gossip, which should include schema migrations. Patch makes it RING_DELAY by default. Alternate patch makes it 30 seconds; I don't know how much we care about not referencing non-CLI code from the CLI. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-3773) make default schema migration wait time RING_DELAY
[ https://issues.apache.org/jira/browse/CASSANDRA-3773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Schuller updated CASSANDRA-3773: -- Attachment: CASSANDRA-3773-30s-v2.txt v2 of 30s version removes spurious import. make default schema migration wait time RING_DELAY -- Key: CASSANDRA-3773 URL: https://issues.apache.org/jira/browse/CASSANDRA-3773 Project: Cassandra Issue Type: Improvement Components: Tools Reporter: Peter Schuller Assignee: Peter Schuller Priority: Minor Attachments: CASSANDRA-3773-30s-v2.txt, CASSANDRA-3773-30s.txt, CASSANDRA-3773-RING_DELAY.txt The default of 10 seconds feels strange to me since we have established that RING_DELAY is the appropriate wait time for gossip, which should include schema migrations. Patch makes it RING_DELAY by default. Alternate patch makes it 30 seconds; I don't know how much we care about not referencing non-CLI code from the CLI. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-3729) support counter debug mode on thrift interface
[ https://issues.apache.org/jira/browse/CASSANDRA-3729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Schuller updated CASSANDRA-3729: -- Attachment: trunk-3729.txt support counter debug mode on thrift interface -- Key: CASSANDRA-3729 URL: https://issues.apache.org/jira/browse/CASSANDRA-3729 Project: Cassandra Issue Type: Improvement Reporter: Peter Schuller Assignee: Peter Schuller Priority: Minor Attachments: trunk-3729.txt Attaching a patch against trunk to add a counter debug mode on the thrift interface, allowing clients to decode and inspect counter contexts. This is all Stu's code, except that I generated the thrift stuff so any mistakes there are mine. This was extremely useful internally on an 0.8. The patch is not yet tested on trunk, but if you think this can go in I will spend effort to test it soonish. It's not very invasive (other than the generated thrift code), so it feels okay to have it if we maybe document that it is not a supported interface (clearly in the thrift spec). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-3674) add nodetool explicitgc
[ https://issues.apache.org/jira/browse/CASSANDRA-3674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Schuller updated CASSANDRA-3674: -- Attachment: CASSANDRA-3674-trunk.txt add nodetool explicitgc --- Key: CASSANDRA-3674 URL: https://issues.apache.org/jira/browse/CASSANDRA-3674 Project: Cassandra Issue Type: Improvement Reporter: Peter Schuller Assignee: Peter Schuller Priority: Minor Attachments: CASSANDRA-3674-trunk.txt So that you can easily ask people run nodetool explicitgc and paste the results. I'll file a separate JIRA suggesting that we ship with -XX:+ExplicitGCInvokesConcurrent by default. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-3675) ship with -XX:+ExplicitGCInvokesConcurrent by default
[ https://issues.apache.org/jira/browse/CASSANDRA-3675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Schuller updated CASSANDRA-3675: -- Attachment: CASSANDRA-3675-trunk.txt ship with -XX:+ExplicitGCInvokesConcurrent by default - Key: CASSANDRA-3675 URL: https://issues.apache.org/jira/browse/CASSANDRA-3675 Project: Cassandra Issue Type: Improvement Reporter: Peter Schuller Assignee: Peter Schuller Priority: Minor Attachments: CASSANDRA-3675-trunk.txt It's so much easier if you can safely tell people to trigger a full GC to discover their live set (see CASSANDRA-3574), instead of explaining the behavior of CMS and what the memory usage graph looks like etc etc. Shipping with {{-XX:+ExplicitGCInvokesConcurrent}} means this is by default safe. For people that have special needs like some kind of rolling compacting GC with disablegossip, they are special enough that they can just change the VM options. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-3641) inconsistent/corrupt counters w/ broken shards never converge
[ https://issues.apache.org/jira/browse/CASSANDRA-3641?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Schuller updated CASSANDRA-3641: -- Attachment: CASSANDRA-3641-trunk-nojmx.txt New version attached. Rebased to current trunk, and no JMX. Otherwise identical. inconsistent/corrupt counters w/ broken shards never converge - Key: CASSANDRA-3641 URL: https://issues.apache.org/jira/browse/CASSANDRA-3641 Project: Cassandra Issue Type: Bug Reporter: Peter Schuller Assignee: Peter Schuller Attachments: 3641-0.8-internal-not-for-inclusion.txt, 3641-trunk.txt, CASSANDRA-3641-trunk-nojmx.txt We ran into a case (which MIGHT be related to CASSANDRA-3070) whereby we had counters that were corrupt (hopefully due to CASSANDRA-3178). The corruption was that there would exist shards with the *same* node_id, *same* clock id, but *different* counts. The counter column diffing and reconciliation code assumes that this never happens, and ignores the count. The problem with this is that if there is an inconsistency, the result of a reconciliation will depend on the order of the shards. In our case for example, we would see the value of the counter randomly fluctuating on a CL.ALL read, but we would get consistent (whatever the node had) on CL.ONE (submitted to one of the nodes in the replica set for the key). In addition, read repair would not work despite digest mismatches because the diffing algorithm also did not care about the counts when determining the differences to send. I'm attaching patches that fixes this. The first patch is against our 0.8 branch, which is not terribly useful to people, but I include it because it is the well-tested version that we have used on the production cluster which was subject to this corruption. The other patch is against trunk, and contains the same change. What the patch does is: * On diffing, treat as DISJOINT if there is a count discrepancy. * On reconciliation, look at the count and *deterministically* pick the higher one, and: ** log the fact that we detected a corrupt counter ** increment a JMX observable counter for monitoring purposes A cluster which is subject to such corruption and has this patch, will fix itself with and AES + compact (or just repeated compactions assuming the replicate-on-compact is able to deliver correctly). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-3670) provide red flags JMX instrumentation
[ https://issues.apache.org/jira/browse/CASSANDRA-3670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Schuller updated CASSANDRA-3670: -- Reviewer: slebresne provide red flags JMX instrumentation --- Key: CASSANDRA-3670 URL: https://issues.apache.org/jira/browse/CASSANDRA-3670 Project: Cassandra Issue Type: Improvement Reporter: Peter Schuller Assignee: Peter Schuller Priority: Minor As discussed in CASSANDRA-3641, it would be nice to expose through JMX certain information which is almost without exception indicative of something being wrong with the node or cluster. In the CASSANDRA-3641 case, it was the detection of corrupt counter shards. Other examples include: * Number of times the selection of files to compact was adjusted due to disk space heuristics * Number of times compaction has failed * Any I/O error reading from or writing to disk (the work here is collecting, not exposing, so maybe not in an initial version) * Any data skipped due to checksum mismatches (when checksumming is being used); e.g., number of skips. * Any arbitrary exception at least in certain code paths (compaction, scrub, cleanup for starters) Probably other things. The motivation is that if we have clear and obvious indications that something truly is wrong, it seems suboptimal to just leave that information in the log somewhere, for someone to discover later when something else broke as a result and a human investigates. You might argue that one should use non-trivial log analysis to detect these things, but I highly doubt a lot of people do this and it seems very wasteful to require that in comparison to just providing the MBean. It is important to note that the *lack* of a certain problem being advertised in this MBean is not supposed to be indicative of a *lack* of a problem. Rather, the point is that to the extent we can easily do so, it is nice to have a clear method of communicating to monitoring systems where there *is* a clear indication of something being wrong. The main part of this ticket is not to cover everything under the sun, but rather to reach agreement on adding an MBean where these types of indicators can be collected. Individual counters can then be added over time as one thinks of them. I propose: * Create an org.apache.cassandra.db.RedFlags MBean * Populate with a few things to begin with. I'll submit the patch if there is agreement. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-3483) Support bringing up a new datacenter to existing cluster without repair
[ https://issues.apache.org/jira/browse/CASSANDRA-3483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Schuller updated CASSANDRA-3483: -- Attachment: CASSANDRA-3483-trunk-noredesign.txt Attaching version rebased to trunk but not yet re-factored. Support bringing up a new datacenter to existing cluster without repair --- Key: CASSANDRA-3483 URL: https://issues.apache.org/jira/browse/CASSANDRA-3483 Project: Cassandra Issue Type: Bug Affects Versions: 1.0.2 Reporter: Chris Goffinet Assignee: Peter Schuller Attachments: CASSANDRA-3483-0.8-prelim.txt, CASSANDRA-3483-1.0.txt, CASSANDRA-3483-trunk-noredesign.txt Was talking to Brandon in irc, and we ran into a case where we want to bring up a new DC to an existing cluster. He suggested from jbellis the way to do it currently was set strategy options of dc2:0, then add the nodes. After the nodes are up, change the RF of dc2, and run repair. I'd like to avoid a repair as it runs AES and is a bit more intense than how bootstrap works currently by just streaming ranges from the SSTables. Would it be possible to improve this functionality (adding a new DC to existing cluster) than the proposed method? We'd be happy to do a patch if we got some input on the best way to go about it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-3671) provide JMX counters for unavailables/timeouts for reads and writes
[ https://issues.apache.org/jira/browse/CASSANDRA-3671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Schuller updated CASSANDRA-3671: -- Attachment: CASSANDRA-3671-trunk.txt provide JMX counters for unavailables/timeouts for reads and writes --- Key: CASSANDRA-3671 URL: https://issues.apache.org/jira/browse/CASSANDRA-3671 Project: Cassandra Issue Type: Improvement Reporter: Peter Schuller Assignee: Peter Schuller Priority: Minor Attachments: CASSANDRA-3671-trunk.txt Attaching patch against trunk. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-3671) provide JMX counters for unavailables/timeouts for reads and writes
[ https://issues.apache.org/jira/browse/CASSANDRA-3671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Schuller updated CASSANDRA-3671: -- Attachment: CASSANDRA-3671-trunk-v2.txt Accidentally attached old version of patch. v2 attached which doesn't fail to re-throw in one case. provide JMX counters for unavailables/timeouts for reads and writes --- Key: CASSANDRA-3671 URL: https://issues.apache.org/jira/browse/CASSANDRA-3671 Project: Cassandra Issue Type: Improvement Reporter: Peter Schuller Assignee: Peter Schuller Priority: Minor Attachments: CASSANDRA-3671-trunk-v2.txt, CASSANDRA-3671-trunk.txt Attaching patch against trunk. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-3652) correct and improve stream protocol mismatch error
[ https://issues.apache.org/jira/browse/CASSANDRA-3652?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Schuller updated CASSANDRA-3652: -- Attachment: CASSANDRA-3652-1.0.txt Patch attached. correct and improve stream protocol mismatch error -- Key: CASSANDRA-3652 URL: https://issues.apache.org/jira/browse/CASSANDRA-3652 Project: Cassandra Issue Type: Bug Reporter: Peter Schuller Assignee: Peter Schuller Priority: Minor Attachments: CASSANDRA-3652-1.0.txt The message (and code comment) claims it got a newer version despite the fact that the check only determines that it is non-equal. Fix that, and also print the actual version gotten and expected. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-3571) make stream throttling configurable at runtime with nodetool
[ https://issues.apache.org/jira/browse/CASSANDRA-3571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Schuller updated CASSANDRA-3571: -- Attachment: CASSANDRA-3571-1.0-rebased-v2.txt I completely agree; attached. make stream throttling configurable at runtime with nodetool Key: CASSANDRA-3571 URL: https://issues.apache.org/jira/browse/CASSANDRA-3571 Project: Cassandra Issue Type: Improvement Reporter: Peter Schuller Assignee: Peter Schuller Priority: Minor Attachments: CASSANDRA-3571-1.0-rebased-v2.txt, CASSANDRA-3571-1.0-rebased.txt, CASSANDRA-3571-1.0.txt Attaching patch that does this, against 1.0. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-3644) parsing of chunk_length_kb silently overflows
[ https://issues.apache.org/jira/browse/CASSANDRA-3644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Schuller updated CASSANDRA-3644: -- Attachment: CASSANDRA-3644-1.0-v2.txt My apologies. I must have accidentally taken the wrong branch. v2 attached, against 1.0. parsing of chunk_length_kb silently overflows - Key: CASSANDRA-3644 URL: https://issues.apache.org/jira/browse/CASSANDRA-3644 Project: Cassandra Issue Type: Bug Reporter: Peter Schuller Assignee: Peter Schuller Priority: Minor Attachments: CASSANDRA-3644-1.0-v2.txt, CASSANDRA-3644-1.0.txt Not likely to trigger for real values; I noticed because some other bug caused the chunk length setting to be corrupted somehow and take on some huge value having nothing to do with what I asked for in my schema update (not yet identified why; separate issue). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-3646) document reference to CASSANDRA-1938 in source code
[ https://issues.apache.org/jira/browse/CASSANDRA-3646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Schuller updated CASSANDRA-3646: -- Attachment: CASSANDRA-3646-trunk.txt document reference to CASSANDRA-1938 in source code --- Key: CASSANDRA-3646 URL: https://issues.apache.org/jira/browse/CASSANDRA-3646 Project: Cassandra Issue Type: Improvement Reporter: Peter Schuller Assignee: Peter Schuller Priority: Minor Attachments: CASSANDRA-3646-trunk.txt I spent a huge amount of time trying to understand what a delta really was, until I finally found the explanation in CASSANDRA-1938. It would have saved me a lot of time debugging if this would have been pointed to, so hence I suggest the attached patch for future persons in the same position. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-3644) parsing of chunk_length_kb silently overflows
[ https://issues.apache.org/jira/browse/CASSANDRA-3644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Schuller updated CASSANDRA-3644: -- Attachment: CASSANDRA-3644-1.0.txt parsing of chunk_length_kb silently overflows - Key: CASSANDRA-3644 URL: https://issues.apache.org/jira/browse/CASSANDRA-3644 Project: Cassandra Issue Type: Bug Reporter: Peter Schuller Assignee: Peter Schuller Priority: Minor Attachments: CASSANDRA-3644-1.0.txt Not likely to trigger for real values; I noticed because some other bug caused the chunk length setting to be corrupted somehow and take on some huge value having nothing to do with what I asked for in my schema update (not yet identified why; separate issue). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-3603) CounterColumn and CounterContext use a log4j logger instead of using slf4j like the rest of the code base
[ https://issues.apache.org/jira/browse/CASSANDRA-3603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Schuller updated CASSANDRA-3603: -- Attachment: CASSANDRA-3603-trunk.txt Trivial patch applied. CounterColumn and CounterContext use a log4j logger instead of using slf4j like the rest of the code base - Key: CASSANDRA-3603 URL: https://issues.apache.org/jira/browse/CASSANDRA-3603 Project: Cassandra Issue Type: Bug Reporter: Peter Schuller Assignee: Peter Schuller Priority: Minor Attachments: CASSANDRA-3603-trunk.txt (Will submit patch but not now, no time.) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-3571) make stream throttling configurable at runtime with nodetool
[ https://issues.apache.org/jira/browse/CASSANDRA-3571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Schuller updated CASSANDRA-3571: -- Attachment: CASSANDRA-3571-1.0-rebased.txt Sure, rebased for 1.0. I doubt it won't apply easily on trunk, just let me know if you need me to do anything there. make stream throttling configurable at runtime with nodetool Key: CASSANDRA-3571 URL: https://issues.apache.org/jira/browse/CASSANDRA-3571 Project: Cassandra Issue Type: Improvement Reporter: Peter Schuller Assignee: Peter Schuller Priority: Minor Attachments: CASSANDRA-3571-1.0-rebased.txt, CASSANDRA-3571-1.0.txt Attaching patch that does this, against 1.0. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-3641) inconsistent/corrupt counters w/ broken shards never converge
[ https://issues.apache.org/jira/browse/CASSANDRA-3641?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Schuller updated CASSANDRA-3641: -- Attachment: 3641-0.8-internal-not-for-inclusion.txt 3641-trunk.txt inconsistent/corrupt counters w/ broken shards never converge - Key: CASSANDRA-3641 URL: https://issues.apache.org/jira/browse/CASSANDRA-3641 Project: Cassandra Issue Type: Bug Reporter: Peter Schuller Attachments: 3641-0.8-internal-not-for-inclusion.txt, 3641-trunk.txt We ran into a case (which MIGHT be related to CASSANDRA-3070) whereby we had counters that were corrupt (hopefully due to CASSANDRA-3178). The corruption was that there would exist shards with the *same* node_id, *same* clock id, but *different* counts. The counter column diffing and reconciliation code assumes that this never happens, and ignores the count. The problem with this is that if there is an inconsistency, the result of a reconciliation will depend on the order of the shards. In our case for example, we would see the value of the counter randomly fluctuating on a CL.ALL read, but we would get consistent (whatever the node had) on CL.ONE (submitted to one of the nodes in the replica set for the key). In addition, read repair would not work despite digest mismatches because the diffing algorithm also did not care about the counts when determining the differences to send. I'm attaching patches that fixes this. The first patch is against our 0.8 branch, which is not terribly useful to people, but I include it because it is the well-tested version that we have used on the production cluster which was subject to this corruption. The other patch is against trunk, and contains the same change. What the patch does is: * On diffing, treat as DISJOINT if there is a count discrepancy. * On reconciliation, look at the count and *deterministically* pick the higher one, and: ** log the fact that we detected a corrupt counter ** increment a JMX observable counter for monitoring purposes A cluster which is subject to such corruption and has this patch, will fix itself with and AES + compact (or just repeated compactions assuming the replicate-on-compact is able to deliver correctly). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-3070) counter repair
[ https://issues.apache.org/jira/browse/CASSANDRA-3070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Schuller updated CASSANDRA-3070: -- Comment: was deleted (was: This may be relevant, quoting myself from IRC: {code} 21:20:01 scode pcmanus: Hey, are you there? 21:20:21 scode pcmanus: I am investigating something which might be https://issues.apache.org/jira/browse/CASSANDRA-3070 21:20:37 scode pcmanus: And I could use the help of someone with his brain all over counters, and Stu isn't here atm. :) 21:21:16 scode pcmanus: https://gist.github.com/8202cb46c8bd00c8391b 21:21:37 scode pcmanus: I am investigating why with CL.ALL and CL.QUORUM, I get seemingly random/varying results when I read a counter. 21:21:53 scode pcmanus: I have the offending sstables on a three-node test setup and am inserting debug printouts in the code to trace the reconiliation. 21:21:57 scode pcmanus: The gist above shows what's happening. 21:22:11 scode pcmanus: The latter is the wrong one, and the former is the correct one. 21:22:28 scode pcmanus: The interesting bit is that I see shards with the same node_id *AND* clock, but *DIFFERENT* counts. 21:22:53 scode pcmanus: My understanding of counters is that there should never (globally across an entire cluster in all sstables) exist two shards for the same node_id+clock but with different counts. 21:22:57 scode pcmanus: Is my understanding correct there? 21:25:10 scode pcmanus: There is one node out of the three that has the offending card (with a count of 2 instead of 1). Like with 3070, we observed this after having expanded a cluster (though I'm not sure how that would cause it, and we don't know if there existed a problem before the expansion). {code} ) counter repair -- Key: CASSANDRA-3070 URL: https://issues.apache.org/jira/browse/CASSANDRA-3070 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 0.8.4 Reporter: ivan Assignee: Sylvain Lebresne Attachments: counter_local_quroum_maybeschedulerepairs.txt, counter_local_quroum_maybeschedulerepairs_2.txt, counter_local_quroum_maybeschedulerepairs_3.txt Hi! We have some counters out of sync but repair doesn't sync values. We tried nodetool repair. We use LOCAL_QUORUM for read. A repair row mutation is sent to other nodes while reading a bad row but counters wasn't repaired by mutation. Output of two nodes were uploaded. (Some new debug messages were added.) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-3571) make stream throttling configurable at runtime with nodetool
[ https://issues.apache.org/jira/browse/CASSANDRA-3571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Schuller updated CASSANDRA-3571: -- Attachment: CASSANDRA-3571-1.0.txt make stream throttling configurable at runtime with nodetool Key: CASSANDRA-3571 URL: https://issues.apache.org/jira/browse/CASSANDRA-3571 Project: Cassandra Issue Type: Improvement Reporter: Peter Schuller Assignee: Peter Schuller Priority: Minor Attachments: CASSANDRA-3571-1.0.txt Attaching patch that does this, against 1.0. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-3572) make NodeCommand enum sorted to avoid constant merge conflicts
[ https://issues.apache.org/jira/browse/CASSANDRA-3572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Schuller updated CASSANDRA-3572: -- Attachment: CASSANDRA-3572-1.0.txt make NodeCommand enum sorted to avoid constant merge conflicts -- Key: CASSANDRA-3572 URL: https://issues.apache.org/jira/browse/CASSANDRA-3572 Project: Cassandra Issue Type: Improvement Reporter: Peter Schuller Assignee: Peter Schuller Priority: Minor Attachments: CASSANDRA-3572-1.0.txt Attaching patch. This is just for developer convenience; the NodeCmd is constantly causing conflicts. There will be initial pain in the transition, but once everyone's done working on branches with it not sorted, we should achieve merge nirvana. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-3494) Streaming is mono-threaded (the bulk loader too by extension)
[ https://issues.apache.org/jira/browse/CASSANDRA-3494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Schuller updated CASSANDRA-3494: -- Attachment: CASSANDRA-3494-1.0.txt Attaching version rebased for 1.0 (and tested). Streaming is mono-threaded (the bulk loader too by extension) - Key: CASSANDRA-3494 URL: https://issues.apache.org/jira/browse/CASSANDRA-3494 Project: Cassandra Issue Type: Improvement Components: Core Affects Versions: 0.8.0 Reporter: Sylvain Lebresne Assignee: Peter Schuller Priority: Minor Attachments: CASSANDRA-3494-0.8-prelim.txt, CASSANDRA-3494-1.0.txt The streamExecutor is define as: {noformat} streamExecutor_ = new DebuggableThreadPoolExecutor(Streaming, Thread.MIN_PRIORITY); {noformat} In the meantime, in DebuggableThreadPoolExecutor.java: {noformat} public DebuggableThreadPoolExecutor(String threadPoolName, int priority) { this(1, Integer.MAX_VALUE, TimeUnit.SECONDS, new LinkedBlockingQueueRunnable(), new NamedThreadFactory(threadPoolName, priority)); } {noformat} In other word, since the core pool size is 1 and the queue unbounded, tasks will always queued and the executor is essentially mono-threaded. This is clearly not necessary since we already have stream throttling nowadays. And it could be a limiting factor in the case of the bulk loader. Besides, I would venture that this maybe was not the intention, because putting the max core size to MAX_VALUE would suggest that the intention was to spawn threads on demand. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-3570) barrier-of-entry: make ./bin/cassandra -f work out of the box by changing default cassandra.yaml
[ https://issues.apache.org/jira/browse/CASSANDRA-3570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Schuller updated CASSANDRA-3570: -- Attachment: CASSANDRA_3570-dbpath.txt barrier-of-entry: make ./bin/cassandra -f work out of the box by changing default cassandra.yaml Key: CASSANDRA-3570 URL: https://issues.apache.org/jira/browse/CASSANDRA-3570 Project: Cassandra Issue Type: Improvement Reporter: Peter Schuller Assignee: Peter Schuller Priority: Trivial Attachments: CASSANDRA_3570-dbpath.txt This is probably going to be controversial. But how about the attached simple patch to just have ./db exist, and then have Cassandra configured to use that by default? This makes it a lot easier for people to just run Cassandra out of the working copy, whether you are a developer or a user who wants to apply a patch when being assisted by a Cassandra developer. A real deployment with packaging should properly override these paths anyway, and the default /var/lib stuff is pretty useless. Even if you are root on the machine, who it is much cleaner to just run self-contained. Yes, I am aware that you can over-ride the configuration but honestly, that's just painful. Especially when switching between various versions of Cassandra. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-3494) Streaming is mono-threaded (the bulk loader too by extension)
[ https://issues.apache.org/jira/browse/CASSANDRA-3494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Schuller updated CASSANDRA-3494: -- Attachment: CASSANDRA-3494-0.8-prelim.txt Attaching a principle diff against our internal 0.8 for cursory examination. I want to submit another one for inclusion for 1.0 later (it's blocking on some 1.0 work that I have to do first). The idea is this: Now we have one executor per destination host. In order to avoid complex synchronization we never bother removing an executor once created; but this is fine because we make sure threads time out so the cost is just the executor instance and not a thread, for destinations that are not being streamed to. Make the tracking of active streams for throttling purposes be explicit, to avoid iterating over the O(n) map. Streaming is mono-threaded (the bulk loader too by extension) - Key: CASSANDRA-3494 URL: https://issues.apache.org/jira/browse/CASSANDRA-3494 Project: Cassandra Issue Type: Improvement Components: Core Affects Versions: 0.8.0 Reporter: Sylvain Lebresne Assignee: Peter Schuller Priority: Minor Attachments: CASSANDRA-3494-0.8-prelim.txt The streamExecutor is define as: {noformat} streamExecutor_ = new DebuggableThreadPoolExecutor(Streaming, Thread.MIN_PRIORITY); {noformat} In the meantime, in DebuggableThreadPoolExecutor.java: {noformat} public DebuggableThreadPoolExecutor(String threadPoolName, int priority) { this(1, Integer.MAX_VALUE, TimeUnit.SECONDS, new LinkedBlockingQueueRunnable(), new NamedThreadFactory(threadPoolName, priority)); } {noformat} In other word, since the core pool size is 1 and the queue unbounded, tasks will always queued and the executor is essentially mono-threaded. This is clearly not necessary since we already have stream throttling nowadays. And it could be a limiting factor in the case of the bulk loader. Besides, I would venture that this maybe was not the intention, because putting the max core size to MAX_VALUE would suggest that the intention was to spawn threads on demand. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-3483) Support bringing up a new datacenter to existing cluster without repair
[ https://issues.apache.org/jira/browse/CASSANDRA-3483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Schuller updated CASSANDRA-3483: -- Attachment: CASSANDRA-3483-0.8-prelim.txt Here is a patch rebased against 0.8 for cursory review. I do not expect this to go into 0.8, and in fact I have not tested this patch other than build against vanilla 0.8 (the original patch is tested, but against our internal 0.8). If there are no concerns with the overall implementation, I'll submit a rebased version for 1.0/trunk. There are two components of the change: * Breaking out the streaming part of BootStrapper into a separate RangeStreamer. Change BootStrapper to use that. * Implement the rebuild command on top of RangeStreamer. There are two ways to invoke rebuild: {code} nodetool rebuild nodetool rebuild nameofdc {code} The first form streams from nearest endpoints, while the latter streams from nearest endpoints in the specified data center. Support bringing up a new datacenter to existing cluster without repair --- Key: CASSANDRA-3483 URL: https://issues.apache.org/jira/browse/CASSANDRA-3483 Project: Cassandra Issue Type: Bug Affects Versions: 1.0.2 Reporter: Chris Goffinet Assignee: Peter Schuller Attachments: CASSANDRA-3483-0.8-prelim.txt Was talking to Brandon in irc, and we ran into a case where we want to bring up a new DC to an existing cluster. He suggested from jbellis the way to do it currently was set strategy options of dc2:0, then add the nodes. After the nodes are up, change the RF of dc2, and run repair. I'd like to avoid a repair as it runs AES and is a bit more intense than how bootstrap works currently by just streaming ranges from the SSTables. Would it be possible to improve this functionality (adding a new DC to existing cluster) than the proposed method? We'd be happy to do a patch if we got some input on the best way to go about it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira