Git Push Summary
Updated Tags: refs/tags/1.2.13-tentative [created] 4be9e6720
Git Push Summary
Updated Tags: refs/tags/1.2.13-tentative [deleted] 6ab82a469
[jira] [Commented] (CASSANDRA-6151) CqlPagingRecorderReader Used when Partition Key Is Explicitly Stated
[ https://issues.apache.org/jira/browse/CASSANDRA-6151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13848964#comment-13848964 ] Akshay DM commented on CASSANDRA-6151: -- @Shridhar The patch seems to be working for 1.2.12 too. Thanks a lot... CqlPagingRecorderReader Used when Partition Key Is Explicitly Stated Key: CASSANDRA-6151 URL: https://issues.apache.org/jira/browse/CASSANDRA-6151 Project: Cassandra Issue Type: Bug Components: Hadoop Reporter: Russell Alexander Spitzer Assignee: Alex Liu Priority: Minor Attachments: 6151-1.2-branch.txt, 6151-v2-1.2-branch.txt, 6151-v3-1.2-branch.txt, 6151-v4-1.2.10-branch.txt From http://stackoverflow.com/questions/19189649/composite-key-in-cassandra-with-pig/19211546#19211546 The user was attempting to load a single partition using a where clause in a pig load statement. CQL Table {code} CREATE table data ( occurday text, seqnumber int, occurtimems bigint, unique bigint, fields maptext, text, primary key ((occurday, seqnumber), occurtimems, unique) ) {code} Pig Load statement Query {code} data = LOAD 'cql://ks/data?where_clause=seqnumber%3D10%20AND%20occurday%3D%272013-10-01%27' USING CqlStorage(); {code} This results in an exception when processed by the the CqlPagingRecordReader which attempts to page this query even though it contains at most one partition key. This leads to an invalid CQL statement. CqlPagingRecordReader Query {code} SELECT * FROM data WHERE token(occurday,seqnumber) ? AND token(occurday,seqnumber) = ? AND occurday='A Great Day' AND seqnumber=1 LIMIT 1000 ALLOW FILTERING {code} Exception {code} InvalidRequestException(why:occurday cannot be restricted by more than one relation if it includes an Equal) {code} I'm not sure it is worth the special case but, a modification to not use the paging record reader when the entire partition key is specified would solve this issue. h3. Solution If it have EQUAL clauses for all the partitioning keys, we use Query {code} SELECT * FROM data WHERE occurday='A Great Day' AND seqnumber=1 LIMIT 1000 ALLOW FILTERING {code} instead of {code} SELECT * FROM data WHERE token(occurday,seqnumber) ? AND token(occurday,seqnumber) = ? AND occurday='A Great Day' AND seqnumber=1 LIMIT 1000 ALLOW FILTERING {code} The base line implementation is to retrieve all data of all rows around the ring. This new feature is to retrieve all data of a wide row. It's a one level lower than the base line. It helps for the use case where user is only interested in a specific wide row, so the user doesn't spend whole job to retrieve all the rows around the ring. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
git commit: Slightly improved message when parsing properties for DDL queries
Updated Branches: refs/heads/cassandra-1.2 4be9e6720 - 54a1955d2 Slightly improved message when parsing properties for DDL queries patch by boneill42; reviewed by slebresne for CASSANDRA-6453 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/54a1955d Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/54a1955d Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/54a1955d Branch: refs/heads/cassandra-1.2 Commit: 54a1955d254bfc89e48389d5d0d94c79d027d470 Parents: 4be9e67 Author: Sylvain Lebresne sylv...@datastax.com Authored: Mon Dec 16 10:53:22 2013 +0100 Committer: Sylvain Lebresne sylv...@datastax.com Committed: Mon Dec 16 10:53:22 2013 +0100 -- CHANGES.txt | 3 +++ src/java/org/apache/cassandra/cql3/Cql.g | 10 -- 2 files changed, 11 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/54a1955d/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index b55393b..4816d70 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,3 +1,6 @@ +1.2.14 + * Improved error message on bad properties in DDL queries (CASSANDRA-6453) + 1.2.13 * Randomize batchlog candidates selection (CASSANDRA-6481) * Fix thundering herd on endpoint cache invalidation (CASSANDRA-6345, 6485) http://git-wip-us.apache.org/repos/asf/cassandra/blob/54a1955d/src/java/org/apache/cassandra/cql3/Cql.g -- diff --git a/src/java/org/apache/cassandra/cql3/Cql.g b/src/java/org/apache/cassandra/cql3/Cql.g index 7101c71..ea6844f 100644 --- a/src/java/org/apache/cassandra/cql3/Cql.g +++ b/src/java/org/apache/cassandra/cql3/Cql.g @@ -93,12 +93,18 @@ options { if (!(entry.left instanceof Constants.Literal)) { -addRecognitionError(Invalid property name: + entry.left); +String msg = Invalid property name: + entry.left; +if (entry.left instanceof AbstractMarker.Raw) +msg += (bind variables are not supported in DDL queries); +addRecognitionError(msg); break; } if (!(entry.right instanceof Constants.Literal)) { -addRecognitionError(Invalid property value: + entry.right); +String msg = Invalid property value: + entry.right + for property: + entry.left; +if (entry.right instanceof AbstractMarker.Raw) +msg += (bind variables are not supported in DDL queries); +addRecognitionError(msg); break; }
[jira] [Resolved] (CASSANDRA-6453) Improve error message for invalid property values during parsing.
[ https://issues.apache.org/jira/browse/CASSANDRA-6453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sylvain Lebresne resolved CASSANDRA-6453. - Resolution: Fixed Reviewer: Sylvain Lebresne I understand that Brian's underlying problem was that he wants prepared statements for DDL queries which we indeed don't support. But pragmatically, as far as this ticket description and patch goes, I don't see the harm in committing the error message improvement. It does is nicer to include the name of the property for which the value is invalid irregardless of the bind marker problem. Besides, Brian's confusion suggests that maybe an error message that explicitly indicate that bind markers are not supported would help too. So committed the patch with a slight specialization in the case of bind markers. Improve error message for invalid property values during parsing. - Key: CASSANDRA-6453 URL: https://issues.apache.org/jira/browse/CASSANDRA-6453 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Brian ONeill Priority: Trivial Attachments: CASSANDRA-6354-patch.txt Trivial change to the error message returned for invalid property values. Previously, it would just say Invalid property value : ?. If you were constructing a large prepared statement, with multiple question marks, it was difficult to track down which one the server was complaining about. This enhancement tells you which one. =) -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Reopened] (CASSANDRA-6453) Improve error message for invalid property values during parsing.
[ https://issues.apache.org/jira/browse/CASSANDRA-6453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sylvain Lebresne reopened CASSANDRA-6453: - Improve error message for invalid property values during parsing. - Key: CASSANDRA-6453 URL: https://issues.apache.org/jira/browse/CASSANDRA-6453 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Brian ONeill Priority: Trivial Attachments: CASSANDRA-6354-patch.txt Trivial change to the error message returned for invalid property values. Previously, it would just say Invalid property value : ?. If you were constructing a large prepared statement, with multiple question marks, it was difficult to track down which one the server was complaining about. This enhancement tells you which one. =) -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Updated] (CASSANDRA-6453) Improve error message for invalid property values during parsing.
[ https://issues.apache.org/jira/browse/CASSANDRA-6453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sylvain Lebresne updated CASSANDRA-6453: Fix Version/s: 1.2.14 Improve error message for invalid property values during parsing. - Key: CASSANDRA-6453 URL: https://issues.apache.org/jira/browse/CASSANDRA-6453 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Brian ONeill Priority: Trivial Fix For: 1.2.14 Attachments: CASSANDRA-6354-patch.txt Trivial change to the error message returned for invalid property values. Previously, it would just say Invalid property value : ?. If you were constructing a large prepared statement, with multiple question marks, it was difficult to track down which one the server was complaining about. This enhancement tells you which one. =) -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Resolved] (CASSANDRA-6490) Please delete old releases from mirroring system
[ https://issues.apache.org/jira/browse/CASSANDRA-6490?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sylvain Lebresne resolved CASSANDRA-6490. - Resolution: Fixed Done ([~urandom], can you check the debian/dists/ directory and delete the 06x and 07x directories? I don't seem to have the right to do so and they don't point at anything existing anymore). Please delete old releases from mirroring system Key: CASSANDRA-6490 URL: https://issues.apache.org/jira/browse/CASSANDRA-6490 Project: Cassandra Issue Type: Bug Environment: http://www.apache.org/dist/cassandra/ Reporter: Sebb Assignee: Sylvain Lebresne To reduce the load on the ASF mirrors, projects are required to delete old releases [1] Please can you remove all non-current releases? Thanks! [Note that older releases are always available from the ASF archive server] Any links to older releases on download pages should first be adjusted to point to the archive server. [1] http://www.apache.org/dev/release.html#when-to-archive -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Created] (CASSANDRA-6491) Timeout can send confusing information as to what their cause is
Sylvain Lebresne created CASSANDRA-6491: --- Summary: Timeout can send confusing information as to what their cause is Key: CASSANDRA-6491 URL: https://issues.apache.org/jira/browse/CASSANDRA-6491 Project: Cassandra Issue Type: Improvement Reporter: Sylvain Lebresne Assignee: Sylvain Lebresne Priority: Trivial We can race between the time we detect a timeout and the time we build the actual exception, so that it's possible to have a timeout exception that pretends enough replica have actually acknowledged the operation, which is thus slightly confusing to the user as to why it got a timeout. That kind of race is rather unlikely in a healthy environment, but https://datastax-oss.atlassian.net/browse/JAVA-227 shows that it's at least possible to trigger in a test environment. Note that it's definitively not worth synchronizing to avoid that that, but it could maybe be simple enough to detect the race when building the exception and correct the ack count. Attaching simple patch to show what I have in mind. Note that I don't entirely disagree that it's not perfect, but as said above, proper synchronization is just not worth it and it seems to me that it's not worth confusing users over that. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Updated] (CASSANDRA-6491) Timeout can send confusing information as to what their cause is
[ https://issues.apache.org/jira/browse/CASSANDRA-6491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sylvain Lebresne updated CASSANDRA-6491: Attachment: 6491.txt Timeout can send confusing information as to what their cause is Key: CASSANDRA-6491 URL: https://issues.apache.org/jira/browse/CASSANDRA-6491 Project: Cassandra Issue Type: Improvement Reporter: Sylvain Lebresne Assignee: Sylvain Lebresne Priority: Trivial Fix For: 1.2.14 Attachments: 6491.txt We can race between the time we detect a timeout and the time we build the actual exception, so that it's possible to have a timeout exception that pretends enough replica have actually acknowledged the operation, which is thus slightly confusing to the user as to why it got a timeout. That kind of race is rather unlikely in a healthy environment, but https://datastax-oss.atlassian.net/browse/JAVA-227 shows that it's at least possible to trigger in a test environment. Note that it's definitively not worth synchronizing to avoid that that, but it could maybe be simple enough to detect the race when building the exception and correct the ack count. Attaching simple patch to show what I have in mind. Note that I don't entirely disagree that it's not perfect, but as said above, proper synchronization is just not worth it and it seems to me that it's not worth confusing users over that. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (CASSANDRA-6490) Please delete old releases from mirroring system
[ https://issues.apache.org/jira/browse/CASSANDRA-6490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13849187#comment-13849187 ] Sebb commented on CASSANDRA-6490: - There is a problem with the directory protections: drwxrwxr-x 3 eevans eevans 6 Aug 20 2012 06x drwxrwxr-x 3 eevans eevans 6 Aug 20 2012 07x drwxr-xr-x 3 slebresne cassandra 6 May 27 2013 11x drwxr-xr-x 3 slebresne cassandra 6 Nov 25 08:11 12x drwxr-xr-x 3 slebresne cassandra 6 Nov 25 08:40 20x drwxrwxr-x 3 apbackup cassandra 6 Sep 10 2012 sid drwxrwxr-x 3 eevans eevans 6 Aug 20 2012 unstable Only 'sid' above is correct. The file group should be cassandra, and files should be group-writable otherwise only the owner can change things. Which is awkward when the individual is temporarily unavailable. However please note that Infra are moving towards all projects using svnpubsub [1] for releases - which avoids all such issues. I suggest you file an Infra request now so you are ready for the next release. [1] http://www.apache.org/dev/release-publishing.html#distribution_dist Please delete old releases from mirroring system Key: CASSANDRA-6490 URL: https://issues.apache.org/jira/browse/CASSANDRA-6490 Project: Cassandra Issue Type: Bug Environment: http://www.apache.org/dist/cassandra/ Reporter: Sebb Assignee: Sylvain Lebresne To reduce the load on the ASF mirrors, projects are required to delete old releases [1] Please can you remove all non-current releases? Thanks! [Note that older releases are always available from the ASF archive server] Any links to older releases on download pages should first be adjusted to point to the archive server. [1] http://www.apache.org/dev/release.html#when-to-archive -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (CASSANDRA-6491) Timeout can send confusing information as to what their cause is
[ https://issues.apache.org/jira/browse/CASSANDRA-6491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13849193#comment-13849193 ] Jonathan Ellis commented on CASSANDRA-6491: --- +1 Timeout can send confusing information as to what their cause is Key: CASSANDRA-6491 URL: https://issues.apache.org/jira/browse/CASSANDRA-6491 Project: Cassandra Issue Type: Improvement Reporter: Sylvain Lebresne Assignee: Sylvain Lebresne Priority: Trivial Fix For: 1.2.14 Attachments: 6491.txt We can race between the time we detect a timeout and the time we build the actual exception, so that it's possible to have a timeout exception that pretends enough replica have actually acknowledged the operation, which is thus slightly confusing to the user as to why it got a timeout. That kind of race is rather unlikely in a healthy environment, but https://datastax-oss.atlassian.net/browse/JAVA-227 shows that it's at least possible to trigger in a test environment. Note that it's definitively not worth synchronizing to avoid that that, but it could maybe be simple enough to detect the race when building the exception and correct the ack count. Attaching simple patch to show what I have in mind. Note that I don't entirely disagree that it's not perfect, but as said above, proper synchronization is just not worth it and it seems to me that it's not worth confusing users over that. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Updated] (CASSANDRA-6487) Log WARN on large batch sizes
[ https://issues.apache.org/jira/browse/CASSANDRA-6487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lyuben Todorov updated CASSANDRA-6487: -- Attachment: 6487_trunk.patch Log WARN on large batch sizes - Key: CASSANDRA-6487 URL: https://issues.apache.org/jira/browse/CASSANDRA-6487 Project: Cassandra Issue Type: Improvement Reporter: Patrick McFadin Assignee: Lyuben Todorov Priority: Minor Attachments: 6487_trunk.patch Large batches on a coordinator can cause a lot of node stress. I propose adding a WARN log entry if batch sizes go beyond a configurable size. This will give more visibility to operators on something that can happen on the developer side. New yaml setting with 5k default. {{# Log WARN on any batch size exceeding this value. 5k by default.}} {{# Caution should be taken on increasing the size of this threshold as it can lead to node instability.}} {{batch_size_warn_threshold: 5k}} -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (CASSANDRA-6490) Please delete old releases from mirroring system
[ https://issues.apache.org/jira/browse/CASSANDRA-6490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13849236#comment-13849236 ] Eric Evans commented on CASSANDRA-6490: --- bq. Done (Eric Evans, can you check the debian/dists/ directory and delete the 06x and 07x directories? I don't seem to have the right to do so and they don't point at anything existing anymore). Done. Please delete old releases from mirroring system Key: CASSANDRA-6490 URL: https://issues.apache.org/jira/browse/CASSANDRA-6490 Project: Cassandra Issue Type: Bug Environment: http://www.apache.org/dist/cassandra/ Reporter: Sebb Assignee: Sylvain Lebresne To reduce the load on the ASF mirrors, projects are required to delete old releases [1] Please can you remove all non-current releases? Thanks! [Note that older releases are always available from the ASF archive server] Any links to older releases on download pages should first be adjusted to point to the archive server. [1] http://www.apache.org/dev/release.html#when-to-archive -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (CASSANDRA-6453) Improve error message for invalid property values during parsing.
[ https://issues.apache.org/jira/browse/CASSANDRA-6453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13849243#comment-13849243 ] Brian ONeill commented on CASSANDRA-6453: - [~slebresne] Agreed, +1. Five minute change to the code might save people hours of time. Thanks. Improve error message for invalid property values during parsing. - Key: CASSANDRA-6453 URL: https://issues.apache.org/jira/browse/CASSANDRA-6453 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Brian ONeill Priority: Trivial Fix For: 1.2.14 Attachments: CASSANDRA-6354-patch.txt Trivial change to the error message returned for invalid property values. Previously, it would just say Invalid property value : ?. If you were constructing a large prepared statement, with multiple question marks, it was difficult to track down which one the server was complaining about. This enhancement tells you which one. =) -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (CASSANDRA-6490) Please delete old releases from mirroring system
[ https://issues.apache.org/jira/browse/CASSANDRA-6490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13849252#comment-13849252 ] Sebb commented on CASSANDRA-6490: - There's still a problem with some of the protections: drwxr-xr-x 3 slebresne cassandra 6 May 27 2013 11x drwxr-xr-x 3 slebresne cassandra 6 Nov 25 08:11 12x drwxr-xr-x 3 slebresne cassandra 6 Nov 25 08:40 20x These should be changed - by slebresne - to allow group-write Please delete old releases from mirroring system Key: CASSANDRA-6490 URL: https://issues.apache.org/jira/browse/CASSANDRA-6490 Project: Cassandra Issue Type: Bug Environment: http://www.apache.org/dist/cassandra/ Reporter: Sebb Assignee: Sylvain Lebresne To reduce the load on the ASF mirrors, projects are required to delete old releases [1] Please can you remove all non-current releases? Thanks! [Note that older releases are always available from the ASF archive server] Any links to older releases on download pages should first be adjusted to point to the archive server. [1] http://www.apache.org/dev/release.html#when-to-archive -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Updated] (CASSANDRA-6378) sstableloader does not support client encryption on Cassandra 2.0
[ https://issues.apache.org/jira/browse/CASSANDRA-6378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sam Tunnicliffe updated CASSANDRA-6378: --- Attachment: 0001-CASSANDRA-6387-Add-SSL-support-to-BulkLoader.patch sstableloader does not support client encryption on Cassandra 2.0 - Key: CASSANDRA-6378 URL: https://issues.apache.org/jira/browse/CASSANDRA-6378 Project: Cassandra Issue Type: Bug Reporter: David Laube Labels: client, encryption, ssl, sstableloader Fix For: 2.0.4 Attachments: 0001-CASSANDRA-6387-Add-SSL-support-to-BulkLoader.patch We have been testing backup/restore from one ring to another and we recently stumbled upon an issue with sstableloader. When client_enc_enable: true, the exception below is generated. However, when client_enc_enable is set to false, the sstableloader is able to get to the point where it is discovers endpoints, connects to stream data, etc. ==BEGIN EXCEPTION== sstableloader --debug -d x.x.x.248,x.x.x.108,x.x.x.113 /tmp/import/keyspace_name/columnfamily_name Exception in thread main java.lang.RuntimeException: Could not retrieve endpoint ranges: at org.apache.cassandra.tools.BulkLoader$ExternalClient.init(BulkLoader.java:226) at org.apache.cassandra.io.sstable.SSTableLoader.stream(SSTableLoader.java:149) at org.apache.cassandra.tools.BulkLoader.main(BulkLoader.java:68) Caused by: org.apache.thrift.transport.TTransportException: Frame size (352518400) larger than max length (16384000)! at org.apache.thrift.transport.TFramedTransport.readFrame(TFramedTransport.java:137) at org.apache.thrift.transport.TFramedTransport.read(TFramedTransport.java:101) at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84) at org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:362) at org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:284) at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:191) at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69) at org.apache.cassandra.thrift.Cassandra$Client.recv_describe_partitioner(Cassandra.java:1292) at org.apache.cassandra.thrift.Cassandra$Client.describe_partitioner(Cassandra.java:1280) at org.apache.cassandra.tools.BulkLoader$ExternalClient.init(BulkLoader.java:199) ... 2 more ==END EXCEPTION== -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Assigned] (CASSANDRA-6378) sstableloader does not support client encryption on Cassandra 2.0
[ https://issues.apache.org/jira/browse/CASSANDRA-6378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sam Tunnicliffe reassigned CASSANDRA-6378: -- Assignee: Sam Tunnicliffe sstableloader does not support client encryption on Cassandra 2.0 - Key: CASSANDRA-6378 URL: https://issues.apache.org/jira/browse/CASSANDRA-6378 Project: Cassandra Issue Type: Bug Reporter: David Laube Assignee: Sam Tunnicliffe Labels: client, encryption, ssl, sstableloader Fix For: 2.0.4 Attachments: 0001-CASSANDRA-6387-Add-SSL-support-to-BulkLoader.patch We have been testing backup/restore from one ring to another and we recently stumbled upon an issue with sstableloader. When client_enc_enable: true, the exception below is generated. However, when client_enc_enable is set to false, the sstableloader is able to get to the point where it is discovers endpoints, connects to stream data, etc. ==BEGIN EXCEPTION== sstableloader --debug -d x.x.x.248,x.x.x.108,x.x.x.113 /tmp/import/keyspace_name/columnfamily_name Exception in thread main java.lang.RuntimeException: Could not retrieve endpoint ranges: at org.apache.cassandra.tools.BulkLoader$ExternalClient.init(BulkLoader.java:226) at org.apache.cassandra.io.sstable.SSTableLoader.stream(SSTableLoader.java:149) at org.apache.cassandra.tools.BulkLoader.main(BulkLoader.java:68) Caused by: org.apache.thrift.transport.TTransportException: Frame size (352518400) larger than max length (16384000)! at org.apache.thrift.transport.TFramedTransport.readFrame(TFramedTransport.java:137) at org.apache.thrift.transport.TFramedTransport.read(TFramedTransport.java:101) at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84) at org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:362) at org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:284) at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:191) at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69) at org.apache.cassandra.thrift.Cassandra$Client.recv_describe_partitioner(Cassandra.java:1292) at org.apache.cassandra.thrift.Cassandra$Client.describe_partitioner(Cassandra.java:1280) at org.apache.cassandra.tools.BulkLoader$ExternalClient.init(BulkLoader.java:199) ... 2 more ==END EXCEPTION== -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (CASSANDRA-6488) Batchlog writes consume unnecessarily large amounts of CPU on vnodes clusters
[ https://issues.apache.org/jira/browse/CASSANDRA-6488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13849261#comment-13849261 ] Michael Shuler commented on CASSANDRA-6488: --- This introduced a failure in BootStrapperTest: {code} test: [echo] running unit tests [mkdir] Created dir: /home/mshuler/git/cassandra/build/test/cassandra [mkdir] Created dir: /home/mshuler/git/cassandra/build/test/output [junit] WARNING: multiple versions of ant detected in path for junit [junit] jar:file:/usr/share/ant/lib/ant.jar!/org/apache/tools/ant/Project.class [junit] and jar:file:/home/mshuler/git/cassandra/build/lib/jars/ant-1.6.5.jar!/org/apache/tools/ant/Project.class [junit] Testsuite: org.apache.cassandra.dht.BootStrapperTest [junit] Tests run: 4, Failures: 1, Errors: 0, Time elapsed: 6.177 sec [junit] [junit] - Standard Error - [junit] WARN 09:47:46,135 No host ID found, created 9019bb70-4d6e-4cf6-b730-140ff5ae4be5 (Note: This should happen exactly once per node). [junit] WARN 09:47:46,262 Generated random token [d9180feb2e806704effa4024e8f4c631]. Random tokens will result in an unbalanced ring; see http://wiki.apache.org/cassandra/Operations [junit] - --- [junit] Testcase: testSourceTargetComputation(org.apache.cassandra.dht.BootStrapperTest): FAILED [junit] expected:1 but was:0 [junit] junit.framework.AssertionFailedError: expected:1 but was:0 [junit] at org.apache.cassandra.dht.BootStrapperTest.testSourceTargetComputation(BootStrapperTest.java:212) [junit] at org.apache.cassandra.dht.BootStrapperTest.testSourceTargetComputation(BootStrapperTest.java:173) [junit] [junit] [junit] Test org.apache.cassandra.dht.BootStrapperTest FAILED BUILD FAILED /home/mshuler/git/cassandra/build.xml:1113: The following error occurred while executing this line: /home/mshuler/git/cassandra/build.xml:1078: Some unit test(s) failed. Total time: 9 seconds ((4be9e67...)|BISECTING)mshuler@hana:~/git/cassandra$ git bisect bad 4be9e6720d9f94a83aa42153c3e71ae1e557d2d9 is the first bad commit commit 4be9e6720d9f94a83aa42153c3e71ae1e557d2d9 Author: Aleksey Yeschenko alek...@apache.org Date: Sun Dec 15 13:29:56 2013 +0300 Improve batchlog write performance with vnodes patch by Jonathan Ellis and Rick Branson; reviewed by Aleksey Yeschenko for CASSANDRA-6488 :100644 100644 e5865925f160faabc2506c3a5aac9985c17c1658 b55393b2ed138011bab52f95f2e9b52107709938 M CHANGES.txt :04 04 dea10aa8044e10eb60002e75f2586a9c8e94b647 7030c09f9713bd3e342e4e012c59b09c86b79a42 M src {code} Batchlog writes consume unnecessarily large amounts of CPU on vnodes clusters - Key: CASSANDRA-6488 URL: https://issues.apache.org/jira/browse/CASSANDRA-6488 Project: Cassandra Issue Type: Bug Reporter: Rick Branson Assignee: Rick Branson Fix For: 1.2.13, 2.0.4 Attachments: 6488-rbranson-patch.txt, 6488-v2.txt, 6488-v3.txt, graph (21).png The cloneTokenOnlyMap call in StorageProxy.getBatchlogEndpoints causes enormous amounts of CPU to be consumed on clusters with many vnodes. I created a patch to cache this data as a workaround and deployed it to a production cluster with 15,000 tokens. CPU consumption drop to 1/5th. This highlights the overall issues with cloneOnlyTokenMap() calls on vnodes clusters. I'm including the maybe-not-the-best-quality workaround patch to use as a reference, but cloneOnlyTokenMap is a systemic issue and every place it's called should probably be investigated. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (CASSANDRA-6488) Batchlog writes consume unnecessarily large amounts of CPU on vnodes clusters
[ https://issues.apache.org/jira/browse/CASSANDRA-6488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13849273#comment-13849273 ] Michael Shuler commented on CASSANDRA-6488: --- I'm working on the cassandra-2.0 branch, since I didn't mention it above. Around the same time, LeaveAndBootstrapTest, MoveTest, and RelocateTest were new failures - I'm looking at those - http://cassci.datastax.com/job/cassandra-2.0_test/49/console Batchlog writes consume unnecessarily large amounts of CPU on vnodes clusters - Key: CASSANDRA-6488 URL: https://issues.apache.org/jira/browse/CASSANDRA-6488 Project: Cassandra Issue Type: Bug Reporter: Rick Branson Assignee: Rick Branson Fix For: 1.2.13, 2.0.4 Attachments: 6488-rbranson-patch.txt, 6488-v2.txt, 6488-v3.txt, graph (21).png The cloneTokenOnlyMap call in StorageProxy.getBatchlogEndpoints causes enormous amounts of CPU to be consumed on clusters with many vnodes. I created a patch to cache this data as a workaround and deployed it to a production cluster with 15,000 tokens. CPU consumption drop to 1/5th. This highlights the overall issues with cloneOnlyTokenMap() calls on vnodes clusters. I'm including the maybe-not-the-best-quality workaround patch to use as a reference, but cloneOnlyTokenMap is a systemic issue and every place it's called should probably be investigated. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (CASSANDRA-6485) NPE in calculateNaturalEndpoints
[ https://issues.apache.org/jira/browse/CASSANDRA-6485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13849275#comment-13849275 ] Russell Alexander Spitzer commented on CASSANDRA-6485: -- Patch worked on my test. NPE in calculateNaturalEndpoints Key: CASSANDRA-6485 URL: https://issues.apache.org/jira/browse/CASSANDRA-6485 Project: Cassandra Issue Type: Bug Components: Core Reporter: Russell Alexander Spitzer Assignee: Jonathan Ellis Fix For: 1.2.13, 2.0.4 Attachments: 6485.txt I was running a test where I added a new data center to an existing cluster. Test outline: Start 25 Node DC1 Keyspace Setup Replication 3 Begin insert against DC1 Using Stress While the inserts are occuring Start up 25 Node DC2 Alter Keyspace to include Replication in 2nd DC Run rebuild on DC2 Wait for stress to finish Run repair on Cluster ... Some other operations Although there are no issues with smaller clusters or clusters without vnodes, Larger setups with vnodes seem to consistently see the following exception in the logs as well as a write operation failing for each exception. Usually this happens between 1-8 times during an experiment. The exceptions/failures are Occurring when DC2 is brought online but *before* any alteration of the Keyspace. All of the exceptions are happening on DC1 nodes. One of the exceptions occurred on a seed node though this doesn't seem to be the case most of the time. While the test was running, nodetool was run every second to get cluster status. At no time did any nodes report themselves as down. {code} ystem_logs-107.21.186.208/system.log-ERROR [Thrift:1] 2013-12-13 06:19:52,647 CustomTThreadPoolServer.java (line 217) Error occurred during processing of message. system_logs-107.21.186.208/system.log:java.lang.NullPointerException system_logs-107.21.186.208/system.log-at org.apache.cassandra.locator.AbstractReplicationStrategy.getNaturalEndpoints(AbstractReplicationStrategy.java:128) system_logs-107.21.186.208/system.log-at org.apache.cassandra.service.StorageService.getNaturalEndpoints(StorageService.java:2624) system_logs-107.21.186.208/system.log-at org.apache.cassandra.service.StorageProxy.performWrite(StorageProxy.java:375) system_logs-107.21.186.208/system.log-at org.apache.cassandra.service.StorageProxy.mutate(StorageProxy.java:190) system_logs-107.21.186.208/system.log-at org.apache.cassandra.thrift.CassandraServer.doInsert(CassandraServer.java:866) system_logs-107.21.186.208/system.log-at org.apache.cassandra.thrift.CassandraServer.doInsert(CassandraServer.java:849) system_logs-107.21.186.208/system.log-at org.apache.cassandra.thrift.CassandraServer.batch_mutate(CassandraServer.java:749) system_logs-107.21.186.208/system.log-at org.apache.cassandra.thrift.Cassandra$Processor$batch_mutate.getResult(Cassandra.java:3690) system_logs-107.21.186.208/system.log-at org.apache.cassandra.thrift.Cassandra$Processor$batch_mutate.getResult(Cassandra.java:3678) system_logs-107.21.186.208/system.log-at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:32) system_logs-107.21.186.208/system.log-at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:34) system_logs-107.21.186.208/system.log-at org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:199) system_logs-107.21.186.208/system.log-at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) system_logs-107.21.186.208/system.log-at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) system_logs-107.21.186.208/system.log-at java.lang.Thread.run(Thread.java:724) {code} -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Reopened] (CASSANDRA-6488) Batchlog writes consume unnecessarily large amounts of CPU on vnodes clusters
[ https://issues.apache.org/jira/browse/CASSANDRA-6488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Yeschenko reopened CASSANDRA-6488: -- So, the caching part. [~jbellis] can you have a look? If not, I will, later, but it's potentially 1.2.13 vote-affecting. Batchlog writes consume unnecessarily large amounts of CPU on vnodes clusters - Key: CASSANDRA-6488 URL: https://issues.apache.org/jira/browse/CASSANDRA-6488 Project: Cassandra Issue Type: Bug Reporter: Rick Branson Assignee: Rick Branson Fix For: 1.2.13, 2.0.4 Attachments: 6488-rbranson-patch.txt, 6488-v2.txt, 6488-v3.txt, graph (21).png The cloneTokenOnlyMap call in StorageProxy.getBatchlogEndpoints causes enormous amounts of CPU to be consumed on clusters with many vnodes. I created a patch to cache this data as a workaround and deployed it to a production cluster with 15,000 tokens. CPU consumption drop to 1/5th. This highlights the overall issues with cloneOnlyTokenMap() calls on vnodes clusters. I'm including the maybe-not-the-best-quality workaround patch to use as a reference, but cloneOnlyTokenMap is a systemic issue and every place it's called should probably be investigated. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Created] (CASSANDRA-6492) Have server pick query page size by default
Jonathan Ellis created CASSANDRA-6492: - Summary: Have server pick query page size by default Key: CASSANDRA-6492 URL: https://issues.apache.org/jira/browse/CASSANDRA-6492 Project: Cassandra Issue Type: New Feature Components: API Reporter: Jonathan Ellis Assignee: Sylvain Lebresne Priority: Minor We're almost always going to do a better job picking a page size based on sstable stats, than users will guesstimating. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (CASSANDRA-6488) Batchlog writes consume unnecessarily large amounts of CPU on vnodes clusters
[ https://issues.apache.org/jira/browse/CASSANDRA-6488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13849291#comment-13849291 ] Michael Shuler commented on CASSANDRA-6488: --- Commit bb09d3c fully passed all the unit tests in cassandra-2.0 branch. - http://cassci.datastax.com/job/cassandra-2.0_test/47/console Batchlog writes consume unnecessarily large amounts of CPU on vnodes clusters - Key: CASSANDRA-6488 URL: https://issues.apache.org/jira/browse/CASSANDRA-6488 Project: Cassandra Issue Type: Bug Reporter: Rick Branson Assignee: Rick Branson Fix For: 1.2.13, 2.0.4 Attachments: 6488-rbranson-patch.txt, 6488-v2.txt, 6488-v3.txt, graph (21).png The cloneTokenOnlyMap call in StorageProxy.getBatchlogEndpoints causes enormous amounts of CPU to be consumed on clusters with many vnodes. I created a patch to cache this data as a workaround and deployed it to a production cluster with 15,000 tokens. CPU consumption drop to 1/5th. This highlights the overall issues with cloneOnlyTokenMap() calls on vnodes clusters. I'm including the maybe-not-the-best-quality workaround patch to use as a reference, but cloneOnlyTokenMap is a systemic issue and every place it's called should probably be investigated. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (CASSANDRA-6490) Please delete old releases from mirroring system
[ https://issues.apache.org/jira/browse/CASSANDRA-6490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13849300#comment-13849300 ] Sylvain Lebresne commented on CASSANDRA-6490: - Right, right, fixed. Please delete old releases from mirroring system Key: CASSANDRA-6490 URL: https://issues.apache.org/jira/browse/CASSANDRA-6490 Project: Cassandra Issue Type: Bug Environment: http://www.apache.org/dist/cassandra/ Reporter: Sebb Assignee: Sylvain Lebresne To reduce the load on the ASF mirrors, projects are required to delete old releases [1] Please can you remove all non-current releases? Thanks! [Note that older releases are always available from the ASF archive server] Any links to older releases on download pages should first be adjusted to point to the archive server. [1] http://www.apache.org/dev/release.html#when-to-archive -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Comment Edited] (CASSANDRA-6488) Batchlog writes consume unnecessarily large amounts of CPU on vnodes clusters
[ https://issues.apache.org/jira/browse/CASSANDRA-6488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13849312#comment-13849312 ] Michael Shuler edited comment on CASSANDRA-6488 at 12/16/13 5:02 PM: - Those same tests look like new failures with this commit in cassandra-1.2 branch also - http://cassci.datastax.com/job/cassandra-1.2_test/32/console vs. - http://cassci.datastax.com/job/cassandra-1.2_test/33/console (edit for clarity) New unit test failures in c-2.0 and c-1.2 branches with this commit: - BootStrapperTest - LeaveAndBootstrapTest - MoveTest - RelocateTest was (Author: mshuler): Those same tests look like new failures with this commit in cassandra-1.2 branch also - http://cassci.datastax.com/job/cassandra-1.2_test/32/console vs. - http://cassci.datastax.com/job/cassandra-1.2_test/33/console Batchlog writes consume unnecessarily large amounts of CPU on vnodes clusters - Key: CASSANDRA-6488 URL: https://issues.apache.org/jira/browse/CASSANDRA-6488 Project: Cassandra Issue Type: Bug Reporter: Rick Branson Assignee: Rick Branson Fix For: 1.2.13, 2.0.4 Attachments: 6488-rbranson-patch.txt, 6488-v2.txt, 6488-v3.txt, graph (21).png The cloneTokenOnlyMap call in StorageProxy.getBatchlogEndpoints causes enormous amounts of CPU to be consumed on clusters with many vnodes. I created a patch to cache this data as a workaround and deployed it to a production cluster with 15,000 tokens. CPU consumption drop to 1/5th. This highlights the overall issues with cloneOnlyTokenMap() calls on vnodes clusters. I'm including the maybe-not-the-best-quality workaround patch to use as a reference, but cloneOnlyTokenMap is a systemic issue and every place it's called should probably be investigated. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Created] (CASSANDRA-6493) Exceptions when a second Datacenter is Added
Russell Alexander Spitzer created CASSANDRA-6493: Summary: Exceptions when a second Datacenter is Added Key: CASSANDRA-6493 URL: https://issues.apache.org/jira/browse/CASSANDRA-6493 Project: Cassandra Issue Type: Bug Components: Core Environment: Ubuntu, EC2 M1.large Reporter: Russell Alexander Spitzer On adding a second datacenter several exceptions were raised. Test outline: Start 25 Node DC1 Keyspace Setup Replication 3 Begin insert against DC1 Using Stress While the inserts are occuring Start up 25 Node DC2 Alter Keyspace to include Replication in 2nd DC Run rebuild on DC2 Wait for stress to finish Run repair on Cluster ... Some other operations At the point when the second datacenter is added several warnings go off because nodetool status is not functioning, and a few moments later the start operation reports a failure because a node has not successfully turned on. The first start attempt yielded the following exception on a node in the second DC. {code} CassandraDaemon.java (line 464) Exception encountered during startup java.lang.AssertionError: -7560216458456714666 not found in -9222060278673125462, -9220751250790085193, . ALL THE TOKENS ..., 9218575851928340117, 9219681798686280387 at org.apache.cassandra.locator.TokenMetadata.getPredecessor(TokenMetadata.java:752) at org.apache.cassandra.locator.TokenMetadata.getPrimaryRangesFor(TokenMetadata.java:696) at org.apache.cassandra.locator.TokenMetadata.getPrimaryRangeFor(TokenMetadata.java:703) at org.apache.cassandra.locator.AbstractReplicationStrategy.getRangeAddresses(AbstractReplicationStrategy.java:187) at org.apache.cassandra.dht.RangeStreamer.getAllRangesWithSourcesFor(RangeStreamer.java:147) at org.apache.cassandra.dht.RangeStreamer.addRanges(RangeStreamer.java:121) at org.apache.cassandra.dht.BootStrapper.bootstrap(BootStrapper.java:81) at org.apache.cassandra.service.StorageService.bootstrap(StorageService.java:979) at org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:745) at org.apache.cassandra.service.StorageService.initServer(StorageService.java:586) at org.apache.cassandra.service.StorageService.initServer(StorageService.java:483) at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:348) at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:447) at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:490) {code} The test automatically tries to restart nodes if they fail during startup, The second attempt for this node succeeded but a nodetool still failed and a different node in the second DC logged the following and failed to start up. {code} ERROR [main] 2013-12-16 18:02:04,869 CassandraDaemon.java (line 464) Exception encountered during startup java.util.ConcurrentModificationException at java.util.TreeMap$PrivateEntryIterator.nextEntry(TreeMap.java:1115) at java.util.TreeMap$KeyIterator.next(TreeMap.java:1169) at org.apache.commons.lang.StringUtils.join(StringUtils.java:3382) at org.apache.commons.lang.StringUtils.join(StringUtils.java:3444) at org.apache.cassandra.locator.TokenMetadata.getPredecessor(TokenMetadata.java:752) at org.apache.cassandra.locator.TokenMetadata.getPrimaryRangesFor(TokenMetadata.java:696) at org.apache.cassandra.locator.TokenMetadata.getPrimaryRangeFor(TokenMetadata.java:703) at org.apache.cassandra.locator.AbstractReplicationStrategy.getRangeAddresses(AbstractReplicationStrategy.java:187) at org.apache.cassandra.dht.RangeStreamer.getAllRangesWithSourcesFor(RangeStreamer.java:147) at org.apache.cassandra.dht.RangeStreamer.addRanges(RangeStreamer.java:121) at org.apache.cassandra.dht.BootStrapper.bootstrap(BootStrapper.java:81) at org.apache.cassandra.service.StorageService.bootstrap(StorageService.java:979) at org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:745) at org.apache.cassandra.service.StorageService.initServer(StorageService.java:586) at org.apache.cassandra.service.StorageService.initServer(StorageService.java:483) at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:348) at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:447) at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:490) ERROR [StorageServiceShutdownHook] 2013-12-16 18:02:04,876 CassandraDaemon.java (line 191) Exception in thread Thread[StorageServiceShutdownHook,5,main] java.lang.NullPointerException at org.apache.cassandra.service.StorageService.stopNativeTransport(StorageService.java:358) at
[jira] [Updated] (CASSANDRA-6493) Exceptions when a second Datacenter is Added
[ https://issues.apache.org/jira/browse/CASSANDRA-6493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Russell Alexander Spitzer updated CASSANDRA-6493: - Description: On adding a second datacenter several exceptions were raised. Test outline: Start 25 Node DC1 Keyspace Setup Replication 3 Begin insert against DC1 Using Stress While the inserts are occuring Start up 25 Node DC2 Alter Keyspace to include Replication in 2nd DC Run rebuild on DC2 Wait for stress to finish Run repair on Cluster ... Some other operations At the point when the second datacenter is added several warnings go off because nodetool status is not functioning, and a few moments later the start operation reports a failure because a node has not successfully turned on. The first start attempt yielded the following exception on a node in the second DC. {code} CassandraDaemon.java (line 464) Exception encountered during startup java.lang.AssertionError: -7560216458456714666 not found in -9222060278673125462, -9220751250790085193, . ALL THE TOKENS ..., 9218575851928340117, 9219681798686280387 at org.apache.cassandra.locator.TokenMetadata.getPredecessor(TokenMetadata.java:752) at org.apache.cassandra.locator.TokenMetadata.getPrimaryRangesFor(TokenMetadata.java:696) at org.apache.cassandra.locator.TokenMetadata.getPrimaryRangeFor(TokenMetadata.java:703) at org.apache.cassandra.locator.AbstractReplicationStrategy.getRangeAddresses(AbstractReplicationStrategy.java:187) at org.apache.cassandra.dht.RangeStreamer.getAllRangesWithSourcesFor(RangeStreamer.java:147) at org.apache.cassandra.dht.RangeStreamer.addRanges(RangeStreamer.java:121) at org.apache.cassandra.dht.BootStrapper.bootstrap(BootStrapper.java:81) at org.apache.cassandra.service.StorageService.bootstrap(StorageService.java:979) at org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:745) at org.apache.cassandra.service.StorageService.initServer(StorageService.java:586) at org.apache.cassandra.service.StorageService.initServer(StorageService.java:483) at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:348) at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:447) at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:490) {code} The test automatically tries to restart nodes if they fail during startup, The second attempt for this node succeeded but a 'nodetool status' still failed and a different node in the second DC logged the following and failed to start up. {code} ERROR [main] 2013-12-16 18:02:04,869 CassandraDaemon.java (line 464) Exception encountered during startup java.util.ConcurrentModificationException at java.util.TreeMap$PrivateEntryIterator.nextEntry(TreeMap.java:1115) at java.util.TreeMap$KeyIterator.next(TreeMap.java:1169) at org.apache.commons.lang.StringUtils.join(StringUtils.java:3382) at org.apache.commons.lang.StringUtils.join(StringUtils.java:3444) at org.apache.cassandra.locator.TokenMetadata.getPredecessor(TokenMetadata.java:752) at org.apache.cassandra.locator.TokenMetadata.getPrimaryRangesFor(TokenMetadata.java:696) at org.apache.cassandra.locator.TokenMetadata.getPrimaryRangeFor(TokenMetadata.java:703) at org.apache.cassandra.locator.AbstractReplicationStrategy.getRangeAddresses(AbstractReplicationStrategy.java:187) at org.apache.cassandra.dht.RangeStreamer.getAllRangesWithSourcesFor(RangeStreamer.java:147) at org.apache.cassandra.dht.RangeStreamer.addRanges(RangeStreamer.java:121) at org.apache.cassandra.dht.BootStrapper.bootstrap(BootStrapper.java:81) at org.apache.cassandra.service.StorageService.bootstrap(StorageService.java:979) at org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:745) at org.apache.cassandra.service.StorageService.initServer(StorageService.java:586) at org.apache.cassandra.service.StorageService.initServer(StorageService.java:483) at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:348) at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:447) at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:490) ERROR [StorageServiceShutdownHook] 2013-12-16 18:02:04,876 CassandraDaemon.java (line 191) Exception in thread Thread[StorageServiceShutdownHook,5,main] java.lang.NullPointerException at org.apache.cassandra.service.StorageService.stopNativeTransport(StorageService.java:358) at org.apache.cassandra.service.StorageService.shutdownClientServers(StorageService.java:373) at org.apache.cassandra.service.StorageService.access$000(StorageService.java:89) at
[jira] [Updated] (CASSANDRA-6493) Exceptions when a second Datacenter is Added
[ https://issues.apache.org/jira/browse/CASSANDRA-6493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Russell Alexander Spitzer updated CASSANDRA-6493: - Reproduced In: 1.2.13 Exceptions when a second Datacenter is Added Key: CASSANDRA-6493 URL: https://issues.apache.org/jira/browse/CASSANDRA-6493 Project: Cassandra Issue Type: Bug Components: Core Environment: Ubuntu, EC2 M1.large Reporter: Russell Alexander Spitzer On adding a second datacenter several exceptions were raised. Test outline: Start 25 Node DC1 Keyspace Setup Replication 3 Begin insert against DC1 Using Stress While the inserts are occuring Start up 25 Node DC2 Alter Keyspace to include Replication in 2nd DC Run rebuild on DC2 Wait for stress to finish Run repair on Cluster ... Some other operations At the point when the second datacenter is added several warnings go off because nodetool status is not functioning, and a few moments later the start operation reports a failure because a node has not successfully turned on. The first start attempt yielded the following exception on a node in the second DC. {code} CassandraDaemon.java (line 464) Exception encountered during startup java.lang.AssertionError: -7560216458456714666 not found in -9222060278673125462, -9220751250790085193, . ALL THE TOKENS ..., 9218575851928340117, 9219681798686280387 at org.apache.cassandra.locator.TokenMetadata.getPredecessor(TokenMetadata.java:752) at org.apache.cassandra.locator.TokenMetadata.getPrimaryRangesFor(TokenMetadata.java:696) at org.apache.cassandra.locator.TokenMetadata.getPrimaryRangeFor(TokenMetadata.java:703) at org.apache.cassandra.locator.AbstractReplicationStrategy.getRangeAddresses(AbstractReplicationStrategy.java:187) at org.apache.cassandra.dht.RangeStreamer.getAllRangesWithSourcesFor(RangeStreamer.java:147) at org.apache.cassandra.dht.RangeStreamer.addRanges(RangeStreamer.java:121) at org.apache.cassandra.dht.BootStrapper.bootstrap(BootStrapper.java:81) at org.apache.cassandra.service.StorageService.bootstrap(StorageService.java:979) at org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:745) at org.apache.cassandra.service.StorageService.initServer(StorageService.java:586) at org.apache.cassandra.service.StorageService.initServer(StorageService.java:483) at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:348) at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:447) at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:490) {code} The test automatically tries to restart nodes if they fail during startup, The second attempt for this node succeeded but a nodetool still failed and a different node in the second DC logged the following and failed to start up. {code} ERROR [main] 2013-12-16 18:02:04,869 CassandraDaemon.java (line 464) Exception encountered during startup java.util.ConcurrentModificationException at java.util.TreeMap$PrivateEntryIterator.nextEntry(TreeMap.java:1115) at java.util.TreeMap$KeyIterator.next(TreeMap.java:1169) at org.apache.commons.lang.StringUtils.join(StringUtils.java:3382) at org.apache.commons.lang.StringUtils.join(StringUtils.java:3444) at org.apache.cassandra.locator.TokenMetadata.getPredecessor(TokenMetadata.java:752) at org.apache.cassandra.locator.TokenMetadata.getPrimaryRangesFor(TokenMetadata.java:696) at org.apache.cassandra.locator.TokenMetadata.getPrimaryRangeFor(TokenMetadata.java:703) at org.apache.cassandra.locator.AbstractReplicationStrategy.getRangeAddresses(AbstractReplicationStrategy.java:187) at org.apache.cassandra.dht.RangeStreamer.getAllRangesWithSourcesFor(RangeStreamer.java:147) at org.apache.cassandra.dht.RangeStreamer.addRanges(RangeStreamer.java:121) at org.apache.cassandra.dht.BootStrapper.bootstrap(BootStrapper.java:81) at org.apache.cassandra.service.StorageService.bootstrap(StorageService.java:979) at org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:745) at org.apache.cassandra.service.StorageService.initServer(StorageService.java:586) at org.apache.cassandra.service.StorageService.initServer(StorageService.java:483) at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:348) at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:447) at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:490) ERROR [StorageServiceShutdownHook] 2013-12-16 18:02:04,876 CassandraDaemon.java (line 191) Exception in thread
[jira] [Commented] (CASSANDRA-6493) Exceptions when a second Datacenter is Added
[ https://issues.apache.org/jira/browse/CASSANDRA-6493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13849448#comment-13849448 ] Russell Alexander Spitzer commented on CASSANDRA-6493: -- https://cassci.datastax.com/job/cassandra-addremovedc/25/console The Node down Detected are messages from a thread which runs nodetool status every ~2 seconds and counts how many nodes report themselves as up, the lack of a command line output shows the command failed. Exceptions when a second Datacenter is Added Key: CASSANDRA-6493 URL: https://issues.apache.org/jira/browse/CASSANDRA-6493 Project: Cassandra Issue Type: Bug Components: Core Environment: Ubuntu, EC2 M1.large Reporter: Russell Alexander Spitzer On adding a second datacenter several exceptions were raised. Test outline: Start 25 Node DC1 Keyspace Setup Replication 3 Begin insert against DC1 Using Stress While the inserts are occuring Start up 25 Node DC2 Alter Keyspace to include Replication in 2nd DC Run rebuild on DC2 Wait for stress to finish Run repair on Cluster ... Some other operations At the point when the second datacenter is added several warnings go off because nodetool status is not functioning, and a few moments later the start operation reports a failure because a node has not successfully turned on. The first start attempt yielded the following exception on a node in the second DC. {code} CassandraDaemon.java (line 464) Exception encountered during startup java.lang.AssertionError: -7560216458456714666 not found in -9222060278673125462, -9220751250790085193, . ALL THE TOKENS ..., 9218575851928340117, 9219681798686280387 at org.apache.cassandra.locator.TokenMetadata.getPredecessor(TokenMetadata.java:752) at org.apache.cassandra.locator.TokenMetadata.getPrimaryRangesFor(TokenMetadata.java:696) at org.apache.cassandra.locator.TokenMetadata.getPrimaryRangeFor(TokenMetadata.java:703) at org.apache.cassandra.locator.AbstractReplicationStrategy.getRangeAddresses(AbstractReplicationStrategy.java:187) at org.apache.cassandra.dht.RangeStreamer.getAllRangesWithSourcesFor(RangeStreamer.java:147) at org.apache.cassandra.dht.RangeStreamer.addRanges(RangeStreamer.java:121) at org.apache.cassandra.dht.BootStrapper.bootstrap(BootStrapper.java:81) at org.apache.cassandra.service.StorageService.bootstrap(StorageService.java:979) at org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:745) at org.apache.cassandra.service.StorageService.initServer(StorageService.java:586) at org.apache.cassandra.service.StorageService.initServer(StorageService.java:483) at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:348) at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:447) at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:490) {code} The test automatically tries to restart nodes if they fail during startup, The second attempt for this node succeeded but a 'nodetool status' still failed and a different node in the second DC logged the following and failed to start up. {code} ERROR [main] 2013-12-16 18:02:04,869 CassandraDaemon.java (line 464) Exception encountered during startup java.util.ConcurrentModificationException at java.util.TreeMap$PrivateEntryIterator.nextEntry(TreeMap.java:1115) at java.util.TreeMap$KeyIterator.next(TreeMap.java:1169) at org.apache.commons.lang.StringUtils.join(StringUtils.java:3382) at org.apache.commons.lang.StringUtils.join(StringUtils.java:3444) at org.apache.cassandra.locator.TokenMetadata.getPredecessor(TokenMetadata.java:752) at org.apache.cassandra.locator.TokenMetadata.getPrimaryRangesFor(TokenMetadata.java:696) at org.apache.cassandra.locator.TokenMetadata.getPrimaryRangeFor(TokenMetadata.java:703) at org.apache.cassandra.locator.AbstractReplicationStrategy.getRangeAddresses(AbstractReplicationStrategy.java:187) at org.apache.cassandra.dht.RangeStreamer.getAllRangesWithSourcesFor(RangeStreamer.java:147) at org.apache.cassandra.dht.RangeStreamer.addRanges(RangeStreamer.java:121) at org.apache.cassandra.dht.BootStrapper.bootstrap(BootStrapper.java:81) at org.apache.cassandra.service.StorageService.bootstrap(StorageService.java:979) at org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:745) at org.apache.cassandra.service.StorageService.initServer(StorageService.java:586) at org.apache.cassandra.service.StorageService.initServer(StorageService.java:483) at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:348)
[jira] [Updated] (CASSANDRA-4687) Exception: DecoratedKey(xxx, yyy) != DecoratedKey(zzz, kkk)
[ https://issues.apache.org/jira/browse/CASSANDRA-4687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pavel Yaskevich updated CASSANDRA-4687: --- Attachment: apache-cassandra-1.2.13-SNAPSHOT.jar guava-backed-cache.patch This is the initial patch which uses Guava Cache as a cache storage. The only thing which is not functional right now is setCapacity but it's not that crucial anyway to figure if that helps to fix the situation. Exception: DecoratedKey(xxx, yyy) != DecoratedKey(zzz, kkk) --- Key: CASSANDRA-4687 URL: https://issues.apache.org/jira/browse/CASSANDRA-4687 Project: Cassandra Issue Type: Bug Components: Core Environment: CentOS 6.3 64-bit, Oracle JRE 1.6.0.33 64-bit, single node cluster Reporter: Leonid Shalupov Priority: Minor Attachments: 4687-debugging.txt, apache-cassandra-1.2.13-SNAPSHOT.jar, guava-backed-cache.patch Under heavy write load sometimes cassandra fails with assertion error. git bisect leads to commit 295aedb278e7a495213241b66bc46d763fd4ce66. works fine if global key/row caches disabled in code. {quote} java.lang.AssertionError: DecoratedKey(xxx, yyy) != DecoratedKey(zzz, kkk) in /var/lib/cassandra/data/...-he-1-Data.db at org.apache.cassandra.db.columniterator.SSTableSliceIterator.init(SSTableSliceIterator.java:60) at org.apache.cassandra.db.filter.SliceQueryFilter.getSSTableColumnIterator(SliceQueryFilter.java:67) at org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:79) at org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:256) at org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:64) at org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1345) at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1207) at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1142) at org.apache.cassandra.db.Table.getRow(Table.java:378) at org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:69) at org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:819) at org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1253) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) {quote} -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (CASSANDRA-6008) Getting 'This should never happen' error at startup due to sstables missing
[ https://issues.apache.org/jira/browse/CASSANDRA-6008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13849672#comment-13849672 ] Nikolai Grigoriev commented on CASSANDRA-6008: -- I am wondering if it is possible that because of this problem I ended up with this (http://stackoverflow.com/questions/20589324/cassandra-2-0-3-endless-compactions-with-no-traffic) issue. I am constantly having this This should never happen problem with I restart my 2.0.3 cluster. Out of 6 nodes, if I restart it now for sure at least 2 will fail to start because of this condition. And to allow them to start I wipe the contents of system.compactions_in_progress table and delete all compactions_in_progress directories under my data directories on the node affected. Getting 'This should never happen' error at startup due to sstables missing --- Key: CASSANDRA-6008 URL: https://issues.apache.org/jira/browse/CASSANDRA-6008 Project: Cassandra Issue Type: Bug Components: Core Reporter: John Carrino Assignee: Tyler Hobbs Fix For: 2.0.4 Attachments: 6008-2.0-part2.patch, 6008-2.0-v1.patch, 6008-trunk-v1.patch Exception encountered during startup: Unfinished compactions reference missing sstables. This should never happen since compactions are marked finished before we start removing the old sstables This happens when sstables that have been compacted away are removed, but they still have entries in the system.compactions_in_progress table. Normally this should not happen because the entries in system.compactions_in_progress are deleted before the old sstables are deleted. However at startup recovery time, old sstables are deleted (NOT BEFORE they are removed from the compactions_in_progress table) and then after that is done it does a truncate using SystemKeyspace.discardCompactionsInProgress We ran into a case where the disk filled up and the node died and was bounced and then failed to truncate this table on startup, and then got stuck hitting this exception in ColumnFamilyStore.removeUnfinishedCompactionLeftovers. Maybe on startup we can delete from this table incrementally as we clean stuff up in the same way that compactions delete from this table before they delete old sstables. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Updated] (CASSANDRA-5742) Add command list snapshots to nodetool
[ https://issues.apache.org/jira/browse/CASSANDRA-5742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sankalp kohli updated CASSANDRA-5742: - Attachment: JIRA-5742.diff Add command list snapshots to nodetool Key: CASSANDRA-5742 URL: https://issues.apache.org/jira/browse/CASSANDRA-5742 Project: Cassandra Issue Type: New Feature Components: Tools Affects Versions: 1.2.1 Reporter: Geert Schuring Assignee: sankalp kohli Priority: Minor Labels: lhf Attachments: JIRA-5742.diff It would be nice if the nodetool could tell me which snapshots are present on the system instead of me having to browse the filesystem to fetch the names of the snapshots. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (CASSANDRA-6008) Getting 'This should never happen' error at startup due to sstables missing
[ https://issues.apache.org/jira/browse/CASSANDRA-6008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13849697#comment-13849697 ] John Carrino commented on CASSANDRA-6008: - I'm fine with leaving all the sstables live. We use our own MVCC and only rely on cassandra to do durable writes and use QUORUM to ensure read what you wrote. Is the only point of this table to ensure counters are handled correctly? Another possible issue may be when doing restore from backup. If you do a shutdown while there are rows in compaction_log and then clear the current tables and replace with new ones you will get this error also. Getting 'This should never happen' error at startup due to sstables missing --- Key: CASSANDRA-6008 URL: https://issues.apache.org/jira/browse/CASSANDRA-6008 Project: Cassandra Issue Type: Bug Components: Core Reporter: John Carrino Assignee: Tyler Hobbs Fix For: 2.0.4 Attachments: 6008-2.0-part2.patch, 6008-2.0-v1.patch, 6008-trunk-v1.patch Exception encountered during startup: Unfinished compactions reference missing sstables. This should never happen since compactions are marked finished before we start removing the old sstables This happens when sstables that have been compacted away are removed, but they still have entries in the system.compactions_in_progress table. Normally this should not happen because the entries in system.compactions_in_progress are deleted before the old sstables are deleted. However at startup recovery time, old sstables are deleted (NOT BEFORE they are removed from the compactions_in_progress table) and then after that is done it does a truncate using SystemKeyspace.discardCompactionsInProgress We ran into a case where the disk filled up and the node died and was bounced and then failed to truncate this table on startup, and then got stuck hitting this exception in ColumnFamilyStore.removeUnfinishedCompactionLeftovers. Maybe on startup we can delete from this table incrementally as we clean stuff up in the same way that compactions delete from this table before they delete old sstables. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Updated] (CASSANDRA-5742) Add command list snapshots to nodetool
[ https://issues.apache.org/jira/browse/CASSANDRA-5742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sankalp kohli updated CASSANDRA-5742: - Attachment: new_file.diff Add command list snapshots to nodetool Key: CASSANDRA-5742 URL: https://issues.apache.org/jira/browse/CASSANDRA-5742 Project: Cassandra Issue Type: New Feature Components: Tools Affects Versions: 1.2.1 Reporter: Geert Schuring Assignee: sankalp kohli Priority: Minor Labels: lhf Attachments: JIRA-5742.diff, new_file.diff It would be nice if the nodetool could tell me which snapshots are present on the system instead of me having to browse the filesystem to fetch the names of the snapshots. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Updated] (CASSANDRA-6216) Level Compaction should persist last compacted key per level
[ https://issues.apache.org/jira/browse/CASSANDRA-6216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sankalp kohli updated CASSANDRA-6216: - Attachment: JIRA-6216.diff Level Compaction should persist last compacted key per level Key: CASSANDRA-6216 URL: https://issues.apache.org/jira/browse/CASSANDRA-6216 Project: Cassandra Issue Type: Improvement Components: Core Reporter: sankalp kohli Assignee: sankalp kohli Priority: Minor Attachments: JIRA-6216.diff Level compaction does not persist the last compacted key per level. This is important for higher levels. The sstables with higher token and in higher levels wont get a chance to compact as the last compacted key will get reset after a restart. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Updated] (CASSANDRA-5742) Add command list snapshots to nodetool
[ https://issues.apache.org/jira/browse/CASSANDRA-5742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-5742: -- Reviewer: Lyuben Todorov Add command list snapshots to nodetool Key: CASSANDRA-5742 URL: https://issues.apache.org/jira/browse/CASSANDRA-5742 Project: Cassandra Issue Type: New Feature Components: Tools Affects Versions: 1.2.1 Reporter: Geert Schuring Assignee: sankalp kohli Priority: Minor Labels: lhf Attachments: JIRA-5742.diff, new_file.diff It would be nice if the nodetool could tell me which snapshots are present on the system instead of me having to browse the filesystem to fetch the names of the snapshots. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Updated] (CASSANDRA-6216) Level Compaction should persist last compacted key per level
[ https://issues.apache.org/jira/browse/CASSANDRA-6216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sankalp kohli updated CASSANDRA-6216: - Attachment: (was: JIRA-6216.diff) Level Compaction should persist last compacted key per level Key: CASSANDRA-6216 URL: https://issues.apache.org/jira/browse/CASSANDRA-6216 Project: Cassandra Issue Type: Improvement Components: Core Reporter: sankalp kohli Assignee: sankalp kohli Priority: Minor Attachments: JIRA-6216.diff Level compaction does not persist the last compacted key per level. This is important for higher levels. The sstables with higher token and in higher levels wont get a chance to compact as the last compacted key will get reset after a restart. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Updated] (CASSANDRA-6216) Level Compaction should persist last compacted key per level
[ https://issues.apache.org/jira/browse/CASSANDRA-6216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sankalp kohli updated CASSANDRA-6216: - Attachment: JIRA-6216.diff Level Compaction should persist last compacted key per level Key: CASSANDRA-6216 URL: https://issues.apache.org/jira/browse/CASSANDRA-6216 Project: Cassandra Issue Type: Improvement Components: Core Reporter: sankalp kohli Assignee: sankalp kohli Priority: Minor Attachments: JIRA-6216.diff Level compaction does not persist the last compacted key per level. This is important for higher levels. The sstables with higher token and in higher levels wont get a chance to compact as the last compacted key will get reset after a restart. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (CASSANDRA-6158) Nodetool command to purge hints
[ https://issues.apache.org/jira/browse/CASSANDRA-6158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13849721#comment-13849721 ] sankalp kohli commented on CASSANDRA-6158: -- [~brandon.williams] Should I change deleteHintsForEndpoint to be blocking as well? Nodetool command to purge hints --- Key: CASSANDRA-6158 URL: https://issues.apache.org/jira/browse/CASSANDRA-6158 Project: Cassandra Issue Type: Improvement Components: Core Reporter: sankalp kohli Assignee: sankalp kohli Priority: Minor Attachments: trunk-6158.txt The only way to truncate all hints in Cassandra is to truncate the hints CF in system table. It would be cleaner to have a nodetool command for it. Also ability to selectively remove hints by host or DC would also be nice rather than removing all the hints. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (CASSANDRA-5742) Add command list snapshots to nodetool
[ https://issues.apache.org/jira/browse/CASSANDRA-5742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13849734#comment-13849734 ] sankalp kohli commented on CASSANDRA-5742: -- [~jbellis] In this command, I am displaying snapshot and how much true space they are taking and how much total space they are taking. If I need to add your suggestion Suggest adding a total space used as well (that doesn't double-count multiple snapshots of the same file) the way we did for cfstats in Then it will be a global value across snapshots and user can get that from cfstats. Does that sound reasonable? Add command list snapshots to nodetool Key: CASSANDRA-5742 URL: https://issues.apache.org/jira/browse/CASSANDRA-5742 Project: Cassandra Issue Type: New Feature Components: Tools Affects Versions: 1.2.1 Reporter: Geert Schuring Assignee: sankalp kohli Priority: Minor Labels: lhf Attachments: JIRA-5742.diff, new_file.diff It would be nice if the nodetool could tell me which snapshots are present on the system instead of me having to browse the filesystem to fetch the names of the snapshots. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Updated] (CASSANDRA-6440) Repair should allow repairing particular endpoints to reduce WAN usage.
[ https://issues.apache.org/jira/browse/CASSANDRA-6440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sankalp kohli updated CASSANDRA-6440: - Attachment: JIRA-6440.diff Repair should allow repairing particular endpoints to reduce WAN usage. Key: CASSANDRA-6440 URL: https://issues.apache.org/jira/browse/CASSANDRA-6440 Project: Cassandra Issue Type: New Feature Reporter: sankalp kohli Priority: Minor Attachments: JIRA-6440.diff The way we send out data that does not match over WAN can be improved. Example: Say there are four nodes(A,B,C,D) which are replica of a range we are repairing. A, B is in DC1 and C,D is in DC2. If A does not have the data which other replicas have, then we will have following streams 1) A to B and back 2) A to C and back(Goes over WAN) 3) A to D and back(Goes over WAN) One of the ways of doing it to reduce WAN traffic is this. 1) Repair A and B only with each other and C and D with each other starting at same time t. 2) Once these repairs have finished, A,B and C,D are in sync with respect to time t. 3) Now run a repair between A and C, the streams which are exchanged as a result of the diff will also be streamed to B and D via A and C(C and D behaves like a proxy to the streams). For a replication of DC1:2,DC2:2, the WAN traffic will get reduced by 50% and even more for higher replication factors. Another easy way to do this is to have repair command take nodes with which you want to repair with. Then we can do something like this. 1) Run repair between (A and B) and (C and D) 2) Run repair between (A and C) 3) Run repair between (A and B) and (C and D) But this will increase the traffic inside the DC as we wont be doing proxy. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Updated] (CASSANDRA-5906) Avoid allocating over-large bloom filters
[ https://issues.apache.org/jira/browse/CASSANDRA-5906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuki Morishita updated CASSANDRA-5906: -- Attachment: 5906.txt (also: https://github.com/yukim/cassandra/tree/5906-v3) Attaching patch for review. * implemented on top of CASSANDRA-6356 * updated stream-lib to v2.5.1 (latest) HLL++ parameters are p=13, sp=25 from my observation above. Avoid allocating over-large bloom filters - Key: CASSANDRA-5906 URL: https://issues.apache.org/jira/browse/CASSANDRA-5906 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Jonathan Ellis Assignee: Yuki Morishita Fix For: 2.1 Attachments: 5906.txt We conservatively estimate the number of partitions post-compaction to be the total number of partitions pre-compaction. That is, we assume the worst-case scenario of no partition overlap at all. This can result in substantial memory wasted in sstables resulting from highly overlapping compactions. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (CASSANDRA-5742) Add command list snapshots to nodetool
[ https://issues.apache.org/jira/browse/CASSANDRA-5742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13849749#comment-13849749 ] Jonathan Ellis commented on CASSANDRA-5742: --- What I'd like to accomplish is making it more obvious to the user that just because he has two snapshots that each take 1GB of space, doesn't mean that they take up 2GB combined. Open to suggestions on implementation details. Add command list snapshots to nodetool Key: CASSANDRA-5742 URL: https://issues.apache.org/jira/browse/CASSANDRA-5742 Project: Cassandra Issue Type: New Feature Components: Tools Affects Versions: 1.2.1 Reporter: Geert Schuring Assignee: sankalp kohli Priority: Minor Labels: lhf Attachments: JIRA-5742.diff, new_file.diff It would be nice if the nodetool could tell me which snapshots are present on the system instead of me having to browse the filesystem to fetch the names of the snapshots. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (CASSANDRA-6493) Exceptions when a second Datacenter is Added
[ https://issues.apache.org/jira/browse/CASSANDRA-6493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13849854#comment-13849854 ] Russell Alexander Spitzer commented on CASSANDRA-6493: -- I was able to get the same results repeating the test. https://cassci.datastax.com/job/cassandra-addremovedc/26/console Exceptions when a second Datacenter is Added Key: CASSANDRA-6493 URL: https://issues.apache.org/jira/browse/CASSANDRA-6493 Project: Cassandra Issue Type: Bug Components: Core Environment: Ubuntu, EC2 M1.large Reporter: Russell Alexander Spitzer On adding a second datacenter several exceptions were raised. Test outline: Start 25 Node DC1 Keyspace Setup Replication 3 Begin insert against DC1 Using Stress While the inserts are occuring Start up 25 Node DC2 Alter Keyspace to include Replication in 2nd DC Run rebuild on DC2 Wait for stress to finish Run repair on Cluster ... Some other operations At the point when the second datacenter is added several warnings go off because nodetool status is not functioning, and a few moments later the start operation reports a failure because a node has not successfully turned on. The first start attempt yielded the following exception on a node in the second DC. {code} CassandraDaemon.java (line 464) Exception encountered during startup java.lang.AssertionError: -7560216458456714666 not found in -9222060278673125462, -9220751250790085193, . ALL THE TOKENS ..., 9218575851928340117, 9219681798686280387 at org.apache.cassandra.locator.TokenMetadata.getPredecessor(TokenMetadata.java:752) at org.apache.cassandra.locator.TokenMetadata.getPrimaryRangesFor(TokenMetadata.java:696) at org.apache.cassandra.locator.TokenMetadata.getPrimaryRangeFor(TokenMetadata.java:703) at org.apache.cassandra.locator.AbstractReplicationStrategy.getRangeAddresses(AbstractReplicationStrategy.java:187) at org.apache.cassandra.dht.RangeStreamer.getAllRangesWithSourcesFor(RangeStreamer.java:147) at org.apache.cassandra.dht.RangeStreamer.addRanges(RangeStreamer.java:121) at org.apache.cassandra.dht.BootStrapper.bootstrap(BootStrapper.java:81) at org.apache.cassandra.service.StorageService.bootstrap(StorageService.java:979) at org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:745) at org.apache.cassandra.service.StorageService.initServer(StorageService.java:586) at org.apache.cassandra.service.StorageService.initServer(StorageService.java:483) at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:348) at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:447) at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:490) {code} The test automatically tries to restart nodes if they fail during startup, The second attempt for this node succeeded but a 'nodetool status' still failed and a different node in the second DC logged the following and failed to start up. {code} ERROR [main] 2013-12-16 18:02:04,869 CassandraDaemon.java (line 464) Exception encountered during startup java.util.ConcurrentModificationException at java.util.TreeMap$PrivateEntryIterator.nextEntry(TreeMap.java:1115) at java.util.TreeMap$KeyIterator.next(TreeMap.java:1169) at org.apache.commons.lang.StringUtils.join(StringUtils.java:3382) at org.apache.commons.lang.StringUtils.join(StringUtils.java:3444) at org.apache.cassandra.locator.TokenMetadata.getPredecessor(TokenMetadata.java:752) at org.apache.cassandra.locator.TokenMetadata.getPrimaryRangesFor(TokenMetadata.java:696) at org.apache.cassandra.locator.TokenMetadata.getPrimaryRangeFor(TokenMetadata.java:703) at org.apache.cassandra.locator.AbstractReplicationStrategy.getRangeAddresses(AbstractReplicationStrategy.java:187) at org.apache.cassandra.dht.RangeStreamer.getAllRangesWithSourcesFor(RangeStreamer.java:147) at org.apache.cassandra.dht.RangeStreamer.addRanges(RangeStreamer.java:121) at org.apache.cassandra.dht.BootStrapper.bootstrap(BootStrapper.java:81) at org.apache.cassandra.service.StorageService.bootstrap(StorageService.java:979) at org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:745) at org.apache.cassandra.service.StorageService.initServer(StorageService.java:586) at org.apache.cassandra.service.StorageService.initServer(StorageService.java:483) at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:348) at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:447) at
[jira] [Commented] (CASSANDRA-6158) Nodetool command to purge hints
[ https://issues.apache.org/jira/browse/CASSANDRA-6158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13849856#comment-13849856 ] Brandon Williams commented on CASSANDRA-6158: - Maybe I'm missing something, but I don't see such a call in this patch? Nodetool command to purge hints --- Key: CASSANDRA-6158 URL: https://issues.apache.org/jira/browse/CASSANDRA-6158 Project: Cassandra Issue Type: Improvement Components: Core Reporter: sankalp kohli Assignee: sankalp kohli Priority: Minor Attachments: trunk-6158.txt The only way to truncate all hints in Cassandra is to truncate the hints CF in system table. It would be cleaner to have a nodetool command for it. Also ability to selectively remove hints by host or DC would also be nice rather than removing all the hints. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (CASSANDRA-6158) Nodetool command to purge hints
[ https://issues.apache.org/jira/browse/CASSANDRA-6158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13849858#comment-13849858 ] Brandon Williams commented on CASSANDRA-6158: - Either way though, yes, calls should block until completed. Nodetool command to purge hints --- Key: CASSANDRA-6158 URL: https://issues.apache.org/jira/browse/CASSANDRA-6158 Project: Cassandra Issue Type: Improvement Components: Core Reporter: sankalp kohli Assignee: sankalp kohli Priority: Minor Attachments: trunk-6158.txt The only way to truncate all hints in Cassandra is to truncate the hints CF in system table. It would be cleaner to have a nodetool command for it. Also ability to selectively remove hints by host or DC would also be nice rather than removing all the hints. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Updated] (CASSANDRA-4268) Expose full stop() operation through JMX
[ https://issues.apache.org/jira/browse/CASSANDRA-4268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lyuben Todorov updated CASSANDRA-4268: -- Attachment: 4268_cassandra-2.0.patch Expose full stop() operation through JMX Key: CASSANDRA-4268 URL: https://issues.apache.org/jira/browse/CASSANDRA-4268 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Tyler Hobbs Assignee: Lyuben Todorov Priority: Minor Labels: jmx Fix For: 2.0.4 Attachments: 4268_cassandra-2.0.patch We already expose ways to stop just the RPC server or gossip. This would fully shutdown the process. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (CASSANDRA-6493) Exceptions when a second Datacenter is Added
[ https://issues.apache.org/jira/browse/CASSANDRA-6493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13849912#comment-13849912 ] Jonathan Ellis commented on CASSANDRA-6493: --- From chat, this does not reproduce when CASSANDRA-6488 is reverted. Exceptions when a second Datacenter is Added Key: CASSANDRA-6493 URL: https://issues.apache.org/jira/browse/CASSANDRA-6493 Project: Cassandra Issue Type: Bug Components: Core Environment: Ubuntu, EC2 M1.large Reporter: Russell Alexander Spitzer On adding a second datacenter several exceptions were raised. Test outline: Start 25 Node DC1 Keyspace Setup Replication 3 Begin insert against DC1 Using Stress While the inserts are occuring Start up 25 Node DC2 Alter Keyspace to include Replication in 2nd DC Run rebuild on DC2 Wait for stress to finish Run repair on Cluster ... Some other operations At the point when the second datacenter is added several warnings go off because nodetool status is not functioning, and a few moments later the start operation reports a failure because a node has not successfully turned on. The first start attempt yielded the following exception on a node in the second DC. {code} CassandraDaemon.java (line 464) Exception encountered during startup java.lang.AssertionError: -7560216458456714666 not found in -9222060278673125462, -9220751250790085193, . ALL THE TOKENS ..., 9218575851928340117, 9219681798686280387 at org.apache.cassandra.locator.TokenMetadata.getPredecessor(TokenMetadata.java:752) at org.apache.cassandra.locator.TokenMetadata.getPrimaryRangesFor(TokenMetadata.java:696) at org.apache.cassandra.locator.TokenMetadata.getPrimaryRangeFor(TokenMetadata.java:703) at org.apache.cassandra.locator.AbstractReplicationStrategy.getRangeAddresses(AbstractReplicationStrategy.java:187) at org.apache.cassandra.dht.RangeStreamer.getAllRangesWithSourcesFor(RangeStreamer.java:147) at org.apache.cassandra.dht.RangeStreamer.addRanges(RangeStreamer.java:121) at org.apache.cassandra.dht.BootStrapper.bootstrap(BootStrapper.java:81) at org.apache.cassandra.service.StorageService.bootstrap(StorageService.java:979) at org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:745) at org.apache.cassandra.service.StorageService.initServer(StorageService.java:586) at org.apache.cassandra.service.StorageService.initServer(StorageService.java:483) at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:348) at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:447) at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:490) {code} The test automatically tries to restart nodes if they fail during startup, The second attempt for this node succeeded but a 'nodetool status' still failed and a different node in the second DC logged the following and failed to start up. {code} ERROR [main] 2013-12-16 18:02:04,869 CassandraDaemon.java (line 464) Exception encountered during startup java.util.ConcurrentModificationException at java.util.TreeMap$PrivateEntryIterator.nextEntry(TreeMap.java:1115) at java.util.TreeMap$KeyIterator.next(TreeMap.java:1169) at org.apache.commons.lang.StringUtils.join(StringUtils.java:3382) at org.apache.commons.lang.StringUtils.join(StringUtils.java:3444) at org.apache.cassandra.locator.TokenMetadata.getPredecessor(TokenMetadata.java:752) at org.apache.cassandra.locator.TokenMetadata.getPrimaryRangesFor(TokenMetadata.java:696) at org.apache.cassandra.locator.TokenMetadata.getPrimaryRangeFor(TokenMetadata.java:703) at org.apache.cassandra.locator.AbstractReplicationStrategy.getRangeAddresses(AbstractReplicationStrategy.java:187) at org.apache.cassandra.dht.RangeStreamer.getAllRangesWithSourcesFor(RangeStreamer.java:147) at org.apache.cassandra.dht.RangeStreamer.addRanges(RangeStreamer.java:121) at org.apache.cassandra.dht.BootStrapper.bootstrap(BootStrapper.java:81) at org.apache.cassandra.service.StorageService.bootstrap(StorageService.java:979) at org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:745) at org.apache.cassandra.service.StorageService.initServer(StorageService.java:586) at org.apache.cassandra.service.StorageService.initServer(StorageService.java:483) at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:348) at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:447) at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:490) ERROR [StorageServiceShutdownHook] 2013-12-16
[jira] [Commented] (CASSANDRA-6493) Exceptions when a second Datacenter is Added
[ https://issues.apache.org/jira/browse/CASSANDRA-6493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13849925#comment-13849925 ] Russell Alexander Spitzer commented on CASSANDRA-6493: -- Correct I didn't see this over several runs over the weekend testing on the pre-6488 build. Head of the git log from that build {code} commit c133ff88982948fdb12669bf766e9848102a3496 Author: Russell Spitzer russell.spit...@gmail.com Date: Fri Dec 13 12:00:53 2013 -0800 Patch to fix NPE ( this is patch a3d91dc9d67572e16d9ad92f22b89eb969373899) commit 11455738fa61c6eb02895a5a8d3fbbe4d8cb24b4 Author: Brandon Williams brandonwilli...@apache.org Date: Fri Dec 13 12:10:47 2013 -0600 Pig: don't assume all DataBags are DefaultDataBags Patch by Mike Spertus, reviewed by brandonwilliams for CASSANDRA-6420 {code} Exceptions when a second Datacenter is Added Key: CASSANDRA-6493 URL: https://issues.apache.org/jira/browse/CASSANDRA-6493 Project: Cassandra Issue Type: Bug Components: Core Environment: Ubuntu, EC2 M1.large Reporter: Russell Alexander Spitzer On adding a second datacenter several exceptions were raised. Test outline: Start 25 Node DC1 Keyspace Setup Replication 3 Begin insert against DC1 Using Stress While the inserts are occuring Start up 25 Node DC2 Alter Keyspace to include Replication in 2nd DC Run rebuild on DC2 Wait for stress to finish Run repair on Cluster ... Some other operations At the point when the second datacenter is added several warnings go off because nodetool status is not functioning, and a few moments later the start operation reports a failure because a node has not successfully turned on. The first start attempt yielded the following exception on a node in the second DC. {code} CassandraDaemon.java (line 464) Exception encountered during startup java.lang.AssertionError: -7560216458456714666 not found in -9222060278673125462, -9220751250790085193, . ALL THE TOKENS ..., 9218575851928340117, 9219681798686280387 at org.apache.cassandra.locator.TokenMetadata.getPredecessor(TokenMetadata.java:752) at org.apache.cassandra.locator.TokenMetadata.getPrimaryRangesFor(TokenMetadata.java:696) at org.apache.cassandra.locator.TokenMetadata.getPrimaryRangeFor(TokenMetadata.java:703) at org.apache.cassandra.locator.AbstractReplicationStrategy.getRangeAddresses(AbstractReplicationStrategy.java:187) at org.apache.cassandra.dht.RangeStreamer.getAllRangesWithSourcesFor(RangeStreamer.java:147) at org.apache.cassandra.dht.RangeStreamer.addRanges(RangeStreamer.java:121) at org.apache.cassandra.dht.BootStrapper.bootstrap(BootStrapper.java:81) at org.apache.cassandra.service.StorageService.bootstrap(StorageService.java:979) at org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:745) at org.apache.cassandra.service.StorageService.initServer(StorageService.java:586) at org.apache.cassandra.service.StorageService.initServer(StorageService.java:483) at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:348) at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:447) at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:490) {code} The test automatically tries to restart nodes if they fail during startup, The second attempt for this node succeeded but a 'nodetool status' still failed and a different node in the second DC logged the following and failed to start up. {code} ERROR [main] 2013-12-16 18:02:04,869 CassandraDaemon.java (line 464) Exception encountered during startup java.util.ConcurrentModificationException at java.util.TreeMap$PrivateEntryIterator.nextEntry(TreeMap.java:1115) at java.util.TreeMap$KeyIterator.next(TreeMap.java:1169) at org.apache.commons.lang.StringUtils.join(StringUtils.java:3382) at org.apache.commons.lang.StringUtils.join(StringUtils.java:3444) at org.apache.cassandra.locator.TokenMetadata.getPredecessor(TokenMetadata.java:752) at org.apache.cassandra.locator.TokenMetadata.getPrimaryRangesFor(TokenMetadata.java:696) at org.apache.cassandra.locator.TokenMetadata.getPrimaryRangeFor(TokenMetadata.java:703) at org.apache.cassandra.locator.AbstractReplicationStrategy.getRangeAddresses(AbstractReplicationStrategy.java:187) at org.apache.cassandra.dht.RangeStreamer.getAllRangesWithSourcesFor(RangeStreamer.java:147) at org.apache.cassandra.dht.RangeStreamer.addRanges(RangeStreamer.java:121) at org.apache.cassandra.dht.BootStrapper.bootstrap(BootStrapper.java:81) at org.apache.cassandra.service.StorageService.bootstrap(StorageService.java:979) at
[Cassandra Wiki] Update of ContributorsGroup by BrandonWilliams
Dear Wiki user, You have subscribed to a wiki page or wiki category on Cassandra Wiki for change notification. The ContributorsGroup page has been changed by BrandonWilliams: https://wiki.apache.org/cassandra/ContributorsGroup?action=diffrev1=23rev2=24 * mkjellman * ono_matope * ChrisBurroughs + * bhamail
[jira] [Commented] (CASSANDRA-6465) DES scores fluctuate too much for cache pinning
[ https://issues.apache.org/jira/browse/CASSANDRA-6465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13849949#comment-13849949 ] Robert Coli commented on CASSANDRA-6465: Are we sure that this mechanism of producing cache pinning is worth the complexity here, especially given speculative retry? DES scores fluctuate too much for cache pinning --- Key: CASSANDRA-6465 URL: https://issues.apache.org/jira/browse/CASSANDRA-6465 Project: Cassandra Issue Type: Bug Components: Core Environment: 1.2.11, 2 DC cluster Reporter: Chris Burroughs Assignee: Tyler Hobbs Priority: Minor Labels: gossip Fix For: 2.0.4 Attachments: des-score-graph.png, des.sample.15min.csv, get-scores.py To quote the conf: {noformat} # if set greater than zero and read_repair_chance is 1.0, this will allow # 'pinning' of replicas to hosts in order to increase cache capacity. # The badness threshold will control how much worse the pinned host has to be # before the dynamic snitch will prefer other replicas over it. This is # expressed as a double which represents a percentage. Thus, a value of # 0.2 means Cassandra would continue to prefer the static snitch values # until the pinned host was 20% worse than the fastest. dynamic_snitch_badness_threshold: 0.1 {noformat} An assumption of this feature is that scores will vary by less than dynamic_snitch_badness_threshold during normal operations. Attached is the result of polling a node for the scores of 6 different endpoints at 1 Hz for 15 minutes. The endpoints to sample were chosen with `nodetool getendpoints` for row that is known to get reads. The node was acting as a coordinator for a few hundred req/second, so it should have sufficient data to work with. Other traces on a second cluster have produced similar results. * The scores vary by far more than I would expect, as show by the difficulty of seeing anything useful in that graph. * The difference between the best and next-best score is usually 10% (default dynamic_snitch_badness_threshold). Neither ClientRequest nor ColumFamily metrics showed wild changes during the data gathering period. Attachments: * jython script cobbled together to gather the data (based on work on the mailing list from Maki Watanabe a while back) * csv of DES scores for 6 endpoints, polled about once a second * Attempt at making a graph -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Comment Edited] (CASSANDRA-6465) DES scores fluctuate too much for cache pinning
[ https://issues.apache.org/jira/browse/CASSANDRA-6465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13849949#comment-13849949 ] Robert Coli edited comment on CASSANDRA-6465 at 12/17/13 12:52 AM: --- Are we sure that this mechanism of producing cache pinning is worth the complexity here, especially given speculative execution? was (Author: rcoli): Are we sure that this mechanism of producing cache pinning is worth the complexity here, especially given speculative retry? DES scores fluctuate too much for cache pinning --- Key: CASSANDRA-6465 URL: https://issues.apache.org/jira/browse/CASSANDRA-6465 Project: Cassandra Issue Type: Bug Components: Core Environment: 1.2.11, 2 DC cluster Reporter: Chris Burroughs Assignee: Tyler Hobbs Priority: Minor Labels: gossip Fix For: 2.0.4 Attachments: des-score-graph.png, des.sample.15min.csv, get-scores.py To quote the conf: {noformat} # if set greater than zero and read_repair_chance is 1.0, this will allow # 'pinning' of replicas to hosts in order to increase cache capacity. # The badness threshold will control how much worse the pinned host has to be # before the dynamic snitch will prefer other replicas over it. This is # expressed as a double which represents a percentage. Thus, a value of # 0.2 means Cassandra would continue to prefer the static snitch values # until the pinned host was 20% worse than the fastest. dynamic_snitch_badness_threshold: 0.1 {noformat} An assumption of this feature is that scores will vary by less than dynamic_snitch_badness_threshold during normal operations. Attached is the result of polling a node for the scores of 6 different endpoints at 1 Hz for 15 minutes. The endpoints to sample were chosen with `nodetool getendpoints` for row that is known to get reads. The node was acting as a coordinator for a few hundred req/second, so it should have sufficient data to work with. Other traces on a second cluster have produced similar results. * The scores vary by far more than I would expect, as show by the difficulty of seeing anything useful in that graph. * The difference between the best and next-best score is usually 10% (default dynamic_snitch_badness_threshold). Neither ClientRequest nor ColumFamily metrics showed wild changes during the data gathering period. Attachments: * jython script cobbled together to gather the data (based on work on the mailing list from Maki Watanabe a while back) * csv of DES scores for 6 endpoints, polled about once a second * Attempt at making a graph -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Created] (CASSANDRA-6494) Cassandra refuses to restart due to a corrupted commit log.
Shao-Chuan Wang created CASSANDRA-6494: -- Summary: Cassandra refuses to restart due to a corrupted commit log. Key: CASSANDRA-6494 URL: https://issues.apache.org/jira/browse/CASSANDRA-6494 Project: Cassandra Issue Type: Bug Reporter: Shao-Chuan Wang This is running on our production server. Please advise how to address this issue. Thank you! INFO 02:46:58,879 Finished reading /mnt/cassandra/commitlog/CommitLog-3-1386069222785.log ERROR 02:46:58,879 Exception encountered during startup java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.RuntimeException: 706167655f74616773 is not defined as a collection at org.apache.cassandra.utils.FBUtilities.waitOnFuture(FBUtilities.java:411) at org.apache.cassandra.utils.FBUtilities.waitOnFutures(FBUtilities.java:400) at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:273) at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:96) at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:146) at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:126) at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:299) at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:442) at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:485) Caused by: java.util.concurrent.ExecutionException: java.lang.RuntimeException: 706167655f74616773 is not defined as a collection at java.util.concurrent.FutureTask.report(FutureTask.java:122) at java.util.concurrent.FutureTask.get(FutureTask.java:188) at org.apache.cassandra.utils.FBUtilities.waitOnFuture(FBUtilities.java:407) ... 8 more Caused by: java.lang.RuntimeException: 706167655f74616773 is not defined as a collection at org.apache.cassandra.db.marshal.ColumnToCollectionType.compareCollectionMembers(ColumnToCollectionType.java:72) at org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:85) at org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:35) at edu.stanford.ppl.concurrent.SnapTreeMap$1.compareTo(SnapTreeMap.java:538) at edu.stanford.ppl.concurrent.SnapTreeMap.attemptUpdate(SnapTreeMap.java:1108) at edu.stanford.ppl.concurrent.SnapTreeMap.attemptUpdate(SnapTreeMap.java:1192) at edu.stanford.ppl.concurrent.SnapTreeMap.updateUnderRoot(SnapTreeMap.java:1059) at edu.stanford.ppl.concurrent.SnapTreeMap.update(SnapTreeMap.java:1023) at edu.stanford.ppl.concurrent.SnapTreeMap.putIfAbsent(SnapTreeMap.java:985) at org.apache.cassandra.db.AtomicSortedColumns$Holder.addColumn(AtomicSortedColumns.java:323) at org.apache.cassandra.db.AtomicSortedColumns.addAllWithSizeDelta(AtomicSortedColumns.java:195) at org.apache.cassandra.db.Memtable.resolve(Memtable.java:196) at org.apache.cassandra.db.Memtable.put(Memtable.java:160) at org.apache.cassandra.db.ColumnFamilyStore.apply(ColumnFamilyStore.java:842) at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:373) at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:338) at org.apache.cassandra.db.commitlog.CommitLogReplayer$1.runMayThrow(CommitLogReplayer.java:265) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:724) java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.RuntimeException: 706167655f74616773 is not defined as a collection at org.apache.cassandra.utils.FBUtilities.waitOnFuture(FBUtilities.java:411) at org.apache.cassandra.utils.FBUtilities.waitOnFutures(FBUtilities.java:400) at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:273) at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:96) at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:146) at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:126) at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:299) at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:442) at
[jira] [Created] (CASSANDRA-6496) Endless L0 LCS compactions
Nikolai Grigoriev created CASSANDRA-6496: Summary: Endless L0 LCS compactions Key: CASSANDRA-6496 URL: https://issues.apache.org/jira/browse/CASSANDRA-6496 Project: Cassandra Issue Type: Bug Components: Core Environment: Cassandra 2.0.3, Linux, 6 nodes, 5 disks per node Reporter: Nikolai Grigoriev I have first described the problem here: http://stackoverflow.com/questions/20589324/cassandra-2-0-3-endless-compactions-with-no-traffic I think I have really abused my system with the traffic (mix of reads, heavy updates and some deletes). Now after stopping the traffic I see the compactions that are going on endlessly for over 4 days. For a specific CF I have about 4700 sstable data files right now. The compaction estimates are logged as [3312, 4, 0, 0, 0, 0, 0, 0, 0]. sstable_size_in_mb=256. 3214 files are about 256Mb (+/1 few megs), other files are smaller or much smaller than that. No sstables are larger than 256Mb. What I observe is that LCS picks 32 sstables from L0 and compacts them into 32 sstables of approximately the same size. So, what my system is doing for last 4 days (no traffic at all) is compacting groups of 32 sstables into groups of 32 sstables without any changes. Seems like a bug to me regardless of what did I do to get the system into this state... -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Created] (CASSANDRA-6495) LOCAL_SERIAL use QUORAM consistency level to validate expected columns
sankalp kohli created CASSANDRA-6495: Summary: LOCAL_SERIAL use QUORAM consistency level to validate expected columns Key: CASSANDRA-6495 URL: https://issues.apache.org/jira/browse/CASSANDRA-6495 Project: Cassandra Issue Type: Bug Components: Core Reporter: sankalp kohli Priority: Minor If CAS is done at LOCAL_SERIAL consistency level, only the nodes from the local data center should be involved. Here we are using QUORAM to validate the expected columns. This will require nodes from more than one DC. We should use LOCAL_QUORAM here when CAS is done at LOCAL_SERIAL. Also if we have 2 DCs with DC1:3,DC2:3, a single DC down will cause CAS to not work even for LOCAL_SERIAL. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (CASSANDRA-6496) Endless L0 LCS compactions
[ https://issues.apache.org/jira/browse/CASSANDRA-6496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13850048#comment-13850048 ] Jonathan Ellis commented on CASSANDRA-6496: --- Can you enable debug logging in o.a.c.db.compaction and post a log sample? Endless L0 LCS compactions -- Key: CASSANDRA-6496 URL: https://issues.apache.org/jira/browse/CASSANDRA-6496 Project: Cassandra Issue Type: Bug Components: Core Environment: Cassandra 2.0.3, Linux, 6 nodes, 5 disks per node Reporter: Nikolai Grigoriev I have first described the problem here: http://stackoverflow.com/questions/20589324/cassandra-2-0-3-endless-compactions-with-no-traffic I think I have really abused my system with the traffic (mix of reads, heavy updates and some deletes). Now after stopping the traffic I see the compactions that are going on endlessly for over 4 days. For a specific CF I have about 4700 sstable data files right now. The compaction estimates are logged as [3312, 4, 0, 0, 0, 0, 0, 0, 0]. sstable_size_in_mb=256. 3214 files are about 256Mb (+/1 few megs), other files are smaller or much smaller than that. No sstables are larger than 256Mb. What I observe is that LCS picks 32 sstables from L0 and compacts them into 32 sstables of approximately the same size. So, what my system is doing for last 4 days (no traffic at all) is compacting groups of 32 sstables into groups of 32 sstables without any changes. Seems like a bug to me regardless of what did I do to get the system into this state... -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Updated] (CASSANDRA-6496) Endless L0 LCS compactions
[ https://issues.apache.org/jira/browse/CASSANDRA-6496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nikolai Grigoriev updated CASSANDRA-6496: - Attachment: system.log.gz system.log.1.gz Attaching the logs. I have enabled the compaction logging this morning to get a slight idea of what was going on. Endless L0 LCS compactions -- Key: CASSANDRA-6496 URL: https://issues.apache.org/jira/browse/CASSANDRA-6496 Project: Cassandra Issue Type: Bug Components: Core Environment: Cassandra 2.0.3, Linux, 6 nodes, 5 disks per node Reporter: Nikolai Grigoriev Attachments: system.log.1.gz, system.log.gz I have first described the problem here: http://stackoverflow.com/questions/20589324/cassandra-2-0-3-endless-compactions-with-no-traffic I think I have really abused my system with the traffic (mix of reads, heavy updates and some deletes). Now after stopping the traffic I see the compactions that are going on endlessly for over 4 days. For a specific CF I have about 4700 sstable data files right now. The compaction estimates are logged as [3312, 4, 0, 0, 0, 0, 0, 0, 0]. sstable_size_in_mb=256. 3214 files are about 256Mb (+/1 few megs), other files are smaller or much smaller than that. No sstables are larger than 256Mb. What I observe is that LCS picks 32 sstables from L0 and compacts them into 32 sstables of approximately the same size. So, what my system is doing for last 4 days (no traffic at all) is compacting groups of 32 sstables into groups of 32 sstables without any changes. Seems like a bug to me regardless of what did I do to get the system into this state... -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (CASSANDRA-5201) Cassandra/Hadoop does not support current Hadoop releases
[ https://issues.apache.org/jira/browse/CASSANDRA-5201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13850067#comment-13850067 ] Jonathan Ellis commented on CASSANDRA-5201: --- [~michaelsembwever]? [~jeromatron]? Cassandra/Hadoop does not support current Hadoop releases - Key: CASSANDRA-5201 URL: https://issues.apache.org/jira/browse/CASSANDRA-5201 Project: Cassandra Issue Type: Bug Components: Hadoop Affects Versions: 1.2.0 Reporter: Brian Jeltema Assignee: Dave Brosius Attachments: 5201_a.txt, hadoopCompat.patch Using Hadoop 0.22.0 with Cassandra results in the stack trace below. It appears that version 0.21+ changed org.apache.hadoop.mapreduce.JobContext from a class to an interface. Exception in thread main java.lang.IncompatibleClassChangeError: Found interface org.apache.hadoop.mapreduce.JobContext, but class was expected at org.apache.cassandra.hadoop.ColumnFamilyInputFormat.getSplits(ColumnFamilyInputFormat.java:103) at org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(JobSubmitter.java:445) at org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:462) at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:357) at org.apache.hadoop.mapreduce.Job$2.run(Job.java:1045) at org.apache.hadoop.mapreduce.Job$2.run(Job.java:1042) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1153) at org.apache.hadoop.mapreduce.Job.submit(Job.java:1042) at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1062) at MyHadoopApp.run(MyHadoopApp.java:163) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:69) at MyHadoopApp.main(MyHadoopApp.java:82) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:601) at org.apache.hadoop.util.RunJar.main(RunJar.java:192) -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (CASSANDRA-2915) Lucene based Secondary Indexes
[ https://issues.apache.org/jira/browse/CASSANDRA-2915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13850175#comment-13850175 ] Matt Stump commented on CASSANDRA-2915: --- Given that the read before write issues still stand for non-numeric fields (as of 4.6), is Lucene based secondary indexes still something we want committed in the near term? Do we want to wait until incremental update/stacked segments are available for all field types? Additionally, Lucene, even for near realtime search still imposes a delay between when a row is added and when it is query-able which would differ from existing behavior; is this something that we can live with? Lucene based Secondary Indexes -- Key: CASSANDRA-2915 URL: https://issues.apache.org/jira/browse/CASSANDRA-2915 Project: Cassandra Issue Type: New Feature Components: Core Reporter: T Jake Luciani Labels: secondary_index Secondary indexes (of type KEYS) suffer from a number of limitations in their current form: - Multiple IndexClauses only work when there is a subset of rows under the highest clause - One new column family is created per index this means 10 new CFs for 10 secondary indexes This ticket will use the Lucene library to implement secondary indexes as one index per CF, and utilize the Lucene query engine to handle multiple index clauses. Also, by using the Lucene we get a highly optimized file format. There are a few parallels we can draw between Cassandra and Lucene. Lucene indexes segments in memory then flushes them to disk so we can sync our memtable flushes to lucene flushes. Lucene also has optimize() which correlates to our compaction process, so these can be sync'd as well. We will also need to correlate column validators to Lucene tokenizers, so the data can be stored properly, the big win in once this is done we can perform complex queries within a column like wildcard searches. The downside of this approach is we will need to read before write since documents in Lucene are written as complete documents. For random workloads with lot's of indexed columns this means we need to read the document from the index, update it and write it back. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Created] (CASSANDRA-6497) Iterable CqlPagingRecordReader
Luca Rosellini created CASSANDRA-6497: - Summary: Iterable CqlPagingRecordReader Key: CASSANDRA-6497 URL: https://issues.apache.org/jira/browse/CASSANDRA-6497 Project: Cassandra Issue Type: Improvement Components: Hadoop Reporter: Luca Rosellini Priority: Minor Fix For: 2.1 Attachments: iterable-CqlPagingRecordReader.diff The current CqlPagingRecordReader implementation provides a non-standard way of iterating over the underlying {{rowIterator}}. It would be nice to have an Iterable CqlPagingRecordReader like the one proposed in the attached diff. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Updated] (CASSANDRA-6497) Iterable CqlPagingRecordReader
[ https://issues.apache.org/jira/browse/CASSANDRA-6497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luca Rosellini updated CASSANDRA-6497: -- Attachment: iterable-CqlPagingRecordReader.diff Iterable CqlPagingRecordReader -- Key: CASSANDRA-6497 URL: https://issues.apache.org/jira/browse/CASSANDRA-6497 Project: Cassandra Issue Type: Improvement Components: Hadoop Reporter: Luca Rosellini Priority: Minor Fix For: 2.1 Attachments: iterable-CqlPagingRecordReader.diff The current CqlPagingRecordReader implementation provides a non-standard way of iterating over the underlying {{rowIterator}}. It would be nice to have an Iterable CqlPagingRecordReader like the one proposed in the attached diff. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Updated] (CASSANDRA-6497) Iterable CqlPagingRecordReader
[ https://issues.apache.org/jira/browse/CASSANDRA-6497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luca Rosellini updated CASSANDRA-6497: -- Attachment: (was: iterable-CqlPagingRecordReader.diff) Iterable CqlPagingRecordReader -- Key: CASSANDRA-6497 URL: https://issues.apache.org/jira/browse/CASSANDRA-6497 Project: Cassandra Issue Type: Improvement Components: Hadoop Reporter: Luca Rosellini Priority: Minor Fix For: 2.1 Attachments: iterable-CqlPagingRecordReader.diff The current CqlPagingRecordReader implementation provides a non-standard way of iterating over the underlying {{rowIterator}}. It would be nice to have an Iterable CqlPagingRecordReader like the one proposed in the attached diff. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Updated] (CASSANDRA-6497) Iterable CqlPagingRecordReader
[ https://issues.apache.org/jira/browse/CASSANDRA-6497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luca Rosellini updated CASSANDRA-6497: -- Attachment: iterable-CqlPagingRecordReader.diff Iterable CqlPagingRecordReader -- Key: CASSANDRA-6497 URL: https://issues.apache.org/jira/browse/CASSANDRA-6497 Project: Cassandra Issue Type: Improvement Components: Hadoop Reporter: Luca Rosellini Priority: Minor Fix For: 2.1 Attachments: iterable-CqlPagingRecordReader.diff The current CqlPagingRecordReader implementation provides a non-standard way of iterating over the underlying {{rowIterator}}. It would be nice to have an Iterable CqlPagingRecordReader like the one proposed in the attached diff. -- This message was sent by Atlassian JIRA (v6.1.4#6159)