[jira] [Updated] (CASSANDRA-12311) Propagate TombstoneOverwhelmingException to the client
[ https://issues.apache.org/jira/browse/CASSANDRA-12311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Geoffrey Yu updated CASSANDRA-12311: Attachment: 12311-trunk-v2.txt I've attached an updated patch that removes the new exception and instead adds a new {{reason}} field within {{ReadFailureException}} that can be used to indicate why the read query failed. > Propagate TombstoneOverwhelmingException to the client > -- > > Key: CASSANDRA-12311 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12311 > Project: Cassandra > Issue Type: Improvement >Reporter: Geoffrey Yu >Assignee: Geoffrey Yu >Priority: Minor > Fix For: 4.x > > Attachments: 12311-trunk-v2.txt, 12311-trunk.txt > > > Right now if a data node fails to perform a read because it ran into a > {{TombstoneOverwhelmingException}}, it only responds back to the coordinator > node with a generic failure. Under this scheme, the coordinator won't be able > to know exactly why the request failed and subsequently the client only gets > a generic {{ReadFailureException}}. It would be useful to inform the client > that their read failed because we read too many tombstones. We should have > the data nodes reply with a failure type so the coordinator can pass this > information to the client. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-12349) Adding some new features to cqlsh
vin01 created CASSANDRA-12349: - Summary: Adding some new features to cqlsh Key: CASSANDRA-12349 URL: https://issues.apache.org/jira/browse/CASSANDRA-12349 Project: Cassandra Issue Type: New Feature Environment: All Reporter: vin01 Priority: Minor I will like to have following features in in cqlsh, I have made a patch to enable them as well. 1. Aliases. 2. Safe mode (prompt on delete,update,truncate,drop if safe_mode is true). 3. Press q to exit. Its also shared here -> https://github.com/vineet01/cassandra/blob/trunk/new_features.txt Example for aliases :- cassandra@cqlsh> show ; ALIASES HOST SESSION VERSION cassandra@cqlsh> show ALIASES ; Aliases :> {'dk': 'desc keyspaces;', 'sl': 'select * from'} now if you type dk and press it will auto complete it to "desc keyspace". Adding an alias from shell :- cassandra@cqlsh> alias slu=select * from login.user ; Alias added successfully - sle:select * login.user ; cassandra@cqlsh> show ALIASES ; Aliases :> {'slu': 'select * from login.user ;', 'dk': 'desc keyspaces;', 'sl': 'select * from'} cassandra@cqlsh> sle Expanded alias to> select * from login.user ; username | blacklisted | lastlogin | password Adding an alias directly in file :- aliases will be kept in same cqlshrc file. [aliases] dk = desc keyspaces; sl = select * from sle = select * from login.user ; now if we type just "sle" it will autocomplete rest of it and show next options. Example of safe mode :- cassandra@cqlsh> truncate login.user ; Are you sure you want to do this? (y/n) > n Not performing any action. cassandra@cqlsh> updatee login.user set password=null; Are you sure you want to do this? (y/n) > Not performing any action. Initial commit :- https://github.com/vineet01/cassandra/commit/0bfce2ccfc610021a74a1f82ed24aa63e1b72fec Current version :- https://github.com/vineet01/cassandra/blob/trunk/bin/cqlsh.py Please review and suggest any improvements. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-9259) Bulk Reading from Cassandra
[ https://issues.apache.org/jira/browse/CASSANDRA-9259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefania updated CASSANDRA-9259: Attachment: no_vnodes.jpg 256_vnodes.jpg > Bulk Reading from Cassandra > --- > > Key: CASSANDRA-9259 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9259 > Project: Cassandra > Issue Type: New Feature > Components: Compaction, CQL, Local Write-Read Paths, Streaming and > Messaging, Testing >Reporter: Brian Hess >Assignee: Stefania >Priority: Critical > Fix For: 3.x > > Attachments: 256_vnodes.jpg, before_after.jpg, > bulk-read-benchmark.1.html, bulk-read-jfr-profiles.1.tar.gz, > bulk-read-jfr-profiles.2.tar.gz, no_vnodes.jpg, spark_benchmark_raw_data.zip > > > This ticket is following on from the 2015 NGCC. This ticket is designed to > be a place for discussing and designing an approach to bulk reading. > The goal is to have a bulk reading path for Cassandra. That is, a path > optimized to grab a large portion of the data for a table (potentially all of > it). This is a core element in the Spark integration with Cassandra, and the > speed at which Cassandra can deliver bulk data to Spark is limiting the > performance of Spark-plus-Cassandra operations. This is especially of > importance as Cassandra will (likely) leverage Spark for internal operations > (for example CASSANDRA-8234). > The core CQL to consider is the following: > SELECT a, b, c FROM myKs.myTable WHERE Token(partitionKey) > X AND > Token(partitionKey) <= Y > Here, we choose X and Y to be contained within one token range (perhaps > considering the primary range of a node without vnodes, for example). This > query pushes 50K-100K rows/sec, which is not very fast if we are doing bulk > operations via Spark (or other processing frameworks - ETL, etc). There are > a few causes (e.g., inefficient paging). > There are a few approaches that could be considered. First, we consider a > new "Streaming Compaction" approach. The key observation here is that a bulk > read from Cassandra is a lot like a major compaction, though instead of > outputting a new SSTable we would output CQL rows to a stream/socket/etc. > This would be similar to a CompactionTask, but would strip out some > unnecessary things in there (e.g., some of the indexing, etc). Predicates and > projections could also be encapsulated in this new "StreamingCompactionTask", > for example. > Another approach would be an alternate storage format. For example, we might > employ Parquet (just as an example) to store the same data as in the primary > Cassandra storage (aka SSTables). This is akin to Global Indexes (an > alternate storage of the same data optimized for a particular query). Then, > Cassandra can choose to leverage this alternate storage for particular CQL > queries (e.g., range scans). > These are just 2 suggestions to get the conversation going. > One thing to note is that it will be useful to have this storage segregated > by token range so that when you extract via these mechanisms you do not get > replications-factor numbers of copies of the data. That will certainly be an > issue for some Spark operations (e.g., counting). Thus, we will want > per-token-range storage (even for single disks), so this will likely leverage > CASSANDRA-6696 (though, we'll want to also consider the single disk case). > It is also worth discussing what the success criteria is here. It is > unlikely to be as fast as EDW or HDFS performance (though, that is still a > good goal), but being within some percentage of that performance should be > set as success. For example, 2x as long as doing bulk operations on HDFS > with similar node count/size/etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-9259) Bulk Reading from Cassandra
[ https://issues.apache.org/jira/browse/CASSANDRA-9259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefania updated CASSANDRA-9259: Attachment: (was: 256_vnodes.jpg) > Bulk Reading from Cassandra > --- > > Key: CASSANDRA-9259 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9259 > Project: Cassandra > Issue Type: New Feature > Components: Compaction, CQL, Local Write-Read Paths, Streaming and > Messaging, Testing >Reporter: Brian Hess >Assignee: Stefania >Priority: Critical > Fix For: 3.x > > Attachments: before_after.jpg, bulk-read-benchmark.1.html, > bulk-read-jfr-profiles.1.tar.gz, bulk-read-jfr-profiles.2.tar.gz, > spark_benchmark_raw_data.zip > > > This ticket is following on from the 2015 NGCC. This ticket is designed to > be a place for discussing and designing an approach to bulk reading. > The goal is to have a bulk reading path for Cassandra. That is, a path > optimized to grab a large portion of the data for a table (potentially all of > it). This is a core element in the Spark integration with Cassandra, and the > speed at which Cassandra can deliver bulk data to Spark is limiting the > performance of Spark-plus-Cassandra operations. This is especially of > importance as Cassandra will (likely) leverage Spark for internal operations > (for example CASSANDRA-8234). > The core CQL to consider is the following: > SELECT a, b, c FROM myKs.myTable WHERE Token(partitionKey) > X AND > Token(partitionKey) <= Y > Here, we choose X and Y to be contained within one token range (perhaps > considering the primary range of a node without vnodes, for example). This > query pushes 50K-100K rows/sec, which is not very fast if we are doing bulk > operations via Spark (or other processing frameworks - ETL, etc). There are > a few causes (e.g., inefficient paging). > There are a few approaches that could be considered. First, we consider a > new "Streaming Compaction" approach. The key observation here is that a bulk > read from Cassandra is a lot like a major compaction, though instead of > outputting a new SSTable we would output CQL rows to a stream/socket/etc. > This would be similar to a CompactionTask, but would strip out some > unnecessary things in there (e.g., some of the indexing, etc). Predicates and > projections could also be encapsulated in this new "StreamingCompactionTask", > for example. > Another approach would be an alternate storage format. For example, we might > employ Parquet (just as an example) to store the same data as in the primary > Cassandra storage (aka SSTables). This is akin to Global Indexes (an > alternate storage of the same data optimized for a particular query). Then, > Cassandra can choose to leverage this alternate storage for particular CQL > queries (e.g., range scans). > These are just 2 suggestions to get the conversation going. > One thing to note is that it will be useful to have this storage segregated > by token range so that when you extract via these mechanisms you do not get > replications-factor numbers of copies of the data. That will certainly be an > issue for some Spark operations (e.g., counting). Thus, we will want > per-token-range storage (even for single disks), so this will likely leverage > CASSANDRA-6696 (though, we'll want to also consider the single disk case). > It is also worth discussing what the success criteria is here. It is > unlikely to be as fast as EDW or HDFS performance (though, that is still a > good goal), but being within some percentage of that performance should be > set as success. For example, 2x as long as doing bulk operations on HDFS > with similar node count/size/etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-9259) Bulk Reading from Cassandra
[ https://issues.apache.org/jira/browse/CASSANDRA-9259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefania updated CASSANDRA-9259: Attachment: (was: no_vnodes.jpg) > Bulk Reading from Cassandra > --- > > Key: CASSANDRA-9259 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9259 > Project: Cassandra > Issue Type: New Feature > Components: Compaction, CQL, Local Write-Read Paths, Streaming and > Messaging, Testing >Reporter: Brian Hess >Assignee: Stefania >Priority: Critical > Fix For: 3.x > > Attachments: before_after.jpg, bulk-read-benchmark.1.html, > bulk-read-jfr-profiles.1.tar.gz, bulk-read-jfr-profiles.2.tar.gz, > spark_benchmark_raw_data.zip > > > This ticket is following on from the 2015 NGCC. This ticket is designed to > be a place for discussing and designing an approach to bulk reading. > The goal is to have a bulk reading path for Cassandra. That is, a path > optimized to grab a large portion of the data for a table (potentially all of > it). This is a core element in the Spark integration with Cassandra, and the > speed at which Cassandra can deliver bulk data to Spark is limiting the > performance of Spark-plus-Cassandra operations. This is especially of > importance as Cassandra will (likely) leverage Spark for internal operations > (for example CASSANDRA-8234). > The core CQL to consider is the following: > SELECT a, b, c FROM myKs.myTable WHERE Token(partitionKey) > X AND > Token(partitionKey) <= Y > Here, we choose X and Y to be contained within one token range (perhaps > considering the primary range of a node without vnodes, for example). This > query pushes 50K-100K rows/sec, which is not very fast if we are doing bulk > operations via Spark (or other processing frameworks - ETL, etc). There are > a few causes (e.g., inefficient paging). > There are a few approaches that could be considered. First, we consider a > new "Streaming Compaction" approach. The key observation here is that a bulk > read from Cassandra is a lot like a major compaction, though instead of > outputting a new SSTable we would output CQL rows to a stream/socket/etc. > This would be similar to a CompactionTask, but would strip out some > unnecessary things in there (e.g., some of the indexing, etc). Predicates and > projections could also be encapsulated in this new "StreamingCompactionTask", > for example. > Another approach would be an alternate storage format. For example, we might > employ Parquet (just as an example) to store the same data as in the primary > Cassandra storage (aka SSTables). This is akin to Global Indexes (an > alternate storage of the same data optimized for a particular query). Then, > Cassandra can choose to leverage this alternate storage for particular CQL > queries (e.g., range scans). > These are just 2 suggestions to get the conversation going. > One thing to note is that it will be useful to have this storage segregated > by token range so that when you extract via these mechanisms you do not get > replications-factor numbers of copies of the data. That will certainly be an > issue for some Spark operations (e.g., counting). Thus, we will want > per-token-range storage (even for single disks), so this will likely leverage > CASSANDRA-6696 (though, we'll want to also consider the single disk case). > It is also worth discussing what the success criteria is here. It is > unlikely to be as fast as EDW or HDFS performance (though, that is still a > good goal), but being within some percentage of that performance should be > set as success. For example, 2x as long as doing bulk operations on HDFS > with similar node count/size/etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-9259) Bulk Reading from Cassandra
[ https://issues.apache.org/jira/browse/CASSANDRA-9259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefania updated CASSANDRA-9259: Attachment: before_after.jpg > Bulk Reading from Cassandra > --- > > Key: CASSANDRA-9259 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9259 > Project: Cassandra > Issue Type: New Feature > Components: Compaction, CQL, Local Write-Read Paths, Streaming and > Messaging, Testing >Reporter: Brian Hess >Assignee: Stefania >Priority: Critical > Fix For: 3.x > > Attachments: 256_vnodes.jpg, before_after.jpg, > bulk-read-benchmark.1.html, bulk-read-jfr-profiles.1.tar.gz, > bulk-read-jfr-profiles.2.tar.gz, no_vnodes.jpg, spark_benchmark_raw_data.zip > > > This ticket is following on from the 2015 NGCC. This ticket is designed to > be a place for discussing and designing an approach to bulk reading. > The goal is to have a bulk reading path for Cassandra. That is, a path > optimized to grab a large portion of the data for a table (potentially all of > it). This is a core element in the Spark integration with Cassandra, and the > speed at which Cassandra can deliver bulk data to Spark is limiting the > performance of Spark-plus-Cassandra operations. This is especially of > importance as Cassandra will (likely) leverage Spark for internal operations > (for example CASSANDRA-8234). > The core CQL to consider is the following: > SELECT a, b, c FROM myKs.myTable WHERE Token(partitionKey) > X AND > Token(partitionKey) <= Y > Here, we choose X and Y to be contained within one token range (perhaps > considering the primary range of a node without vnodes, for example). This > query pushes 50K-100K rows/sec, which is not very fast if we are doing bulk > operations via Spark (or other processing frameworks - ETL, etc). There are > a few causes (e.g., inefficient paging). > There are a few approaches that could be considered. First, we consider a > new "Streaming Compaction" approach. The key observation here is that a bulk > read from Cassandra is a lot like a major compaction, though instead of > outputting a new SSTable we would output CQL rows to a stream/socket/etc. > This would be similar to a CompactionTask, but would strip out some > unnecessary things in there (e.g., some of the indexing, etc). Predicates and > projections could also be encapsulated in this new "StreamingCompactionTask", > for example. > Another approach would be an alternate storage format. For example, we might > employ Parquet (just as an example) to store the same data as in the primary > Cassandra storage (aka SSTables). This is akin to Global Indexes (an > alternate storage of the same data optimized for a particular query). Then, > Cassandra can choose to leverage this alternate storage for particular CQL > queries (e.g., range scans). > These are just 2 suggestions to get the conversation going. > One thing to note is that it will be useful to have this storage segregated > by token range so that when you extract via these mechanisms you do not get > replications-factor numbers of copies of the data. That will certainly be an > issue for some Spark operations (e.g., counting). Thus, we will want > per-token-range storage (even for single disks), so this will likely leverage > CASSANDRA-6696 (though, we'll want to also consider the single disk case). > It is also worth discussing what the success criteria is here. It is > unlikely to be as fast as EDW or HDFS performance (though, that is still a > good goal), but being within some percentage of that performance should be > set as success. For example, 2x as long as doing bulk operations on HDFS > with similar node count/size/etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-9259) Bulk Reading from Cassandra
[ https://issues.apache.org/jira/browse/CASSANDRA-9259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefania updated CASSANDRA-9259: Attachment: (was: before_after.jpg) > Bulk Reading from Cassandra > --- > > Key: CASSANDRA-9259 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9259 > Project: Cassandra > Issue Type: New Feature > Components: Compaction, CQL, Local Write-Read Paths, Streaming and > Messaging, Testing >Reporter: Brian Hess >Assignee: Stefania >Priority: Critical > Fix For: 3.x > > Attachments: 256_vnodes.jpg, before_after.jpg, > bulk-read-benchmark.1.html, bulk-read-jfr-profiles.1.tar.gz, > bulk-read-jfr-profiles.2.tar.gz, no_vnodes.jpg, spark_benchmark_raw_data.zip > > > This ticket is following on from the 2015 NGCC. This ticket is designed to > be a place for discussing and designing an approach to bulk reading. > The goal is to have a bulk reading path for Cassandra. That is, a path > optimized to grab a large portion of the data for a table (potentially all of > it). This is a core element in the Spark integration with Cassandra, and the > speed at which Cassandra can deliver bulk data to Spark is limiting the > performance of Spark-plus-Cassandra operations. This is especially of > importance as Cassandra will (likely) leverage Spark for internal operations > (for example CASSANDRA-8234). > The core CQL to consider is the following: > SELECT a, b, c FROM myKs.myTable WHERE Token(partitionKey) > X AND > Token(partitionKey) <= Y > Here, we choose X and Y to be contained within one token range (perhaps > considering the primary range of a node without vnodes, for example). This > query pushes 50K-100K rows/sec, which is not very fast if we are doing bulk > operations via Spark (or other processing frameworks - ETL, etc). There are > a few causes (e.g., inefficient paging). > There are a few approaches that could be considered. First, we consider a > new "Streaming Compaction" approach. The key observation here is that a bulk > read from Cassandra is a lot like a major compaction, though instead of > outputting a new SSTable we would output CQL rows to a stream/socket/etc. > This would be similar to a CompactionTask, but would strip out some > unnecessary things in there (e.g., some of the indexing, etc). Predicates and > projections could also be encapsulated in this new "StreamingCompactionTask", > for example. > Another approach would be an alternate storage format. For example, we might > employ Parquet (just as an example) to store the same data as in the primary > Cassandra storage (aka SSTables). This is akin to Global Indexes (an > alternate storage of the same data optimized for a particular query). Then, > Cassandra can choose to leverage this alternate storage for particular CQL > queries (e.g., range scans). > These are just 2 suggestions to get the conversation going. > One thing to note is that it will be useful to have this storage segregated > by token range so that when you extract via these mechanisms you do not get > replications-factor numbers of copies of the data. That will certainly be an > issue for some Spark operations (e.g., counting). Thus, we will want > per-token-range storage (even for single disks), so this will likely leverage > CASSANDRA-6696 (though, we'll want to also consider the single disk case). > It is also worth discussing what the success criteria is here. It is > unlikely to be as fast as EDW or HDFS performance (though, that is still a > good goal), but being within some percentage of that performance should be > set as success. For example, 2x as long as doing bulk operations on HDFS > with similar node count/size/etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-11521) Implement streaming for bulk read requests
[ https://issues.apache.org/jira/browse/CASSANDRA-11521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefania updated CASSANDRA-11521: - Status: Patch Available (was: In Progress) > Implement streaming for bulk read requests > -- > > Key: CASSANDRA-11521 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11521 > Project: Cassandra > Issue Type: Sub-task > Components: Local Write-Read Paths >Reporter: Stefania >Assignee: Stefania > Labels: client-impacting, protocolv5 > Fix For: 3.x > > Attachments: final-patch-jfr-profiles-1.zip > > > Allow clients to stream data from a C* host, bypassing the coordination layer > and eliminating the need to query individual pages one by one. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11521) Implement streaming for bulk read requests
[ https://issues.apache.org/jira/browse/CASSANDRA-11521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15400395#comment-15400395 ] Stefania commented on CASSANDRA-11521: -- The patch is ready for review: ||trunk|[patch|https://github.com/stef1927/cassandra/commits/11521]|[testall|http://cassci.datastax.com/view/Dev/view/stef1927/job/stef1927-11521-testall/]|[dtest|http://cassci.datastax.com/view/Dev/view/stef1927/job/stef1927-11521-dtest/]| There are also the [driver patch|https://github.com/stef1927/java-driver/commits/11521] and the [spark connector patch|https://github.com/stef1927/spark-cassandra-connector/commits/11521]. For these I plan to create tickets for the respective projects once the native protocol changes have been finalized. A [design document|https://docs.google.com/document/d/1YqKGSU1P8EJIfMrO--29VaSoCy5mUu-ePfAiIOLsY7o/edit] is also available. The Spark benchmark results are available in [this comment|https://issues.apache.org/jira/browse/CASSANDRA-9259?focusedCommentId=15400394=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15400394] on the parent ticket. The final patch is slightly better than the proof-of-concept, and the asynchronous paging mechanism significantly outperforms the existing mechanism for large data sets. I've also repeated some cstar_perf tests to rule out performance regressions with ordinary queries, which are not in the optimized path: * Single partition queries (default cassandra-stress read command) at CL.LOCAL_ONE (the cassandra-stress default): [first run|http://cstar.datastax.com/graph?command=one_job=8b1f1d54-53e4-11e6-85af-0256e416528f=99th_latency=2_read=1_aggregates=true=0=276.98=0=22.33], [second run with swapped revision's order|http://cstar.datastax.com/graph?command=one_job=1abd3fe4-545e-11e6-8920-0256e416528f=op_rate=2_read=1_aggregates=true=0=277.86=0=243951.4], [an old run|http://cstar.datastax.com/graph?command=one_job=16cef080-53dc-11e6-b967-0256e416528f=op_rate=2_read=1_aggregates=true=0=282.92=0=249571.3] done before enabling token aware routing in cassandra stress. * Single partition queries at CL.ALL: [unique run|http://cstar.datastax.com/graph?command=one_job=e2155410-5462-11e6-9cd7-0256e416528f=op_rate=2_read=1_aggregates=true=0=277.75=0=246123.9] There is a gap of 3.6K ops/second without token aware routing and 1K with CL=ALL. With token aware routing the patch is instead 1K ops / second faster. These differences must arise from the refactoring in select statement. They are very small differences, the test error seems to be around 0.5K, but I can look into it further if there are concerns. > Implement streaming for bulk read requests > -- > > Key: CASSANDRA-11521 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11521 > Project: Cassandra > Issue Type: Sub-task > Components: Local Write-Read Paths >Reporter: Stefania >Assignee: Stefania > Labels: client-impacting, protocolv5 > Fix For: 3.x > > Attachments: final-patch-jfr-profiles-1.zip > > > Allow clients to stream data from a C* host, bypassing the coordination layer > and eliminating the need to query individual pages one by one. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9259) Bulk Reading from Cassandra
[ https://issues.apache.org/jira/browse/CASSANDRA-9259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15400394#comment-15400394 ] Stefania commented on CASSANDRA-9259: - Now that CASSANDRA-11521 is ready for review, I've repeated the Spark benchmark defined by CASSANDRA-11542 using schema 1: {code} CREATE TABLE ks.schema1 (id TEXT, timestamp BIGINT, val1 INT, val2 INT, val3 INT, val4 INT, val5 INT, val6 INT, val7 INT, val8 INT, val9 INT, val10 INT, PRIMARY KEY (id, timestamp)) {code} and schema 3: {code} CREATE TABLE ks.schema3 (id TEXT, timestamp BIGINT, data TEXT, PRIMARY KEY (id, timestamp)) {code} The benchmark measures how many seconds it takes to count rows and to find the maximum of two columns for each row, where rows are retrieved either via Spark RDDs or Data Frames (DFs). The most significant difference between RDD and DF tests is that in the DF tests only the two columns of interest to the Spark query are retrieved, whilst in the RDD tests the entire data set is retrieved. The data is either stored in Cassandra or in HDFS using CSV or Parquet files. More details on the benchmark are available [here|https://issues.apache.org/jira/browse/CASSANDRA-11542?focusedCommentId=15249213=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15249213] and the code is available [here|https://github.com/stef1927/spark-load-perf]. Here is the comparison with the results of the benchmark that was run on 6th May with 15M rows, as described in [this comment|https://issues.apache.org/jira/browse/CASSANDRA-11542?focusedCommentId=15273884=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15273884]. We can see that the final results are consistent with the proof of concept, which was presented at the Cassandra NGCC conference. !before_after.jpg! * C* old: is TRUNK with no optimizations (at c662d876b95d67a911dfe549d8a0d38ee6fbb904), and the Spark Connector without SPARK-C383 * C* POC: is the [proof-of-concept patch|https://github.com/stef1927/cassandra/commits/9259], and the Spark Connector with an [earlier version|https://github.com/stef1927/spark-cassandra-connector/commits/9259] of SPARK-C383 * C* async: is the CASSANDRA-11521 patch, with results delivered to the client via the new asynchronous paging mechanism * C* sync: is the CASSANDRA-11521 patch, with results delivered to the client via the existing synchronous paging mechanism Here are the results run over several incremental data sets at 15M, 30M, 60M and 120M rows with 256 VNODES: !256_vnodes.jpg! Below are the results run over several incremental data sets at 1 15M, 30M, 60M and 120M rows without VNODES: !no_vnodes.jpg! The raw data is attached [^spark_benchmark_raw_data.zip]. h5. Conclusions * Overall the duration of the 15M row test was improved by 65% (from about 40 to 14 seconds) for schema 1 and by 56% (from 23 to 10 seconds) for schema 3. * The new asynchronous paging mechanism significantly outperforms the existing mechanism with large data sets. For example, for schema 1 and 120M rows, it is approximately 30% faster. In order to achieve this, it is however necessary that the driver reserves a connection per asynchronous paging request, sharing connections degrades performance significantly and makes it no better than the existing mechanism. * CSV still outperforms C* for schema 1 RDD tests. However, for DF tests and schema 3 RDD tests, C* is now on par or faster than CSV. This indicates that the number of cells in CQL rows continues to impact performance. * Parquet is in a league of its own due to its efficient columnar format. It should however be noted that it may be storing the row count in metadata. A more meaningful benchmark could have been obtained had we excluded the row count from the time measurements. > Bulk Reading from Cassandra > --- > > Key: CASSANDRA-9259 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9259 > Project: Cassandra > Issue Type: New Feature > Components: Compaction, CQL, Local Write-Read Paths, Streaming and > Messaging, Testing >Reporter: Brian Hess >Assignee: Stefania >Priority: Critical > Fix For: 3.x > > Attachments: 256_vnodes.jpg, before_after.jpg, > bulk-read-benchmark.1.html, bulk-read-jfr-profiles.1.tar.gz, > bulk-read-jfr-profiles.2.tar.gz, no_vnodes.jpg, spark_benchmark_raw_data.zip > > > This ticket is following on from the 2015 NGCC. This ticket is designed to > be a place for discussing and designing an approach to bulk reading. > The goal is to have a bulk reading path for Cassandra. That is, a path > optimized to grab a large portion of the data for a table (potentially all of > it). This is a core element in the Spark integration with Cassandra, and the > speed at which Cassandra
[jira] [Updated] (CASSANDRA-9259) Bulk Reading from Cassandra
[ https://issues.apache.org/jira/browse/CASSANDRA-9259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefania updated CASSANDRA-9259: Attachment: no_vnodes.jpg 256_vnodes.jpg > Bulk Reading from Cassandra > --- > > Key: CASSANDRA-9259 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9259 > Project: Cassandra > Issue Type: New Feature > Components: Compaction, CQL, Local Write-Read Paths, Streaming and > Messaging, Testing >Reporter: Brian Hess >Assignee: Stefania >Priority: Critical > Fix For: 3.x > > Attachments: 256_vnodes.jpg, before_after.jpg, > bulk-read-benchmark.1.html, bulk-read-jfr-profiles.1.tar.gz, > bulk-read-jfr-profiles.2.tar.gz, no_vnodes.jpg, spark_benchmark_raw_data.zip > > > This ticket is following on from the 2015 NGCC. This ticket is designed to > be a place for discussing and designing an approach to bulk reading. > The goal is to have a bulk reading path for Cassandra. That is, a path > optimized to grab a large portion of the data for a table (potentially all of > it). This is a core element in the Spark integration with Cassandra, and the > speed at which Cassandra can deliver bulk data to Spark is limiting the > performance of Spark-plus-Cassandra operations. This is especially of > importance as Cassandra will (likely) leverage Spark for internal operations > (for example CASSANDRA-8234). > The core CQL to consider is the following: > SELECT a, b, c FROM myKs.myTable WHERE Token(partitionKey) > X AND > Token(partitionKey) <= Y > Here, we choose X and Y to be contained within one token range (perhaps > considering the primary range of a node without vnodes, for example). This > query pushes 50K-100K rows/sec, which is not very fast if we are doing bulk > operations via Spark (or other processing frameworks - ETL, etc). There are > a few causes (e.g., inefficient paging). > There are a few approaches that could be considered. First, we consider a > new "Streaming Compaction" approach. The key observation here is that a bulk > read from Cassandra is a lot like a major compaction, though instead of > outputting a new SSTable we would output CQL rows to a stream/socket/etc. > This would be similar to a CompactionTask, but would strip out some > unnecessary things in there (e.g., some of the indexing, etc). Predicates and > projections could also be encapsulated in this new "StreamingCompactionTask", > for example. > Another approach would be an alternate storage format. For example, we might > employ Parquet (just as an example) to store the same data as in the primary > Cassandra storage (aka SSTables). This is akin to Global Indexes (an > alternate storage of the same data optimized for a particular query). Then, > Cassandra can choose to leverage this alternate storage for particular CQL > queries (e.g., range scans). > These are just 2 suggestions to get the conversation going. > One thing to note is that it will be useful to have this storage segregated > by token range so that when you extract via these mechanisms you do not get > replications-factor numbers of copies of the data. That will certainly be an > issue for some Spark operations (e.g., counting). Thus, we will want > per-token-range storage (even for single disks), so this will likely leverage > CASSANDRA-6696 (though, we'll want to also consider the single disk case). > It is also worth discussing what the success criteria is here. It is > unlikely to be as fast as EDW or HDFS performance (though, that is still a > good goal), but being within some percentage of that performance should be > set as success. For example, 2x as long as doing bulk operations on HDFS > with similar node count/size/etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-9259) Bulk Reading from Cassandra
[ https://issues.apache.org/jira/browse/CASSANDRA-9259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefania updated CASSANDRA-9259: Attachment: (was: 256_vnodes.jpg) > Bulk Reading from Cassandra > --- > > Key: CASSANDRA-9259 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9259 > Project: Cassandra > Issue Type: New Feature > Components: Compaction, CQL, Local Write-Read Paths, Streaming and > Messaging, Testing >Reporter: Brian Hess >Assignee: Stefania >Priority: Critical > Fix For: 3.x > > Attachments: before_after.jpg, bulk-read-benchmark.1.html, > bulk-read-jfr-profiles.1.tar.gz, bulk-read-jfr-profiles.2.tar.gz, > spark_benchmark_raw_data.zip > > > This ticket is following on from the 2015 NGCC. This ticket is designed to > be a place for discussing and designing an approach to bulk reading. > The goal is to have a bulk reading path for Cassandra. That is, a path > optimized to grab a large portion of the data for a table (potentially all of > it). This is a core element in the Spark integration with Cassandra, and the > speed at which Cassandra can deliver bulk data to Spark is limiting the > performance of Spark-plus-Cassandra operations. This is especially of > importance as Cassandra will (likely) leverage Spark for internal operations > (for example CASSANDRA-8234). > The core CQL to consider is the following: > SELECT a, b, c FROM myKs.myTable WHERE Token(partitionKey) > X AND > Token(partitionKey) <= Y > Here, we choose X and Y to be contained within one token range (perhaps > considering the primary range of a node without vnodes, for example). This > query pushes 50K-100K rows/sec, which is not very fast if we are doing bulk > operations via Spark (or other processing frameworks - ETL, etc). There are > a few causes (e.g., inefficient paging). > There are a few approaches that could be considered. First, we consider a > new "Streaming Compaction" approach. The key observation here is that a bulk > read from Cassandra is a lot like a major compaction, though instead of > outputting a new SSTable we would output CQL rows to a stream/socket/etc. > This would be similar to a CompactionTask, but would strip out some > unnecessary things in there (e.g., some of the indexing, etc). Predicates and > projections could also be encapsulated in this new "StreamingCompactionTask", > for example. > Another approach would be an alternate storage format. For example, we might > employ Parquet (just as an example) to store the same data as in the primary > Cassandra storage (aka SSTables). This is akin to Global Indexes (an > alternate storage of the same data optimized for a particular query). Then, > Cassandra can choose to leverage this alternate storage for particular CQL > queries (e.g., range scans). > These are just 2 suggestions to get the conversation going. > One thing to note is that it will be useful to have this storage segregated > by token range so that when you extract via these mechanisms you do not get > replications-factor numbers of copies of the data. That will certainly be an > issue for some Spark operations (e.g., counting). Thus, we will want > per-token-range storage (even for single disks), so this will likely leverage > CASSANDRA-6696 (though, we'll want to also consider the single disk case). > It is also worth discussing what the success criteria is here. It is > unlikely to be as fast as EDW or HDFS performance (though, that is still a > good goal), but being within some percentage of that performance should be > set as success. For example, 2x as long as doing bulk operations on HDFS > with similar node count/size/etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-9259) Bulk Reading from Cassandra
[ https://issues.apache.org/jira/browse/CASSANDRA-9259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefania updated CASSANDRA-9259: Attachment: (was: no_vnodes.jpg) > Bulk Reading from Cassandra > --- > > Key: CASSANDRA-9259 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9259 > Project: Cassandra > Issue Type: New Feature > Components: Compaction, CQL, Local Write-Read Paths, Streaming and > Messaging, Testing >Reporter: Brian Hess >Assignee: Stefania >Priority: Critical > Fix For: 3.x > > Attachments: before_after.jpg, bulk-read-benchmark.1.html, > bulk-read-jfr-profiles.1.tar.gz, bulk-read-jfr-profiles.2.tar.gz, > spark_benchmark_raw_data.zip > > > This ticket is following on from the 2015 NGCC. This ticket is designed to > be a place for discussing and designing an approach to bulk reading. > The goal is to have a bulk reading path for Cassandra. That is, a path > optimized to grab a large portion of the data for a table (potentially all of > it). This is a core element in the Spark integration with Cassandra, and the > speed at which Cassandra can deliver bulk data to Spark is limiting the > performance of Spark-plus-Cassandra operations. This is especially of > importance as Cassandra will (likely) leverage Spark for internal operations > (for example CASSANDRA-8234). > The core CQL to consider is the following: > SELECT a, b, c FROM myKs.myTable WHERE Token(partitionKey) > X AND > Token(partitionKey) <= Y > Here, we choose X and Y to be contained within one token range (perhaps > considering the primary range of a node without vnodes, for example). This > query pushes 50K-100K rows/sec, which is not very fast if we are doing bulk > operations via Spark (or other processing frameworks - ETL, etc). There are > a few causes (e.g., inefficient paging). > There are a few approaches that could be considered. First, we consider a > new "Streaming Compaction" approach. The key observation here is that a bulk > read from Cassandra is a lot like a major compaction, though instead of > outputting a new SSTable we would output CQL rows to a stream/socket/etc. > This would be similar to a CompactionTask, but would strip out some > unnecessary things in there (e.g., some of the indexing, etc). Predicates and > projections could also be encapsulated in this new "StreamingCompactionTask", > for example. > Another approach would be an alternate storage format. For example, we might > employ Parquet (just as an example) to store the same data as in the primary > Cassandra storage (aka SSTables). This is akin to Global Indexes (an > alternate storage of the same data optimized for a particular query). Then, > Cassandra can choose to leverage this alternate storage for particular CQL > queries (e.g., range scans). > These are just 2 suggestions to get the conversation going. > One thing to note is that it will be useful to have this storage segregated > by token range so that when you extract via these mechanisms you do not get > replications-factor numbers of copies of the data. That will certainly be an > issue for some Spark operations (e.g., counting). Thus, we will want > per-token-range storage (even for single disks), so this will likely leverage > CASSANDRA-6696 (though, we'll want to also consider the single disk case). > It is also worth discussing what the success criteria is here. It is > unlikely to be as fast as EDW or HDFS performance (though, that is still a > good goal), but being within some percentage of that performance should be > set as success. For example, 2x as long as doing bulk operations on HDFS > with similar node count/size/etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-9259) Bulk Reading from Cassandra
[ https://issues.apache.org/jira/browse/CASSANDRA-9259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefania updated CASSANDRA-9259: Attachment: no_vnodes.jpg before_after.jpg 256_vnodes.jpg > Bulk Reading from Cassandra > --- > > Key: CASSANDRA-9259 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9259 > Project: Cassandra > Issue Type: New Feature > Components: Compaction, CQL, Local Write-Read Paths, Streaming and > Messaging, Testing >Reporter: Brian Hess >Assignee: Stefania >Priority: Critical > Fix For: 3.x > > Attachments: 256_vnodes.jpg, before_after.jpg, > bulk-read-benchmark.1.html, bulk-read-jfr-profiles.1.tar.gz, > bulk-read-jfr-profiles.2.tar.gz, no_vnodes.jpg, spark_benchmark_raw_data.zip > > > This ticket is following on from the 2015 NGCC. This ticket is designed to > be a place for discussing and designing an approach to bulk reading. > The goal is to have a bulk reading path for Cassandra. That is, a path > optimized to grab a large portion of the data for a table (potentially all of > it). This is a core element in the Spark integration with Cassandra, and the > speed at which Cassandra can deliver bulk data to Spark is limiting the > performance of Spark-plus-Cassandra operations. This is especially of > importance as Cassandra will (likely) leverage Spark for internal operations > (for example CASSANDRA-8234). > The core CQL to consider is the following: > SELECT a, b, c FROM myKs.myTable WHERE Token(partitionKey) > X AND > Token(partitionKey) <= Y > Here, we choose X and Y to be contained within one token range (perhaps > considering the primary range of a node without vnodes, for example). This > query pushes 50K-100K rows/sec, which is not very fast if we are doing bulk > operations via Spark (or other processing frameworks - ETL, etc). There are > a few causes (e.g., inefficient paging). > There are a few approaches that could be considered. First, we consider a > new "Streaming Compaction" approach. The key observation here is that a bulk > read from Cassandra is a lot like a major compaction, though instead of > outputting a new SSTable we would output CQL rows to a stream/socket/etc. > This would be similar to a CompactionTask, but would strip out some > unnecessary things in there (e.g., some of the indexing, etc). Predicates and > projections could also be encapsulated in this new "StreamingCompactionTask", > for example. > Another approach would be an alternate storage format. For example, we might > employ Parquet (just as an example) to store the same data as in the primary > Cassandra storage (aka SSTables). This is akin to Global Indexes (an > alternate storage of the same data optimized for a particular query). Then, > Cassandra can choose to leverage this alternate storage for particular CQL > queries (e.g., range scans). > These are just 2 suggestions to get the conversation going. > One thing to note is that it will be useful to have this storage segregated > by token range so that when you extract via these mechanisms you do not get > replications-factor numbers of copies of the data. That will certainly be an > issue for some Spark operations (e.g., counting). Thus, we will want > per-token-range storage (even for single disks), so this will likely leverage > CASSANDRA-6696 (though, we'll want to also consider the single disk case). > It is also worth discussing what the success criteria is here. It is > unlikely to be as fast as EDW or HDFS performance (though, that is still a > good goal), but being within some percentage of that performance should be > set as success. For example, 2x as long as doing bulk operations on HDFS > with similar node count/size/etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-9259) Bulk Reading from Cassandra
[ https://issues.apache.org/jira/browse/CASSANDRA-9259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefania updated CASSANDRA-9259: Attachment: (was: no_vnodes.jpg) > Bulk Reading from Cassandra > --- > > Key: CASSANDRA-9259 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9259 > Project: Cassandra > Issue Type: New Feature > Components: Compaction, CQL, Local Write-Read Paths, Streaming and > Messaging, Testing >Reporter: Brian Hess >Assignee: Stefania >Priority: Critical > Fix For: 3.x > > Attachments: 256_vnodes.jpg, before_after.jpg, > bulk-read-benchmark.1.html, bulk-read-jfr-profiles.1.tar.gz, > bulk-read-jfr-profiles.2.tar.gz, no_vnodes.jpg, spark_benchmark_raw_data.zip > > > This ticket is following on from the 2015 NGCC. This ticket is designed to > be a place for discussing and designing an approach to bulk reading. > The goal is to have a bulk reading path for Cassandra. That is, a path > optimized to grab a large portion of the data for a table (potentially all of > it). This is a core element in the Spark integration with Cassandra, and the > speed at which Cassandra can deliver bulk data to Spark is limiting the > performance of Spark-plus-Cassandra operations. This is especially of > importance as Cassandra will (likely) leverage Spark for internal operations > (for example CASSANDRA-8234). > The core CQL to consider is the following: > SELECT a, b, c FROM myKs.myTable WHERE Token(partitionKey) > X AND > Token(partitionKey) <= Y > Here, we choose X and Y to be contained within one token range (perhaps > considering the primary range of a node without vnodes, for example). This > query pushes 50K-100K rows/sec, which is not very fast if we are doing bulk > operations via Spark (or other processing frameworks - ETL, etc). There are > a few causes (e.g., inefficient paging). > There are a few approaches that could be considered. First, we consider a > new "Streaming Compaction" approach. The key observation here is that a bulk > read from Cassandra is a lot like a major compaction, though instead of > outputting a new SSTable we would output CQL rows to a stream/socket/etc. > This would be similar to a CompactionTask, but would strip out some > unnecessary things in there (e.g., some of the indexing, etc). Predicates and > projections could also be encapsulated in this new "StreamingCompactionTask", > for example. > Another approach would be an alternate storage format. For example, we might > employ Parquet (just as an example) to store the same data as in the primary > Cassandra storage (aka SSTables). This is akin to Global Indexes (an > alternate storage of the same data optimized for a particular query). Then, > Cassandra can choose to leverage this alternate storage for particular CQL > queries (e.g., range scans). > These are just 2 suggestions to get the conversation going. > One thing to note is that it will be useful to have this storage segregated > by token range so that when you extract via these mechanisms you do not get > replications-factor numbers of copies of the data. That will certainly be an > issue for some Spark operations (e.g., counting). Thus, we will want > per-token-range storage (even for single disks), so this will likely leverage > CASSANDRA-6696 (though, we'll want to also consider the single disk case). > It is also worth discussing what the success criteria is here. It is > unlikely to be as fast as EDW or HDFS performance (though, that is still a > good goal), but being within some percentage of that performance should be > set as success. For example, 2x as long as doing bulk operations on HDFS > with similar node count/size/etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-9259) Bulk Reading from Cassandra
[ https://issues.apache.org/jira/browse/CASSANDRA-9259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefania updated CASSANDRA-9259: Attachment: (was: before_after.jpg) > Bulk Reading from Cassandra > --- > > Key: CASSANDRA-9259 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9259 > Project: Cassandra > Issue Type: New Feature > Components: Compaction, CQL, Local Write-Read Paths, Streaming and > Messaging, Testing >Reporter: Brian Hess >Assignee: Stefania >Priority: Critical > Fix For: 3.x > > Attachments: bulk-read-benchmark.1.html, > bulk-read-jfr-profiles.1.tar.gz, bulk-read-jfr-profiles.2.tar.gz, > no_vnodes.jpg, spark_benchmark_raw_data.zip > > > This ticket is following on from the 2015 NGCC. This ticket is designed to > be a place for discussing and designing an approach to bulk reading. > The goal is to have a bulk reading path for Cassandra. That is, a path > optimized to grab a large portion of the data for a table (potentially all of > it). This is a core element in the Spark integration with Cassandra, and the > speed at which Cassandra can deliver bulk data to Spark is limiting the > performance of Spark-plus-Cassandra operations. This is especially of > importance as Cassandra will (likely) leverage Spark for internal operations > (for example CASSANDRA-8234). > The core CQL to consider is the following: > SELECT a, b, c FROM myKs.myTable WHERE Token(partitionKey) > X AND > Token(partitionKey) <= Y > Here, we choose X and Y to be contained within one token range (perhaps > considering the primary range of a node without vnodes, for example). This > query pushes 50K-100K rows/sec, which is not very fast if we are doing bulk > operations via Spark (or other processing frameworks - ETL, etc). There are > a few causes (e.g., inefficient paging). > There are a few approaches that could be considered. First, we consider a > new "Streaming Compaction" approach. The key observation here is that a bulk > read from Cassandra is a lot like a major compaction, though instead of > outputting a new SSTable we would output CQL rows to a stream/socket/etc. > This would be similar to a CompactionTask, but would strip out some > unnecessary things in there (e.g., some of the indexing, etc). Predicates and > projections could also be encapsulated in this new "StreamingCompactionTask", > for example. > Another approach would be an alternate storage format. For example, we might > employ Parquet (just as an example) to store the same data as in the primary > Cassandra storage (aka SSTables). This is akin to Global Indexes (an > alternate storage of the same data optimized for a particular query). Then, > Cassandra can choose to leverage this alternate storage for particular CQL > queries (e.g., range scans). > These are just 2 suggestions to get the conversation going. > One thing to note is that it will be useful to have this storage segregated > by token range so that when you extract via these mechanisms you do not get > replications-factor numbers of copies of the data. That will certainly be an > issue for some Spark operations (e.g., counting). Thus, we will want > per-token-range storage (even for single disks), so this will likely leverage > CASSANDRA-6696 (though, we'll want to also consider the single disk case). > It is also worth discussing what the success criteria is here. It is > unlikely to be as fast as EDW or HDFS performance (though, that is still a > good goal), but being within some percentage of that performance should be > set as success. For example, 2x as long as doing bulk operations on HDFS > with similar node count/size/etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-9259) Bulk Reading from Cassandra
[ https://issues.apache.org/jira/browse/CASSANDRA-9259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefania updated CASSANDRA-9259: Attachment: (was: 256_vnodes.jpg) > Bulk Reading from Cassandra > --- > > Key: CASSANDRA-9259 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9259 > Project: Cassandra > Issue Type: New Feature > Components: Compaction, CQL, Local Write-Read Paths, Streaming and > Messaging, Testing >Reporter: Brian Hess >Assignee: Stefania >Priority: Critical > Fix For: 3.x > > Attachments: bulk-read-benchmark.1.html, > bulk-read-jfr-profiles.1.tar.gz, bulk-read-jfr-profiles.2.tar.gz, > no_vnodes.jpg, spark_benchmark_raw_data.zip > > > This ticket is following on from the 2015 NGCC. This ticket is designed to > be a place for discussing and designing an approach to bulk reading. > The goal is to have a bulk reading path for Cassandra. That is, a path > optimized to grab a large portion of the data for a table (potentially all of > it). This is a core element in the Spark integration with Cassandra, and the > speed at which Cassandra can deliver bulk data to Spark is limiting the > performance of Spark-plus-Cassandra operations. This is especially of > importance as Cassandra will (likely) leverage Spark for internal operations > (for example CASSANDRA-8234). > The core CQL to consider is the following: > SELECT a, b, c FROM myKs.myTable WHERE Token(partitionKey) > X AND > Token(partitionKey) <= Y > Here, we choose X and Y to be contained within one token range (perhaps > considering the primary range of a node without vnodes, for example). This > query pushes 50K-100K rows/sec, which is not very fast if we are doing bulk > operations via Spark (or other processing frameworks - ETL, etc). There are > a few causes (e.g., inefficient paging). > There are a few approaches that could be considered. First, we consider a > new "Streaming Compaction" approach. The key observation here is that a bulk > read from Cassandra is a lot like a major compaction, though instead of > outputting a new SSTable we would output CQL rows to a stream/socket/etc. > This would be similar to a CompactionTask, but would strip out some > unnecessary things in there (e.g., some of the indexing, etc). Predicates and > projections could also be encapsulated in this new "StreamingCompactionTask", > for example. > Another approach would be an alternate storage format. For example, we might > employ Parquet (just as an example) to store the same data as in the primary > Cassandra storage (aka SSTables). This is akin to Global Indexes (an > alternate storage of the same data optimized for a particular query). Then, > Cassandra can choose to leverage this alternate storage for particular CQL > queries (e.g., range scans). > These are just 2 suggestions to get the conversation going. > One thing to note is that it will be useful to have this storage segregated > by token range so that when you extract via these mechanisms you do not get > replications-factor numbers of copies of the data. That will certainly be an > issue for some Spark operations (e.g., counting). Thus, we will want > per-token-range storage (even for single disks), so this will likely leverage > CASSANDRA-6696 (though, we'll want to also consider the single disk case). > It is also worth discussing what the success criteria is here. It is > unlikely to be as fast as EDW or HDFS performance (though, that is still a > good goal), but being within some percentage of that performance should be > set as success. For example, 2x as long as doing bulk operations on HDFS > with similar node count/size/etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-9259) Bulk Reading from Cassandra
[ https://issues.apache.org/jira/browse/CASSANDRA-9259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefania updated CASSANDRA-9259: Attachment: no_vnodes.jpg before_after.jpg 256_vnodes.jpg spark_benchmark_raw_data.zip > Bulk Reading from Cassandra > --- > > Key: CASSANDRA-9259 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9259 > Project: Cassandra > Issue Type: New Feature > Components: Compaction, CQL, Local Write-Read Paths, Streaming and > Messaging, Testing >Reporter: Brian Hess >Assignee: Stefania >Priority: Critical > Fix For: 3.x > > Attachments: 256_vnodes.jpg, before_after.jpg, > bulk-read-benchmark.1.html, bulk-read-jfr-profiles.1.tar.gz, > bulk-read-jfr-profiles.2.tar.gz, no_vnodes.jpg, spark_benchmark_raw_data.zip > > > This ticket is following on from the 2015 NGCC. This ticket is designed to > be a place for discussing and designing an approach to bulk reading. > The goal is to have a bulk reading path for Cassandra. That is, a path > optimized to grab a large portion of the data for a table (potentially all of > it). This is a core element in the Spark integration with Cassandra, and the > speed at which Cassandra can deliver bulk data to Spark is limiting the > performance of Spark-plus-Cassandra operations. This is especially of > importance as Cassandra will (likely) leverage Spark for internal operations > (for example CASSANDRA-8234). > The core CQL to consider is the following: > SELECT a, b, c FROM myKs.myTable WHERE Token(partitionKey) > X AND > Token(partitionKey) <= Y > Here, we choose X and Y to be contained within one token range (perhaps > considering the primary range of a node without vnodes, for example). This > query pushes 50K-100K rows/sec, which is not very fast if we are doing bulk > operations via Spark (or other processing frameworks - ETL, etc). There are > a few causes (e.g., inefficient paging). > There are a few approaches that could be considered. First, we consider a > new "Streaming Compaction" approach. The key observation here is that a bulk > read from Cassandra is a lot like a major compaction, though instead of > outputting a new SSTable we would output CQL rows to a stream/socket/etc. > This would be similar to a CompactionTask, but would strip out some > unnecessary things in there (e.g., some of the indexing, etc). Predicates and > projections could also be encapsulated in this new "StreamingCompactionTask", > for example. > Another approach would be an alternate storage format. For example, we might > employ Parquet (just as an example) to store the same data as in the primary > Cassandra storage (aka SSTables). This is akin to Global Indexes (an > alternate storage of the same data optimized for a particular query). Then, > Cassandra can choose to leverage this alternate storage for particular CQL > queries (e.g., range scans). > These are just 2 suggestions to get the conversation going. > One thing to note is that it will be useful to have this storage segregated > by token range so that when you extract via these mechanisms you do not get > replications-factor numbers of copies of the data. That will certainly be an > issue for some Spark operations (e.g., counting). Thus, we will want > per-token-range storage (even for single disks), so this will likely leverage > CASSANDRA-6696 (though, we'll want to also consider the single disk case). > It is also worth discussing what the success criteria is here. It is > unlikely to be as fast as EDW or HDFS performance (though, that is still a > good goal), but being within some percentage of that performance should be > set as success. For example, 2x as long as doing bulk operations on HDFS > with similar node count/size/etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-12228) Write performance regression in 3.x vs 3.0
[ https://issues.apache.org/jira/browse/CASSANDRA-12228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15400361#comment-15400361 ] Ariel Weisberg commented on CASSANDRA-12228: I think there is also an issue I am stilling working on nailing down where memory accounting releases memory pinned by memtables too early or is just off by too much causing the heap to fill up with memtables that are waiting for the post flush executor. I can see the heap going to double the limit in a heap dump and things are falling apart server side. > Write performance regression in 3.x vs 3.0 > -- > > Key: CASSANDRA-12228 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12228 > Project: Cassandra > Issue Type: Bug >Reporter: T Jake Luciani >Assignee: Marcus Eriksson >Priority: Minor > Fix For: 3.9 > > > I've been tracking down a performance issue in trunk vs cassandra-3.0 branch. > I think I've found it. CASSANDRA-6696 changed the default memtable flush > default to 1 vs the min of 2 in cassandra-3.0. > I don't see any technical reason for this and we should add back the min of 2 > sstable flushers per disk. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-12348) Flaky failures in SSTableRewriterTest.basicTest2/getPositionsTest
Joel Knighton created CASSANDRA-12348: - Summary: Flaky failures in SSTableRewriterTest.basicTest2/getPositionsTest Key: CASSANDRA-12348 URL: https://issues.apache.org/jira/browse/CASSANDRA-12348 Project: Cassandra Issue Type: Bug Components: Testing Reporter: Joel Knighton Fix For: 3.x Example failures: http://cassci.datastax.com/job/cassandra-3.9_testall/45/testReport/junit/org.apache.cassandra.io.sstable/SSTableRewriterTest/basicTest2/ http://cassci.datastax.com/job/cassandra-3.9_testall/37/testReport/junit/org.apache.cassandra.io.sstable/SSTableRewriterTest/getPositionsTest/ http://cassci.datastax.com/job/trunk_testall/1054/testReport/junit/org.apache.cassandra.io.sstable/SSTableRewriterTest/getPositionsTest/ All failures look like the test is finding more files than expected after a rewrite. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11687) dtest failure in rebuild_test.TestRebuild.simple_rebuild_test
[ https://issues.apache.org/jira/browse/CASSANDRA-11687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15400348#comment-15400348 ] Joel Knighton commented on CASSANDRA-11687: --- I commented on the dtest PR to remove the known_failure annotation - after this PR was merged, I saw a new failure on this test as part of the [daily 3.9 dtest run|http://cassci.datastax.com/job/cassandra-3.9_dtest/21/testReport/rebuild_test/TestRebuild/simple_rebuild_test/]. This also reproduces fairly easily on my local machine; not sure why the multiplexer didn't hit it. I think the PRed fix is still prone to races. > dtest failure in rebuild_test.TestRebuild.simple_rebuild_test > - > > Key: CASSANDRA-11687 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11687 > Project: Cassandra > Issue Type: Bug >Reporter: Russ Hatch >Assignee: Yuki Morishita > Labels: dtest > > single failure on most recent run (3.0 no-vnode) > {noformat} > concurrent rebuild should not be allowed, but one rebuild command should have > succeeded. > {noformat} > http://cassci.datastax.com/job/cassandra-3.0_novnode_dtest/217/testReport/rebuild_test/TestRebuild/simple_rebuild_test > Failed on CassCI build cassandra-3.0_novnode_dtest #217 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-12008) Make decommission operations resumable
[ https://issues.apache.org/jira/browse/CASSANDRA-12008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15400338#comment-15400338 ] Paulo Motta commented on CASSANDRA-12008: - Thanks for the update! See follow-up below: bq. I've added a new strategy, please let me know what do you think about it. Instead of building a {{streamedRangesPerEndpoints Map}} manually, maybe it's better to modify {{getStreamedRanges}} to take a description and keyspace as argument an return a {{Map }} by querying {{"SELECT * FROM system.streamed_ranges WHERE operation = ? AND keyspace_name = ?"}}. This way you can query {{getStreamedRanges}} directly to filter already transferred ranges when iterating {{rangesToStreamByKeyspace}}. bq. So instead we will obtain StreamSession from StreamTransferTask.getSession() when each StreamTransferTask is complete i.e when StreamStateStore.handleStreamEvent is invoked. All these means that we are going to only pass its responsible keyspace. I think we can simplify that and instead of adding {{transferTasks}} to {{SessionCompleteEvent}} we can simply add the session description and {{transferredRangesPerKeyspace}}, and that's all we will need to populate the streamed ranges on {{StreamStateStore}}. A minor nit is that the transferred ranges are always being overriden on {{addTransferRanges}} while you should append to the existing set if it's already present on {{transferredRangesPerKeyspace}}. bq. Don't know if there's some problem with current implementation or there's something weird in the set-up, but it skips twice the same range: this is for different keyspaces, so you should add the keyspace name in the log message so it's not confusing. bq. I think it's the set-up itself since StorageService.getChangedRangesForLeaving is also returning the same range twice that's probably for the same reason as above, maybe it would be a good idea to add the keyspace name in that log as well. > Make decommission operations resumable > -- > > Key: CASSANDRA-12008 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12008 > Project: Cassandra > Issue Type: Improvement > Components: Streaming and Messaging >Reporter: Tom van der Woerdt >Assignee: Kaide Mu >Priority: Minor > > We're dealing with large data sets (multiple terabytes per node) and > sometimes we need to add or remove nodes. These operations are very dependent > on the entire cluster being up, so while we're joining a new node (which > sometimes takes 6 hours or longer) a lot can go wrong and in a lot of cases > something does. > It would be great if the ability to retry streams was implemented. > Example to illustrate the problem : > {code} > 03:18 PM ~ $ nodetool decommission > error: Stream failed > -- StackTrace -- > org.apache.cassandra.streaming.StreamException: Stream failed > at > org.apache.cassandra.streaming.management.StreamEventJMXNotifier.onFailure(StreamEventJMXNotifier.java:85) > at com.google.common.util.concurrent.Futures$6.run(Futures.java:1310) > at > com.google.common.util.concurrent.MoreExecutors$DirectExecutor.execute(MoreExecutors.java:457) > at > com.google.common.util.concurrent.ExecutionList.executeListener(ExecutionList.java:156) > at > com.google.common.util.concurrent.ExecutionList.execute(ExecutionList.java:145) > at > com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:202) > at > org.apache.cassandra.streaming.StreamResultFuture.maybeComplete(StreamResultFuture.java:210) > at > org.apache.cassandra.streaming.StreamResultFuture.handleSessionComplete(StreamResultFuture.java:186) > at > org.apache.cassandra.streaming.StreamSession.closeSession(StreamSession.java:430) > at > org.apache.cassandra.streaming.StreamSession.complete(StreamSession.java:622) > at > org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:486) > at > org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:274) > at java.lang.Thread.run(Thread.java:745) > 08:04 PM ~ $ nodetool decommission > nodetool: Unsupported operation: Node in LEAVING state; wait for status to > become normal or restart > See 'nodetool help' or 'nodetool help '. > {code} > Streaming failed, probably due to load : > {code} > ERROR [STREAM-IN-/] 2016-06-14 18:05:47,275 StreamSession.java:520 - > [Stream #] Streaming error occurred > java.net.SocketTimeoutException: null > at > sun.nio.ch.SocketAdaptor$SocketInputStream.read(SocketAdaptor.java:211) > ~[na:1.8.0_77] > at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103) > ~[na:1.8.0_77] >
[jira] [Comment Edited] (CASSANDRA-12251) dtest failure in upgrade_tests.cql_tests.TestCQLNodes3RF3_Upgrade_current_3_x_To_indev_3_x.whole_list_conditional_test
[ https://issues.apache.org/jira/browse/CASSANDRA-12251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15400277#comment-15400277 ] Joel Knighton edited comment on CASSANDRA-12251 at 7/30/16 12:10 AM: - This looks to me like in drain/StorageServiceShutdownHook, the schema stage is not shutdown, so if you have a task submitted to that executor that doesn't execute until after the postflush executor has been terminated in drain, you'll hit this exception. This is a C* fix for sure. was (Author: jkni): This looks to me like in drain/StorageShutdownHook, the schema stage is not shutdown, so if you have a task submitted to that executor that doesn't execute until after the postflush executor has been terminated in drain, you'll hit this exception. This is a C* fix for sure. > dtest failure in > upgrade_tests.cql_tests.TestCQLNodes3RF3_Upgrade_current_3_x_To_indev_3_x.whole_list_conditional_test > -- > > Key: CASSANDRA-12251 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12251 > Project: Cassandra > Issue Type: Bug >Reporter: Philip Thompson >Assignee: Alex Petrov > Labels: dtest > Attachments: node1.log, node1_debug.log, node1_gc.log, node2.log, > node2_debug.log, node2_gc.log, node3.log, node3_debug.log, node3_gc.log > > > example failure: > http://cassci.datastax.com/job/cassandra-3.8_dtest_upgrade/1/testReport/upgrade_tests.cql_tests/TestCQLNodes3RF3_Upgrade_current_3_x_To_indev_3_x/whole_list_conditional_test > Failed on CassCI build cassandra-3.8_dtest_upgrade #1 > Relevant error in logs is > {code} > Unexpected error in node1 log, error: > ERROR [InternalResponseStage:2] 2016-07-20 04:58:45,876 > CassandraDaemon.java:217 - Exception in thread > Thread[InternalResponseStage:2,5,main] > java.util.concurrent.RejectedExecutionException: ThreadPoolExecutor has shut > down > at > org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor$1.rejectedExecution(DebuggableThreadPoolExecutor.java:61) > ~[apache-cassandra-3.7.jar:3.7] > at > java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:823) > ~[na:1.8.0_51] > at > java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1369) > ~[na:1.8.0_51] > at > org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor.execute(DebuggableThreadPoolExecutor.java:165) > ~[apache-cassandra-3.7.jar:3.7] > at > java.util.concurrent.AbstractExecutorService.submit(AbstractExecutorService.java:112) > ~[na:1.8.0_51] > at > org.apache.cassandra.db.ColumnFamilyStore.switchMemtable(ColumnFamilyStore.java:842) > ~[apache-cassandra-3.7.jar:3.7] > at > org.apache.cassandra.db.ColumnFamilyStore.switchMemtableIfCurrent(ColumnFamilyStore.java:822) > ~[apache-cassandra-3.7.jar:3.7] > at > org.apache.cassandra.db.ColumnFamilyStore.forceFlush(ColumnFamilyStore.java:891) > ~[apache-cassandra-3.7.jar:3.7] > at > org.apache.cassandra.schema.SchemaKeyspace.lambda$flush$1(SchemaKeyspace.java:279) > ~[apache-cassandra-3.7.jar:3.7] > at > org.apache.cassandra.schema.SchemaKeyspace$$Lambda$200/1129213153.accept(Unknown > Source) ~[na:na] > at java.lang.Iterable.forEach(Iterable.java:75) ~[na:1.8.0_51] > at > org.apache.cassandra.schema.SchemaKeyspace.flush(SchemaKeyspace.java:279) > ~[apache-cassandra-3.7.jar:3.7] > at > org.apache.cassandra.schema.SchemaKeyspace.mergeSchema(SchemaKeyspace.java:1271) > ~[apache-cassandra-3.7.jar:3.7] > at > org.apache.cassandra.schema.SchemaKeyspace.mergeSchemaAndAnnounceVersion(SchemaKeyspace.java:1253) > ~[apache-cassandra-3.7.jar:3.7] > at > org.apache.cassandra.service.MigrationTask$1.response(MigrationTask.java:92) > ~[apache-cassandra-3.7.jar:3.7] > at > org.apache.cassandra.net.ResponseVerbHandler.doVerb(ResponseVerbHandler.java:53) > ~[apache-cassandra-3.7.jar:3.7] > at > org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:64) > ~[apache-cassandra-3.7.jar:3.7] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > ~[na:1.8.0_51] > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > ~[na:1.8.0_51] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > ~[na:1.8.0_51] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > [na:1.8.0_51] > at java.lang.Thread.run(Thread.java:745) [na:1.8.0_51] > {code} > This is on a mixed 3.0.8, 3.8-tentative cluster -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-12251) dtest failure in upgrade_tests.cql_tests.TestCQLNodes3RF3_Upgrade_current_3_x_To_indev_3_x.whole_list_conditional_test
[ https://issues.apache.org/jira/browse/CASSANDRA-12251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15400277#comment-15400277 ] Joel Knighton commented on CASSANDRA-12251: --- This looks to me like in drain/StorageShutdownHook, the schema stage is not shutdown, so if you have a task submitted to that executor that doesn't execute until after the postflush executor has been terminated in drain, you'll hit this exception. This is a C* fix for sure. > dtest failure in > upgrade_tests.cql_tests.TestCQLNodes3RF3_Upgrade_current_3_x_To_indev_3_x.whole_list_conditional_test > -- > > Key: CASSANDRA-12251 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12251 > Project: Cassandra > Issue Type: Bug >Reporter: Philip Thompson >Assignee: Alex Petrov > Labels: dtest > Attachments: node1.log, node1_debug.log, node1_gc.log, node2.log, > node2_debug.log, node2_gc.log, node3.log, node3_debug.log, node3_gc.log > > > example failure: > http://cassci.datastax.com/job/cassandra-3.8_dtest_upgrade/1/testReport/upgrade_tests.cql_tests/TestCQLNodes3RF3_Upgrade_current_3_x_To_indev_3_x/whole_list_conditional_test > Failed on CassCI build cassandra-3.8_dtest_upgrade #1 > Relevant error in logs is > {code} > Unexpected error in node1 log, error: > ERROR [InternalResponseStage:2] 2016-07-20 04:58:45,876 > CassandraDaemon.java:217 - Exception in thread > Thread[InternalResponseStage:2,5,main] > java.util.concurrent.RejectedExecutionException: ThreadPoolExecutor has shut > down > at > org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor$1.rejectedExecution(DebuggableThreadPoolExecutor.java:61) > ~[apache-cassandra-3.7.jar:3.7] > at > java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:823) > ~[na:1.8.0_51] > at > java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1369) > ~[na:1.8.0_51] > at > org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor.execute(DebuggableThreadPoolExecutor.java:165) > ~[apache-cassandra-3.7.jar:3.7] > at > java.util.concurrent.AbstractExecutorService.submit(AbstractExecutorService.java:112) > ~[na:1.8.0_51] > at > org.apache.cassandra.db.ColumnFamilyStore.switchMemtable(ColumnFamilyStore.java:842) > ~[apache-cassandra-3.7.jar:3.7] > at > org.apache.cassandra.db.ColumnFamilyStore.switchMemtableIfCurrent(ColumnFamilyStore.java:822) > ~[apache-cassandra-3.7.jar:3.7] > at > org.apache.cassandra.db.ColumnFamilyStore.forceFlush(ColumnFamilyStore.java:891) > ~[apache-cassandra-3.7.jar:3.7] > at > org.apache.cassandra.schema.SchemaKeyspace.lambda$flush$1(SchemaKeyspace.java:279) > ~[apache-cassandra-3.7.jar:3.7] > at > org.apache.cassandra.schema.SchemaKeyspace$$Lambda$200/1129213153.accept(Unknown > Source) ~[na:na] > at java.lang.Iterable.forEach(Iterable.java:75) ~[na:1.8.0_51] > at > org.apache.cassandra.schema.SchemaKeyspace.flush(SchemaKeyspace.java:279) > ~[apache-cassandra-3.7.jar:3.7] > at > org.apache.cassandra.schema.SchemaKeyspace.mergeSchema(SchemaKeyspace.java:1271) > ~[apache-cassandra-3.7.jar:3.7] > at > org.apache.cassandra.schema.SchemaKeyspace.mergeSchemaAndAnnounceVersion(SchemaKeyspace.java:1253) > ~[apache-cassandra-3.7.jar:3.7] > at > org.apache.cassandra.service.MigrationTask$1.response(MigrationTask.java:92) > ~[apache-cassandra-3.7.jar:3.7] > at > org.apache.cassandra.net.ResponseVerbHandler.doVerb(ResponseVerbHandler.java:53) > ~[apache-cassandra-3.7.jar:3.7] > at > org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:64) > ~[apache-cassandra-3.7.jar:3.7] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > ~[na:1.8.0_51] > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > ~[na:1.8.0_51] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > ~[na:1.8.0_51] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > [na:1.8.0_51] > at java.lang.Thread.run(Thread.java:745) [na:1.8.0_51] > {code} > This is on a mixed 3.0.8, 3.8-tentative cluster -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-12311) Propagate TombstoneOverwhelmingException to the client
[ https://issues.apache.org/jira/browse/CASSANDRA-12311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15400265#comment-15400265 ] Geoffrey Yu commented on CASSANDRA-12311: - Sure, that sounds reasonable to me. I'll make the changes and update the patch. > Propagate TombstoneOverwhelmingException to the client > -- > > Key: CASSANDRA-12311 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12311 > Project: Cassandra > Issue Type: Improvement >Reporter: Geoffrey Yu >Assignee: Geoffrey Yu >Priority: Minor > Fix For: 4.x > > Attachments: 12311-trunk.txt > > > Right now if a data node fails to perform a read because it ran into a > {{TombstoneOverwhelmingException}}, it only responds back to the coordinator > node with a generic failure. Under this scheme, the coordinator won't be able > to know exactly why the request failed and subsequently the client only gets > a generic {{ReadFailureException}}. It would be useful to inform the client > that their read failed because we read too many tombstones. We should have > the data nodes reply with a failure type so the coordinator can pass this > information to the client. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-12092) dtest failure in consistency_test.TestAccuracy.test_simple_strategy_counters
[ https://issues.apache.org/jira/browse/CASSANDRA-12092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15400145#comment-15400145 ] Russ Hatch commented on CASSANDRA-12092: [~Stefania] any idea what could be causing this test to fail (intermittently) but on the same key when it does? > dtest failure in consistency_test.TestAccuracy.test_simple_strategy_counters > > > Key: CASSANDRA-12092 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12092 > Project: Cassandra > Issue Type: Test >Reporter: Sean McCarthy >Assignee: Russ Hatch > Labels: dtest > Attachments: node1.log, node2.log, node3.log > > > example failure: > http://cassci.datastax.com/job/cassandra-2.1_dtest/484/testReport/consistency_test/TestAccuracy/test_simple_strategy_counters > Failed on CassCI build cassandra-2.1_dtest #484 > {code} > Standard Error > Traceback (most recent call last): > File "/home/automaton/cassandra-dtest/consistency_test.py", line 514, in run > valid_fcn(v) > File "/home/automaton/cassandra-dtest/consistency_test.py", line 497, in > validate_counters > check_all_sessions(s, n, c) > File "/home/automaton/cassandra-dtest/consistency_test.py", line 490, in > check_all_sessions > "value of %s at key %d, instead got these values: %s" % (write_nodes, > val, n, results) > AssertionError: Failed to read value from sufficient number of nodes, > required 2 nodes to have a counter value of 1 at key 200, instead got these > values: [0, 0, 1] > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-12228) Write performance regression in 3.x vs 3.0
[ https://issues.apache.org/jira/browse/CASSANDRA-12228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15400133#comment-15400133 ] Ariel Weisberg edited comment on CASSANDRA-12228 at 7/29/16 10:28 PM: -- There are some remaining issues with thread pool sizes. See [CASSANDRA-12071|https://issues.apache.org/jira/browse/CASSANDRA-12071?focusedCommentId=15400086=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15400086]. You still can't get multiple threads if you have a single disk due to TPE not spinning up additional threads if you are using an unbounded queue. Seems like this would be a good place to address the related issue. I also don't think this is minor it's pretty crippling for performance and you can't work around it by changing configuration values. was (Author: aweisberg): There are some remaining issues with thread pool sizes. See [CASSANDRA-12071|https://issues.apache.org/jira/browse/CASSANDRA-12071?focusedCommentId=15400086=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15400086]. You still can't get multiple threads if you have a single disk. Seems like this would be a good place to address the related issue. I also don't think this is minor it's pretty crippling for performance and you can't work around it by changing configuration values. > Write performance regression in 3.x vs 3.0 > -- > > Key: CASSANDRA-12228 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12228 > Project: Cassandra > Issue Type: Bug >Reporter: T Jake Luciani >Assignee: Marcus Eriksson >Priority: Minor > Fix For: 3.9 > > > I've been tracking down a performance issue in trunk vs cassandra-3.0 branch. > I think I've found it. CASSANDRA-6696 changed the default memtable flush > default to 1 vs the min of 2 in cassandra-3.0. > I don't see any technical reason for this and we should add back the min of 2 > sstable flushers per disk. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-12228) Write performance regression in 3.x vs 3.0
[ https://issues.apache.org/jira/browse/CASSANDRA-12228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15400133#comment-15400133 ] Ariel Weisberg commented on CASSANDRA-12228: There are some remaining issues with thread pool sizes. See [CASSANDRA-12071|https://issues.apache.org/jira/browse/CASSANDRA-12071?focusedCommentId=15400086=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15400086]. You still can't get multiple threads if you have a single disk. Seems like this would be a good place to address the related issue. I also don't think this is minor it's pretty crippling for performance and you can't work around it by changing configuration values. > Write performance regression in 3.x vs 3.0 > -- > > Key: CASSANDRA-12228 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12228 > Project: Cassandra > Issue Type: Bug >Reporter: T Jake Luciani >Assignee: Marcus Eriksson >Priority: Minor > Fix For: 3.9 > > > I've been tracking down a performance issue in trunk vs cassandra-3.0 branch. > I think I've found it. CASSANDRA-6696 changed the default memtable flush > default to 1 vs the min of 2 in cassandra-3.0. > I don't see any technical reason for this and we should add back the min of 2 > sstable flushers per disk. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (CASSANDRA-12071) Regression in flushing throughput under load after CASSANDRA-6696
[ https://issues.apache.org/jira/browse/CASSANDRA-12071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ariel Weisberg resolved CASSANDRA-12071. Resolution: Fixed Going to bring this up in CASSANDRA-12228 since this was resolved and released already. > Regression in flushing throughput under load after CASSANDRA-6696 > - > > Key: CASSANDRA-12071 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12071 > Project: Cassandra > Issue Type: Bug > Components: Local Write-Read Paths >Reporter: Ariel Weisberg >Assignee: Marcus Eriksson > Fix For: 3.8 > > > The way flushing used to work is that a ColumnFamilyStore could have multiple > Memtables flushing at once and multiple ColumnFamilyStores could flush at the > same time. The way it works now there can be only a single flush of any > ColumnFamilyStore & Memtable running in the C* process, and the number of > threads applied to that flush is bounded by the number of disks in JBOD. > This works ok most of the time but occasionally flushing will be a little > slower and ingest will outstrip it and then block on available memory. At > this point you see several second stalls that cause timeouts. > This is a problem for reasonable configurations that don't use JBOD but have > access to a fast disk that can handle some IO queuing (RAID, SSD). > You can reproduce on beefy hardware (12 cores 24 threads, 64 gigs of RAM, > SSD) if you unthrottle compaction or set it to something like 64 > megabytes/second and run with 8 compaction threads and stress with the > default write workload and a reasonable number of threads. I tested with 96. > It started happening after about 60 gigabytes of data was loaded. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-12346) Gossip 2.0 - introduce a Peer Sampling Service for partial cluster views
[ https://issues.apache.org/jira/browse/CASSANDRA-12346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15400097#comment-15400097 ] Jason Brown commented on CASSANDRA-12346: - Here is the branch for an implementation of hyparview: ||hyparview|| |[branch|https://github.com/jasobrown/cassandra/tree/broadcast_hyparview]| |[dtest|http://cassci.datastax.com/view/Dev/view/jasobrown/job/jasobrown-broadcast_hyparview-dtest/]| |[testall|http://cassci.datastax.com/view/Dev/view/jasobrown/job/jasobrown-broadcast_hyparview-testall/]| I've documented it rather extensively, so I hope that aides the reviewer. There are still a couple of (very) minor things left to clean up, including the simulator (implemented as a long test), but that should not hinder any review, i think. > Gossip 2.0 - introduce a Peer Sampling Service for partial cluster views > > > Key: CASSANDRA-12346 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12346 > Project: Cassandra > Issue Type: Improvement > Components: Core >Reporter: Jason Brown >Assignee: Jason Brown > Labels: gossip > > A [Peer Sampling > Service|infoscience.epfl.ch/record/83409/files/neg--1184036295all.pdf] is a > module that provides a partial view of a cluster to dependent modules. A > node's partial view, combined with all other nodes' partial views, combine to > create a fully-connected mesh over the cluster. This way, a given node does > not need to have direct connections to every other node in the cluster, and > can be much more efficient in terms of resource management as well as > information dissemination. Peer Sampling Services by their nature must be > self-healing and self-balancing to maintain the fully-connected mesh. > I propose we use an algorithm based on [HyParView > (http://asc.di.fct.unl.pt/~jleitao/pdf/dsn07-leitao.pdf], which is a concrete > algorithm for a Peer Sampling Service. HyParView has a clearly defined > protocol, and is reasonably simple to implement. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-12347) Gossip 2.0 - broadcast tree for data dissemination
[ https://issues.apache.org/jira/browse/CASSANDRA-12347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joel Knighton updated CASSANDRA-12347: -- Description: Description: A broadcast tree (spanning tree) allows an originating node to efficiently send out updates to all of the peers in the cluster by constructing a balanced, self-healing tree based upon the view it gets from the peer sampling service (CASSANDRA-12346). I propose we use an algorithm based on the [Thicket paper|http://www.gsd.inesc-id.pt/%7Ejleitao/pdf/srds10-mario.pdf], which describes a dynamic, self-healing broadcast tree. When a given node needs to send out a message, it dynamically builds a tree for each node in the cluster; thus giving us a unique tree for every node in the cluster (a tree rooted at every cluster node). The trees, of course, would be reusable until the cluster configurations changes or failures are detected (by the mechanism described in the paper). Additionally, Thicket includes a mechanism for load-balancing the trees such that nodes spread out the work amongst themselves. was: Description: A broadcast tree (spanning tree) allows an originating node to efficiently send out updates to all of the peers in the cluster by constructing a balanced, self-healing tree based upon the view it gets from the peer sampling service (CASSANDRA-12346). I propose we use an algorithm based on the [Thicket paper|www.gsd.inesc-id.pt/%7Ejleitao/pdf/srds10-mario.pdf], which describes a dynamic, self-healing broadcast tree. When a given node needs to send out a message, it dynamically builds a tree for each node in the cluster; thus giving us a unique tree for every node in the cluster (a tree rooted at every cluster node). The trees, of course, would be reusable until the cluster configurations changes or failures are detected (by the mechanism described in the paper). Additionally, Thicket includes a mechanism for load-balancing the trees such that nodes spread out the work amongst themselves. > Gossip 2.0 - broadcast tree for data dissemination > -- > > Key: CASSANDRA-12347 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12347 > Project: Cassandra > Issue Type: Improvement >Reporter: Jason Brown > > Description: A broadcast tree (spanning tree) allows an originating node to > efficiently send out updates to all of the peers in the cluster by > constructing a balanced, self-healing tree based upon the view it gets from > the peer sampling service (CASSANDRA-12346). > I propose we use an algorithm based on the [Thicket > paper|http://www.gsd.inesc-id.pt/%7Ejleitao/pdf/srds10-mario.pdf], which > describes a dynamic, self-healing broadcast tree. When a given node needs to > send out a message, it dynamically builds a tree for each node in the > cluster; thus giving us a unique tree for every node in the cluster (a tree > rooted at every cluster node). The trees, of course, would be reusable until > the cluster configurations changes or failures are detected (by the mechanism > described in the paper). Additionally, Thicket includes a mechanism for > load-balancing the trees such that nodes spread out the work amongst > themselves. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-12345) Gossip 2.0
[ https://issues.apache.org/jira/browse/CASSANDRA-12345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Brown updated CASSANDRA-12345: Description: This is a parent ticket covering changes to the dissemination aspects of the current gossip subsystem. (Changes to the actual data being exchanged by the current gossip (the cluster metadata) will be handled elsewhere, but the current primary ticket covering that work is CASSANDRA-9667.) This work requires several components, which largely need to completed in this order: - a peer sampling service to create partial cluster views (CASSANDRA-12346). This forms the basis of the next two components - a broadcast tree, which creates dynamic spanning trees given the partial views provided by the peer sampling service (CASSANDRA-12347) - an anti-entropy component, which is similar to the pair-wise exchange and reconciliation of the exitsing gossip implementation (CASSANDRA-???) These base components (primarily the broadcast and anti-entropy) can allow for generic consumers to simply and effectively share a body of data across an entire cluster. The most obvious consumer will be a cluster metadata component, which can replace the existing gossip system, but also other components like CASSANDRA-12106. was: This is a parent ticket covering changes to the dissemination aspects of the current gossip subsystem. (Changes to the actual data being exchanged by the current gossip (the cluster metadata) will be handled elsewhere, but the current primary ticket covering that work is CASSANDRA-9667.) This work requires several components, which largely need to completed in this order: - a peer sampling service to create partial cluster views (CASSANDRA-12346). This forms the basis of the next two components - a broadcast tree, which creates dynamic spanning trees given the partial views provided by the peer sampling service (CASSANDRA-???) - an anti-entropy component, which is similar to the pair-wise exchange and reconciliation of the exitsing gossip implementation (CASSANDRA-???) These base components (primarily the broadcast and anti-entropy) can allow for generic consumers to simply and effectively share a body of data across an entire cluster. The most obvious consumer will be a cluster metadata component, which can replace the existing gossip system, but also other components like CASSANDRA-12106. > Gossip 2.0 > -- > > Key: CASSANDRA-12345 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12345 > Project: Cassandra > Issue Type: Improvement > Components: Core >Reporter: Jason Brown >Assignee: Jason Brown > Labels: gossip > > This is a parent ticket covering changes to the dissemination aspects of the > current gossip subsystem. (Changes to the actual data being exchanged by the > current gossip (the cluster metadata) will be handled elsewhere, but the > current primary ticket covering that work is CASSANDRA-9667.) > This work requires several components, which largely need to completed in > this order: > - a peer sampling service to create partial cluster views (CASSANDRA-12346). > This forms the basis of the next two components > - a broadcast tree, which creates dynamic spanning trees given the partial > views provided by the peer sampling service (CASSANDRA-12347) > - an anti-entropy component, which is similar to the pair-wise exchange and > reconciliation of the exitsing gossip implementation (CASSANDRA-???) > These base components (primarily the broadcast and anti-entropy) can allow > for generic consumers to simply and effectively share a body of data across > an entire cluster. The most obvious consumer will be a cluster metadata > component, which can replace the existing gossip system, but also other > components like CASSANDRA-12106. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-12347) Gossip 2.0 - broadcast tree for data dissemination
Jason Brown created CASSANDRA-12347: --- Summary: Gossip 2.0 - broadcast tree for data dissemination Key: CASSANDRA-12347 URL: https://issues.apache.org/jira/browse/CASSANDRA-12347 Project: Cassandra Issue Type: Improvement Reporter: Jason Brown Description: A broadcast tree (spanning tree) allows an originating node to efficiently send out updates to all of the peers in the cluster by constructing a balanced, self-healing tree based upon the view it gets from the peer sampling service (CASSANDRA-12346). I propose we use an algorithm based on the [Thicket paper|www.gsd.inesc-id.pt/%7Ejleitao/pdf/srds10-mario.pdf], which describes a dynamic, self-healing broadcast tree. When a given node needs to send out a message, it dynamically builds a tree for each node in the cluster; thus giving us a unique tree for every node in the cluster (a tree rooted at every cluster node). The trees, of course, would be reusable until the cluster configurations changes or failures are detected (by the mechanism described in the paper). Additionally, Thicket includes a mechanism for load-balancing the trees such that nodes spread out the work amongst themselves. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-12345) Gossip 2.0
[ https://issues.apache.org/jira/browse/CASSANDRA-12345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Brown updated CASSANDRA-12345: Description: This is a parent ticket covering changes to the dissemination aspects of the current gossip subsystem. (Changes to the actual data being exchanged by the current gossip (the cluster metadata) will be handled elsewhere, but the current primary ticket covering that work is CASSANDRA-9667.) This work requires several components, which largely need to completed in this order: - a peer sampling service to create partial cluster views (CASSANDRA-12346). This forms the basis of the next two components - a broadcast tree, which creates dynamic spanning trees given the partial views provided by the peer sampling service (CASSANDRA-???) - an anti-entropy component, which is similar to the pair-wise exchange and reconciliation of the exitsing gossip implementation (CASSANDRA-???) These base components (primarily the broadcast and anti-entropy) can allow for generic consumers to simply and effectively share a body of data across an entire cluster. The most obvious consumer will be a cluster metadata component, which can replace the existing gossip system, but also other components like CASSANDRA-12106. was: This is a parent ticket covering changes to the dissemination aspects of the current gossip subsystem. (Changes to the actual data being exchanged by the current gossip (the cluster metadata) will be handled elsewhere, but the current primary ticket covering that work is CASSANDRA-9667.) This work requires several components, which largely need to completed in this order: - a peer sampling service to create partial cluster views (CASSANDRA-). This forms the basis of the next two components - a broadcast tree, which creates dynamic spanning trees given the partial views provided by the peer sampling service (CASSANDRA-???) - an anti-entropy component, which is similar to the pair-wise exchange and reconciliation of the exitsing gossip implementation (CASSANDRA-???) These base components (primarily the broadcast and anti-entropy) can allow for generic consumers to simply and effectively share a body of data across an entire cluster. The most obvious consumer will be a cluster metadata component, which can replace the existing gossip system, but also other components like CASSANDRA-12106. > Gossip 2.0 > -- > > Key: CASSANDRA-12345 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12345 > Project: Cassandra > Issue Type: Improvement > Components: Core >Reporter: Jason Brown >Assignee: Jason Brown > Labels: gossip > > This is a parent ticket covering changes to the dissemination aspects of the > current gossip subsystem. (Changes to the actual data being exchanged by the > current gossip (the cluster metadata) will be handled elsewhere, but the > current primary ticket covering that work is CASSANDRA-9667.) > This work requires several components, which largely need to completed in > this order: > - a peer sampling service to create partial cluster views (CASSANDRA-12346). > This forms the basis of the next two components > - a broadcast tree, which creates dynamic spanning trees given the partial > views provided by the peer sampling service (CASSANDRA-???) > - an anti-entropy component, which is similar to the pair-wise exchange and > reconciliation of the exitsing gossip implementation (CASSANDRA-???) > These base components (primarily the broadcast and anti-entropy) can allow > for generic consumers to simply and effectively share a body of data across > an entire cluster. The most obvious consumer will be a cluster metadata > component, which can replace the existing gossip system, but also other > components like CASSANDRA-12106. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Reopened] (CASSANDRA-12071) Regression in flushing throughput under load after CASSANDRA-6696
[ https://issues.apache.org/jira/browse/CASSANDRA-12071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ariel Weisberg reopened CASSANDRA-12071: Turns out this is still a problem because the usage of an unbounded LinkedBlockingQueue means that TPE will never actually spin up additional threads. You can see that this was necessary for CASSANDRA-2178 as well. > Regression in flushing throughput under load after CASSANDRA-6696 > - > > Key: CASSANDRA-12071 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12071 > Project: Cassandra > Issue Type: Bug > Components: Local Write-Read Paths >Reporter: Ariel Weisberg >Assignee: Marcus Eriksson > Fix For: 3.8 > > > The way flushing used to work is that a ColumnFamilyStore could have multiple > Memtables flushing at once and multiple ColumnFamilyStores could flush at the > same time. The way it works now there can be only a single flush of any > ColumnFamilyStore & Memtable running in the C* process, and the number of > threads applied to that flush is bounded by the number of disks in JBOD. > This works ok most of the time but occasionally flushing will be a little > slower and ingest will outstrip it and then block on available memory. At > this point you see several second stalls that cause timeouts. > This is a problem for reasonable configurations that don't use JBOD but have > access to a fast disk that can handle some IO queuing (RAID, SSD). > You can reproduce on beefy hardware (12 cores 24 threads, 64 gigs of RAM, > SSD) if you unthrottle compaction or set it to something like 64 > megabytes/second and run with 8 compaction threads and stress with the > default write workload and a reasonable number of threads. I tested with 96. > It started happening after about 60 gigabytes of data was loaded. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-12346) Gossip 2.0 - introduce a Peer Sampling Service for partial cluster views
Jason Brown created CASSANDRA-12346: --- Summary: Gossip 2.0 - introduce a Peer Sampling Service for partial cluster views Key: CASSANDRA-12346 URL: https://issues.apache.org/jira/browse/CASSANDRA-12346 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Jason Brown Assignee: Jason Brown A [Peer Sampling Service|infoscience.epfl.ch/record/83409/files/neg--1184036295all.pdf] is a module that provides a partial view of a cluster to dependent modules. A node's partial view, combined with all other nodes' partial views, combine to create a fully-connected mesh over the cluster. This way, a given node does not need to have direct connections to every other node in the cluster, and can be much more efficient in terms of resource management as well as information dissemination. Peer Sampling Services by their nature must be self-healing and self-balancing to maintain the fully-connected mesh. I propose we use an algorithm based on [HyParView (http://asc.di.fct.unl.pt/~jleitao/pdf/dsn07-leitao.pdf], which is a concrete algorithm for a Peer Sampling Service. HyParView has a clearly defined protocol, and is reasonably simple to implement. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-12340) dtest failure in upgrade_supercolumns_test.TestSCUpgrade.upgrade_with_counters_test
[ https://issues.apache.org/jira/browse/CASSANDRA-12340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15400057#comment-15400057 ] Joel Knighton commented on CASSANDRA-12340: --- This is almost certainly because when we're stopping 2.0.17 for the upgrade, the StorageServiceShutdownHook doesn't stop compactions and a compaction is attempting to schedule a task on the miscellaneous tasks executor after that executor has been stopped. This isn't going to get fixed in 2.0 (or likely 2.1 for that matter), so the best option is to just ignore this error in the test if possible. > dtest failure in > upgrade_supercolumns_test.TestSCUpgrade.upgrade_with_counters_test > --- > > Key: CASSANDRA-12340 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12340 > Project: Cassandra > Issue Type: Test >Reporter: Sean McCarthy >Assignee: DS Test Eng > Labels: dtest > Attachments: node1.log, node2.log, node3.log > > > example failure: > http://cassci.datastax.com/job/cassandra-2.1_dtest_jdk8/249/testReport/upgrade_supercolumns_test/TestSCUpgrade/upgrade_with_counters_test > {code} > Standard Output > Unexpected error in node3 log, error: > ERROR [CompactionExecutor:1] 2016-07-28 15:34:19,533 CassandraDaemon.java > (line 191) Exception in thread Thread[CompactionExecutor:1,1,main] > java.util.concurrent.RejectedExecutionException: Task > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask@5fb8b2bf > rejected from > org.apache.cassandra.concurrent.DebuggableScheduledThreadPoolExecutor@1ae851ad[Terminated, > pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 8] > at > java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2047) > at > java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:823) > at > java.util.concurrent.ScheduledThreadPoolExecutor.delayedExecute(ScheduledThreadPoolExecutor.java:326) > at > java.util.concurrent.ScheduledThreadPoolExecutor.schedule(ScheduledThreadPoolExecutor.java:533) > at > java.util.concurrent.ScheduledThreadPoolExecutor.submit(ScheduledThreadPoolExecutor.java:632) > at > org.apache.cassandra.io.sstable.SSTableDeletingTask.schedule(SSTableDeletingTask.java:65) > at > org.apache.cassandra.io.sstable.SSTableReader.releaseReference(SSTableReader.java:976) > at > org.apache.cassandra.db.DataTracker.removeOldSSTablesSize(DataTracker.java:383) > at org.apache.cassandra.db.DataTracker.postReplace(DataTracker.java:348) > at org.apache.cassandra.db.DataTracker.replace(DataTracker.java:342) > at > org.apache.cassandra.db.DataTracker.replaceCompactedSSTables(DataTracker.java:245) > at > org.apache.cassandra.db.ColumnFamilyStore.replaceCompactedSSTables(ColumnFamilyStore.java:995) > at > org.apache.cassandra.db.compaction.CompactionTask.replaceCompactedSSTables(CompactionTask.java:270) > at > org.apache.cassandra.db.compaction.CompactionTask.runWith(CompactionTask.java:230) > at > org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48) > at > org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) > at > org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:58) > at > org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:60) > at > org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionTask.run(CompactionManager.java:208) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-12345) Gossip 2.0
Jason Brown created CASSANDRA-12345: --- Summary: Gossip 2.0 Key: CASSANDRA-12345 URL: https://issues.apache.org/jira/browse/CASSANDRA-12345 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Jason Brown Assignee: Jason Brown This is a parent ticket covering changes to the dissemination aspects of the current gossip subsystem. (Changes to the actual data being exchanged by the current gossip (the cluster metadata) will be handled elsewhere, but the current primary ticket covering that work is CASSANDRA-9667.) This work requires several components, which largely need to completed in this order: - a peer sampling service to create partial cluster views (CASSANDRA-). This forms the basis of the next two components - a broadcast tree, which creates dynamic spanning trees given the partial views provided by the peer sampling service (CASSANDRA-???) - an anti-entropy component, which is similar to the pair-wise exchange and reconciliation of the exitsing gossip implementation (CASSANDRA-???) These base components (primarily the broadcast and anti-entropy) can allow for generic consumers to simply and effectively share a body of data across an entire cluster. The most obvious consumer will be a cluster metadata component, which can replace the existing gossip system, but also other components like CASSANDRA-12106. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-12092) dtest failure in consistency_test.TestAccuracy.test_simple_strategy_counters
[ https://issues.apache.org/jira/browse/CASSANDRA-12092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1547#comment-1547 ] Russ Hatch commented on CASSANDRA-12092: Interestingly both failures shown on this ticket happened at 'key 200', which looks to be writing at quorum, reading back at one, with serial unset. For a random-looking failure, the same key of 200 is a suspicious value. > dtest failure in consistency_test.TestAccuracy.test_simple_strategy_counters > > > Key: CASSANDRA-12092 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12092 > Project: Cassandra > Issue Type: Test >Reporter: Sean McCarthy >Assignee: Russ Hatch > Labels: dtest > Attachments: node1.log, node2.log, node3.log > > > example failure: > http://cassci.datastax.com/job/cassandra-2.1_dtest/484/testReport/consistency_test/TestAccuracy/test_simple_strategy_counters > Failed on CassCI build cassandra-2.1_dtest #484 > {code} > Standard Error > Traceback (most recent call last): > File "/home/automaton/cassandra-dtest/consistency_test.py", line 514, in run > valid_fcn(v) > File "/home/automaton/cassandra-dtest/consistency_test.py", line 497, in > validate_counters > check_all_sessions(s, n, c) > File "/home/automaton/cassandra-dtest/consistency_test.py", line 490, in > check_all_sessions > "value of %s at key %d, instead got these values: %s" % (write_nodes, > val, n, results) > AssertionError: Failed to read value from sufficient number of nodes, > required 2 nodes to have a counter value of 1 at key 200, instead got these > values: [0, 0, 1] > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-12092) dtest failure in consistency_test.TestAccuracy.test_simple_strategy_counters
[ https://issues.apache.org/jira/browse/CASSANDRA-12092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1533#comment-1533 ] Russ Hatch commented on CASSANDRA-12092: failure from recent multiplex: http://cassci.datastax.com/view/Parameterized/job/parameterized_dtest_multiplexer/195/testReport/node_1_iter_059.consistency_test/TestAccuracy/test_simple_strategy_counters/ > dtest failure in consistency_test.TestAccuracy.test_simple_strategy_counters > > > Key: CASSANDRA-12092 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12092 > Project: Cassandra > Issue Type: Test >Reporter: Sean McCarthy >Assignee: Russ Hatch > Labels: dtest > Attachments: node1.log, node2.log, node3.log > > > example failure: > http://cassci.datastax.com/job/cassandra-2.1_dtest/484/testReport/consistency_test/TestAccuracy/test_simple_strategy_counters > Failed on CassCI build cassandra-2.1_dtest #484 > {code} > Standard Error > Traceback (most recent call last): > File "/home/automaton/cassandra-dtest/consistency_test.py", line 514, in run > valid_fcn(v) > File "/home/automaton/cassandra-dtest/consistency_test.py", line 497, in > validate_counters > check_all_sessions(s, n, c) > File "/home/automaton/cassandra-dtest/consistency_test.py", line 490, in > check_all_sessions > "value of %s at key %d, instead got these values: %s" % (write_nodes, > val, n, results) > AssertionError: Failed to read value from sufficient number of nodes, > required 2 nodes to have a counter value of 1 at key 200, instead got these > values: [0, 0, 1] > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-12092) dtest failure in consistency_test.TestAccuracy.test_simple_strategy_counters
[ https://issues.apache.org/jira/browse/CASSANDRA-12092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1532#comment-1532 ] Russ Hatch commented on CASSANDRA-12092: 1 failure in 200 iterations. Either the test is bad or there's a bug here. > dtest failure in consistency_test.TestAccuracy.test_simple_strategy_counters > > > Key: CASSANDRA-12092 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12092 > Project: Cassandra > Issue Type: Test >Reporter: Sean McCarthy >Assignee: Russ Hatch > Labels: dtest > Attachments: node1.log, node2.log, node3.log > > > example failure: > http://cassci.datastax.com/job/cassandra-2.1_dtest/484/testReport/consistency_test/TestAccuracy/test_simple_strategy_counters > Failed on CassCI build cassandra-2.1_dtest #484 > {code} > Standard Error > Traceback (most recent call last): > File "/home/automaton/cassandra-dtest/consistency_test.py", line 514, in run > valid_fcn(v) > File "/home/automaton/cassandra-dtest/consistency_test.py", line 497, in > validate_counters > check_all_sessions(s, n, c) > File "/home/automaton/cassandra-dtest/consistency_test.py", line 490, in > check_all_sessions > "value of %s at key %d, instead got these values: %s" % (write_nodes, > val, n, results) > AssertionError: Failed to read value from sufficient number of nodes, > required 2 nodes to have a counter value of 1 at key 200, instead got these > values: [0, 0, 1] > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8523) Writes should be sent to a replacement node while it is streaming in data
[ https://issues.apache.org/jira/browse/CASSANDRA-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15399949#comment-15399949 ] Richard Low commented on CASSANDRA-8523: I'll review the 3.9 version. I'm very much in favour of putting this in 2.2 and 3.0 as this hurts us badly and no doubt others suffer too. > Writes should be sent to a replacement node while it is streaming in data > - > > Key: CASSANDRA-8523 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8523 > Project: Cassandra > Issue Type: Improvement >Reporter: Richard Wagner >Assignee: Paulo Motta > Fix For: 2.1.x > > > In our operations, we make heavy use of replace_address (or > replace_address_first_boot) in order to replace broken nodes. We now realize > that writes are not sent to the replacement nodes while they are in hibernate > state and streaming in data. This runs counter to what our expectations were, > especially since we know that writes ARE sent to nodes when they are > bootstrapped into the ring. > It seems like cassandra should arrange to send writes to a node that is in > the process of replacing another node, just like it does for a nodes that are > bootstraping. I hesitate to phrase this as "we should send writes to a node > in hibernate" because the concept of hibernate may be useful in other > contexts, as per CASSANDRA-8336. Maybe a new state is needed here? > Among other things, the fact that we don't get writes during this period > makes subsequent repairs more expensive, proportional to the number of writes > that we miss (and depending on the amount of data that needs to be streamed > during replacement and the time it may take to rebuild secondary indexes, we > could miss many many hours worth of writes). It also leaves us more exposed > to consistency violations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8523) Writes should be sent to a replacement node while it is streaming in data
[ https://issues.apache.org/jira/browse/CASSANDRA-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15399941#comment-15399941 ] Paulo Motta commented on CASSANDRA-8523: Thanks for taking a look! Created CASSANDRA-12344 to follow-up with support for this when the replacement node has the same address as the original node. Rebased patch and dtests as well as merged up to 3.0+. All patches and CI results available below: ||2.2||3.0||3.9||trunk||dtest|| |[branch|https://github.com/apache/cassandra/compare/cassandra-2.2...pauloricardomg:2.2-8523]|[branch|https://github.com/apache/cassandra/compare/cassandra-3.0...pauloricardomg:3.0-8523]|[branch|https://github.com/apache/cassandra/compare/cassandra-3.9...pauloricardomg:3.9-8523]|[branch|https://github.com/apache/cassandra/compare/trunk...pauloricardomg:trunk-8523]|[branch|https://github.com/riptano/cassandra-dtest/compare/master...pauloricardomg:8523]| |[testall|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-2.2-8523-testall/lastCompletedBuild/testReport/]|[testall|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-3.0-8523-testall/lastCompletedBuild/testReport/]|[testall|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-3.9-8523-testall/lastCompletedBuild/testReport/]|[testall|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-trunk-8523-testall/lastCompletedBuild/testReport/]| |[dtest|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-2.2-8523-dtest/lastCompletedBuild/testReport/]|[dtest|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-3.0-8523-dtest/lastCompletedBuild/testReport/]|[dtest|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-3.9-8523-dtest/lastCompletedBuild/testReport/]|[dtest|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-trunk-8523-dtest/lastCompletedBuild/testReport/]| There were some minor merge conflicts on 3.0, and a slightly larger conflict on 3.9 due to CASSANDRA-10134, so I did some refactoring in the 3.9+ version to move most of the logic to {{prepareForReplacement}}. Can you take another look [~rlow]? Dtest PR created [here|https://github.com/riptano/cassandra-dtest/pull/1155]. While this is marked an improvement and would theoretically only go to trunk, this limitation is pretty counter-intuitive and probably hurts many users in the wild, and since the changeset is relatively small and self-contained, I think it could be interpreted as a bugfix and perhaps go on 2.2+ or maybe 3.0+. WDYT [~brandon.williams] [~jkni] ? > Writes should be sent to a replacement node while it is streaming in data > - > > Key: CASSANDRA-8523 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8523 > Project: Cassandra > Issue Type: Improvement >Reporter: Richard Wagner >Assignee: Paulo Motta > Fix For: 2.1.x > > > In our operations, we make heavy use of replace_address (or > replace_address_first_boot) in order to replace broken nodes. We now realize > that writes are not sent to the replacement nodes while they are in hibernate > state and streaming in data. This runs counter to what our expectations were, > especially since we know that writes ARE sent to nodes when they are > bootstrapped into the ring. > It seems like cassandra should arrange to send writes to a node that is in > the process of replacing another node, just like it does for a nodes that are > bootstraping. I hesitate to phrase this as "we should send writes to a node > in hibernate" because the concept of hibernate may be useful in other > contexts, as per CASSANDRA-8336. Maybe a new state is needed here? > Among other things, the fact that we don't get writes during this period > makes subsequent repairs more expensive, proportional to the number of writes > that we miss (and depending on the amount of data that needs to be streamed > during replacement and the time it may take to rebuild secondary indexes, we > could miss many many hours worth of writes). It also leaves us more exposed > to consistency violations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-12160) dtest failure in counter_tests.TestCounters.upgrade_test
[ https://issues.apache.org/jira/browse/CASSANDRA-12160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15399932#comment-15399932 ] Russ Hatch commented on CASSANDRA-12160: single flap in recent history, going to try a multiplex (x100) here: http://cassci.datastax.com/view/Parameterized/job/parameterized_dtest_multiplexer/196/ > dtest failure in counter_tests.TestCounters.upgrade_test > > > Key: CASSANDRA-12160 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12160 > Project: Cassandra > Issue Type: Test >Reporter: Sean McCarthy >Assignee: Russ Hatch > Labels: dtest > Attachments: node1.log, node2.log > > > example failure: > http://cassci.datastax.com/job/cassandra-2.1_dtest/493/testReport/counter_tests/TestCounters/upgrade_test > Failed on CassCI build cassandra-2.1_dtest #493 > {code} > Error Message > 07 Jul 2016 21:07:28 [node1] Missing: ['127.0.0.2.* now UP']: > . > See system.log for remainder > {code} > {code} > Stacktrace > File "/usr/lib/python2.7/unittest/case.py", line 329, in run > testMethod() > File "/home/automaton/cassandra-dtest/counter_tests.py", line 101, in > upgrade_test > rolling_restart() > File "/home/automaton/cassandra-dtest/counter_tests.py", line 96, in > rolling_restart > nodes[i].start(wait_other_notice=True, wait_for_binary_proto=True) > File "/home/automaton/ccm/ccmlib/node.py", line 634, in start > node.watch_log_for_alive(self, from_mark=mark) > File "/home/automaton/ccm/ccmlib/node.py", line 481, in watch_log_for_alive > self.watch_log_for(tofind, from_mark=from_mark, timeout=timeout, > filename=filename) > File "/home/automaton/ccm/ccmlib/node.py", line 449, in watch_log_for > raise TimeoutError(time.strftime("%d %b %Y %H:%M:%S", time.gmtime()) + " > [" + self.name + "] Missing: " + str([e.pattern for e in tofind]) + ":\n" + > reads[:50] + ".\nSee {} for remainder".format(filename)) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (CASSANDRA-12160) dtest failure in counter_tests.TestCounters.upgrade_test
[ https://issues.apache.org/jira/browse/CASSANDRA-12160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Russ Hatch reassigned CASSANDRA-12160: -- Assignee: Russ Hatch (was: DS Test Eng) > dtest failure in counter_tests.TestCounters.upgrade_test > > > Key: CASSANDRA-12160 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12160 > Project: Cassandra > Issue Type: Test >Reporter: Sean McCarthy >Assignee: Russ Hatch > Labels: dtest > Attachments: node1.log, node2.log > > > example failure: > http://cassci.datastax.com/job/cassandra-2.1_dtest/493/testReport/counter_tests/TestCounters/upgrade_test > Failed on CassCI build cassandra-2.1_dtest #493 > {code} > Error Message > 07 Jul 2016 21:07:28 [node1] Missing: ['127.0.0.2.* now UP']: > . > See system.log for remainder > {code} > {code} > Stacktrace > File "/usr/lib/python2.7/unittest/case.py", line 329, in run > testMethod() > File "/home/automaton/cassandra-dtest/counter_tests.py", line 101, in > upgrade_test > rolling_restart() > File "/home/automaton/cassandra-dtest/counter_tests.py", line 96, in > rolling_restart > nodes[i].start(wait_other_notice=True, wait_for_binary_proto=True) > File "/home/automaton/ccm/ccmlib/node.py", line 634, in start > node.watch_log_for_alive(self, from_mark=mark) > File "/home/automaton/ccm/ccmlib/node.py", line 481, in watch_log_for_alive > self.watch_log_for(tofind, from_mark=from_mark, timeout=timeout, > filename=filename) > File "/home/automaton/ccm/ccmlib/node.py", line 449, in watch_log_for > raise TimeoutError(time.strftime("%d %b %Y %H:%M:%S", time.gmtime()) + " > [" + self.name + "] Missing: " + str([e.pattern for e in tofind]) + ":\n" + > reads[:50] + ".\nSee {} for remainder".format(filename)) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-12344) Forward writes to replacement node with same address during replace
Paulo Motta created CASSANDRA-12344: --- Summary: Forward writes to replacement node with same address during replace Key: CASSANDRA-12344 URL: https://issues.apache.org/jira/browse/CASSANDRA-12344 Project: Cassandra Issue Type: Improvement Components: Coordination, Distributed Metadata Reporter: Paulo Motta On CASSANDRA-8523 it was added support to forwarding writes to a replacement node via a new gossip state {{BOOTSTRAPPING_REPLACE}}. Currently this is limited to replacement nodes with a different address of the original node, because if a replacement node with the same address of a normal endpoint joins gossip with a non-dead state, it will become alive in the Failure Detector and reads will be forwarded to it before the node is ready to serve reads. This ticket is to add support to forwarding writes to replacement nodes with the same IP address as the original node. The initial idea is to allow marking a node as unavailable for reads on {{TokenMetadata}}, what will allow the replacement node with the same IP join gossip without having reads forwarded to it. This will be enabled by CASSANDRA-11559. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9054) Break DatabaseDescriptor up into multiple classes.
[ https://issues.apache.org/jira/browse/CASSANDRA-9054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15399828#comment-15399828 ] Robert Stupp commented on CASSANDRA-9054: - Alright - pushed a couple of commits to the branch to address the review comments and also fix some things. utests + dtests look good now. Latest dtest run has 0 errors an utest a couple of timeouts (triggered another run). Worked in all the review comments. Just removed the weird comments in Config + DD. The intention of these comments was to hint contributors to be careful to not introduce new "magic" class dependencies that startup "everything" by accessing DD. > Break DatabaseDescriptor up into multiple classes. > -- > > Key: CASSANDRA-9054 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9054 > Project: Cassandra > Issue Type: Improvement >Reporter: Jeremiah Jordan >Assignee: Robert Stupp > Fix For: 3.x > > > Right now to get at Config stuff you go through DatabaseDescriptor. But when > you instantiate DatabaseDescriptor it actually opens system tables and such, > which triggers commit log replays, and other things if the right flags aren't > set ahead of time. This makes getting at config stuff from tools annoying, > as you have to be very careful about instantiation orders. > It would be nice if we could break DatabaseDescriptor up into multiple > classes, so that getting at config stuff from tools wasn't such a pain. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (CASSANDRA-12092) dtest failure in consistency_test.TestAccuracy.test_simple_strategy_counters
[ https://issues.apache.org/jira/browse/CASSANDRA-12092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Russ Hatch reassigned CASSANDRA-12092: -- Assignee: Russ Hatch (was: DS Test Eng) > dtest failure in consistency_test.TestAccuracy.test_simple_strategy_counters > > > Key: CASSANDRA-12092 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12092 > Project: Cassandra > Issue Type: Test >Reporter: Sean McCarthy >Assignee: Russ Hatch > Labels: dtest > Attachments: node1.log, node2.log, node3.log > > > example failure: > http://cassci.datastax.com/job/cassandra-2.1_dtest/484/testReport/consistency_test/TestAccuracy/test_simple_strategy_counters > Failed on CassCI build cassandra-2.1_dtest #484 > {code} > Standard Error > Traceback (most recent call last): > File "/home/automaton/cassandra-dtest/consistency_test.py", line 514, in run > valid_fcn(v) > File "/home/automaton/cassandra-dtest/consistency_test.py", line 497, in > validate_counters > check_all_sessions(s, n, c) > File "/home/automaton/cassandra-dtest/consistency_test.py", line 490, in > check_all_sessions > "value of %s at key %d, instead got these values: %s" % (write_nodes, > val, n, results) > AssertionError: Failed to read value from sufficient number of nodes, > required 2 nodes to have a counter value of 1 at key 200, instead got these > values: [0, 0, 1] > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-12092) dtest failure in consistency_test.TestAccuracy.test_simple_strategy_counters
[ https://issues.apache.org/jira/browse/CASSANDRA-12092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15399825#comment-15399825 ] Russ Hatch commented on CASSANDRA-12092: Since this is one isolated flap in recent history, testing with multiplex (200 iterations) here: http://cassci.datastax.com/view/Parameterized/job/parameterized_dtest_multiplexer/195/ > dtest failure in consistency_test.TestAccuracy.test_simple_strategy_counters > > > Key: CASSANDRA-12092 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12092 > Project: Cassandra > Issue Type: Test >Reporter: Sean McCarthy >Assignee: DS Test Eng > Labels: dtest > Attachments: node1.log, node2.log, node3.log > > > example failure: > http://cassci.datastax.com/job/cassandra-2.1_dtest/484/testReport/consistency_test/TestAccuracy/test_simple_strategy_counters > Failed on CassCI build cassandra-2.1_dtest #484 > {code} > Standard Error > Traceback (most recent call last): > File "/home/automaton/cassandra-dtest/consistency_test.py", line 514, in run > valid_fcn(v) > File "/home/automaton/cassandra-dtest/consistency_test.py", line 497, in > validate_counters > check_all_sessions(s, n, c) > File "/home/automaton/cassandra-dtest/consistency_test.py", line 490, in > check_all_sessions > "value of %s at key %d, instead got these values: %s" % (write_nodes, > val, n, results) > AssertionError: Failed to read value from sufficient number of nodes, > required 2 nodes to have a counter value of 1 at key 200, instead got these > values: [0, 0, 1] > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-12336) NullPointerException during compaction on table with static columns
[ https://issues.apache.org/jira/browse/CASSANDRA-12336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15399808#comment-15399808 ] Evan Prothro edited comment on CASSANDRA-12336 at 7/29/16 6:23 PM: --- [~dannyantonetti] have you tried using {{sstabledump}} to inspect your data? http://www.datastax.com/dev/blog/debugging-sstables-in-3-0-with-sstabledump was (Author: eprothro): [~dannyantonetti] have you tried using {{sstabledump}} to inspect your data? http://www.datastax.com/dev/blog/debugging-sstables-in-3-0-with-sstabledump It might help if you explain what you are doing and what exception you are seeing where and when. > NullPointerException during compaction on table with static columns > --- > > Key: CASSANDRA-12336 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12336 > Project: Cassandra > Issue Type: Bug > Components: Compaction > Environment: cqlsh 5.0.1 > Cassandra 3.0.8-SNAPSHOT (3.0.x dev - a5cbb0) >Reporter: Evan Prothro >Assignee: Sylvain Lebresne > Fix For: 3.0.9 > > > After being affected by > https://issues.apache.org/jira/browse/CASSANDRA-11988, we built a5cbb0. > Compaction still fails with the following trace: > {code} > WARN [SharedPool-Worker-2] 2016-07-28 10:51:56,111 > AbstractLocalAwareExecutorService.java:169 - Uncaught exception on thread > Thread[SharedPool-Worker-2,5,main]: {} > java.lang.RuntimeException: java.lang.NullPointerException > at > org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2453) > ~[main/:na] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > ~[na:1.8.0_72] > at > org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:164) > ~[main/:na] > at > org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:136) > [main/:na] > at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) > [main/:na] > at java.lang.Thread.run(Thread.java:745) [na:1.8.0_72] > Caused by: java.lang.NullPointerException: null > at > org.apache.cassandra.db.ReadCommand$1MetricRecording.applyToRow(ReadCommand.java:466) > ~[main/:na] > at > org.apache.cassandra.db.ReadCommand$1MetricRecording.applyToStatic(ReadCommand.java:460) > ~[main/:na] > at org.apache.cassandra.db.transform.BaseRows.add(BaseRows.java:105) > ~[main/:na] > at > org.apache.cassandra.db.transform.UnfilteredRows.add(UnfilteredRows.java:41) > ~[main/:na] > at > org.apache.cassandra.db.transform.Transformation.add(Transformation.java:156) > ~[main/:na] > at > org.apache.cassandra.db.transform.Transformation.apply(Transformation.java:122) > ~[main/:na] > at > org.apache.cassandra.db.ReadCommand$1MetricRecording.applyToPartition(ReadCommand.java:454) > ~[main/:na] > at > org.apache.cassandra.db.ReadCommand$1MetricRecording.applyToPartition(ReadCommand.java:438) > ~[main/:na] > at > org.apache.cassandra.db.transform.BasePartitions.hasNext(BasePartitions.java:96) > ~[main/:na] > at > org.apache.cassandra.db.partitions.UnfilteredPartitionIterators$Serializer.serialize(UnfilteredPartitionIterators.java:295) > ~[main/:na] > at > org.apache.cassandra.db.ReadResponse$LocalDataResponse.build(ReadResponse.java:145) > ~[main/:na] > at > org.apache.cassandra.db.ReadResponse$LocalDataResponse.(ReadResponse.java:138) > ~[main/:na] > at > org.apache.cassandra.db.ReadResponse$LocalDataResponse.(ReadResponse.java:134) > ~[main/:na] > at > org.apache.cassandra.db.ReadResponse.createDataResponse(ReadResponse.java:76) > ~[main/:na] > at > org.apache.cassandra.db.ReadCommand.createResponse(ReadCommand.java:320) > ~[main/:na] > at > org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:1796) > ~[main/:na] > at > org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2449) > ~[main/:na] > ... 5 common frames omitted > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-12336) NullPointerException during compaction on table with static columns
[ https://issues.apache.org/jira/browse/CASSANDRA-12336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15399808#comment-15399808 ] Evan Prothro commented on CASSANDRA-12336: -- [~dannyantonetti] have you tried using {{sstabledump}} to inspect your data? http://www.datastax.com/dev/blog/debugging-sstables-in-3-0-with-sstabledump It might help if you explain what you are doing and what exception you are seeing where and when. > NullPointerException during compaction on table with static columns > --- > > Key: CASSANDRA-12336 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12336 > Project: Cassandra > Issue Type: Bug > Components: Compaction > Environment: cqlsh 5.0.1 > Cassandra 3.0.8-SNAPSHOT (3.0.x dev - a5cbb0) >Reporter: Evan Prothro >Assignee: Sylvain Lebresne > Fix For: 3.0.9 > > > After being affected by > https://issues.apache.org/jira/browse/CASSANDRA-11988, we built a5cbb0. > Compaction still fails with the following trace: > {code} > WARN [SharedPool-Worker-2] 2016-07-28 10:51:56,111 > AbstractLocalAwareExecutorService.java:169 - Uncaught exception on thread > Thread[SharedPool-Worker-2,5,main]: {} > java.lang.RuntimeException: java.lang.NullPointerException > at > org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2453) > ~[main/:na] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > ~[na:1.8.0_72] > at > org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:164) > ~[main/:na] > at > org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:136) > [main/:na] > at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) > [main/:na] > at java.lang.Thread.run(Thread.java:745) [na:1.8.0_72] > Caused by: java.lang.NullPointerException: null > at > org.apache.cassandra.db.ReadCommand$1MetricRecording.applyToRow(ReadCommand.java:466) > ~[main/:na] > at > org.apache.cassandra.db.ReadCommand$1MetricRecording.applyToStatic(ReadCommand.java:460) > ~[main/:na] > at org.apache.cassandra.db.transform.BaseRows.add(BaseRows.java:105) > ~[main/:na] > at > org.apache.cassandra.db.transform.UnfilteredRows.add(UnfilteredRows.java:41) > ~[main/:na] > at > org.apache.cassandra.db.transform.Transformation.add(Transformation.java:156) > ~[main/:na] > at > org.apache.cassandra.db.transform.Transformation.apply(Transformation.java:122) > ~[main/:na] > at > org.apache.cassandra.db.ReadCommand$1MetricRecording.applyToPartition(ReadCommand.java:454) > ~[main/:na] > at > org.apache.cassandra.db.ReadCommand$1MetricRecording.applyToPartition(ReadCommand.java:438) > ~[main/:na] > at > org.apache.cassandra.db.transform.BasePartitions.hasNext(BasePartitions.java:96) > ~[main/:na] > at > org.apache.cassandra.db.partitions.UnfilteredPartitionIterators$Serializer.serialize(UnfilteredPartitionIterators.java:295) > ~[main/:na] > at > org.apache.cassandra.db.ReadResponse$LocalDataResponse.build(ReadResponse.java:145) > ~[main/:na] > at > org.apache.cassandra.db.ReadResponse$LocalDataResponse.(ReadResponse.java:138) > ~[main/:na] > at > org.apache.cassandra.db.ReadResponse$LocalDataResponse.(ReadResponse.java:134) > ~[main/:na] > at > org.apache.cassandra.db.ReadResponse.createDataResponse(ReadResponse.java:76) > ~[main/:na] > at > org.apache.cassandra.db.ReadCommand.createResponse(ReadCommand.java:320) > ~[main/:na] > at > org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:1796) > ~[main/:na] > at > org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2449) > ~[main/:na] > ... 5 common frames omitted > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7190) Add schema to snapshot manifest
[ https://issues.apache.org/jira/browse/CASSANDRA-7190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15399792#comment-15399792 ] sankalp kohli commented on CASSANDRA-7190: -- Can we commit this to 3.0 or its too late? > Add schema to snapshot manifest > --- > > Key: CASSANDRA-7190 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7190 > Project: Cassandra > Issue Type: Improvement > Components: Tools >Reporter: Jonathan Ellis >Assignee: Alex Petrov >Priority: Minor > Labels: client-impacting, doc-impacting, lhf > Fix For: 3.10 > > > followup from CASSANDRA-6326 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-12336) NullPointerException during compaction on table with static columns
[ https://issues.apache.org/jira/browse/CASSANDRA-12336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15399763#comment-15399763 ] Daniel Antonetti commented on CASSANDRA-12336: -- Looking a little bit into this, these seem to be bad rows in our database. Is there a way to find them (maybe add a logging statement on the primary key), so that we can identify the records and investigate further, to see how many records that we have like this. > NullPointerException during compaction on table with static columns > --- > > Key: CASSANDRA-12336 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12336 > Project: Cassandra > Issue Type: Bug > Components: Compaction > Environment: cqlsh 5.0.1 > Cassandra 3.0.8-SNAPSHOT (3.0.x dev - a5cbb0) >Reporter: Evan Prothro >Assignee: Sylvain Lebresne > Fix For: 3.0.9 > > > After being affected by > https://issues.apache.org/jira/browse/CASSANDRA-11988, we built a5cbb0. > Compaction still fails with the following trace: > {code} > WARN [SharedPool-Worker-2] 2016-07-28 10:51:56,111 > AbstractLocalAwareExecutorService.java:169 - Uncaught exception on thread > Thread[SharedPool-Worker-2,5,main]: {} > java.lang.RuntimeException: java.lang.NullPointerException > at > org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2453) > ~[main/:na] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > ~[na:1.8.0_72] > at > org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:164) > ~[main/:na] > at > org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:136) > [main/:na] > at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) > [main/:na] > at java.lang.Thread.run(Thread.java:745) [na:1.8.0_72] > Caused by: java.lang.NullPointerException: null > at > org.apache.cassandra.db.ReadCommand$1MetricRecording.applyToRow(ReadCommand.java:466) > ~[main/:na] > at > org.apache.cassandra.db.ReadCommand$1MetricRecording.applyToStatic(ReadCommand.java:460) > ~[main/:na] > at org.apache.cassandra.db.transform.BaseRows.add(BaseRows.java:105) > ~[main/:na] > at > org.apache.cassandra.db.transform.UnfilteredRows.add(UnfilteredRows.java:41) > ~[main/:na] > at > org.apache.cassandra.db.transform.Transformation.add(Transformation.java:156) > ~[main/:na] > at > org.apache.cassandra.db.transform.Transformation.apply(Transformation.java:122) > ~[main/:na] > at > org.apache.cassandra.db.ReadCommand$1MetricRecording.applyToPartition(ReadCommand.java:454) > ~[main/:na] > at > org.apache.cassandra.db.ReadCommand$1MetricRecording.applyToPartition(ReadCommand.java:438) > ~[main/:na] > at > org.apache.cassandra.db.transform.BasePartitions.hasNext(BasePartitions.java:96) > ~[main/:na] > at > org.apache.cassandra.db.partitions.UnfilteredPartitionIterators$Serializer.serialize(UnfilteredPartitionIterators.java:295) > ~[main/:na] > at > org.apache.cassandra.db.ReadResponse$LocalDataResponse.build(ReadResponse.java:145) > ~[main/:na] > at > org.apache.cassandra.db.ReadResponse$LocalDataResponse.(ReadResponse.java:138) > ~[main/:na] > at > org.apache.cassandra.db.ReadResponse$LocalDataResponse.(ReadResponse.java:134) > ~[main/:na] > at > org.apache.cassandra.db.ReadResponse.createDataResponse(ReadResponse.java:76) > ~[main/:na] > at > org.apache.cassandra.db.ReadCommand.createResponse(ReadCommand.java:320) > ~[main/:na] > at > org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:1796) > ~[main/:na] > at > org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2449) > ~[main/:na] > ... 5 common frames omitted > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10643) Implement compaction for a specific token range
[ https://issues.apache.org/jira/browse/CASSANDRA-10643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15399751#comment-15399751 ] sankalp kohli commented on CASSANDRA-10643: --- [~krummas] Can you please review this. We already run this internally in 2.0,2.1 > Implement compaction for a specific token range > --- > > Key: CASSANDRA-10643 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10643 > Project: Cassandra > Issue Type: Improvement > Components: Compaction >Reporter: Vishy Kasar >Assignee: Vishy Kasar > Labels: lcs > Attachments: 10643-trunk-REV01.txt, 10643-trunk-REV02.txt > > > We see repeated cases in production (using LCS) where small number of users > generate a large number repeated updates or tombstones. Reading data of such > users brings in large amounts of data in to java process. Apart from the read > itself being slow for the user, the excessive GC affects other users as well. > Our solution so far is to move from LCS to SCS and back. This takes long and > is an over kill if the number of outliers is small. For such cases, we can > implement the point compaction of a token range. We make the nodetool compact > take a starting and ending token range and compact all the SSTables that fall > with in that range. We can refuse to compact if the number of sstables is > beyond a max_limit. > Example: > nodetool -st 3948291562518219268 -et 3948291562518219269 compact keyspace > table -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-12336) NullPointerException during compaction on table with static columns
[ https://issues.apache.org/jira/browse/CASSANDRA-12336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15399760#comment-15399760 ] Daniel Antonetti commented on CASSANDRA-12336: -- This patch does seem to fix the issue that we saw before. > NullPointerException during compaction on table with static columns > --- > > Key: CASSANDRA-12336 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12336 > Project: Cassandra > Issue Type: Bug > Components: Compaction > Environment: cqlsh 5.0.1 > Cassandra 3.0.8-SNAPSHOT (3.0.x dev - a5cbb0) >Reporter: Evan Prothro >Assignee: Sylvain Lebresne > Fix For: 3.0.9 > > > After being affected by > https://issues.apache.org/jira/browse/CASSANDRA-11988, we built a5cbb0. > Compaction still fails with the following trace: > {code} > WARN [SharedPool-Worker-2] 2016-07-28 10:51:56,111 > AbstractLocalAwareExecutorService.java:169 - Uncaught exception on thread > Thread[SharedPool-Worker-2,5,main]: {} > java.lang.RuntimeException: java.lang.NullPointerException > at > org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2453) > ~[main/:na] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > ~[na:1.8.0_72] > at > org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:164) > ~[main/:na] > at > org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:136) > [main/:na] > at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) > [main/:na] > at java.lang.Thread.run(Thread.java:745) [na:1.8.0_72] > Caused by: java.lang.NullPointerException: null > at > org.apache.cassandra.db.ReadCommand$1MetricRecording.applyToRow(ReadCommand.java:466) > ~[main/:na] > at > org.apache.cassandra.db.ReadCommand$1MetricRecording.applyToStatic(ReadCommand.java:460) > ~[main/:na] > at org.apache.cassandra.db.transform.BaseRows.add(BaseRows.java:105) > ~[main/:na] > at > org.apache.cassandra.db.transform.UnfilteredRows.add(UnfilteredRows.java:41) > ~[main/:na] > at > org.apache.cassandra.db.transform.Transformation.add(Transformation.java:156) > ~[main/:na] > at > org.apache.cassandra.db.transform.Transformation.apply(Transformation.java:122) > ~[main/:na] > at > org.apache.cassandra.db.ReadCommand$1MetricRecording.applyToPartition(ReadCommand.java:454) > ~[main/:na] > at > org.apache.cassandra.db.ReadCommand$1MetricRecording.applyToPartition(ReadCommand.java:438) > ~[main/:na] > at > org.apache.cassandra.db.transform.BasePartitions.hasNext(BasePartitions.java:96) > ~[main/:na] > at > org.apache.cassandra.db.partitions.UnfilteredPartitionIterators$Serializer.serialize(UnfilteredPartitionIterators.java:295) > ~[main/:na] > at > org.apache.cassandra.db.ReadResponse$LocalDataResponse.build(ReadResponse.java:145) > ~[main/:na] > at > org.apache.cassandra.db.ReadResponse$LocalDataResponse.(ReadResponse.java:138) > ~[main/:na] > at > org.apache.cassandra.db.ReadResponse$LocalDataResponse.(ReadResponse.java:134) > ~[main/:na] > at > org.apache.cassandra.db.ReadResponse.createDataResponse(ReadResponse.java:76) > ~[main/:na] > at > org.apache.cassandra.db.ReadCommand.createResponse(ReadCommand.java:320) > ~[main/:na] > at > org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:1796) > ~[main/:na] > at > org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2449) > ~[main/:na] > ... 5 common frames omitted > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (CASSANDRA-12337) dtest failure in scrub_test.TestScrubIndexes.test_standalone_scrub
[ https://issues.apache.org/jira/browse/CASSANDRA-12337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joel Knighton reassigned CASSANDRA-12337: - Assignee: Joel Knighton > dtest failure in scrub_test.TestScrubIndexes.test_standalone_scrub > -- > > Key: CASSANDRA-12337 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12337 > Project: Cassandra > Issue Type: Bug >Reporter: Joel Knighton >Assignee: Joel Knighton > Labels: dtest > > We have an existing open ticket for this test in [CASSANDRA-11236]. That > ticket is for a Windows failure with a different failure mode. Since the > resolution will likely be different, I've opened this ticket to track the > most recent failure. > example failure: > [http://cassci.datastax.com/job/cassandra-3.9_dtest/20/testReport/junit/scrub_test/TestScrubIndexes/test_standalone_scrub/] > {code} > sstablescrub failed > >> begin captured logging << > dtest: DEBUG: cluster ccm directory: /tmp/dtest-Ilk7GU > dtest: DEBUG: Done setting configuration options: > { 'initial_token': None, > 'num_tokens': '32', > 'phi_convict_threshold': 5, > 'range_request_timeout_in_ms': 1, > 'read_request_timeout_in_ms': 1, > 'request_timeout_in_ms': 1, > 'truncate_request_timeout_in_ms': 1, > 'write_request_timeout_in_ms': 1} > dtest: DEBUG: Checking sstables in > ['/tmp/dtest-Ilk7GU/test/node1/data0/ks/users-b1956190547611e683f7630547833e83'] > dtest: DEBUG: Found sstable file mb-1-big-Statistics.db > dtest: DEBUG: Found sstable file mb-1-big-CRC.db > dtest: DEBUG: Found sstable file mb-1-big-Filter.db > dtest: DEBUG: Found sstable file mb-1-big-Summary.db > dtest: DEBUG: Found sstable file mb-1-big-Data.db > dtest: DEBUG: Found sstable file mb-1-big-Index.db > dtest: DEBUG: Found sstable file mb-1-big-TOC.txt > dtest: DEBUG: Checking sstables in > ['/tmp/dtest-Ilk7GU/test/node1/data0/ks/users-b1956190547611e683f7630547833e83'] > dtest: DEBUG: Found sstable file mb-1-big-Statistics.db > dtest: DEBUG: Found sstable file mb-1-big-CRC.db > dtest: DEBUG: Found sstable file mb-1-big-Filter.db > dtest: DEBUG: Found sstable file mb-1-big-Summary.db > dtest: DEBUG: Found sstable file mb-1-big-Data.db > dtest: DEBUG: Found sstable file mb-1-big-Index.db > dtest: DEBUG: Found sstable file mb-1-big-TOC.txt > dtest: DEBUG: Checking sstables in > ['/tmp/dtest-Ilk7GU/test/node1/data0/ks/users-b1956190547611e683f7630547833e83'] > dtest: DEBUG: Found sstable file mb-1-big-Statistics.db > dtest: DEBUG: Found sstable file mb-1-big-CRC.db > dtest: DEBUG: Found sstable file mb-1-big-Filter.db > dtest: DEBUG: Found sstable file mb-1-big-Summary.db > dtest: DEBUG: Found sstable file mb-1-big-Data.db > dtest: DEBUG: Found sstable file mb-1-big-Index.db > dtest: DEBUG: Found sstable file mb-1-big-TOC.txt > dtest: DEBUG: Checking sstables in > ['/tmp/dtest-Ilk7GU/test/node1/data0/ks/users-b1956190547611e683f7630547833e83'] > dtest: DEBUG: Found sstable file mb-1-big-Statistics.db > dtest: DEBUG: Found sstable file mb-1-big-CRC.db > dtest: DEBUG: Found sstable file mb-1-big-Filter.db > dtest: DEBUG: Found sstable file mb-1-big-Summary.db > dtest: DEBUG: Found sstable file mb-1-big-Data.db > dtest: DEBUG: Found sstable file mb-1-big-Index.db > dtest: DEBUG: Found sstable file mb-1-big-TOC.txt > dtest: DEBUG: /home/automaton/cassandra/bin/sstablescrub > dtest: DEBUG: Pre-scrub sstables snapshotted into snapshot > pre-scrub-1469677957710 > Scrubbing > BigTableReader(path='/tmp/dtest-Ilk7GU/test/node1/data0/ks/users-b1956190547611e683f7630547833e83/mb-1-big-Data.db') > (0.317KiB) > Scrub of > BigTableReader(path='/tmp/dtest-Ilk7GU/test/node1/data0/ks/users-b1956190547611e683f7630547833e83/mb-1-big-Data.db') > complete; looks like all 0 rows were tombstoned > dtest: DEBUG: /home/automaton/cassandra/bin/sstablescrub > dtest: DEBUG: Pre-scrub sstables snapshotted into snapshot > pre-scrub-1469677962057 > Scrubbing > BigTableReader(path='/tmp/dtest-Ilk7GU/test/node1/data0/ks/users-b1956190547611e683f7630547833e83/.gender_idx/mb-1-big-Data.db') > (0.176KiB) > Scrub of > BigTableReader(path='/tmp/dtest-Ilk7GU/test/node1/data0/ks/users-b1956190547611e683f7630547833e83/.gender_idx/mb-1-big-Data.db') > complete; looks like all 0 rows were tombstoned > dtest: DEBUG: /home/automaton/cassandra/bin/sstablescrub > dtest: DEBUG: Pre-scrub sstables snapshotted into snapshot > pre-scrub-1469677966308 > Scrubbing > BigTableReader(path='/tmp/dtest-Ilk7GU/test/node1/data0/ks/users-b1956190547611e683f7630547833e83/.state_idx/mb-1-big-Data.db') > (0.178KiB) > Scrub of > BigTableReader(path='/tmp/dtest-Ilk7GU/test/node1/data0/ks/users-b1956190547611e683f7630547833e83/.state_idx/mb-1-big-Data.db') >
[jira] [Commented] (CASSANDRA-10848) Upgrade paging dtests involving deletion flap on CassCI
[ https://issues.apache.org/jira/browse/CASSANDRA-10848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15399643#comment-15399643 ] Russ Hatch commented on CASSANDRA-10848: I've had similar difficulty trying to get a test error to repro with 500 iterations. But the CI results still stand, so I'm not sure how we repro and fix the test issue. > Upgrade paging dtests involving deletion flap on CassCI > --- > > Key: CASSANDRA-10848 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10848 > Project: Cassandra > Issue Type: Bug >Reporter: Jim Witschey >Assignee: DS Test Eng > Labels: dtest > Fix For: 3.0.x, 3.x > > > A number of dtests in the {{upgrade_tests.paging_tests}} that involve > deletion flap with the following error: > {code} > Requested pages were not delivered before timeout. > {code} > This may just be an effect of CASSANDRA-10730, but it's worth having a look > at separately. Here are some examples of tests flapping in this way: -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-12337) dtest failure in scrub_test.TestScrubIndexes.test_standalone_scrub
[ https://issues.apache.org/jira/browse/CASSANDRA-12337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joel Knighton updated CASSANDRA-12337: -- Description: We have an existing open ticket for this test in [CASSANDRA-11236]. That ticket is for a Windows failure with a different failure mode. Since the resolution will likely be different, I've opened this ticket to track the most recent failure. example failure: [http://cassci.datastax.com/job/cassandra-3.9_dtest/20/testReport/junit/scrub_test/TestScrubIndexes/test_standalone_scrub/] {code} sstablescrub failed >> begin captured logging << dtest: DEBUG: cluster ccm directory: /tmp/dtest-Ilk7GU dtest: DEBUG: Done setting configuration options: { 'initial_token': None, 'num_tokens': '32', 'phi_convict_threshold': 5, 'range_request_timeout_in_ms': 1, 'read_request_timeout_in_ms': 1, 'request_timeout_in_ms': 1, 'truncate_request_timeout_in_ms': 1, 'write_request_timeout_in_ms': 1} dtest: DEBUG: Checking sstables in ['/tmp/dtest-Ilk7GU/test/node1/data0/ks/users-b1956190547611e683f7630547833e83'] dtest: DEBUG: Found sstable file mb-1-big-Statistics.db dtest: DEBUG: Found sstable file mb-1-big-CRC.db dtest: DEBUG: Found sstable file mb-1-big-Filter.db dtest: DEBUG: Found sstable file mb-1-big-Summary.db dtest: DEBUG: Found sstable file mb-1-big-Data.db dtest: DEBUG: Found sstable file mb-1-big-Index.db dtest: DEBUG: Found sstable file mb-1-big-TOC.txt dtest: DEBUG: Checking sstables in ['/tmp/dtest-Ilk7GU/test/node1/data0/ks/users-b1956190547611e683f7630547833e83'] dtest: DEBUG: Found sstable file mb-1-big-Statistics.db dtest: DEBUG: Found sstable file mb-1-big-CRC.db dtest: DEBUG: Found sstable file mb-1-big-Filter.db dtest: DEBUG: Found sstable file mb-1-big-Summary.db dtest: DEBUG: Found sstable file mb-1-big-Data.db dtest: DEBUG: Found sstable file mb-1-big-Index.db dtest: DEBUG: Found sstable file mb-1-big-TOC.txt dtest: DEBUG: Checking sstables in ['/tmp/dtest-Ilk7GU/test/node1/data0/ks/users-b1956190547611e683f7630547833e83'] dtest: DEBUG: Found sstable file mb-1-big-Statistics.db dtest: DEBUG: Found sstable file mb-1-big-CRC.db dtest: DEBUG: Found sstable file mb-1-big-Filter.db dtest: DEBUG: Found sstable file mb-1-big-Summary.db dtest: DEBUG: Found sstable file mb-1-big-Data.db dtest: DEBUG: Found sstable file mb-1-big-Index.db dtest: DEBUG: Found sstable file mb-1-big-TOC.txt dtest: DEBUG: Checking sstables in ['/tmp/dtest-Ilk7GU/test/node1/data0/ks/users-b1956190547611e683f7630547833e83'] dtest: DEBUG: Found sstable file mb-1-big-Statistics.db dtest: DEBUG: Found sstable file mb-1-big-CRC.db dtest: DEBUG: Found sstable file mb-1-big-Filter.db dtest: DEBUG: Found sstable file mb-1-big-Summary.db dtest: DEBUG: Found sstable file mb-1-big-Data.db dtest: DEBUG: Found sstable file mb-1-big-Index.db dtest: DEBUG: Found sstable file mb-1-big-TOC.txt dtest: DEBUG: /home/automaton/cassandra/bin/sstablescrub dtest: DEBUG: Pre-scrub sstables snapshotted into snapshot pre-scrub-1469677957710 Scrubbing BigTableReader(path='/tmp/dtest-Ilk7GU/test/node1/data0/ks/users-b1956190547611e683f7630547833e83/mb-1-big-Data.db') (0.317KiB) Scrub of BigTableReader(path='/tmp/dtest-Ilk7GU/test/node1/data0/ks/users-b1956190547611e683f7630547833e83/mb-1-big-Data.db') complete; looks like all 0 rows were tombstoned dtest: DEBUG: /home/automaton/cassandra/bin/sstablescrub dtest: DEBUG: Pre-scrub sstables snapshotted into snapshot pre-scrub-1469677962057 Scrubbing BigTableReader(path='/tmp/dtest-Ilk7GU/test/node1/data0/ks/users-b1956190547611e683f7630547833e83/.gender_idx/mb-1-big-Data.db') (0.176KiB) Scrub of BigTableReader(path='/tmp/dtest-Ilk7GU/test/node1/data0/ks/users-b1956190547611e683f7630547833e83/.gender_idx/mb-1-big-Data.db') complete; looks like all 0 rows were tombstoned dtest: DEBUG: /home/automaton/cassandra/bin/sstablescrub dtest: DEBUG: Pre-scrub sstables snapshotted into snapshot pre-scrub-1469677966308 Scrubbing BigTableReader(path='/tmp/dtest-Ilk7GU/test/node1/data0/ks/users-b1956190547611e683f7630547833e83/.state_idx/mb-1-big-Data.db') (0.178KiB) Scrub of BigTableReader(path='/tmp/dtest-Ilk7GU/test/node1/data0/ks/users-b1956190547611e683f7630547833e83/.state_idx/mb-1-big-Data.db') complete; looks like all 0 rows were tombstoned dtest: DEBUG: /home/automaton/cassandra/bin/sstablescrub dtest: DEBUG: Pre-scrub sstables snapshotted into snapshot pre-scrub-1469677970549 Scrubbing BigTableReader(path='/tmp/dtest-Ilk7GU/test/node1/data0/ks/users-b1956190547611e683f7630547833e83/.birth_year_idx/mb-1-big-Data.db') (0.189KiB) Scrub of BigTableReader(path='/tmp/dtest-Ilk7GU/test/node1/data0/ks/users-b1956190547611e683f7630547833e83/.birth_year_idx/mb-1-big-Data.db') complete; looks like all 0 rows were tombstoned dtest: DEBUG: ERROR 03:52:50 Error in ThreadPoolExecutor
[jira] [Commented] (CASSANDRA-12151) Audit logging for database activity
[ https://issues.apache.org/jira/browse/CASSANDRA-12151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15399584#comment-15399584 ] Jonathan Ellis commented on CASSANDRA-12151: Remember that people almost always use Cassandra to drive applications at scale, not to do interactive analytics. I can't see that logging 100,000 ops per second of the same ten queries is going to add much value. I don't want to load that gun for people to blow their feet off with... Generally auditing is most useful to see "who *changed* what" not "who *asked for* what." (Again, the "who" for most of the latter is going to be "the application server.") And again, it's not super useful to know that the app server inserted 10,000 new user accounts today, but it IS useful to know when Jonathan added a new column to the users table. (I would also include user logins as an interesting event. This will be dominated by app servers still but much much less noise than logging every query or update.) Besides changes over CQL, this could also include JMX changes, although there are so many entry points to JMX mbeans that this would be ugly to do by hand. Perhaps we could inject this with byteman? > Audit logging for database activity > --- > > Key: CASSANDRA-12151 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12151 > Project: Cassandra > Issue Type: New Feature >Reporter: stefan setyadi > Fix For: 3.x > > Attachments: 12151.txt > > > we would like a way to enable cassandra to log database activity being done > on our server. > It should show username, remote address, timestamp, action type, keyspace, > column family, and the query statement. > it should also be able to log connection attempt and changes to the > user/roles. > I was thinking of making a new keyspace and insert an entry for every > activity that occurs. > Then It would be possible to query for specific activity or a query targeting > a specific keyspace and column family. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-12343) Make 'static final boolean' easier to optimize for Hotspot
Robert Stupp created CASSANDRA-12343: Summary: Make 'static final boolean' easier to optimize for Hotspot Key: CASSANDRA-12343 URL: https://issues.apache.org/jira/browse/CASSANDRA-12343 Project: Cassandra Issue Type: Improvement Reporter: Robert Stupp Assignee: Robert Stupp Priority: Trivial Fix For: 3.x Hotspot is able to optimize condition checks on `static final` fields. But the compiler can only optimize if the referenced "constant" is the first condition to check. (If I understood the optimization in Hotspot correctly.) I.e. the first {{if}} block can be "eliminated" whereas the second cannot: {code} class Foo { static final boolean CONST = /* some fragment evaluating to false */; public void doSomeStuff(boolean param) { if (!CONST) { // this code block can be eliminated } if (!CONST && param) { // this code block can be eliminated } if (param && !CONST) { // this code block cannot be eliminated due to some compiler logic } } } {code} Linked patch changes the order in some {{if}} statements and migrates a few methods to static final fields. ||trunk|[branch|https://github.com/apache/cassandra/compare/trunk...snazy:boolean-hotspot]|[testall|http://cassci.datastax.com/view/Dev/view/snazy/job/snazy-boolean-hotspot-testall/lastSuccessfulBuild/]|[dtest|http://cassci.datastax.com/view/Dev/view/snazy/job/snazy-boolean-hotspot-dtest/lastSuccessfulBuild/] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-12342) CLibrary improvements
Robert Stupp created CASSANDRA-12342: Summary: CLibrary improvements Key: CASSANDRA-12342 URL: https://issues.apache.org/jira/browse/CASSANDRA-12342 Project: Cassandra Issue Type: Improvement Reporter: Robert Stupp Assignee: Robert Stupp Priority: Minor Fix For: 3.x {{CLibrary}} uses {{FBUtilities.getProtectedField}} for each invocation of {{getfd}} - i.e. {{Class.getDeclaredField}} + {{Field.setAccessible}}. Linked patch migrates these {{Field}} references to static class fields + adds constants for the OS. Also adds a tiny optimization for non-linux OSs in {{trySync}}. ||trunk|[branch|https://github.com/apache/cassandra/compare/trunk...snazy:CLibrary-opts]|[testall|http://cassci.datastax.com/view/Dev/view/snazy/job/snazy-CLibrary-opts-testall/lastSuccessfulBuild/]|[dtest|http://cassci.datastax.com/view/Dev/view/snazy/job/snazy-CLibrary-opts-dtest/lastSuccessfulBuild/] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11635) test-clientutil-jar unit test fails
[ https://issues.apache.org/jira/browse/CASSANDRA-11635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15399531#comment-15399531 ] Sylvain Lebresne commented on CASSANDRA-11635: -- Ok, I got confused because I was running the test as part of the normal {{ant test}} target, and was looking at a job failure on CI, but it's actually the specific {{ant test-clientutil-jar}} target, which is run by CI but whose failure doesn't seem to be reported (as far as I can tell, and unless you manually check the console output). Anyway, the original error message with Sigar actually makes sense: we did make sigar a dependency of UUIDGen and forgot to add it to the {{clientutil.jar}}. That's still the error with which the test fails on 2.2 and 3.0 and I'm attaching below simple update to fix the test. I didn't bother running CI because it only changes the build file, and only the target related to {{clientutil.java}}, which as said above is not even reported by CI. It's easy enough to check locally that it fixes the test though. | [11635-2.2|https://github.com/pcmanus/cassandra/commits/11635-2.2] | | [11635-3.0|https://github.com/pcmanus/cassandra/commits/11635-3.0] | It does is worth noting that the inclusion of sigar will make the use of clientutil.jar a tad more annoying as you need to put sigar's binary for your architecture. Or more precisely, you don't _need_ to, but you'll get an ugly error message if you don't. I'm not entirely sure clientutil.jar is still in use though, and as I said, it's not actually mandatory, so I'm not convinced it's a problem, but still mentioning it for completness. Now, that leaves the issue on trunk (or 3.9 for that matter). And I'm not entirely sure why the test complains about {{IPartitioner}}. What I do know is that the reason it complains on 3.x and not on 3.0.x is CASSANDRA-12002. For some reason, the code that patch added to {{FBUtilities}} triggers the problem (the test pass if I revert those changes). Which is weird in the sense that those change only included 2 new methods returning {{IPartitioner}} but there already has one before. I'm not an expert in class loaders though. Anyway, I'm not sure what is the right fix for 3.x. I could try to randomly bend the code in {{FBUtilities}} to make the test happy, or move that code somewhere else (even more random), or spend a few hours understanding the subtlety of the class loader, but as I said above, I'm not really sure clientutil.java is used anywhere anymore as the functionality it offers is provided by other clients, and those other clients don't use it. So I'm seriously wondering if the most progmatic solution isn't to just stop providing that jar in 3.x (and if you really depend on that jar, you're probably better off sticking to an old version anyway). Any opinions? > test-clientutil-jar unit test fails > --- > > Key: CASSANDRA-11635 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11635 > Project: Cassandra > Issue Type: Test > Components: Testing >Reporter: Michael Shuler >Assignee: Sylvain Lebresne > Labels: unittest > Fix For: 2.2.x, 3.0.x, 3.x > > > {noformat} > test-clientutil-jar: > [junit] Testsuite: org.apache.cassandra.serializers.ClientUtilsTest > [junit] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: > 0.314 sec > [junit] > [junit] Testcase: test(org.apache.cassandra.serializers.ClientUtilsTest): > Caused an ERROR > [junit] org/apache/cassandra/utils/SigarLibrary > [junit] java.lang.NoClassDefFoundError: > org/apache/cassandra/utils/SigarLibrary > [junit] at org.apache.cassandra.utils.UUIDGen.hash(UUIDGen.java:328) > [junit] at > org.apache.cassandra.utils.UUIDGen.makeNode(UUIDGen.java:307) > [junit] at > org.apache.cassandra.utils.UUIDGen.makeClockSeqAndNode(UUIDGen.java:256) > [junit] at > org.apache.cassandra.utils.UUIDGen.(UUIDGen.java:39) > [junit] at > org.apache.cassandra.serializers.ClientUtilsTest.test(ClientUtilsTest.java:56) > [junit] Caused by: java.lang.ClassNotFoundException: > org.apache.cassandra.utils.SigarLibrary > [junit] at java.net.URLClassLoader.findClass(URLClassLoader.java:381) > [junit] at java.lang.ClassLoader.loadClass(ClassLoader.java:424) > [junit] at > sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331) > [junit] at java.lang.ClassLoader.loadClass(ClassLoader.java:357) > [junit] > [junit] > [junit] Test org.apache.cassandra.serializers.ClientUtilsTest FAILED > BUILD FAILED > {noformat} > I'll see if I can find a spot where this passes, but it appears to have been > failing for a long time. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7190) Add schema to snapshot manifest
[ https://issues.apache.org/jira/browse/CASSANDRA-7190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15399533#comment-15399533 ] Aleksey Yeschenko commented on CASSANDRA-7190: -- Alright. Committed as [a123e984c3236b2a188411cad5c29f16e662c369|https://github.com/apache/cassandra/commit/a123e984c3236b2a188411cad5c29f16e662c369] to trunk. Yay (: If I come up with another case that shouldn't be represented, I'll file a follow-up JIRA. > Add schema to snapshot manifest > --- > > Key: CASSANDRA-7190 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7190 > Project: Cassandra > Issue Type: Improvement > Components: Tools >Reporter: Jonathan Ellis >Assignee: Alex Petrov >Priority: Minor > Labels: client-impacting, doc-impacting, lhf > Fix For: 3.10 > > > followup from CASSANDRA-6326 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-7190) Add schema to snapshot manifest
[ https://issues.apache.org/jira/browse/CASSANDRA-7190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Yeschenko updated CASSANDRA-7190: - Resolution: Fixed Fix Version/s: (was: 3.x) 3.10 Status: Resolved (was: Ready to Commit) > Add schema to snapshot manifest > --- > > Key: CASSANDRA-7190 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7190 > Project: Cassandra > Issue Type: Improvement > Components: Tools >Reporter: Jonathan Ellis >Assignee: Alex Petrov >Priority: Minor > Labels: client-impacting, doc-impacting, lhf > Fix For: 3.10 > > > followup from CASSANDRA-6326 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
cassandra git commit: Add schema to snapshot manifest, add WITH TIMESTAMP to DROP statement
Repository: cassandra Updated Branches: refs/heads/trunk bdaa53de4 -> a123e984c Add schema to snapshot manifest, add WITH TIMESTAMP to DROP statement Patch by Alex Petrov; reviewed by Aleksey Yeschenko for CASSANDRA-7190 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/a123e984 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/a123e984 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/a123e984 Branch: refs/heads/trunk Commit: a123e984c3236b2a188411cad5c29f16e662c369 Parents: bdaa53d Author: Alex PetrovAuthored: Wed Apr 20 14:57:52 2016 +0200 Committer: Aleksey Yeschenko Committed: Fri Jul 29 16:39:03 2016 +0100 -- CHANGES.txt | 1 + src/antlr/Parser.g | 10 +- .../org/apache/cassandra/config/CFMetaData.java | 12 +- .../cql3/statements/AlterTableStatement.java| 26 +- .../apache/cassandra/db/ColumnFamilyStore.java | 22 + .../db/ColumnFamilyStoreCQLHelper.java | 442 .../org/apache/cassandra/db/Directories.java| 6 + .../unit/org/apache/cassandra/SchemaLoader.java | 5 + .../cql3/validation/operations/AlterTest.java | 70 ++ .../db/ColumnFamilyStoreCQLHelperTest.java | 683 +++ .../schema/LegacySchemaMigratorTest.java| 3 +- 11 files changed, 1259 insertions(+), 21 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/a123e984/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 27655d2..80063c8 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 3.10 + * Add schema to snapshot manifest, add USING TIMESTAMP clause to ALTER TABLE statements (CASSANDRA-7190) * Add beta protocol flag for v5 native protocol (CASSANDRA-12142) * Support filtering on non-PRIMARY KEY columns in the CREATE MATERIALIZED VIEW statement's WHERE clause (CASSANDRA-10368) http://git-wip-us.apache.org/repos/asf/cassandra/blob/a123e984/src/antlr/Parser.g -- diff --git a/src/antlr/Parser.g b/src/antlr/Parser.g index f00f9d0..e762bde 100644 --- a/src/antlr/Parser.g +++ b/src/antlr/Parser.g @@ -777,22 +777,24 @@ alterTableStatement returns [AlterTableStatement expr] TableAttributes attrs = new TableAttributes(); Map renames = new HashMap (); List colNameList = new ArrayList(); +Long deleteTimestamp = null; } : K_ALTER K_COLUMNFAMILY cf=columnFamilyName ( K_ALTER id=cident K_TYPE v=comparatorType { type = AlterTableStatement.Type.ALTER; } { colNameList.add(new AlterTableStatementColumn(id,v)); } | K_ADD ((id=cident v=comparatorType b1=cfisStatic { colNameList.add(new AlterTableStatementColumn(id,v,b1)); }) | ('(' id1=cident v1=comparatorType b1=cfisStatic { colNameList.add(new AlterTableStatementColumn(id1,v1,b1)); } ( ',' idn=cident vn=comparatorType bn=cfisStatic { colNameList.add(new AlterTableStatementColumn(idn,vn,bn)); } )* ')' ) ) { type = AlterTableStatement.Type.ADD; } - | K_DROP ( id=cident { colNameList.add(new AlterTableStatementColumn(id)); } - | ('(' id1=cident { colNameList.add(new AlterTableStatementColumn(id1)); } - ( ',' idn=cident { colNameList.add(new AlterTableStatementColumn(idn)); } )* ')') ) { type = AlterTableStatement.Type.DROP; } + | K_DROP ( ( id=cident { colNameList.add(new AlterTableStatementColumn(id)); } + | ('(' id1=cident { colNameList.add(new AlterTableStatementColumn(id1)); } +( ',' idn=cident { colNameList.add(new AlterTableStatementColumn(idn)); } )* ')') ) + ( K_USING K_TIMESTAMP t=INTEGER { deleteTimestamp = Long.parseLong(Constants.Literal.integer($t.text).getText()); })? ) { type = AlterTableStatement.Type.DROP; } | K_WITH properties[attrs] { type = AlterTableStatement.Type.OPTS; } | K_RENAME { type = AlterTableStatement.Type.RENAME; } id1=cident K_TO toId1=cident { renames.put(id1, toId1); } ( K_AND idn=cident K_TO toIdn=cident { renames.put(idn, toIdn); } )* ) { -$expr = new AlterTableStatement(cf, type, colNameList, attrs, renames); +$expr = new AlterTableStatement(cf, type, colNameList, attrs, renames, deleteTimestamp); } ;
[jira] [Created] (CASSANDRA-12341) dtest failure in hintedhandoff_test.TestHintedHandoffConfig.hintedhandoff_enabled_test
Sean McCarthy created CASSANDRA-12341: - Summary: dtest failure in hintedhandoff_test.TestHintedHandoffConfig.hintedhandoff_enabled_test Key: CASSANDRA-12341 URL: https://issues.apache.org/jira/browse/CASSANDRA-12341 Project: Cassandra Issue Type: Test Reporter: Sean McCarthy Assignee: DS Test Eng Attachments: node1.log, node1_debug.log, node1_gc.log, node2.log, node2_debug.log, node2_gc.log example failure: http://cassci.datastax.com/job/trunk_novnode_dtest/440/testReport/hintedhandoff_test/TestHintedHandoffConfig/hintedhandoff_enabled_test {code} Error Message 29 Jul 2016 00:56:17 [node1] Missing: ['Finished hinted']: INFO [HANDSHAKE-/127.0.0.2] 2016-07-29 00:54:14,4. See system.log for remainder {code} {code} Stacktrace File "/usr/lib/python2.7/unittest/case.py", line 329, in run testMethod() File "/home/automaton/cassandra-dtest/hintedhandoff_test.py", line 125, in hintedhandoff_enabled_test self._do_hinted_handoff(node1, node2, True) File "/home/automaton/cassandra-dtest/hintedhandoff_test.py", line 61, in _do_hinted_handoff node1.watch_log_for(["Finished hinted"], from_mark=log_mark, timeout=120) File "/home/automaton/ccm/ccmlib/node.py", line 449, in watch_log_for raise TimeoutError(time.strftime("%d %b %Y %H:%M:%S", time.gmtime()) + " [" + self.name + "] Missing: " + str([e.pattern for e in tofind]) + ":\n" + reads[:50] + ".\nSee {} for remainder".format(filename)) "29 Jul 2016 00:56:17 [node1] Missing: ['Finished hinted']:\nINFO [HANDSHAKE-/127.0.0.2] 2016-07-29 00:54:14,4.\nSee system.log for remainder {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-12340) dtest failure in upgrade_supercolumns_test.TestSCUpgrade.upgrade_with_counters_test
Sean McCarthy created CASSANDRA-12340: - Summary: dtest failure in upgrade_supercolumns_test.TestSCUpgrade.upgrade_with_counters_test Key: CASSANDRA-12340 URL: https://issues.apache.org/jira/browse/CASSANDRA-12340 Project: Cassandra Issue Type: Test Reporter: Sean McCarthy Assignee: DS Test Eng Attachments: node1.log, node2.log, node3.log example failure: http://cassci.datastax.com/job/cassandra-2.1_dtest_jdk8/249/testReport/upgrade_supercolumns_test/TestSCUpgrade/upgrade_with_counters_test {code} Standard Output Unexpected error in node3 log, error: ERROR [CompactionExecutor:1] 2016-07-28 15:34:19,533 CassandraDaemon.java (line 191) Exception in thread Thread[CompactionExecutor:1,1,main] java.util.concurrent.RejectedExecutionException: Task java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask@5fb8b2bf rejected from org.apache.cassandra.concurrent.DebuggableScheduledThreadPoolExecutor@1ae851ad[Terminated, pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 8] at java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2047) at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:823) at java.util.concurrent.ScheduledThreadPoolExecutor.delayedExecute(ScheduledThreadPoolExecutor.java:326) at java.util.concurrent.ScheduledThreadPoolExecutor.schedule(ScheduledThreadPoolExecutor.java:533) at java.util.concurrent.ScheduledThreadPoolExecutor.submit(ScheduledThreadPoolExecutor.java:632) at org.apache.cassandra.io.sstable.SSTableDeletingTask.schedule(SSTableDeletingTask.java:65) at org.apache.cassandra.io.sstable.SSTableReader.releaseReference(SSTableReader.java:976) at org.apache.cassandra.db.DataTracker.removeOldSSTablesSize(DataTracker.java:383) at org.apache.cassandra.db.DataTracker.postReplace(DataTracker.java:348) at org.apache.cassandra.db.DataTracker.replace(DataTracker.java:342) at org.apache.cassandra.db.DataTracker.replaceCompactedSSTables(DataTracker.java:245) at org.apache.cassandra.db.ColumnFamilyStore.replaceCompactedSSTables(ColumnFamilyStore.java:995) at org.apache.cassandra.db.compaction.CompactionTask.replaceCompactedSSTables(CompactionTask.java:270) at org.apache.cassandra.db.compaction.CompactionTask.runWith(CompactionTask.java:230) at org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) at org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:58) at org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:60) at org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionTask.run(CompactionManager.java:208) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (CASSANDRA-12339) dtest failure in cql_tracing_test.TestCqlTracing.tracing_unknown_impl_test
[ https://issues.apache.org/jira/browse/CASSANDRA-12339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jim Witschey resolved CASSANDRA-12339. -- Resolution: Invalid > dtest failure in cql_tracing_test.TestCqlTracing.tracing_unknown_impl_test > -- > > Key: CASSANDRA-12339 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12339 > Project: Cassandra > Issue Type: Test >Reporter: Craig Kodman >Assignee: DS Test Eng > Labels: dtest > Attachments: node1.log, node1_debug.log, node1_gc.log, node2.log, > node2_debug.log, node2_gc.log, node3.log, node3_debug.log, node3_gc.log > > > example failure: > http://cassci.datastax.com/job/cassandra-3.9_dtest/21/testReport/cql_tracing_test/TestCqlTracing/tracing_unknown_impl_test -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-12339) dtest failure in cql_tracing_test.TestCqlTracing.tracing_unknown_impl_test
[ https://issues.apache.org/jira/browse/CASSANDRA-12339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15399457#comment-15399457 ] Jim Witschey commented on CASSANDRA-12339: -- This failure seems to have happened on an earlier commit than the one that closed [CASSANDRA-11465|https://issues.apache.org/jira/browse/CASSANDRA-11465?focusedCommentId=1539=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-1539]. Closing. > dtest failure in cql_tracing_test.TestCqlTracing.tracing_unknown_impl_test > -- > > Key: CASSANDRA-12339 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12339 > Project: Cassandra > Issue Type: Test >Reporter: Craig Kodman >Assignee: DS Test Eng > Labels: dtest > Attachments: node1.log, node1_debug.log, node1_gc.log, node2.log, > node2_debug.log, node2_gc.log, node3.log, node3_debug.log, node3_gc.log > > > example failure: > http://cassci.datastax.com/job/cassandra-3.9_dtest/21/testReport/cql_tracing_test/TestCqlTracing/tracing_unknown_impl_test -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-12311) Propagate TombstoneOverwhelmingException to the client
[ https://issues.apache.org/jira/browse/CASSANDRA-12311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15399436#comment-15399436 ] Sylvain Lebresne commented on CASSANDRA-12311: -- I'm wondering if it's worth adding a totally new exception. I agree {{ReadFailureException}} is currently a bit too imprecise regarding it's details, but it's not necessarily only true of {{TombstoneOverwhelmingException}}, and the latter is still a read failure exception. I think I'd have a preference for adding a (potentially optional) {{cause}} to the existing {{ReadFailureException}}. > Propagate TombstoneOverwhelmingException to the client > -- > > Key: CASSANDRA-12311 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12311 > Project: Cassandra > Issue Type: Improvement >Reporter: Geoffrey Yu >Assignee: Geoffrey Yu >Priority: Minor > Fix For: 4.x > > Attachments: 12311-trunk.txt > > > Right now if a data node fails to perform a read because it ran into a > {{TombstoneOverwhelmingException}}, it only responds back to the coordinator > node with a generic failure. Under this scheme, the coordinator won't be able > to know exactly why the request failed and subsequently the client only gets > a generic {{ReadFailureException}}. It would be useful to inform the client > that their read failed because we read too many tombstones. We should have > the data nodes reply with a failure type so the coordinator can pass this > information to the client. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-12312) Restore JVM metric export for metric reporters
[ https://issues.apache.org/jira/browse/CASSANDRA-12312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sylvain Lebresne updated CASSANDRA-12312: - Labels: lhf (was: ) > Restore JVM metric export for metric reporters > -- > > Key: CASSANDRA-12312 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12312 > Project: Cassandra > Issue Type: Bug >Reporter: Stefan Podkowinski >Assignee: Stefan Podkowinski > Labels: lhf > Fix For: 2.2.x, 3.0.x, 3.x > > Attachments: 12312-2.2.patch, 12312-3.0.patch, 12312-trunk.patch, > metrics-jvm-3.1.0.jar.asc > > > JVM instrumentation as part of dropwizard metrics has been moved to a > separate {{metrics-jvm}} artifact in metrics-v3.0. After CASSANDRA-5657, no > jvm related metrics will be exported to any reporter configured via > {{metrics-reporter-config}}, as this isn't part of {{metrics-core}} anymore. > As memory and GC stats are essential for monitoring Cassandra, this turns out > to be a blocker for us for upgrading to 2.2. > I've included a patch that would add the now separate {{metrics-jvm}} package > and enables some of the provided metrics on startup in case a metrics > reporter is used ({{-Dcassandra.metricsReporterConfigFile}}). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-12270) Nodetool toppartitions - add metrics of latency and payload
[ https://issues.apache.org/jira/browse/CASSANDRA-12270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Yeschenko updated CASSANDRA-12270: -- Component/s: Observability > Nodetool toppartitions - add metrics of latency and payload > --- > > Key: CASSANDRA-12270 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12270 > Project: Cassandra > Issue Type: Improvement > Components: Observability >Reporter: Rich Rein > > Its painful to diagnose complex 300 table clusters in production. > Extending toppartitions to record based on latency and payload size would > greatly simplify this and lower the time, cost, and drama of hot partitions. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-12281) Gossip blocks on startup when another node is bootstrapping
[ https://issues.apache.org/jira/browse/CASSANDRA-12281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sylvain Lebresne updated CASSANDRA-12281: - Assignee: Joel Knighton > Gossip blocks on startup when another node is bootstrapping > --- > > Key: CASSANDRA-12281 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12281 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Eric Evans >Assignee: Joel Knighton >Priority: Minor > Attachments: restbase1015-a_jstack.txt > > > In our cluster, normal node startup times (after a drain on shutdown) are > less than 1 minute. However, when another node in the cluster is > bootstrapping, the same node startup takes nearly 30 minutes to complete, the > apparent result of gossip blocking on pending range calculations. > {noformat} > $ nodetool-a tpstats > Pool NameActive Pending Completed Blocked All > time blocked > MutationStage 0 0 1840 0 > 0 > ReadStage 0 0 2350 0 > 0 > RequestResponseStage 0 0 53 0 > 0 > ReadRepairStage 0 0 1 0 > 0 > CounterMutationStage 0 0 0 0 > 0 > HintedHandoff 0 0 44 0 > 0 > MiscStage 0 0 0 0 > 0 > CompactionExecutor3 3395 0 > 0 > MemtableReclaimMemory 0 0 30 0 > 0 > PendingRangeCalculator1 2 29 0 > 0 > GossipStage 1 5602164 0 > 0 > MigrationStage0 0 0 0 > 0 > MemtablePostFlush 0 0111 0 > 0 > ValidationExecutor0 0 0 0 > 0 > Sampler 0 0 0 0 > 0 > MemtableFlushWriter 0 0 30 0 > 0 > InternalResponseStage 0 0 0 0 > 0 > AntiEntropyStage 0 0 0 0 > 0 > CacheCleanupExecutor 0 0 0 0 > 0 > Message type Dropped > READ 0 > RANGE_SLICE 0 > _TRACE 0 > MUTATION 0 > COUNTER_MUTATION 0 > REQUEST_RESPONSE 0 > PAGED_RANGE 0 > READ_REPAIR 0 > {noformat} > A full thread dump is attached, but the relevant bit seems to be here: > {noformat} > [ ... ] > "GossipStage:1" #1801 daemon prio=5 os_prio=0 tid=0x7fe4cd54b000 > nid=0xea9 waiting on condition [0x7fddcf883000] >java.lang.Thread.State: WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0x0004c1e922c0> (a > java.util.concurrent.locks.ReentrantReadWriteLock$FairSync) > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199) > at > java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:943) > at > org.apache.cassandra.locator.TokenMetadata.updateNormalTokens(TokenMetadata.java:174) > at > org.apache.cassandra.locator.TokenMetadata.updateNormalTokens(TokenMetadata.java:160) > at > org.apache.cassandra.service.StorageService.handleStateNormal(StorageService.java:2023) > at > org.apache.cassandra.service.StorageService.onChange(StorageService.java:1682) > at > org.apache.cassandra.gms.Gossiper.doOnChangeNotifications(Gossiper.java:1182) > at org.apache.cassandra.gms.Gossiper.applyNewStates(Gossiper.java:1165) > at > org.apache.cassandra.gms.Gossiper.applyStateLocally(Gossiper.java:1128) > at > org.apache.cassandra.gms.GossipDigestAckVerbHandler.doVerb(GossipDigestAckVerbHandler.java:58) >
[jira] [Updated] (CASSANDRA-12279) nodetool repair hangs on non-existant table
[ https://issues.apache.org/jira/browse/CASSANDRA-12279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sylvain Lebresne updated CASSANDRA-12279: - Labels: lhf (was: ) > nodetool repair hangs on non-existant table > --- > > Key: CASSANDRA-12279 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12279 > Project: Cassandra > Issue Type: Bug > Environment: Linux Ubuntu, Openjdk >Reporter: Benjamin Roth >Priority: Minor > Labels: lhf > > If nodetool repair is called with a table that does not exist, ist hangs > infinitely without any error message or logs. > E.g. > nodetool repair foo bar > Keyspace foo exists but table bar does not -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-12270) Nodetool toppartitions - add metrics of latency and payload
[ https://issues.apache.org/jira/browse/CASSANDRA-12270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sylvain Lebresne updated CASSANDRA-12270: - Issue Type: Improvement (was: Bug) > Nodetool toppartitions - add metrics of latency and payload > --- > > Key: CASSANDRA-12270 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12270 > Project: Cassandra > Issue Type: Improvement > Components: Observability >Reporter: Rich Rein > > Its painful to diagnose complex 300 table clusters in production. > Extending toppartitions to record based on latency and payload size would > greatly simplify this and lower the time, cost, and drama of hot partitions. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-12336) NullPointerException during compaction on table with static columns
[ https://issues.apache.org/jira/browse/CASSANDRA-12336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15399380#comment-15399380 ] Evan Prothro commented on CASSANDRA-12336: -- Confirmed that patch [12336-3.0|https://github.com/pcmanus/cassandra/commits/12336-3.0] fixes this issue for us. Both manual and triggered-from-read compaction work without error. > NullPointerException during compaction on table with static columns > --- > > Key: CASSANDRA-12336 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12336 > Project: Cassandra > Issue Type: Bug > Components: Compaction > Environment: cqlsh 5.0.1 > Cassandra 3.0.8-SNAPSHOT (3.0.x dev - a5cbb0) >Reporter: Evan Prothro >Assignee: Sylvain Lebresne > Fix For: 3.0.9 > > > After being affected by > https://issues.apache.org/jira/browse/CASSANDRA-11988, we built a5cbb0. > Compaction still fails with the following trace: > {code} > WARN [SharedPool-Worker-2] 2016-07-28 10:51:56,111 > AbstractLocalAwareExecutorService.java:169 - Uncaught exception on thread > Thread[SharedPool-Worker-2,5,main]: {} > java.lang.RuntimeException: java.lang.NullPointerException > at > org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2453) > ~[main/:na] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > ~[na:1.8.0_72] > at > org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:164) > ~[main/:na] > at > org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:136) > [main/:na] > at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) > [main/:na] > at java.lang.Thread.run(Thread.java:745) [na:1.8.0_72] > Caused by: java.lang.NullPointerException: null > at > org.apache.cassandra.db.ReadCommand$1MetricRecording.applyToRow(ReadCommand.java:466) > ~[main/:na] > at > org.apache.cassandra.db.ReadCommand$1MetricRecording.applyToStatic(ReadCommand.java:460) > ~[main/:na] > at org.apache.cassandra.db.transform.BaseRows.add(BaseRows.java:105) > ~[main/:na] > at > org.apache.cassandra.db.transform.UnfilteredRows.add(UnfilteredRows.java:41) > ~[main/:na] > at > org.apache.cassandra.db.transform.Transformation.add(Transformation.java:156) > ~[main/:na] > at > org.apache.cassandra.db.transform.Transformation.apply(Transformation.java:122) > ~[main/:na] > at > org.apache.cassandra.db.ReadCommand$1MetricRecording.applyToPartition(ReadCommand.java:454) > ~[main/:na] > at > org.apache.cassandra.db.ReadCommand$1MetricRecording.applyToPartition(ReadCommand.java:438) > ~[main/:na] > at > org.apache.cassandra.db.transform.BasePartitions.hasNext(BasePartitions.java:96) > ~[main/:na] > at > org.apache.cassandra.db.partitions.UnfilteredPartitionIterators$Serializer.serialize(UnfilteredPartitionIterators.java:295) > ~[main/:na] > at > org.apache.cassandra.db.ReadResponse$LocalDataResponse.build(ReadResponse.java:145) > ~[main/:na] > at > org.apache.cassandra.db.ReadResponse$LocalDataResponse.(ReadResponse.java:138) > ~[main/:na] > at > org.apache.cassandra.db.ReadResponse$LocalDataResponse.(ReadResponse.java:134) > ~[main/:na] > at > org.apache.cassandra.db.ReadResponse.createDataResponse(ReadResponse.java:76) > ~[main/:na] > at > org.apache.cassandra.db.ReadCommand.createResponse(ReadCommand.java:320) > ~[main/:na] > at > org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:1796) > ~[main/:na] > at > org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2449) > ~[main/:na] > ... 5 common frames omitted > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-12273) Casandra stess graph: option to create directory for graph if it doesn't exist
[ https://issues.apache.org/jira/browse/CASSANDRA-12273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sylvain Lebresne updated CASSANDRA-12273: - Labels: lhf (was: ) > Casandra stess graph: option to create directory for graph if it doesn't exist > -- > > Key: CASSANDRA-12273 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12273 > Project: Cassandra > Issue Type: Improvement > Components: Tools >Reporter: Christopher Batey >Assignee: Christopher Batey >Priority: Minor > Labels: lhf > > I am running it in CI with ephemeral workspace / build dirs. It would be > nice if CS would create the directory so my build tool doesn't have to -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-7190) Add schema to snapshot manifest
[ https://issues.apache.org/jira/browse/CASSANDRA-7190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Petrov updated CASSANDRA-7190: --- Status: Ready to Commit (was: Patch Available) > Add schema to snapshot manifest > --- > > Key: CASSANDRA-7190 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7190 > Project: Cassandra > Issue Type: Improvement > Components: Tools >Reporter: Jonathan Ellis >Assignee: Alex Petrov >Priority: Minor > Labels: client-impacting, doc-impacting, lhf > Fix For: 3.x > > > followup from CASSANDRA-6326 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
svn commit: r1754523 - /cassandra/site/src/.htaccess
Author: slebresne Date: Fri Jul 29 13:31:08 2016 New Revision: 1754523 URL: http://svn.apache.org/viewvc?rev=1754523=rev Log: Fix .htaccess in source too Modified: cassandra/site/src/.htaccess Modified: cassandra/site/src/.htaccess URL: http://svn.apache.org/viewvc/cassandra/site/src/.htaccess?rev=1754523=1754522=1754523=diff == --- cassandra/site/src/.htaccess (original) +++ cassandra/site/src/.htaccess Fri Jul 29 13:31:08 2016 @@ -1,3 +1,3 @@ RewriteEngine On -RewriteRule /doc/ /doc/latest/ [NC, L] +RewriteRule /doc/ /doc/latest/ [NC,L]
svn commit: r1754519 - /cassandra/site/publish/.htaccess
Author: slebresne Date: Fri Jul 29 13:21:42 2016 New Revision: 1754519 URL: http://svn.apache.org/viewvc?rev=1754519=rev Log: Fix .htaccess Modified: cassandra/site/publish/.htaccess Modified: cassandra/site/publish/.htaccess URL: http://svn.apache.org/viewvc/cassandra/site/publish/.htaccess?rev=1754519=1754518=1754519=diff == --- cassandra/site/publish/.htaccess (original) +++ cassandra/site/publish/.htaccess Fri Jul 29 13:21:42 2016 @@ -1,3 +1,3 @@ RewriteEngine On -RewriteRule /doc/ /doc/latest/ [NC, L] +RewriteRule /doc/ /doc/latest/ [NC,L]
[jira] [Resolved] (CASSANDRA-12338) Upgrading 2.1.0 / 2.1.9->3.0.2/ 3.7 failed
[ https://issues.apache.org/jira/browse/CASSANDRA-12338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Yeschenko resolved CASSANDRA-12338. --- Resolution: Duplicate This is a duplicate of CASSANDRA-11900, caused by CASSANDRA-11877 most likely. There is a good chance it's been fixed by CASSANDRA-12144 already, which will be released soon in 3.0.9/3.9. Is there any way you can build off latest cassandra-3.0 branch HEAD and try starting up Cassandra again? If that doesn't work, feel free to reopen this JIRA. Thanks. > Upgrading 2.1.0 / 2.1.9->3.0.2/ 3.7 failed > -- > > Key: CASSANDRA-12338 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12338 > Project: Cassandra > Issue Type: Bug > Environment: Windows / Linux >Reporter: Samraj > Labels: upgrade > > I am trying to upgrade the cassandra from 2.1.0 to 3.7. As per the > recommendation, i just migrated from 2.1.0 to 2.1.9 and then i tried to > migrate over 3.0.2 or 3.7. But i am getting the below exception while > startup. How to skip this or overcome this. > INFO [WrapperSimpleAppMain] 2016-07-29 11:33:36,684 SystemKeyspace.java:1283 > - Detected version upgrade from 2.1.9 to 3.0.8, snapshotting system keyspace > WARN [WrapperSimpleAppMain] 2016-07-29 11:33:38,565 > CompressionParams.java:382 - The sstable_compression option has been > deprecated. You should use class instead > ERROR [WrapperSimpleAppMain] 2016-07-29 11:33:40,984 CassandraDaemon.java:698 > - Exception encountered during startup > java.lang.IllegalStateException: One row required, 2 found > at > org.apache.cassandra.cql3.UntypedResultSet$FromResultSet.one(UntypedResultSet.java:84) > ~[apache-cassandra-3.0.8.jar:3.0.8] > at > org.apache.cassandra.schema.LegacySchemaMigrator.readTableTimestamp(LegacySchemaMigrator.java:253) > ~[apache-cassandra-3.0.8.jar:3.0.8] > at > org.apache.cassandra.schema.LegacySchemaMigrator.readTable(LegacySchemaMigrator.java:243) > ~[apache-cassandra-3.0.8.jar:3.0.8] > at > org.apache.cassandra.schema.LegacySchemaMigrator.lambda$readTables$7(LegacySchemaMigrator.java:237) > ~[apache-cassandra-3.0.8.jar:3.0.8] > at java.util.ArrayList.forEach(ArrayList.java:1249) ~[na:1.8.0_60] > at > org.apache.cassandra.schema.LegacySchemaMigrator.readTables(LegacySchemaMigrator.java:237) > ~[apache-cassandra-3.0.8.jar:3.0.8] > at > org.apache.cassandra.schema.LegacySchemaMigrator.readKeyspace(LegacySchemaMigrator.java:186) > ~[apache-cassandra-3.0.8.jar:3.0.8] > at > org.apache.cassandra.schema.LegacySchemaMigrator.lambda$readSchema$4(LegacySchemaMigrator.java:177) > ~[apache-cassandra-3.0.8.jar:3.0.8] > at java.util.ArrayList.forEach(ArrayList.java:1249) ~[na:1.8.0_60] > at > org.apache.cassandra.schema.LegacySchemaMigrator.readSchema(LegacySchemaMigrator.java:177) > ~[apache-cassandra-3.0.8.jar:3.0.8] > at > org.apache.cassandra.schema.LegacySchemaMigrator.migrate(LegacySchemaMigrator.java:77) > ~[apache-cassandra-3.0.8.jar:3.0.8] > at > org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:229) > ~[apache-cassandra-3.0.8.jar:3.0.8] > at > org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:557) > ~[apache-cassandra-3.0.8.jar:3.0.8] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-12339) dtest failure in cql_tracing_test.TestCqlTracing.tracing_unknown_impl_test
Craig Kodman created CASSANDRA-12339: Summary: dtest failure in cql_tracing_test.TestCqlTracing.tracing_unknown_impl_test Key: CASSANDRA-12339 URL: https://issues.apache.org/jira/browse/CASSANDRA-12339 Project: Cassandra Issue Type: Test Reporter: Craig Kodman Assignee: DS Test Eng Attachments: node1.log, node1_debug.log, node1_gc.log, node2.log, node2_debug.log, node2_gc.log, node3.log, node3_debug.log, node3_gc.log example failure: http://cassci.datastax.com/job/cassandra-3.9_dtest/21/testReport/cql_tracing_test/TestCqlTracing/tracing_unknown_impl_test -- This message was sent by Atlassian JIRA (v6.3.4#6332)
svn commit: r1754516 - /cassandra/site/publish/index.html
Author: slebresne Date: Fri Jul 29 13:14:43 2016 New Revision: 1754516 URL: http://svn.apache.org/viewvc?rev=1754516=rev Log: 2nd attempt to waking up svnpubsub Modified: cassandra/site/publish/index.html Modified: cassandra/site/publish/index.html URL: http://svn.apache.org/viewvc/cassandra/site/publish/index.html?rev=1754516=1754515=1754516=diff == --- cassandra/site/publish/index.html (original) +++ cassandra/site/publish/index.html Fri Jul 29 13:14:43 2016 @@ -1,8 +1,5 @@ - - -
svn commit: r1754509 - /cassandra/site/src/README
Author: slebresne Date: Fri Jul 29 12:56:02 2016 New Revision: 1754509 URL: http://svn.apache.org/viewvc?rev=1754509=rev Log: Trying to wake up svnpubsub Modified: cassandra/site/src/README Modified: cassandra/site/src/README URL: http://svn.apache.org/viewvc/cassandra/site/src/README?rev=1754509=1754508=1754509=diff == --- cassandra/site/src/README (original) +++ cassandra/site/src/README Fri Jul 29 12:56:02 2016 @@ -51,3 +51,4 @@ The rest of the layout is standard to Je * `_sass/` is to `css/` what `_includes` is to `_layout`; it contains sass fragments imported by the main css files (currently only the pygments theme for syntax highligthing in the documentation). * `_plugins/` contains a tiny plugin that make it easier to input download links in the `download.md` file. +
[jira] [Commented] (CASSANDRA-7190) Add schema to snapshot manifest
[ https://issues.apache.org/jira/browse/CASSANDRA-7190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15399193#comment-15399193 ] Alex Petrov commented on CASSANDRA-7190: I've fixed the {{DynamicCompositeType}} support. As regards the second clause, it refers to the [python driver|https://github.com/datastax/python-driver/blob/master/cassandra/metadata.py#L1061-L1067]. Although I could not construct the table that would fail with this case. bq. some cases of mixed dynamic/static Thrift CFs I've tested several permutation of dynamic/static thrift CFs that were also worked on during [CASSANDRA-10857] and tried constructing other cases and so far could not find other mismatches. Might be since Thrift is hiding some or they're being dumped without compact storage. For example: {code} CfDef cfDef = new CfDef().setDefault_validation_class(Int32Type.instance.toString()) .setKey_validation_class(AsciiType.instance.toString()) .setComparator_type(CompositeType.getInstance(AsciiType.instance, AsciiType.instance).toString()) .setColumn_metadata(Arrays.asList(new ColumnDef(CompositeType.build(ByteBufferUtil.bytes("col1"), ByteBufferUtil.bytes("col1")), AsciiType.instance.toString()), new ColumnDef(CompositeType.build(ByteBufferUtil.bytes("col2"), ByteBufferUtil.bytes("col2")), AsciiType.instance.toString()) )) .setKeyspace(KEYSPACE) .setName(TABLE); {code} is represented as {code} CREATE TABLE IF NOT EXISTS thrift_created_table_test_ks.test_table_1 ( key ascii, column1 ascii, column2 ascii, "col1:col1" ascii static, "col2:col2" ascii static, value int, PRIMARY KEY (key, column1, column2)) WITH ID = d1e70820-5581-11e6-9b6d-53f9c6c224e8 AND CLUSTERING ORDER BY (column1 ASC, column2 ASC) {code} And {{isThriftCompatible}} yields {{false}}, since it's not dense table. > Add schema to snapshot manifest > --- > > Key: CASSANDRA-7190 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7190 > Project: Cassandra > Issue Type: Improvement > Components: Tools >Reporter: Jonathan Ellis >Assignee: Alex Petrov >Priority: Minor > Labels: client-impacting, doc-impacting, lhf > Fix For: 3.x > > > followup from CASSANDRA-6326 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (CASSANDRA-12335) Super columns are broken after upgrading to 3.0
[ https://issues.apache.org/jira/browse/CASSANDRA-12335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Yeschenko reassigned CASSANDRA-12335: - Assignee: Aleksey Yeschenko > Super columns are broken after upgrading to 3.0 > --- > > Key: CASSANDRA-12335 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12335 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Jeremiah Jordan >Assignee: Aleksey Yeschenko >Priority: Blocker > Fix For: 3.0.x, 3.x > > > Super Columns are broken after upgrading to cassandra-3.0 HEAD. The below > script shows this. > 2.1 cli output for get: > {code} > [default@test] get Sites[utf8('Bob')][utf8('attr')]['name'] as utf8; > => (name=name, value=Bob, timestamp=1469724504357000) > {code} > cqlsh: > {code} > [default@test] > key | blobAsText(column1) > --+- > 0x53696d6f6e |attr > 0x426f62 |attr > {code} > 3.0 cli: > {code} > [default@unknown] use test; > unconfigured table schema_columnfamilies > [default@test] get Sites[utf8('Bob')][utf8('attr')]['name'] as utf8; > null > [default@test] > {code} > cqlsh: > {code} > key | system.blobastext(column1) > --+-- > 0x53696d6f6e | \x00\x04attr\x00\x00\x04name\x00 > 0x426f62 | \x00\x04attr\x00\x00\x04name\x00 > {code} > Run this from a directory with cassandra-3.0 checked out and compiled > {code} > ccm create -n 2 -v 2.1.14 testsuper > echo "### Starting 2.1 ###" > ccm start > MYFILE=`mktemp` > echo "create keyspace test with placement_strategy = > 'org.apache.cassandra.locator.SimpleStrategy' and strategy_options = > {replication_factor:2}; > use test; > create column family Sites with column_type = 'Super' and comparator = > 'BytesType' and subcomparator='UTF8Type'; > set Sites[utf8('Simon')][utf8('attr')]['name'] = utf8('Simon'); > set Sites[utf8('Bob')][utf8('attr')]['name'] = utf8('Bob'); > get Sites[utf8('Bob')][utf8('attr')]['name'] as utf8;" > $MYFILE > ~/.ccm/repository/2.1.14/bin/cassandra-cli < $MYFILE > rm $MYFILE > ~/.ccm/repository/2.1.14/bin/nodetool -p 7100 flush > ~/.ccm/repository/2.1.14/bin/nodetool -p 7200 flush > ccm stop > # run from cassandra-3.0 checked out and compiled > ccm setdir > echo "### Starting Current Directory > ###" > ccm start > ./bin/nodetool -p 7100 upgradesstables > ./bin/nodetool -p 7200 upgradesstables > ./bin/nodetool -p 7100 enablethrift > ./bin/nodetool -p 7200 enablethrift > MYFILE=`mktemp` > echo "use test; > get Sites[utf8('Bob')][utf8('attr')]['name'] as utf8;" > $MYFILE > ~/.ccm/repository/2.1.14/bin/cassandra-cli < $MYFILE > rm $MYFILE > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-12335) Super columns are broken after upgrading to 3.0
[ https://issues.apache.org/jira/browse/CASSANDRA-12335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Yeschenko updated CASSANDRA-12335: -- Priority: Major (was: Blocker) > Super columns are broken after upgrading to 3.0 > --- > > Key: CASSANDRA-12335 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12335 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Jeremiah Jordan >Assignee: Aleksey Yeschenko > Fix For: 3.0.x, 3.x > > > Super Columns are broken after upgrading to cassandra-3.0 HEAD. The below > script shows this. > 2.1 cli output for get: > {code} > [default@test] get Sites[utf8('Bob')][utf8('attr')]['name'] as utf8; > => (name=name, value=Bob, timestamp=1469724504357000) > {code} > cqlsh: > {code} > [default@test] > key | blobAsText(column1) > --+- > 0x53696d6f6e |attr > 0x426f62 |attr > {code} > 3.0 cli: > {code} > [default@unknown] use test; > unconfigured table schema_columnfamilies > [default@test] get Sites[utf8('Bob')][utf8('attr')]['name'] as utf8; > null > [default@test] > {code} > cqlsh: > {code} > key | system.blobastext(column1) > --+-- > 0x53696d6f6e | \x00\x04attr\x00\x00\x04name\x00 > 0x426f62 | \x00\x04attr\x00\x00\x04name\x00 > {code} > Run this from a directory with cassandra-3.0 checked out and compiled > {code} > ccm create -n 2 -v 2.1.14 testsuper > echo "### Starting 2.1 ###" > ccm start > MYFILE=`mktemp` > echo "create keyspace test with placement_strategy = > 'org.apache.cassandra.locator.SimpleStrategy' and strategy_options = > {replication_factor:2}; > use test; > create column family Sites with column_type = 'Super' and comparator = > 'BytesType' and subcomparator='UTF8Type'; > set Sites[utf8('Simon')][utf8('attr')]['name'] = utf8('Simon'); > set Sites[utf8('Bob')][utf8('attr')]['name'] = utf8('Bob'); > get Sites[utf8('Bob')][utf8('attr')]['name'] as utf8;" > $MYFILE > ~/.ccm/repository/2.1.14/bin/cassandra-cli < $MYFILE > rm $MYFILE > ~/.ccm/repository/2.1.14/bin/nodetool -p 7100 flush > ~/.ccm/repository/2.1.14/bin/nodetool -p 7200 flush > ccm stop > # run from cassandra-3.0 checked out and compiled > ccm setdir > echo "### Starting Current Directory > ###" > ccm start > ./bin/nodetool -p 7100 upgradesstables > ./bin/nodetool -p 7200 upgradesstables > ./bin/nodetool -p 7100 enablethrift > ./bin/nodetool -p 7200 enablethrift > MYFILE=`mktemp` > echo "use test; > get Sites[utf8('Bob')][utf8('attr')]['name'] as utf8;" > $MYFILE > ~/.ccm/repository/2.1.14/bin/cassandra-cli < $MYFILE > rm $MYFILE > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-12008) Make decommission operations resumable
[ https://issues.apache.org/jira/browse/CASSANDRA-12008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15399170#comment-15399170 ] Kaide Mu commented on CASSANDRA-12008: -- bq. It seems getStreamedRanges is querying the AVAILABLE_RANGES table instead of STREAMED_RANGES, that's why is generating the Undefined column name operation error. Unbelievable, but yes it was the error, already fixed it, thanks! bq. Maybe it's not working because of the previous error? Perhaps it would help to add a unit test on StreamStateStoreTest to verify that updateStreamedRanges and getStreamedRanges is being populated correctly and working as expected. You can also add debug logs to troubleshoot. Another stupid error, wasn't adding {{StreamTransferTask}} to {{SessionCompleteEvent}}, fixed. bq. SystemKeyspace.getStreamedRanges is being called from inside a for-loop what may be inefficient, it's maybe better to retrieve it before and re-use it inside the loop. I've added a new strategy, please let me know what do you think about it. Some additional modifications, we are not going to pass description to {{StreamTransferTask}} constructor, if we do so it will raise an error because when task is created {{StreamResultFuture}} is not initialized yet, thus {{StreamSession.description()}} will return a null value at creation time. So instead we will obtain {{StreamSession}} from {{StreamTransferTask.getSession()}} when each {{StreamTransferTask}} is complete i.e when {{StreamStateStore.handleStreamEvent}} is invoked. All these means that we are going to only pass its responsible keyspace. Some minor details: Don't know if there's some problem with current implementation or there's something weird in the set-up, but it skips twice the same range: {quote} DEBUG [RMI TCP Connection(9)-127.0.0.1] 2016-07-29 12:48:36,301 StorageService.java:4556 - Range (3074457345618258602,-9223372036854775808] already in /127.0.0.3, skipping DEBUG [RMI TCP Connection(9)-127.0.0.1] 2016-07-29 12:48:36,301 StorageService.java:4556 - Range (3074457345618258602,-9223372036854775808] already in /127.0.0.3, skipping {quote} I think it's the set-up itself since {{StorageService.getChangedRangesForLeaving}} is also returning the same range twice {quote} DEBUG [RMI TCP Connection(9)-127.0.0.1] 2016-07-29 12:48:36,289 StorageService.java:2526 - Range (3074457345618258602,-9223372036854775808] will be responsibility of /127.0.0.3 DEBUG [RMI TCP Connection(9)-127.0.0.1] 2016-07-29 12:48:36,294 StorageService.java:2526 - Range (3074457345618258602,-9223372036854775808] will be responsibility of /127.0.0.3 {quote} You can find latest working patch via: https://github.com/apache/cassandra/compare/trunk...kdmu:trunk-12008?expand=1 > Make decommission operations resumable > -- > > Key: CASSANDRA-12008 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12008 > Project: Cassandra > Issue Type: Improvement > Components: Streaming and Messaging >Reporter: Tom van der Woerdt >Assignee: Kaide Mu >Priority: Minor > > We're dealing with large data sets (multiple terabytes per node) and > sometimes we need to add or remove nodes. These operations are very dependent > on the entire cluster being up, so while we're joining a new node (which > sometimes takes 6 hours or longer) a lot can go wrong and in a lot of cases > something does. > It would be great if the ability to retry streams was implemented. > Example to illustrate the problem : > {code} > 03:18 PM ~ $ nodetool decommission > error: Stream failed > -- StackTrace -- > org.apache.cassandra.streaming.StreamException: Stream failed > at > org.apache.cassandra.streaming.management.StreamEventJMXNotifier.onFailure(StreamEventJMXNotifier.java:85) > at com.google.common.util.concurrent.Futures$6.run(Futures.java:1310) > at > com.google.common.util.concurrent.MoreExecutors$DirectExecutor.execute(MoreExecutors.java:457) > at > com.google.common.util.concurrent.ExecutionList.executeListener(ExecutionList.java:156) > at > com.google.common.util.concurrent.ExecutionList.execute(ExecutionList.java:145) > at > com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:202) > at > org.apache.cassandra.streaming.StreamResultFuture.maybeComplete(StreamResultFuture.java:210) > at > org.apache.cassandra.streaming.StreamResultFuture.handleSessionComplete(StreamResultFuture.java:186) > at > org.apache.cassandra.streaming.StreamSession.closeSession(StreamSession.java:430) > at > org.apache.cassandra.streaming.StreamSession.complete(StreamSession.java:622) > at > org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:486) > at >
[jira] [Commented] (CASSANDRA-11990) Address rows rather than partitions in SASI
[ https://issues.apache.org/jira/browse/CASSANDRA-11990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15399144#comment-15399144 ] Sylvain Lebresne commented on CASSANDRA-11990: -- I'm not familiar enough with the SASI code at this point to have much opinions on the specifics of what is the best implementation choices. But it does sound like supporting other partitioners is fairly orthogonal to the original ticket intent so it should likely be left to a follow-up ticket (if only so we can focus on proper testing separately). And in general, the more incremental we can do stuff, the better. > Address rows rather than partitions in SASI > --- > > Key: CASSANDRA-11990 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11990 > Project: Cassandra > Issue Type: Improvement > Components: CQL >Reporter: Alex Petrov >Assignee: Alex Petrov > Attachments: perf.pdf, size_comparison.png > > > Currently, the lookup in SASI index would return the key position of the > partition. After the partition lookup, the rows are iterated and the > operators are applied in order to filter out ones that do not match. > bq. TokenTree which accepts variable size keys (such would enable different > partitioners, collections support, primary key indexing etc.), -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11990) Address rows rather than partitions in SASI
[ https://issues.apache.org/jira/browse/CASSANDRA-11990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15399120#comment-15399120 ] Alex Petrov commented on CASSANDRA-11990: - During several discussions it's been proposed to evaluate the support for different partitioners, since it'd help with wider SASI adoption and remove current limitation of Long tokens. I've evaluated the support, and can conclude that supporting the constant-size tokens can be included into the patch without large overhead. Patch was adjusted accordingly. There are still several failing tests, although they'll be fixed shortly. Support for variable-size tokens (for partitioners such as {{ByteOrderedPartitioner}} requires much larger time investment. My personal suggestion is to encode them with the size and avoid on-disk format changes. This will result into more complex iteration process for variable-size tokens, since we'll have to skip tokens depending on the size and won't be able to use simple multiplication for offset calculation. I've made a small patch / proof of concept for variable size tokens by adding `serializedSize` method into the token tree nodes, currently (for sakes of POC and in order to save some time), it was done by reusing the `serialize` function and passing a throwaway byte buffer, and calculating offsets by iterating and reading integers with token size. It worked just fine for simple cases. I'll mention that SASI code is written very well and offset calculation methods are very well isolated. Having that said, I'd suggest to leave the "algorithmic" heavy-lifting (variable token offset calculation) for the separate ticket to reduce the scope of current ticket. Since it's not going to require the on-disk format changes, we can safely postpone this work. Another thing that's been mentioned was is to include the column offset into clustering offset long. I'll be evaluating this proposal in terms of performance today. It seems that we can avoid increasing the size of {{long[]}} array that hold offsets and this change can help to avoid post-filtering alltogether. Additional optimisation (which, once again, could be left for the follow-up patch) is to avoid the second seek within the data file for cases when we are only querying columns that are indexed. This can be a significant performance improvement, although it'd be good to discuss whether such queries are widely used. cc [~slebresne] [~iamaleksey] [~jbellis] [~beobal] > Address rows rather than partitions in SASI > --- > > Key: CASSANDRA-11990 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11990 > Project: Cassandra > Issue Type: Improvement > Components: CQL >Reporter: Alex Petrov >Assignee: Alex Petrov > Attachments: perf.pdf, size_comparison.png > > > Currently, the lookup in SASI index would return the key position of the > partition. After the partition lookup, the rows are iterated and the > operators are applied in order to filter out ones that do not match. > bq. TokenTree which accepts variable size keys (such would enable different > partitioners, collections support, primary key indexing etc.), -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-12336) NullPointerException during compaction on table with static columns
[ https://issues.apache.org/jira/browse/CASSANDRA-12336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sylvain Lebresne updated CASSANDRA-12336: - Reviewer: Carl Yeksigian Status: Patch Available (was: Open) The patch for CASSANDRA-11988 wasn't thorough enough. While the iterator won't return {{null}} static rows, we may still have {{BaseRows.staticRow == null}} after a transformation, and since transformation are "chained", we can end up passing a {{null}} to an {{applytoStatic()}} call, which shouldn't really be done. Anyway, it's pretty easy to fix by guarding the call to {{applytoStatic()}} (much like the call to {{applyToRow()}} is for that matter). I was surprised the test added by {{CASSANDRA-11988}} didn't catch it but it appears to be a timing issue: even though the test sets {{gc_grace == 0}}, it should wait at least 1 second to make sure stuffs gets purged, and I get to reproduce consistently with that additional wait (including in the patch below). | [12336-3.0|https://github.com/pcmanus/cassandra/commits/12336-3.0] | [utests|http://cassci.datastax.com/job/pcmanus-12336-3.0-testall] | [dtests|http://cassci.datastax.com/job/pcmanus-12336-3.0-dtest] | | [12336-3.9|https://github.com/pcmanus/cassandra/commits/12336-3.9] | [utests|http://cassci.datastax.com/job/pcmanus-12336-3.9-testall] | [dtests|http://cassci.datastax.com/job/pcmanus-12336-3.9-dtest] | > NullPointerException during compaction on table with static columns > --- > > Key: CASSANDRA-12336 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12336 > Project: Cassandra > Issue Type: Bug > Components: Compaction > Environment: cqlsh 5.0.1 > Cassandra 3.0.8-SNAPSHOT (3.0.x dev - a5cbb0) >Reporter: Evan Prothro >Assignee: Sylvain Lebresne > Fix For: 3.0.9 > > > After being affected by > https://issues.apache.org/jira/browse/CASSANDRA-11988, we built a5cbb0. > Compaction still fails with the following trace: > {code} > WARN [SharedPool-Worker-2] 2016-07-28 10:51:56,111 > AbstractLocalAwareExecutorService.java:169 - Uncaught exception on thread > Thread[SharedPool-Worker-2,5,main]: {} > java.lang.RuntimeException: java.lang.NullPointerException > at > org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2453) > ~[main/:na] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > ~[na:1.8.0_72] > at > org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:164) > ~[main/:na] > at > org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:136) > [main/:na] > at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) > [main/:na] > at java.lang.Thread.run(Thread.java:745) [na:1.8.0_72] > Caused by: java.lang.NullPointerException: null > at > org.apache.cassandra.db.ReadCommand$1MetricRecording.applyToRow(ReadCommand.java:466) > ~[main/:na] > at > org.apache.cassandra.db.ReadCommand$1MetricRecording.applyToStatic(ReadCommand.java:460) > ~[main/:na] > at org.apache.cassandra.db.transform.BaseRows.add(BaseRows.java:105) > ~[main/:na] > at > org.apache.cassandra.db.transform.UnfilteredRows.add(UnfilteredRows.java:41) > ~[main/:na] > at > org.apache.cassandra.db.transform.Transformation.add(Transformation.java:156) > ~[main/:na] > at > org.apache.cassandra.db.transform.Transformation.apply(Transformation.java:122) > ~[main/:na] > at > org.apache.cassandra.db.ReadCommand$1MetricRecording.applyToPartition(ReadCommand.java:454) > ~[main/:na] > at > org.apache.cassandra.db.ReadCommand$1MetricRecording.applyToPartition(ReadCommand.java:438) > ~[main/:na] > at > org.apache.cassandra.db.transform.BasePartitions.hasNext(BasePartitions.java:96) > ~[main/:na] > at > org.apache.cassandra.db.partitions.UnfilteredPartitionIterators$Serializer.serialize(UnfilteredPartitionIterators.java:295) > ~[main/:na] > at > org.apache.cassandra.db.ReadResponse$LocalDataResponse.build(ReadResponse.java:145) > ~[main/:na] > at > org.apache.cassandra.db.ReadResponse$LocalDataResponse.(ReadResponse.java:138) > ~[main/:na] > at > org.apache.cassandra.db.ReadResponse$LocalDataResponse.(ReadResponse.java:134) > ~[main/:na] > at > org.apache.cassandra.db.ReadResponse.createDataResponse(ReadResponse.java:76) > ~[main/:na] > at > org.apache.cassandra.db.ReadCommand.createResponse(ReadCommand.java:320) > ~[main/:na] > at > org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:1796) > ~[main/:na] > at >
[jira] [Commented] (CASSANDRA-11960) Hints are not seekable
[ https://issues.apache.org/jira/browse/CASSANDRA-11960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15399102#comment-15399102 ] Branimir Lambov commented on CASSANDRA-11960: - I pushed an update to your branch [here|https://github.com/blambov/cassandra/tree/spodkowinski/WIP-11960] which replaces the single {{long}} position we stored with a mark object that includes compressed and uncompressed positions and can be used to seek quickly and accurately. This should be able to solve the problem for all hint file variations (plain, compressed, encrypted); we must test all of them though. I tried to go back to the old code in {{HintsDispatcher}} but had to turn off {{RETRY}} to make the tests work. I must admit I haven't yet looked closely at all the changes you made and don't understand why this should be necessary. > Hints are not seekable > -- > > Key: CASSANDRA-11960 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11960 > Project: Cassandra > Issue Type: Bug >Reporter: Robert Stupp >Assignee: Stefan Podkowinski > > Got the following error message on trunk. No idea how to reproduce. But the > only thing the (not overridden) seek method does is throwing this exception. > {code} > ERROR [HintsDispatcher:2] 2016-06-05 18:51:09,397 CassandraDaemon.java:222 - > Exception in thread Thread[HintsDispatcher:2,1,main] > java.lang.UnsupportedOperationException: Hints are not seekable. > at org.apache.cassandra.hints.HintsReader.seek(HintsReader.java:114) > ~[main/:na] > at > org.apache.cassandra.hints.HintsDispatcher.seek(HintsDispatcher.java:79) > ~[main/:na] > at > org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.deliver(HintsDispatchExecutor.java:257) > ~[main/:na] > at > org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.dispatch(HintsDispatchExecutor.java:242) > ~[main/:na] > at > org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.dispatch(HintsDispatchExecutor.java:220) > ~[main/:na] > at > org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.run(HintsDispatchExecutor.java:199) > ~[main/:na] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > ~[na:1.8.0_91] > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > ~[na:1.8.0_91] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > ~[na:1.8.0_91] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > [na:1.8.0_91] > at java.lang.Thread.run(Thread.java:745) [na:1.8.0_91] > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-12215) NullPointerException during Compaction
[ https://issues.apache.org/jira/browse/CASSANDRA-12215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15399099#comment-15399099 ] Sylvain Lebresne commented on CASSANDRA-12215: -- It's the same trace than in CASSANDRA-12336, so let's follow up there (but I have all the information I need to fix the problem now). > NullPointerException during Compaction > -- > > Key: CASSANDRA-12215 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12215 > Project: Cassandra > Issue Type: Bug > Components: Compaction > Environment: Cassandra 3.0.8, cqlsh 5.0.1 >Reporter: Hau Phan > Fix For: 3.0.x > > > Running 3.0.8 on a single standalone node with cqlsh 5.0.1, the keyspace RF = > 1 and class SimpleStrategy. > Attempting to run a 'select * from ' and receiving this error: > ReadFailure: code=1300 [Replica(s) failed to execute read] message="Operation > failed - received 0 responses and 1 failures" info={'failures': 1, > 'received_responses': 0, 'required_responses': 1, 'consistency': 'ONE'} > Cassandra system.log prints this: > {code} > ERROR [CompactionExecutor:5] 2016-07-15 13:42:13,219 CassandraDaemon.java:201 > - Exception in thread Thread[CompactionExecutor:5,1,main] > java.lang.NullPointerException: null > at > org.apache.cassandra.db.transform.UnfilteredRows.isEmpty(UnfilteredRows.java:58) > ~[apache-cassandra-3.0.8.jar:3.0.8] > at > org.apache.cassandra.db.partitions.PurgeFunction.applyToPartition(PurgeFunction.java:64) > ~[apache-cassandra-3.0.8.jar:3.0.8] > at > org.apache.cassandra.db.partitions.PurgeFunction.applyToPartition(PurgeFunction.java:24) > ~[apache-cassandra-3.0.8.jar:3.0.8] > at > org.apache.cassandra.db.transform.BasePartitions.hasNext(BasePartitions.java:96) > ~[apache-cassandra-3.0.8.jar:3.0.8] > at > org.apache.cassandra.db.compaction.CompactionIterator.hasNext(CompactionIterator.java:226) > ~[apache-cassandra-3.0.8.jar:3.0.8] > at > org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:177) > ~[apache-cassandra-3.0.8.jar:3.0.8] > at > org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) > ~[apache-cassandra-3.0.8.jar:3.0.8] > at > org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:78) > ~[apache-cassandra-3.0.8.jar:3.0.8] > at > org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:60) > ~[apache-cassandra-3.0.8.jar:3.0.8] > at > org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionCandidate.run(CompactionManager.java:263) > ~[apache-cassandra-3.0.8.jar:3.0.8] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > ~[na:1.8.0_65] > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > ~[na:1.8.0_65] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > ~[na:1.8.0_65] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > [na:1.8.0_65] > at java.lang.Thread.run(Thread.java:745) [na:1.8.0_65] > {code} > Doing a sstabledump -d shows a few rows with the column value of > "", telling me compaction doesn't seem to be working correctly. > # nodetool compactionstats > pending tasks: 1 > attempting to run a compaction gets: > # nodetool compact > error: null > -- StackTrace -- > java.lang.NullPointerException > at > org.apache.cassandra.db.transform.UnfilteredRows.isEmpty(UnfilteredRows.java:58) > at > org.apache.cassandra.db.partitions.PurgeFunction.applyToPartition(PurgeFunction.java:64) > at > org.apache.cassandra.db.partitions.PurgeFunction.applyToPartition(PurgeFunction.java:24) > at > org.apache.cassandra.db.transform.BasePartitions.hasNext(BasePartitions.java:96) > at > org.apache.cassandra.db.compaction.CompactionIterator.hasNext(CompactionIterator.java:226) > at > org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:177) > at > org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) > at > org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:78) > at > org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:60) > at > org.apache.cassandra.db.compaction.CompactionManager$8.runMayThrow(CompactionManager.java:606) > at > org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at >
[jira] [Assigned] (CASSANDRA-12336) NullPointerException during compaction on table with static columns
[ https://issues.apache.org/jira/browse/CASSANDRA-12336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sylvain Lebresne reassigned CASSANDRA-12336: Assignee: Sylvain Lebresne > NullPointerException during compaction on table with static columns > --- > > Key: CASSANDRA-12336 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12336 > Project: Cassandra > Issue Type: Bug > Components: Compaction > Environment: cqlsh 5.0.1 > Cassandra 3.0.8-SNAPSHOT (3.0.x dev - a5cbb0) >Reporter: Evan Prothro >Assignee: Sylvain Lebresne > Fix For: 3.0.9 > > > After being affected by > https://issues.apache.org/jira/browse/CASSANDRA-11988, we built a5cbb0. > Compaction still fails with the following trace: > {code} > WARN [SharedPool-Worker-2] 2016-07-28 10:51:56,111 > AbstractLocalAwareExecutorService.java:169 - Uncaught exception on thread > Thread[SharedPool-Worker-2,5,main]: {} > java.lang.RuntimeException: java.lang.NullPointerException > at > org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2453) > ~[main/:na] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > ~[na:1.8.0_72] > at > org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:164) > ~[main/:na] > at > org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:136) > [main/:na] > at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) > [main/:na] > at java.lang.Thread.run(Thread.java:745) [na:1.8.0_72] > Caused by: java.lang.NullPointerException: null > at > org.apache.cassandra.db.ReadCommand$1MetricRecording.applyToRow(ReadCommand.java:466) > ~[main/:na] > at > org.apache.cassandra.db.ReadCommand$1MetricRecording.applyToStatic(ReadCommand.java:460) > ~[main/:na] > at org.apache.cassandra.db.transform.BaseRows.add(BaseRows.java:105) > ~[main/:na] > at > org.apache.cassandra.db.transform.UnfilteredRows.add(UnfilteredRows.java:41) > ~[main/:na] > at > org.apache.cassandra.db.transform.Transformation.add(Transformation.java:156) > ~[main/:na] > at > org.apache.cassandra.db.transform.Transformation.apply(Transformation.java:122) > ~[main/:na] > at > org.apache.cassandra.db.ReadCommand$1MetricRecording.applyToPartition(ReadCommand.java:454) > ~[main/:na] > at > org.apache.cassandra.db.ReadCommand$1MetricRecording.applyToPartition(ReadCommand.java:438) > ~[main/:na] > at > org.apache.cassandra.db.transform.BasePartitions.hasNext(BasePartitions.java:96) > ~[main/:na] > at > org.apache.cassandra.db.partitions.UnfilteredPartitionIterators$Serializer.serialize(UnfilteredPartitionIterators.java:295) > ~[main/:na] > at > org.apache.cassandra.db.ReadResponse$LocalDataResponse.build(ReadResponse.java:145) > ~[main/:na] > at > org.apache.cassandra.db.ReadResponse$LocalDataResponse.(ReadResponse.java:138) > ~[main/:na] > at > org.apache.cassandra.db.ReadResponse$LocalDataResponse.(ReadResponse.java:134) > ~[main/:na] > at > org.apache.cassandra.db.ReadResponse.createDataResponse(ReadResponse.java:76) > ~[main/:na] > at > org.apache.cassandra.db.ReadCommand.createResponse(ReadCommand.java:320) > ~[main/:na] > at > org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:1796) > ~[main/:na] > at > org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2449) > ~[main/:na] > ... 5 common frames omitted > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-12282) SSTablesIteratedTest.testDeletionOnIndexedSSTableASC-compression failure
[ https://issues.apache.org/jira/browse/CASSANDRA-12282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15399098#comment-15399098 ] Stefania commented on CASSANDRA-12282: -- When this test fails it's because there is an unexpected sstable, and this is caused by a flush operation that is triggered by a schema change. The clean-up tasks in CQLTester.afterTest() are causing this schema change, and they are currently asynchronous: {code} DEBUG [OptionalTasks:1] 2016-07-29 09:52:30,992 java.lang.Thread.getStackTrace(Thread.java:1552) org.apache.cassandra.db.ColumnFamilyStore.getCurrentStackTrace(ColumnFamilyStore.java:866) org.apache.cassandra.db.ColumnFamilyStore.logFlush(ColumnFamilyStore.java:896) org.apache.cassandra.db.ColumnFamilyStore.switchMemtable(ColumnFamilyStore.java:854) org.apache.cassandra.db.ColumnFamilyStore.switchMemtableIfCurrent(ColumnFamilyStore.java:838) org.apache.cassandra.db.ColumnFamilyStore.forceFlush(ColumnFamilyStore.java:921) org.apache.cassandra.db.commitlog.AbstractCommitLogSegmentManager.flushDataFrom(AbstractCommitLogSegmentManager.java:452) org.apache.cassandra.db.commitlog.AbstractCommitLogSegmentManager.forceRecycleAll(AbstractCommitLogSegmentManager.java:314) org.apache.cassandra.db.commitlog.CommitLog.forceRecycleAllSegments(CommitLog.java:220) org.apache.cassandra.config.Schema.dropTable(Schema.java:692) org.apache.cassandra.schema.SchemaKeyspace.lambda$updateKeyspace$376(SchemaKeyspace.java:1343) org.apache.cassandra.schema.SchemaKeyspace$$Lambda$162/1250499735.accept(Unknown Source) java.util.HashMap$Values.forEach(HashMap.java:972) java.util.Collections$UnmodifiableCollection.forEach(Collections.java:1080) org.apache.cassandra.schema.SchemaKeyspace.updateKeyspace(SchemaKeyspace.java:1343) org.apache.cassandra.schema.SchemaKeyspace.mergeSchema(SchemaKeyspace.java:1313) org.apache.cassandra.service.MigrationManager.announce(MigrationManager.java:512) org.apache.cassandra.service.MigrationManager.announceColumnFamilyDrop(MigrationManager.java:466) org.apache.cassandra.cql3.statements.DropTableStatement.announceMigration(DropTableStatement.java:93) org.apache.cassandra.cql3.statements.SchemaAlteringStatement.executeInternal(SchemaAlteringStatement.java:120) org.apache.cassandra.cql3.CQLTester.schemaChange(CQLTester.java:669) org.apache.cassandra.cql3.CQLTester$2.run(CQLTester.java:294) java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) java.util.concurrent.FutureTask.run(FutureTask.java:266) java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) java.lang.Thread.run(Thread.java:745) {code} Note the trace at CQLTester.java line 294. The cleanup operations need to be asynchronous to reduce the runtime of CQL tests, see CASSANDRA-7327. However, this specific test doesn't need to drop all previous tables every time a single test is run. So I'm thinking of adding an opt-out mechanism to the cleanup done after each test, in which case we would only clean up after the entire test suite has executed. > SSTablesIteratedTest.testDeletionOnIndexedSSTableASC-compression failure > > > Key: CASSANDRA-12282 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12282 > Project: Cassandra > Issue Type: Test >Reporter: Joshua McKenzie >Assignee: Stefania > Labels: unittest > > Error Message > expected:<3> but was:<4> > Stacktrace > junit.framework.AssertionFailedError: expected:<3> but was:<4> > at > org.apache.cassandra.cql3.validation.miscellaneous.SSTablesIteratedTest.executeAndCheck(SSTablesIteratedTest.java:45) > at > org.apache.cassandra.cql3.validation.miscellaneous.SSTablesIteratedTest.testDeletionOnIndexedSSTableASC(SSTablesIteratedTest.java:348) > at > org.apache.cassandra.cql3.validation.miscellaneous.SSTablesIteratedTest.testDeletionOnIndexedSSTableASC(SSTablesIteratedTest.java:312) > [Failure|http://cassci.datastax.com/job/cassandra-3.9_testall/lastCompletedBuild/testReport/org.apache.cassandra.cql3.validation.miscellaneous/SSTablesIteratedTest/testDeletionOnIndexedSSTableASC_compression/] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-12236) RTE from new CDC column breaks in flight queries.
[ https://issues.apache.org/jira/browse/CASSANDRA-12236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15399095#comment-15399095 ] Sylvain Lebresne commented on CASSANDRA-12236: -- The upgrade test has finished, but something went wrong. Looking at the console output, at least some tests "seems" to have run successfully, but there is some timeout during the "POST BUILD TASK". I'm not entirely sure what to make of that so I'll restart the job in case that was a temporary env issue. > RTE from new CDC column breaks in flight queries. > - > > Key: CASSANDRA-12236 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12236 > Project: Cassandra > Issue Type: Bug >Reporter: Jeremiah Jordan >Assignee: Sylvain Lebresne > Fix For: 3.x > > Attachments: 12236.txt > > > This RTE is not harmless. It will cause the internode connection to break > which will cause all in flight requests between these nodes to die/timeout. > {noformat} > - Due to changes in schema migration handling and the storage format > after 3.0, you will > see error messages such as: > "java.lang.RuntimeException: Unknown column cdc during > deserialization" > in your system logs on a mixed-version cluster during upgrades. This > error message > is harmless and due to the 3.8 nodes having cdc added to their schema > tables while > the <3.8 nodes do not. This message should cease once all nodes are > upgraded to 3.8. > As always, refrain from schema changes during cluster upgrades. > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
svn commit: r1754493 - in /cassandra/site: publish/ publish/community/ publish/css/ publish/doc/ publish/doc/3.7/ publish/doc/3.7/architecture/ publish/doc/3.7/configuration/ publish/doc/3.7/cql/ publ
Author: slebresne Date: Fri Jul 29 09:59:38 2016 New Revision: 1754493 URL: http://svn.apache.org/viewvc?rev=1754493=rev Log: New website (that includes new documentation) [This commit notification would consist of 78 parts, which exceeds the limit of 50 ones, so it was shortened to the summary.]
[jira] [Commented] (CASSANDRA-9318) Bound the number of in-flight requests at the coordinator
[ https://issues.apache.org/jira/browse/CASSANDRA-9318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15398930#comment-15398930 ] Stefania commented on CASSANDRA-9318: - It's in pretty good shape now! :) Let's rebase and start the testing. > Bound the number of in-flight requests at the coordinator > - > > Key: CASSANDRA-9318 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9318 > Project: Cassandra > Issue Type: Improvement > Components: Local Write-Read Paths, Streaming and Messaging >Reporter: Ariel Weisberg >Assignee: Sergio Bossa > Attachments: 9318-3.0-nits-trailing-spaces.patch, backpressure.png, > limit.btm, no_backpressure.png > > > It's possible to somewhat bound the amount of load accepted into the cluster > by bounding the number of in-flight requests and request bytes. > An implementation might do something like track the number of outstanding > bytes and requests and if it reaches a high watermark disable read on client > connections until it goes back below some low watermark. > Need to make sure that disabling read on the client connection won't > introduce other issues. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-12236) RTE from new CDC column breaks in flight queries.
[ https://issues.apache.org/jira/browse/CASSANDRA-12236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15398914#comment-15398914 ] Sylvain Lebresne commented on CASSANDRA-12236: -- I fixed the unit test failures (mainly due to various minor errors in the change to {{RowUpdateBuilder}} I did in the tests) in a new commit. There is 2 dtests failures for cqlsh DESCRIBE but that's just because a consequence of this patch is that the {{cdc}} table properly won't be displayed by DESCRIBE by default, which is imo fine. I have a trivial fix of that locally that I'll push on commit. The rest of the dtest failures "seems" unrelated. I seem to be have been able to start upgrade tests on that last branch which I include below, but they haven't finished at the time of this writing so we'll see the results. | [12236-trunk|https://github.com/pcmanus/cassandra/commits/12236-trunk] | [utests|http://cassci.datastax.com/job/pcmanus-12236-trunk-testall] | [dtests|http://cassci.datastax.com/job/pcmanus-12236-trunk-dtest] | [upgrade tests|http://cassci.datastax.com/view/Dev/view/pcmanus/job/pcmanus-upgrade_12236-upgrade/] | > RTE from new CDC column breaks in flight queries. > - > > Key: CASSANDRA-12236 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12236 > Project: Cassandra > Issue Type: Bug >Reporter: Jeremiah Jordan >Assignee: Sylvain Lebresne > Fix For: 3.x > > Attachments: 12236.txt > > > This RTE is not harmless. It will cause the internode connection to break > which will cause all in flight requests between these nodes to die/timeout. > {noformat} > - Due to changes in schema migration handling and the storage format > after 3.0, you will > see error messages such as: > "java.lang.RuntimeException: Unknown column cdc during > deserialization" > in your system logs on a mixed-version cluster during upgrades. This > error message > is harmless and due to the 3.8 nodes having cdc added to their schema > tables while > the <3.8 nodes do not. This message should cease once all nodes are > upgraded to 3.8. > As always, refrain from schema changes during cluster upgrades. > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-11465) dtest failure in cql_tracing_test.TestCqlTracing.tracing_unknown_impl_test
[ https://issues.apache.org/jira/browse/CASSANDRA-11465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefania updated CASSANDRA-11465: - Component/s: Observability > dtest failure in cql_tracing_test.TestCqlTracing.tracing_unknown_impl_test > -- > > Key: CASSANDRA-11465 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11465 > Project: Cassandra > Issue Type: Bug > Components: Observability >Reporter: Philip Thompson >Assignee: Stefania > Labels: dtest > Fix For: 2.2.8, 3.0.9, 3.9 > > > Failing on the following assert, on trunk only: > {{self.assertEqual(len(errs[0]), 1)}} > Is not failing consistently. > example failure: > http://cassci.datastax.com/job/trunk_dtest/1087/testReport/cql_tracing_test/TestCqlTracing/tracing_unknown_impl_test > Failed on CassCI build trunk_dtest #1087 -- This message was sent by Atlassian JIRA (v6.3.4#6332)