date:20160729

[jira] [Updated] (CASSANDRA-12311) Propagate TombstoneOverwhelmingException to the client

2016-07-29 Thread Geoffrey Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Geoffrey Yu updated CASSANDRA-12311:

Attachment: 12311-trunk-v2.txt

I've attached an updated patch that removes the new exception and instead adds 
a new {{reason}} field within {{ReadFailureException}} that can be used to 
indicate why the read query failed.

> Propagate TombstoneOverwhelmingException to the client
> --
>
> Key: CASSANDRA-12311
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12311
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Geoffrey Yu
>Assignee: Geoffrey Yu
>Priority: Minor
> Fix For: 4.x
>
> Attachments: 12311-trunk-v2.txt, 12311-trunk.txt
>
>
> Right now if a data node fails to perform a read because it ran into a 
> {{TombstoneOverwhelmingException}}, it only responds back to the coordinator 
> node with a generic failure. Under this scheme, the coordinator won't be able 
> to know exactly why the request failed and subsequently the client only gets 
> a generic {{ReadFailureException}}. It would be useful to inform the client 
> that their read failed because we read too many tombstones. We should have 
> the data nodes reply with a failure type so the coordinator can pass this 
> information to the client.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (CASSANDRA-12349) Adding some new features to cqlsh

2016-07-29 Thread vin01 (JIRA)

vin01 created CASSANDRA-12349:
-

 Summary: Adding some new features to cqlsh
 Key: CASSANDRA-12349
 URL: https://issues.apache.org/jira/browse/CASSANDRA-12349
 Project: Cassandra
  Issue Type: New Feature
 Environment: All
Reporter: vin01
Priority: Minor


I will like to have following features in in cqlsh, I have made a patch to 
enable them as well.

1. Aliases.
2. Safe mode (prompt on delete,update,truncate,drop if safe_mode is true).
3. Press q to exit.

Its also shared here -> 
https://github.com/vineet01/cassandra/blob/trunk/new_features.txt

Example for aliases :-

cassandra@cqlsh> show 
 ;  ALIASES  HOST SESSION  VERSION  
cassandra@cqlsh> show ALIASES ;
Aliases :> {'dk': 'desc keyspaces;', 'sl': 'select * from'}

now if you type dk and press  it will auto complete it to "desc keyspace".

Adding an alias from shell :-

cassandra@cqlsh> alias slu=select * from login.user ;
Alias added successfully - sle:select * login.user ;
cassandra@cqlsh> show ALIASES ;
Aliases :> {'slu': 'select * from login.user ;', 'dk': 'desc keyspaces;', 'sl': 
'select * from'}
cassandra@cqlsh> sle
Expanded alias to> select * from login.user ;
 username | blacklisted | lastlogin | password   


Adding an alias directly in file :-

aliases will be kept in same cqlshrc file.
[aliases]
dk = desc keyspaces;
sl = select * from
sle = select * from login.user ;

now if we type just "sle" it will autocomplete rest of it and show next options.


Example of safe mode :-

cassandra@cqlsh> truncate login.user ;
Are you sure you want to do this? (y/n) > n
Not performing any action.

cassandra@cqlsh> updatee login.user set password=null;
Are you sure you want to do this? (y/n) > 
Not performing any action.

Initial commit :- 
https://github.com/vineet01/cassandra/commit/0bfce2ccfc610021a74a1f82ed24aa63e1b72fec

Current version :- https://github.com/vineet01/cassandra/blob/trunk/bin/cqlsh.py

Please review and suggest any improvements.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-9259) Bulk Reading from Cassandra

2016-07-29 Thread Stefania (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefania updated CASSANDRA-9259:

Attachment: no_vnodes.jpg
256_vnodes.jpg

> Bulk Reading from Cassandra
> ---
>
> Key: CASSANDRA-9259
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9259
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Compaction, CQL, Local Write-Read Paths, Streaming and 
> Messaging, Testing
>Reporter:  Brian Hess
>Assignee: Stefania
>Priority: Critical
> Fix For: 3.x
>
> Attachments: 256_vnodes.jpg, before_after.jpg, 
> bulk-read-benchmark.1.html, bulk-read-jfr-profiles.1.tar.gz, 
> bulk-read-jfr-profiles.2.tar.gz, no_vnodes.jpg, spark_benchmark_raw_data.zip
>
>
> This ticket is following on from the 2015 NGCC.  This ticket is designed to 
> be a place for discussing and designing an approach to bulk reading.
> The goal is to have a bulk reading path for Cassandra.  That is, a path 
> optimized to grab a large portion of the data for a table (potentially all of 
> it).  This is a core element in the Spark integration with Cassandra, and the 
> speed at which Cassandra can deliver bulk data to Spark is limiting the 
> performance of Spark-plus-Cassandra operations.  This is especially of 
> importance as Cassandra will (likely) leverage Spark for internal operations 
> (for example CASSANDRA-8234).
> The core CQL to consider is the following:
> SELECT a, b, c FROM myKs.myTable WHERE Token(partitionKey) > X AND 
> Token(partitionKey) <= Y
> Here, we choose X and Y to be contained within one token range (perhaps 
> considering the primary range of a node without vnodes, for example).  This 
> query pushes 50K-100K rows/sec, which is not very fast if we are doing bulk 
> operations via Spark (or other processing frameworks - ETL, etc).  There are 
> a few causes (e.g., inefficient paging).
> There are a few approaches that could be considered.  First, we consider a 
> new "Streaming Compaction" approach.  The key observation here is that a bulk 
> read from Cassandra is a lot like a major compaction, though instead of 
> outputting a new SSTable we would output CQL rows to a stream/socket/etc.  
> This would be similar to a CompactionTask, but would strip out some 
> unnecessary things in there (e.g., some of the indexing, etc). Predicates and 
> projections could also be encapsulated in this new "StreamingCompactionTask", 
> for example.
> Another approach would be an alternate storage format.  For example, we might 
> employ Parquet (just as an example) to store the same data as in the primary 
> Cassandra storage (aka SSTables).  This is akin to Global Indexes (an 
> alternate storage of the same data optimized for a particular query).  Then, 
> Cassandra can choose to leverage this alternate storage for particular CQL 
> queries (e.g., range scans).
> These are just 2 suggestions to get the conversation going.
> One thing to note is that it will be useful to have this storage segregated 
> by token range so that when you extract via these mechanisms you do not get 
> replications-factor numbers of copies of the data.  That will certainly be an 
> issue for some Spark operations (e.g., counting).  Thus, we will want 
> per-token-range storage (even for single disks), so this will likely leverage 
> CASSANDRA-6696 (though, we'll want to also consider the single disk case).
> It is also worth discussing what the success criteria is here.  It is 
> unlikely to be as fast as EDW or HDFS performance (though, that is still a 
> good goal), but being within some percentage of that performance should be 
> set as success.  For example, 2x as long as doing bulk operations on HDFS 
> with similar node count/size/etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-9259) Bulk Reading from Cassandra

2016-07-29 Thread Stefania (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefania updated CASSANDRA-9259:

Attachment: (was: 256_vnodes.jpg)

> Bulk Reading from Cassandra
> ---
>
> Key: CASSANDRA-9259
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9259
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Compaction, CQL, Local Write-Read Paths, Streaming and 
> Messaging, Testing
>Reporter:  Brian Hess
>Assignee: Stefania
>Priority: Critical
> Fix For: 3.x
>
> Attachments: before_after.jpg, bulk-read-benchmark.1.html, 
> bulk-read-jfr-profiles.1.tar.gz, bulk-read-jfr-profiles.2.tar.gz, 
> spark_benchmark_raw_data.zip
>
>
> This ticket is following on from the 2015 NGCC.  This ticket is designed to 
> be a place for discussing and designing an approach to bulk reading.
> The goal is to have a bulk reading path for Cassandra.  That is, a path 
> optimized to grab a large portion of the data for a table (potentially all of 
> it).  This is a core element in the Spark integration with Cassandra, and the 
> speed at which Cassandra can deliver bulk data to Spark is limiting the 
> performance of Spark-plus-Cassandra operations.  This is especially of 
> importance as Cassandra will (likely) leverage Spark for internal operations 
> (for example CASSANDRA-8234).
> The core CQL to consider is the following:
> SELECT a, b, c FROM myKs.myTable WHERE Token(partitionKey) > X AND 
> Token(partitionKey) <= Y
> Here, we choose X and Y to be contained within one token range (perhaps 
> considering the primary range of a node without vnodes, for example).  This 
> query pushes 50K-100K rows/sec, which is not very fast if we are doing bulk 
> operations via Spark (or other processing frameworks - ETL, etc).  There are 
> a few causes (e.g., inefficient paging).
> There are a few approaches that could be considered.  First, we consider a 
> new "Streaming Compaction" approach.  The key observation here is that a bulk 
> read from Cassandra is a lot like a major compaction, though instead of 
> outputting a new SSTable we would output CQL rows to a stream/socket/etc.  
> This would be similar to a CompactionTask, but would strip out some 
> unnecessary things in there (e.g., some of the indexing, etc). Predicates and 
> projections could also be encapsulated in this new "StreamingCompactionTask", 
> for example.
> Another approach would be an alternate storage format.  For example, we might 
> employ Parquet (just as an example) to store the same data as in the primary 
> Cassandra storage (aka SSTables).  This is akin to Global Indexes (an 
> alternate storage of the same data optimized for a particular query).  Then, 
> Cassandra can choose to leverage this alternate storage for particular CQL 
> queries (e.g., range scans).
> These are just 2 suggestions to get the conversation going.
> One thing to note is that it will be useful to have this storage segregated 
> by token range so that when you extract via these mechanisms you do not get 
> replications-factor numbers of copies of the data.  That will certainly be an 
> issue for some Spark operations (e.g., counting).  Thus, we will want 
> per-token-range storage (even for single disks), so this will likely leverage 
> CASSANDRA-6696 (though, we'll want to also consider the single disk case).
> It is also worth discussing what the success criteria is here.  It is 
> unlikely to be as fast as EDW or HDFS performance (though, that is still a 
> good goal), but being within some percentage of that performance should be 
> set as success.  For example, 2x as long as doing bulk operations on HDFS 
> with similar node count/size/etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-9259) Bulk Reading from Cassandra

2016-07-29 Thread Stefania (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefania updated CASSANDRA-9259:

Attachment: (was: no_vnodes.jpg)

> Bulk Reading from Cassandra
> ---
>
> Key: CASSANDRA-9259
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9259
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Compaction, CQL, Local Write-Read Paths, Streaming and 
> Messaging, Testing
>Reporter:  Brian Hess
>Assignee: Stefania
>Priority: Critical
> Fix For: 3.x
>
> Attachments: before_after.jpg, bulk-read-benchmark.1.html, 
> bulk-read-jfr-profiles.1.tar.gz, bulk-read-jfr-profiles.2.tar.gz, 
> spark_benchmark_raw_data.zip
>
>
> This ticket is following on from the 2015 NGCC.  This ticket is designed to 
> be a place for discussing and designing an approach to bulk reading.
> The goal is to have a bulk reading path for Cassandra.  That is, a path 
> optimized to grab a large portion of the data for a table (potentially all of 
> it).  This is a core element in the Spark integration with Cassandra, and the 
> speed at which Cassandra can deliver bulk data to Spark is limiting the 
> performance of Spark-plus-Cassandra operations.  This is especially of 
> importance as Cassandra will (likely) leverage Spark for internal operations 
> (for example CASSANDRA-8234).
> The core CQL to consider is the following:
> SELECT a, b, c FROM myKs.myTable WHERE Token(partitionKey) > X AND 
> Token(partitionKey) <= Y
> Here, we choose X and Y to be contained within one token range (perhaps 
> considering the primary range of a node without vnodes, for example).  This 
> query pushes 50K-100K rows/sec, which is not very fast if we are doing bulk 
> operations via Spark (or other processing frameworks - ETL, etc).  There are 
> a few causes (e.g., inefficient paging).
> There are a few approaches that could be considered.  First, we consider a 
> new "Streaming Compaction" approach.  The key observation here is that a bulk 
> read from Cassandra is a lot like a major compaction, though instead of 
> outputting a new SSTable we would output CQL rows to a stream/socket/etc.  
> This would be similar to a CompactionTask, but would strip out some 
> unnecessary things in there (e.g., some of the indexing, etc). Predicates and 
> projections could also be encapsulated in this new "StreamingCompactionTask", 
> for example.
> Another approach would be an alternate storage format.  For example, we might 
> employ Parquet (just as an example) to store the same data as in the primary 
> Cassandra storage (aka SSTables).  This is akin to Global Indexes (an 
> alternate storage of the same data optimized for a particular query).  Then, 
> Cassandra can choose to leverage this alternate storage for particular CQL 
> queries (e.g., range scans).
> These are just 2 suggestions to get the conversation going.
> One thing to note is that it will be useful to have this storage segregated 
> by token range so that when you extract via these mechanisms you do not get 
> replications-factor numbers of copies of the data.  That will certainly be an 
> issue for some Spark operations (e.g., counting).  Thus, we will want 
> per-token-range storage (even for single disks), so this will likely leverage 
> CASSANDRA-6696 (though, we'll want to also consider the single disk case).
> It is also worth discussing what the success criteria is here.  It is 
> unlikely to be as fast as EDW or HDFS performance (though, that is still a 
> good goal), but being within some percentage of that performance should be 
> set as success.  For example, 2x as long as doing bulk operations on HDFS 
> with similar node count/size/etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-9259) Bulk Reading from Cassandra

2016-07-29 Thread Stefania (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefania updated CASSANDRA-9259:

Attachment: before_after.jpg

> Bulk Reading from Cassandra
> ---
>
> Key: CASSANDRA-9259
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9259
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Compaction, CQL, Local Write-Read Paths, Streaming and 
> Messaging, Testing
>Reporter:  Brian Hess
>Assignee: Stefania
>Priority: Critical
> Fix For: 3.x
>
> Attachments: 256_vnodes.jpg, before_after.jpg, 
> bulk-read-benchmark.1.html, bulk-read-jfr-profiles.1.tar.gz, 
> bulk-read-jfr-profiles.2.tar.gz, no_vnodes.jpg, spark_benchmark_raw_data.zip
>
>
> This ticket is following on from the 2015 NGCC.  This ticket is designed to 
> be a place for discussing and designing an approach to bulk reading.
> The goal is to have a bulk reading path for Cassandra.  That is, a path 
> optimized to grab a large portion of the data for a table (potentially all of 
> it).  This is a core element in the Spark integration with Cassandra, and the 
> speed at which Cassandra can deliver bulk data to Spark is limiting the 
> performance of Spark-plus-Cassandra operations.  This is especially of 
> importance as Cassandra will (likely) leverage Spark for internal operations 
> (for example CASSANDRA-8234).
> The core CQL to consider is the following:
> SELECT a, b, c FROM myKs.myTable WHERE Token(partitionKey) > X AND 
> Token(partitionKey) <= Y
> Here, we choose X and Y to be contained within one token range (perhaps 
> considering the primary range of a node without vnodes, for example).  This 
> query pushes 50K-100K rows/sec, which is not very fast if we are doing bulk 
> operations via Spark (or other processing frameworks - ETL, etc).  There are 
> a few causes (e.g., inefficient paging).
> There are a few approaches that could be considered.  First, we consider a 
> new "Streaming Compaction" approach.  The key observation here is that a bulk 
> read from Cassandra is a lot like a major compaction, though instead of 
> outputting a new SSTable we would output CQL rows to a stream/socket/etc.  
> This would be similar to a CompactionTask, but would strip out some 
> unnecessary things in there (e.g., some of the indexing, etc). Predicates and 
> projections could also be encapsulated in this new "StreamingCompactionTask", 
> for example.
> Another approach would be an alternate storage format.  For example, we might 
> employ Parquet (just as an example) to store the same data as in the primary 
> Cassandra storage (aka SSTables).  This is akin to Global Indexes (an 
> alternate storage of the same data optimized for a particular query).  Then, 
> Cassandra can choose to leverage this alternate storage for particular CQL 
> queries (e.g., range scans).
> These are just 2 suggestions to get the conversation going.
> One thing to note is that it will be useful to have this storage segregated 
> by token range so that when you extract via these mechanisms you do not get 
> replications-factor numbers of copies of the data.  That will certainly be an 
> issue for some Spark operations (e.g., counting).  Thus, we will want 
> per-token-range storage (even for single disks), so this will likely leverage 
> CASSANDRA-6696 (though, we'll want to also consider the single disk case).
> It is also worth discussing what the success criteria is here.  It is 
> unlikely to be as fast as EDW or HDFS performance (though, that is still a 
> good goal), but being within some percentage of that performance should be 
> set as success.  For example, 2x as long as doing bulk operations on HDFS 
> with similar node count/size/etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-9259) Bulk Reading from Cassandra

2016-07-29 Thread Stefania (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefania updated CASSANDRA-9259:

Attachment: (was: before_after.jpg)

> Bulk Reading from Cassandra
> ---
>
> Key: CASSANDRA-9259
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9259
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Compaction, CQL, Local Write-Read Paths, Streaming and 
> Messaging, Testing
>Reporter:  Brian Hess
>Assignee: Stefania
>Priority: Critical
> Fix For: 3.x
>
> Attachments: 256_vnodes.jpg, before_after.jpg, 
> bulk-read-benchmark.1.html, bulk-read-jfr-profiles.1.tar.gz, 
> bulk-read-jfr-profiles.2.tar.gz, no_vnodes.jpg, spark_benchmark_raw_data.zip
>
>
> This ticket is following on from the 2015 NGCC.  This ticket is designed to 
> be a place for discussing and designing an approach to bulk reading.
> The goal is to have a bulk reading path for Cassandra.  That is, a path 
> optimized to grab a large portion of the data for a table (potentially all of 
> it).  This is a core element in the Spark integration with Cassandra, and the 
> speed at which Cassandra can deliver bulk data to Spark is limiting the 
> performance of Spark-plus-Cassandra operations.  This is especially of 
> importance as Cassandra will (likely) leverage Spark for internal operations 
> (for example CASSANDRA-8234).
> The core CQL to consider is the following:
> SELECT a, b, c FROM myKs.myTable WHERE Token(partitionKey) > X AND 
> Token(partitionKey) <= Y
> Here, we choose X and Y to be contained within one token range (perhaps 
> considering the primary range of a node without vnodes, for example).  This 
> query pushes 50K-100K rows/sec, which is not very fast if we are doing bulk 
> operations via Spark (or other processing frameworks - ETL, etc).  There are 
> a few causes (e.g., inefficient paging).
> There are a few approaches that could be considered.  First, we consider a 
> new "Streaming Compaction" approach.  The key observation here is that a bulk 
> read from Cassandra is a lot like a major compaction, though instead of 
> outputting a new SSTable we would output CQL rows to a stream/socket/etc.  
> This would be similar to a CompactionTask, but would strip out some 
> unnecessary things in there (e.g., some of the indexing, etc). Predicates and 
> projections could also be encapsulated in this new "StreamingCompactionTask", 
> for example.
> Another approach would be an alternate storage format.  For example, we might 
> employ Parquet (just as an example) to store the same data as in the primary 
> Cassandra storage (aka SSTables).  This is akin to Global Indexes (an 
> alternate storage of the same data optimized for a particular query).  Then, 
> Cassandra can choose to leverage this alternate storage for particular CQL 
> queries (e.g., range scans).
> These are just 2 suggestions to get the conversation going.
> One thing to note is that it will be useful to have this storage segregated 
> by token range so that when you extract via these mechanisms you do not get 
> replications-factor numbers of copies of the data.  That will certainly be an 
> issue for some Spark operations (e.g., counting).  Thus, we will want 
> per-token-range storage (even for single disks), so this will likely leverage 
> CASSANDRA-6696 (though, we'll want to also consider the single disk case).
> It is also worth discussing what the success criteria is here.  It is 
> unlikely to be as fast as EDW or HDFS performance (though, that is still a 
> good goal), but being within some percentage of that performance should be 
> set as success.  For example, 2x as long as doing bulk operations on HDFS 
> with similar node count/size/etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-11521) Implement streaming for bulk read requests

2016-07-29 Thread Stefania (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefania updated CASSANDRA-11521:
-
Status: Patch Available  (was: In Progress)

> Implement streaming for bulk read requests
> --
>
> Key: CASSANDRA-11521
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11521
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Local Write-Read Paths
>Reporter: Stefania
>Assignee: Stefania
>  Labels: client-impacting, protocolv5
> Fix For: 3.x
>
> Attachments: final-patch-jfr-profiles-1.zip
>
>
> Allow clients to stream data from a C* host, bypassing the coordination layer 
> and eliminating the need to query individual pages one by one.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-11521) Implement streaming for bulk read requests

2016-07-29 Thread Stefania (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-11521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15400395#comment-15400395
 ] 

Stefania commented on CASSANDRA-11521:
--

The patch is ready for review:

||trunk|[patch|https://github.com/stef1927/cassandra/commits/11521]|[testall|http://cassci.datastax.com/view/Dev/view/stef1927/job/stef1927-11521-testall/]|[dtest|http://cassci.datastax.com/view/Dev/view/stef1927/job/stef1927-11521-dtest/]|

There are also the [driver 
patch|https://github.com/stef1927/java-driver/commits/11521] and the [spark 
connector 
patch|https://github.com/stef1927/spark-cassandra-connector/commits/11521]. For 
these I plan to create tickets for the respective projects once the native 
protocol changes have been finalized.

A [design 
document|https://docs.google.com/document/d/1YqKGSU1P8EJIfMrO--29VaSoCy5mUu-ePfAiIOLsY7o/edit]
 is also available.

The Spark benchmark results are available in [this 
comment|https://issues.apache.org/jira/browse/CASSANDRA-9259?focusedCommentId=15400394=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15400394]
 on the parent ticket. The final patch is slightly better than the 
proof-of-concept, and the asynchronous paging mechanism significantly 
outperforms the existing mechanism for large data sets.

I've also repeated some cstar_perf tests to rule out performance regressions 
with ordinary queries, which are not in the optimized path:

* Single partition queries (default cassandra-stress read command) at 
CL.LOCAL_ONE (the cassandra-stress default): [first 
run|http://cstar.datastax.com/graph?command=one_job=8b1f1d54-53e4-11e6-85af-0256e416528f=99th_latency=2_read=1_aggregates=true=0=276.98=0=22.33],
 [second run with swapped revision's 
order|http://cstar.datastax.com/graph?command=one_job=1abd3fe4-545e-11e6-8920-0256e416528f=op_rate=2_read=1_aggregates=true=0=277.86=0=243951.4],
 [an old 
run|http://cstar.datastax.com/graph?command=one_job=16cef080-53dc-11e6-b967-0256e416528f=op_rate=2_read=1_aggregates=true=0=282.92=0=249571.3]
 done before enabling token aware routing in cassandra stress.

* Single partition queries at CL.ALL: [unique 
run|http://cstar.datastax.com/graph?command=one_job=e2155410-5462-11e6-9cd7-0256e416528f=op_rate=2_read=1_aggregates=true=0=277.75=0=246123.9]

There is a gap of 3.6K ops/second without token aware routing and 1K with 
CL=ALL. With token aware routing the patch is instead 1K ops / second faster. 
These differences must arise from the refactoring in select statement. They are 
very small differences, the test error seems to be around 0.5K, but I can look 
into it further if there are concerns. 

> Implement streaming for bulk read requests
> --
>
> Key: CASSANDRA-11521
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11521
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Local Write-Read Paths
>Reporter: Stefania
>Assignee: Stefania
>  Labels: client-impacting, protocolv5
> Fix For: 3.x
>
> Attachments: final-patch-jfr-profiles-1.zip
>
>
> Allow clients to stream data from a C* host, bypassing the coordination layer 
> and eliminating the need to query individual pages one by one.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9259) Bulk Reading from Cassandra

2016-07-29 Thread Stefania (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15400394#comment-15400394
 ] 

Stefania commented on CASSANDRA-9259:
-

Now that CASSANDRA-11521 is ready for review, I've repeated the Spark benchmark 
defined by CASSANDRA-11542 using schema 1:

{code}
CREATE TABLE ks.schema1 (id TEXT, timestamp BIGINT, val1 INT, val2 INT, val3 
INT, val4 INT, val5 INT, val6 INT, val7 INT, val8 INT, val9 INT, val10 INT, 
PRIMARY KEY (id, timestamp))
{code}

and schema 3:

{code}
CREATE TABLE ks.schema3 (id TEXT, timestamp BIGINT, data TEXT, PRIMARY KEY (id, 
timestamp))
{code}

The benchmark measures how many seconds it takes to count rows and to find the 
maximum of two columns for each row, where rows are retrieved either via Spark 
RDDs or Data Frames (DFs). The most significant difference between RDD and DF 
tests is that in the DF tests only the two columns of interest to the Spark 
query are retrieved, whilst in the RDD tests the entire data set is retrieved. 
The data is either stored in Cassandra or in HDFS using CSV or Parquet files.

More details on the benchmark are available 
[here|https://issues.apache.org/jira/browse/CASSANDRA-11542?focusedCommentId=15249213=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15249213]
 and the code is available [here|https://github.com/stef1927/spark-load-perf].

Here is the comparison with the results of the benchmark that was run on 6th 
May with 15M rows, as described in [this 
comment|https://issues.apache.org/jira/browse/CASSANDRA-11542?focusedCommentId=15273884=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15273884].
 We can see that the final results are consistent with the proof of concept, 
which was presented at the Cassandra NGCC conference.

!before_after.jpg!

* C* old: is TRUNK with no optimizations (at 
c662d876b95d67a911dfe549d8a0d38ee6fbb904), and the Spark Connector without 
SPARK-C383
* C* POC: is the [proof-of-concept 
patch|https://github.com/stef1927/cassandra/commits/9259], and the Spark 
Connector with an [earlier 
version|https://github.com/stef1927/spark-cassandra-connector/commits/9259] of 
SPARK-C383 
* C* async: is the CASSANDRA-11521 patch, with results delivered to the client 
via the new asynchronous paging mechanism
* C* sync: is the CASSANDRA-11521 patch, with results delivered to the client 
via the existing synchronous paging mechanism

Here are the results run over several incremental data sets at 15M, 30M, 60M 
and 120M rows with 256 VNODES:

!256_vnodes.jpg!

Below are the results run over several incremental data sets at 1 15M, 30M, 60M 
and 120M rows without VNODES:

!no_vnodes.jpg!


The raw data is attached [^spark_benchmark_raw_data.zip].

h5. Conclusions

* Overall the duration of the 15M row test was improved by 65% (from about 40 
to 14 seconds) for schema 1 and by 56% (from 23 to 10 seconds) for schema 3.

* The new asynchronous paging mechanism significantly outperforms the existing 
mechanism with large data sets. For example, for schema 1 and 120M rows, it is 
approximately 30% faster. In order to achieve this, it is however necessary 
that the driver reserves a connection per asynchronous paging request, sharing 
connections degrades performance significantly and makes it no better than the 
existing mechanism.

* CSV still outperforms C* for schema 1 RDD tests. However, for DF tests and 
schema 3 RDD tests, C* is now on par or faster than CSV. This indicates that 
the number of cells in CQL rows continues to impact performance.

* Parquet is in a league of its own due to its efficient columnar format. It 
should however be noted that it may be storing the row count in metadata. A 
more meaningful benchmark could have been obtained had we excluded the row 
count from the time measurements.

> Bulk Reading from Cassandra
> ---
>
> Key: CASSANDRA-9259
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9259
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Compaction, CQL, Local Write-Read Paths, Streaming and 
> Messaging, Testing
>Reporter:  Brian Hess
>Assignee: Stefania
>Priority: Critical
> Fix For: 3.x
>
> Attachments: 256_vnodes.jpg, before_after.jpg, 
> bulk-read-benchmark.1.html, bulk-read-jfr-profiles.1.tar.gz, 
> bulk-read-jfr-profiles.2.tar.gz, no_vnodes.jpg, spark_benchmark_raw_data.zip
>
>
> This ticket is following on from the 2015 NGCC.  This ticket is designed to 
> be a place for discussing and designing an approach to bulk reading.
> The goal is to have a bulk reading path for Cassandra.  That is, a path 
> optimized to grab a large portion of the data for a table (potentially all of 
> it).  This is a core element in the Spark integration with Cassandra, and the 
> speed at which Cassandra

[jira] [Updated] (CASSANDRA-9259) Bulk Reading from Cassandra

2016-07-29 Thread Stefania (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefania updated CASSANDRA-9259:

Attachment: no_vnodes.jpg
256_vnodes.jpg

> Bulk Reading from Cassandra
> ---
>
> Key: CASSANDRA-9259
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9259
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Compaction, CQL, Local Write-Read Paths, Streaming and 
> Messaging, Testing
>Reporter:  Brian Hess
>Assignee: Stefania
>Priority: Critical
> Fix For: 3.x
>
> Attachments: 256_vnodes.jpg, before_after.jpg, 
> bulk-read-benchmark.1.html, bulk-read-jfr-profiles.1.tar.gz, 
> bulk-read-jfr-profiles.2.tar.gz, no_vnodes.jpg, spark_benchmark_raw_data.zip
>
>
> This ticket is following on from the 2015 NGCC.  This ticket is designed to 
> be a place for discussing and designing an approach to bulk reading.
> The goal is to have a bulk reading path for Cassandra.  That is, a path 
> optimized to grab a large portion of the data for a table (potentially all of 
> it).  This is a core element in the Spark integration with Cassandra, and the 
> speed at which Cassandra can deliver bulk data to Spark is limiting the 
> performance of Spark-plus-Cassandra operations.  This is especially of 
> importance as Cassandra will (likely) leverage Spark for internal operations 
> (for example CASSANDRA-8234).
> The core CQL to consider is the following:
> SELECT a, b, c FROM myKs.myTable WHERE Token(partitionKey) > X AND 
> Token(partitionKey) <= Y
> Here, we choose X and Y to be contained within one token range (perhaps 
> considering the primary range of a node without vnodes, for example).  This 
> query pushes 50K-100K rows/sec, which is not very fast if we are doing bulk 
> operations via Spark (or other processing frameworks - ETL, etc).  There are 
> a few causes (e.g., inefficient paging).
> There are a few approaches that could be considered.  First, we consider a 
> new "Streaming Compaction" approach.  The key observation here is that a bulk 
> read from Cassandra is a lot like a major compaction, though instead of 
> outputting a new SSTable we would output CQL rows to a stream/socket/etc.  
> This would be similar to a CompactionTask, but would strip out some 
> unnecessary things in there (e.g., some of the indexing, etc). Predicates and 
> projections could also be encapsulated in this new "StreamingCompactionTask", 
> for example.
> Another approach would be an alternate storage format.  For example, we might 
> employ Parquet (just as an example) to store the same data as in the primary 
> Cassandra storage (aka SSTables).  This is akin to Global Indexes (an 
> alternate storage of the same data optimized for a particular query).  Then, 
> Cassandra can choose to leverage this alternate storage for particular CQL 
> queries (e.g., range scans).
> These are just 2 suggestions to get the conversation going.
> One thing to note is that it will be useful to have this storage segregated 
> by token range so that when you extract via these mechanisms you do not get 
> replications-factor numbers of copies of the data.  That will certainly be an 
> issue for some Spark operations (e.g., counting).  Thus, we will want 
> per-token-range storage (even for single disks), so this will likely leverage 
> CASSANDRA-6696 (though, we'll want to also consider the single disk case).
> It is also worth discussing what the success criteria is here.  It is 
> unlikely to be as fast as EDW or HDFS performance (though, that is still a 
> good goal), but being within some percentage of that performance should be 
> set as success.  For example, 2x as long as doing bulk operations on HDFS 
> with similar node count/size/etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-9259) Bulk Reading from Cassandra

2016-07-29 Thread Stefania (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefania updated CASSANDRA-9259:

Attachment: (was: 256_vnodes.jpg)

> Bulk Reading from Cassandra
> ---
>
> Key: CASSANDRA-9259
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9259
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Compaction, CQL, Local Write-Read Paths, Streaming and 
> Messaging, Testing
>Reporter:  Brian Hess
>Assignee: Stefania
>Priority: Critical
> Fix For: 3.x
>
> Attachments: before_after.jpg, bulk-read-benchmark.1.html, 
> bulk-read-jfr-profiles.1.tar.gz, bulk-read-jfr-profiles.2.tar.gz, 
> spark_benchmark_raw_data.zip
>
>
> This ticket is following on from the 2015 NGCC.  This ticket is designed to 
> be a place for discussing and designing an approach to bulk reading.
> The goal is to have a bulk reading path for Cassandra.  That is, a path 
> optimized to grab a large portion of the data for a table (potentially all of 
> it).  This is a core element in the Spark integration with Cassandra, and the 
> speed at which Cassandra can deliver bulk data to Spark is limiting the 
> performance of Spark-plus-Cassandra operations.  This is especially of 
> importance as Cassandra will (likely) leverage Spark for internal operations 
> (for example CASSANDRA-8234).
> The core CQL to consider is the following:
> SELECT a, b, c FROM myKs.myTable WHERE Token(partitionKey) > X AND 
> Token(partitionKey) <= Y
> Here, we choose X and Y to be contained within one token range (perhaps 
> considering the primary range of a node without vnodes, for example).  This 
> query pushes 50K-100K rows/sec, which is not very fast if we are doing bulk 
> operations via Spark (or other processing frameworks - ETL, etc).  There are 
> a few causes (e.g., inefficient paging).
> There are a few approaches that could be considered.  First, we consider a 
> new "Streaming Compaction" approach.  The key observation here is that a bulk 
> read from Cassandra is a lot like a major compaction, though instead of 
> outputting a new SSTable we would output CQL rows to a stream/socket/etc.  
> This would be similar to a CompactionTask, but would strip out some 
> unnecessary things in there (e.g., some of the indexing, etc). Predicates and 
> projections could also be encapsulated in this new "StreamingCompactionTask", 
> for example.
> Another approach would be an alternate storage format.  For example, we might 
> employ Parquet (just as an example) to store the same data as in the primary 
> Cassandra storage (aka SSTables).  This is akin to Global Indexes (an 
> alternate storage of the same data optimized for a particular query).  Then, 
> Cassandra can choose to leverage this alternate storage for particular CQL 
> queries (e.g., range scans).
> These are just 2 suggestions to get the conversation going.
> One thing to note is that it will be useful to have this storage segregated 
> by token range so that when you extract via these mechanisms you do not get 
> replications-factor numbers of copies of the data.  That will certainly be an 
> issue for some Spark operations (e.g., counting).  Thus, we will want 
> per-token-range storage (even for single disks), so this will likely leverage 
> CASSANDRA-6696 (though, we'll want to also consider the single disk case).
> It is also worth discussing what the success criteria is here.  It is 
> unlikely to be as fast as EDW or HDFS performance (though, that is still a 
> good goal), but being within some percentage of that performance should be 
> set as success.  For example, 2x as long as doing bulk operations on HDFS 
> with similar node count/size/etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-9259) Bulk Reading from Cassandra

2016-07-29 Thread Stefania (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefania updated CASSANDRA-9259:

Attachment: (was: no_vnodes.jpg)

> Bulk Reading from Cassandra
> ---
>
> Key: CASSANDRA-9259
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9259
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Compaction, CQL, Local Write-Read Paths, Streaming and 
> Messaging, Testing
>Reporter:  Brian Hess
>Assignee: Stefania
>Priority: Critical
> Fix For: 3.x
>
> Attachments: before_after.jpg, bulk-read-benchmark.1.html, 
> bulk-read-jfr-profiles.1.tar.gz, bulk-read-jfr-profiles.2.tar.gz, 
> spark_benchmark_raw_data.zip
>
>
> This ticket is following on from the 2015 NGCC.  This ticket is designed to 
> be a place for discussing and designing an approach to bulk reading.
> The goal is to have a bulk reading path for Cassandra.  That is, a path 
> optimized to grab a large portion of the data for a table (potentially all of 
> it).  This is a core element in the Spark integration with Cassandra, and the 
> speed at which Cassandra can deliver bulk data to Spark is limiting the 
> performance of Spark-plus-Cassandra operations.  This is especially of 
> importance as Cassandra will (likely) leverage Spark for internal operations 
> (for example CASSANDRA-8234).
> The core CQL to consider is the following:
> SELECT a, b, c FROM myKs.myTable WHERE Token(partitionKey) > X AND 
> Token(partitionKey) <= Y
> Here, we choose X and Y to be contained within one token range (perhaps 
> considering the primary range of a node without vnodes, for example).  This 
> query pushes 50K-100K rows/sec, which is not very fast if we are doing bulk 
> operations via Spark (or other processing frameworks - ETL, etc).  There are 
> a few causes (e.g., inefficient paging).
> There are a few approaches that could be considered.  First, we consider a 
> new "Streaming Compaction" approach.  The key observation here is that a bulk 
> read from Cassandra is a lot like a major compaction, though instead of 
> outputting a new SSTable we would output CQL rows to a stream/socket/etc.  
> This would be similar to a CompactionTask, but would strip out some 
> unnecessary things in there (e.g., some of the indexing, etc). Predicates and 
> projections could also be encapsulated in this new "StreamingCompactionTask", 
> for example.
> Another approach would be an alternate storage format.  For example, we might 
> employ Parquet (just as an example) to store the same data as in the primary 
> Cassandra storage (aka SSTables).  This is akin to Global Indexes (an 
> alternate storage of the same data optimized for a particular query).  Then, 
> Cassandra can choose to leverage this alternate storage for particular CQL 
> queries (e.g., range scans).
> These are just 2 suggestions to get the conversation going.
> One thing to note is that it will be useful to have this storage segregated 
> by token range so that when you extract via these mechanisms you do not get 
> replications-factor numbers of copies of the data.  That will certainly be an 
> issue for some Spark operations (e.g., counting).  Thus, we will want 
> per-token-range storage (even for single disks), so this will likely leverage 
> CASSANDRA-6696 (though, we'll want to also consider the single disk case).
> It is also worth discussing what the success criteria is here.  It is 
> unlikely to be as fast as EDW or HDFS performance (though, that is still a 
> good goal), but being within some percentage of that performance should be 
> set as success.  For example, 2x as long as doing bulk operations on HDFS 
> with similar node count/size/etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-9259) Bulk Reading from Cassandra

2016-07-29 Thread Stefania (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefania updated CASSANDRA-9259:

Attachment: no_vnodes.jpg
before_after.jpg
256_vnodes.jpg

> Bulk Reading from Cassandra
> ---
>
> Key: CASSANDRA-9259
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9259
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Compaction, CQL, Local Write-Read Paths, Streaming and 
> Messaging, Testing
>Reporter:  Brian Hess
>Assignee: Stefania
>Priority: Critical
> Fix For: 3.x
>
> Attachments: 256_vnodes.jpg, before_after.jpg, 
> bulk-read-benchmark.1.html, bulk-read-jfr-profiles.1.tar.gz, 
> bulk-read-jfr-profiles.2.tar.gz, no_vnodes.jpg, spark_benchmark_raw_data.zip
>
>
> This ticket is following on from the 2015 NGCC.  This ticket is designed to 
> be a place for discussing and designing an approach to bulk reading.
> The goal is to have a bulk reading path for Cassandra.  That is, a path 
> optimized to grab a large portion of the data for a table (potentially all of 
> it).  This is a core element in the Spark integration with Cassandra, and the 
> speed at which Cassandra can deliver bulk data to Spark is limiting the 
> performance of Spark-plus-Cassandra operations.  This is especially of 
> importance as Cassandra will (likely) leverage Spark for internal operations 
> (for example CASSANDRA-8234).
> The core CQL to consider is the following:
> SELECT a, b, c FROM myKs.myTable WHERE Token(partitionKey) > X AND 
> Token(partitionKey) <= Y
> Here, we choose X and Y to be contained within one token range (perhaps 
> considering the primary range of a node without vnodes, for example).  This 
> query pushes 50K-100K rows/sec, which is not very fast if we are doing bulk 
> operations via Spark (or other processing frameworks - ETL, etc).  There are 
> a few causes (e.g., inefficient paging).
> There are a few approaches that could be considered.  First, we consider a 
> new "Streaming Compaction" approach.  The key observation here is that a bulk 
> read from Cassandra is a lot like a major compaction, though instead of 
> outputting a new SSTable we would output CQL rows to a stream/socket/etc.  
> This would be similar to a CompactionTask, but would strip out some 
> unnecessary things in there (e.g., some of the indexing, etc). Predicates and 
> projections could also be encapsulated in this new "StreamingCompactionTask", 
> for example.
> Another approach would be an alternate storage format.  For example, we might 
> employ Parquet (just as an example) to store the same data as in the primary 
> Cassandra storage (aka SSTables).  This is akin to Global Indexes (an 
> alternate storage of the same data optimized for a particular query).  Then, 
> Cassandra can choose to leverage this alternate storage for particular CQL 
> queries (e.g., range scans).
> These are just 2 suggestions to get the conversation going.
> One thing to note is that it will be useful to have this storage segregated 
> by token range so that when you extract via these mechanisms you do not get 
> replications-factor numbers of copies of the data.  That will certainly be an 
> issue for some Spark operations (e.g., counting).  Thus, we will want 
> per-token-range storage (even for single disks), so this will likely leverage 
> CASSANDRA-6696 (though, we'll want to also consider the single disk case).
> It is also worth discussing what the success criteria is here.  It is 
> unlikely to be as fast as EDW or HDFS performance (though, that is still a 
> good goal), but being within some percentage of that performance should be 
> set as success.  For example, 2x as long as doing bulk operations on HDFS 
> with similar node count/size/etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-9259) Bulk Reading from Cassandra

2016-07-29 Thread Stefania (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefania updated CASSANDRA-9259:

Attachment: (was: no_vnodes.jpg)

> Bulk Reading from Cassandra
> ---
>
> Key: CASSANDRA-9259
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9259
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Compaction, CQL, Local Write-Read Paths, Streaming and 
> Messaging, Testing
>Reporter:  Brian Hess
>Assignee: Stefania
>Priority: Critical
> Fix For: 3.x
>
> Attachments: 256_vnodes.jpg, before_after.jpg, 
> bulk-read-benchmark.1.html, bulk-read-jfr-profiles.1.tar.gz, 
> bulk-read-jfr-profiles.2.tar.gz, no_vnodes.jpg, spark_benchmark_raw_data.zip
>
>
> This ticket is following on from the 2015 NGCC.  This ticket is designed to 
> be a place for discussing and designing an approach to bulk reading.
> The goal is to have a bulk reading path for Cassandra.  That is, a path 
> optimized to grab a large portion of the data for a table (potentially all of 
> it).  This is a core element in the Spark integration with Cassandra, and the 
> speed at which Cassandra can deliver bulk data to Spark is limiting the 
> performance of Spark-plus-Cassandra operations.  This is especially of 
> importance as Cassandra will (likely) leverage Spark for internal operations 
> (for example CASSANDRA-8234).
> The core CQL to consider is the following:
> SELECT a, b, c FROM myKs.myTable WHERE Token(partitionKey) > X AND 
> Token(partitionKey) <= Y
> Here, we choose X and Y to be contained within one token range (perhaps 
> considering the primary range of a node without vnodes, for example).  This 
> query pushes 50K-100K rows/sec, which is not very fast if we are doing bulk 
> operations via Spark (or other processing frameworks - ETL, etc).  There are 
> a few causes (e.g., inefficient paging).
> There are a few approaches that could be considered.  First, we consider a 
> new "Streaming Compaction" approach.  The key observation here is that a bulk 
> read from Cassandra is a lot like a major compaction, though instead of 
> outputting a new SSTable we would output CQL rows to a stream/socket/etc.  
> This would be similar to a CompactionTask, but would strip out some 
> unnecessary things in there (e.g., some of the indexing, etc). Predicates and 
> projections could also be encapsulated in this new "StreamingCompactionTask", 
> for example.
> Another approach would be an alternate storage format.  For example, we might 
> employ Parquet (just as an example) to store the same data as in the primary 
> Cassandra storage (aka SSTables).  This is akin to Global Indexes (an 
> alternate storage of the same data optimized for a particular query).  Then, 
> Cassandra can choose to leverage this alternate storage for particular CQL 
> queries (e.g., range scans).
> These are just 2 suggestions to get the conversation going.
> One thing to note is that it will be useful to have this storage segregated 
> by token range so that when you extract via these mechanisms you do not get 
> replications-factor numbers of copies of the data.  That will certainly be an 
> issue for some Spark operations (e.g., counting).  Thus, we will want 
> per-token-range storage (even for single disks), so this will likely leverage 
> CASSANDRA-6696 (though, we'll want to also consider the single disk case).
> It is also worth discussing what the success criteria is here.  It is 
> unlikely to be as fast as EDW or HDFS performance (though, that is still a 
> good goal), but being within some percentage of that performance should be 
> set as success.  For example, 2x as long as doing bulk operations on HDFS 
> with similar node count/size/etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-9259) Bulk Reading from Cassandra

2016-07-29 Thread Stefania (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefania updated CASSANDRA-9259:

Attachment: (was: before_after.jpg)

> Bulk Reading from Cassandra
> ---
>
> Key: CASSANDRA-9259
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9259
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Compaction, CQL, Local Write-Read Paths, Streaming and 
> Messaging, Testing
>Reporter:  Brian Hess
>Assignee: Stefania
>Priority: Critical
> Fix For: 3.x
>
> Attachments: bulk-read-benchmark.1.html, 
> bulk-read-jfr-profiles.1.tar.gz, bulk-read-jfr-profiles.2.tar.gz, 
> no_vnodes.jpg, spark_benchmark_raw_data.zip
>
>
> This ticket is following on from the 2015 NGCC.  This ticket is designed to 
> be a place for discussing and designing an approach to bulk reading.
> The goal is to have a bulk reading path for Cassandra.  That is, a path 
> optimized to grab a large portion of the data for a table (potentially all of 
> it).  This is a core element in the Spark integration with Cassandra, and the 
> speed at which Cassandra can deliver bulk data to Spark is limiting the 
> performance of Spark-plus-Cassandra operations.  This is especially of 
> importance as Cassandra will (likely) leverage Spark for internal operations 
> (for example CASSANDRA-8234).
> The core CQL to consider is the following:
> SELECT a, b, c FROM myKs.myTable WHERE Token(partitionKey) > X AND 
> Token(partitionKey) <= Y
> Here, we choose X and Y to be contained within one token range (perhaps 
> considering the primary range of a node without vnodes, for example).  This 
> query pushes 50K-100K rows/sec, which is not very fast if we are doing bulk 
> operations via Spark (or other processing frameworks - ETL, etc).  There are 
> a few causes (e.g., inefficient paging).
> There are a few approaches that could be considered.  First, we consider a 
> new "Streaming Compaction" approach.  The key observation here is that a bulk 
> read from Cassandra is a lot like a major compaction, though instead of 
> outputting a new SSTable we would output CQL rows to a stream/socket/etc.  
> This would be similar to a CompactionTask, but would strip out some 
> unnecessary things in there (e.g., some of the indexing, etc). Predicates and 
> projections could also be encapsulated in this new "StreamingCompactionTask", 
> for example.
> Another approach would be an alternate storage format.  For example, we might 
> employ Parquet (just as an example) to store the same data as in the primary 
> Cassandra storage (aka SSTables).  This is akin to Global Indexes (an 
> alternate storage of the same data optimized for a particular query).  Then, 
> Cassandra can choose to leverage this alternate storage for particular CQL 
> queries (e.g., range scans).
> These are just 2 suggestions to get the conversation going.
> One thing to note is that it will be useful to have this storage segregated 
> by token range so that when you extract via these mechanisms you do not get 
> replications-factor numbers of copies of the data.  That will certainly be an 
> issue for some Spark operations (e.g., counting).  Thus, we will want 
> per-token-range storage (even for single disks), so this will likely leverage 
> CASSANDRA-6696 (though, we'll want to also consider the single disk case).
> It is also worth discussing what the success criteria is here.  It is 
> unlikely to be as fast as EDW or HDFS performance (though, that is still a 
> good goal), but being within some percentage of that performance should be 
> set as success.  For example, 2x as long as doing bulk operations on HDFS 
> with similar node count/size/etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-9259) Bulk Reading from Cassandra

2016-07-29 Thread Stefania (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefania updated CASSANDRA-9259:

Attachment: (was: 256_vnodes.jpg)

> Bulk Reading from Cassandra
> ---
>
> Key: CASSANDRA-9259
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9259
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Compaction, CQL, Local Write-Read Paths, Streaming and 
> Messaging, Testing
>Reporter:  Brian Hess
>Assignee: Stefania
>Priority: Critical
> Fix For: 3.x
>
> Attachments: bulk-read-benchmark.1.html, 
> bulk-read-jfr-profiles.1.tar.gz, bulk-read-jfr-profiles.2.tar.gz, 
> no_vnodes.jpg, spark_benchmark_raw_data.zip
>
>
> This ticket is following on from the 2015 NGCC.  This ticket is designed to 
> be a place for discussing and designing an approach to bulk reading.
> The goal is to have a bulk reading path for Cassandra.  That is, a path 
> optimized to grab a large portion of the data for a table (potentially all of 
> it).  This is a core element in the Spark integration with Cassandra, and the 
> speed at which Cassandra can deliver bulk data to Spark is limiting the 
> performance of Spark-plus-Cassandra operations.  This is especially of 
> importance as Cassandra will (likely) leverage Spark for internal operations 
> (for example CASSANDRA-8234).
> The core CQL to consider is the following:
> SELECT a, b, c FROM myKs.myTable WHERE Token(partitionKey) > X AND 
> Token(partitionKey) <= Y
> Here, we choose X and Y to be contained within one token range (perhaps 
> considering the primary range of a node without vnodes, for example).  This 
> query pushes 50K-100K rows/sec, which is not very fast if we are doing bulk 
> operations via Spark (or other processing frameworks - ETL, etc).  There are 
> a few causes (e.g., inefficient paging).
> There are a few approaches that could be considered.  First, we consider a 
> new "Streaming Compaction" approach.  The key observation here is that a bulk 
> read from Cassandra is a lot like a major compaction, though instead of 
> outputting a new SSTable we would output CQL rows to a stream/socket/etc.  
> This would be similar to a CompactionTask, but would strip out some 
> unnecessary things in there (e.g., some of the indexing, etc). Predicates and 
> projections could also be encapsulated in this new "StreamingCompactionTask", 
> for example.
> Another approach would be an alternate storage format.  For example, we might 
> employ Parquet (just as an example) to store the same data as in the primary 
> Cassandra storage (aka SSTables).  This is akin to Global Indexes (an 
> alternate storage of the same data optimized for a particular query).  Then, 
> Cassandra can choose to leverage this alternate storage for particular CQL 
> queries (e.g., range scans).
> These are just 2 suggestions to get the conversation going.
> One thing to note is that it will be useful to have this storage segregated 
> by token range so that when you extract via these mechanisms you do not get 
> replications-factor numbers of copies of the data.  That will certainly be an 
> issue for some Spark operations (e.g., counting).  Thus, we will want 
> per-token-range storage (even for single disks), so this will likely leverage 
> CASSANDRA-6696 (though, we'll want to also consider the single disk case).
> It is also worth discussing what the success criteria is here.  It is 
> unlikely to be as fast as EDW or HDFS performance (though, that is still a 
> good goal), but being within some percentage of that performance should be 
> set as success.  For example, 2x as long as doing bulk operations on HDFS 
> with similar node count/size/etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-9259) Bulk Reading from Cassandra

2016-07-29 Thread Stefania (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefania updated CASSANDRA-9259:

Attachment: no_vnodes.jpg
before_after.jpg
256_vnodes.jpg
spark_benchmark_raw_data.zip

> Bulk Reading from Cassandra
> ---
>
> Key: CASSANDRA-9259
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9259
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Compaction, CQL, Local Write-Read Paths, Streaming and 
> Messaging, Testing
>Reporter:  Brian Hess
>Assignee: Stefania
>Priority: Critical
> Fix For: 3.x
>
> Attachments: 256_vnodes.jpg, before_after.jpg, 
> bulk-read-benchmark.1.html, bulk-read-jfr-profiles.1.tar.gz, 
> bulk-read-jfr-profiles.2.tar.gz, no_vnodes.jpg, spark_benchmark_raw_data.zip
>
>
> This ticket is following on from the 2015 NGCC.  This ticket is designed to 
> be a place for discussing and designing an approach to bulk reading.
> The goal is to have a bulk reading path for Cassandra.  That is, a path 
> optimized to grab a large portion of the data for a table (potentially all of 
> it).  This is a core element in the Spark integration with Cassandra, and the 
> speed at which Cassandra can deliver bulk data to Spark is limiting the 
> performance of Spark-plus-Cassandra operations.  This is especially of 
> importance as Cassandra will (likely) leverage Spark for internal operations 
> (for example CASSANDRA-8234).
> The core CQL to consider is the following:
> SELECT a, b, c FROM myKs.myTable WHERE Token(partitionKey) > X AND 
> Token(partitionKey) <= Y
> Here, we choose X and Y to be contained within one token range (perhaps 
> considering the primary range of a node without vnodes, for example).  This 
> query pushes 50K-100K rows/sec, which is not very fast if we are doing bulk 
> operations via Spark (or other processing frameworks - ETL, etc).  There are 
> a few causes (e.g., inefficient paging).
> There are a few approaches that could be considered.  First, we consider a 
> new "Streaming Compaction" approach.  The key observation here is that a bulk 
> read from Cassandra is a lot like a major compaction, though instead of 
> outputting a new SSTable we would output CQL rows to a stream/socket/etc.  
> This would be similar to a CompactionTask, but would strip out some 
> unnecessary things in there (e.g., some of the indexing, etc). Predicates and 
> projections could also be encapsulated in this new "StreamingCompactionTask", 
> for example.
> Another approach would be an alternate storage format.  For example, we might 
> employ Parquet (just as an example) to store the same data as in the primary 
> Cassandra storage (aka SSTables).  This is akin to Global Indexes (an 
> alternate storage of the same data optimized for a particular query).  Then, 
> Cassandra can choose to leverage this alternate storage for particular CQL 
> queries (e.g., range scans).
> These are just 2 suggestions to get the conversation going.
> One thing to note is that it will be useful to have this storage segregated 
> by token range so that when you extract via these mechanisms you do not get 
> replications-factor numbers of copies of the data.  That will certainly be an 
> issue for some Spark operations (e.g., counting).  Thus, we will want 
> per-token-range storage (even for single disks), so this will likely leverage 
> CASSANDRA-6696 (though, we'll want to also consider the single disk case).
> It is also worth discussing what the success criteria is here.  It is 
> unlikely to be as fast as EDW or HDFS performance (though, that is still a 
> good goal), but being within some percentage of that performance should be 
> set as success.  For example, 2x as long as doing bulk operations on HDFS 
> with similar node count/size/etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-12228) Write performance regression in 3.x vs 3.0

2016-07-29 Thread Ariel Weisberg (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-12228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15400361#comment-15400361
 ] 

Ariel Weisberg commented on CASSANDRA-12228:


I think there is also an issue I am stilling working on nailing down where 
memory accounting releases memory pinned by memtables too early or is just off 
by too much causing the heap to fill up with memtables that are waiting for the 
post flush executor. I can see the heap going to double the limit in a heap 
dump and things are falling apart server side.

> Write performance regression in 3.x vs 3.0
> --
>
> Key: CASSANDRA-12228
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12228
> Project: Cassandra
>  Issue Type: Bug
>Reporter: T Jake Luciani
>Assignee: Marcus Eriksson
>Priority: Minor
> Fix For: 3.9
>
>
> I've been tracking down a performance issue in trunk vs cassandra-3.0 branch.
> I think I've found it.  CASSANDRA-6696 changed the default memtable flush 
> default to 1 vs the min of 2 in cassandra-3.0.
> I don't see any technical reason for this and we should add back the min of 2 
> sstable flushers per disk.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (CASSANDRA-12348) Flaky failures in SSTableRewriterTest.basicTest2/getPositionsTest

2016-07-29 Thread Joel Knighton (JIRA)

Joel Knighton created CASSANDRA-12348:
-

 Summary: Flaky failures in 
SSTableRewriterTest.basicTest2/getPositionsTest
 Key: CASSANDRA-12348
 URL: https://issues.apache.org/jira/browse/CASSANDRA-12348
 Project: Cassandra
  Issue Type: Bug
  Components: Testing
Reporter: Joel Knighton
 Fix For: 3.x


Example failures:
http://cassci.datastax.com/job/cassandra-3.9_testall/45/testReport/junit/org.apache.cassandra.io.sstable/SSTableRewriterTest/basicTest2/

http://cassci.datastax.com/job/cassandra-3.9_testall/37/testReport/junit/org.apache.cassandra.io.sstable/SSTableRewriterTest/getPositionsTest/

http://cassci.datastax.com/job/trunk_testall/1054/testReport/junit/org.apache.cassandra.io.sstable/SSTableRewriterTest/getPositionsTest/

All failures look like the test is finding more files than expected after a 
rewrite.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-11687) dtest failure in rebuild_test.TestRebuild.simple_rebuild_test

2016-07-29 Thread Joel Knighton (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-11687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15400348#comment-15400348
 ] 

Joel Knighton commented on CASSANDRA-11687:
---

I commented on the dtest PR to remove the known_failure annotation - after this 
PR was merged, I saw a new failure on this test as part of the [daily 3.9 dtest 
run|http://cassci.datastax.com/job/cassandra-3.9_dtest/21/testReport/rebuild_test/TestRebuild/simple_rebuild_test/].
 This also reproduces fairly easily on my local machine; not sure why the 
multiplexer didn't hit it. I think the PRed fix is still prone to races.

> dtest failure in rebuild_test.TestRebuild.simple_rebuild_test
> -
>
> Key: CASSANDRA-11687
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11687
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Russ Hatch
>Assignee: Yuki Morishita
>  Labels: dtest
>
> single failure on most recent run (3.0 no-vnode)
> {noformat}
> concurrent rebuild should not be allowed, but one rebuild command should have 
> succeeded.
> {noformat}
> http://cassci.datastax.com/job/cassandra-3.0_novnode_dtest/217/testReport/rebuild_test/TestRebuild/simple_rebuild_test
> Failed on CassCI build cassandra-3.0_novnode_dtest #217



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-12008) Make decommission operations resumable

2016-07-29 Thread Paulo Motta (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-12008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15400338#comment-15400338
 ] 

Paulo Motta commented on CASSANDRA-12008:
-

Thanks for the update! See follow-up below:

bq. I've added a new strategy, please let me know what do you think about it.

Instead of building a {{streamedRangesPerEndpoints Map}} manually, maybe it's better to modify {{getStreamedRanges}} 
to take a description and keyspace as argument an return a {{Map}} by querying {{"SELECT * FROM system.streamed_ranges WHERE 
operation = ? AND keyspace_name = ?"}}.

This way you can query {{getStreamedRanges}} directly to filter already 
transferred ranges when iterating {{rangesToStreamByKeyspace}}.

bq. So instead we will obtain StreamSession from 
StreamTransferTask.getSession() when each StreamTransferTask is complete i.e 
when StreamStateStore.handleStreamEvent is invoked. All these means that we are 
going to only pass its responsible keyspace.

I think we can simplify that and instead of adding {{transferTasks}} to 
{{SessionCompleteEvent}} we can simply add the session description and 
{{transferredRangesPerKeyspace}}, and that's all we will need to populate the 
streamed ranges on {{StreamStateStore}}.

A minor nit is that the transferred ranges are always being overriden on 
{{addTransferRanges}} while you should append to the existing set if it's 
already present on {{transferredRangesPerKeyspace}}.

bq. Don't know if there's some problem with current implementation or there's 
something weird in the set-up, but it skips twice the same range:

this is for different keyspaces, so you should add the keyspace name in the log 
message so it's not confusing.

bq. I think it's the set-up itself since 
StorageService.getChangedRangesForLeaving is also returning the same range twice

that's probably for the same reason as above, maybe it would be a good idea to 
add the keyspace name in that log as well.

> Make decommission operations resumable
> --
>
> Key: CASSANDRA-12008
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12008
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Streaming and Messaging
>Reporter: Tom van der Woerdt
>Assignee: Kaide Mu
>Priority: Minor
>
> We're dealing with large data sets (multiple terabytes per node) and 
> sometimes we need to add or remove nodes. These operations are very dependent 
> on the entire cluster being up, so while we're joining a new node (which 
> sometimes takes 6 hours or longer) a lot can go wrong and in a lot of cases 
> something does.
> It would be great if the ability to retry streams was implemented.
> Example to illustrate the problem :
> {code}
> 03:18 PM   ~ $ nodetool decommission
> error: Stream failed
> -- StackTrace --
> org.apache.cassandra.streaming.StreamException: Stream failed
> at 
> org.apache.cassandra.streaming.management.StreamEventJMXNotifier.onFailure(StreamEventJMXNotifier.java:85)
> at com.google.common.util.concurrent.Futures$6.run(Futures.java:1310)
> at 
> com.google.common.util.concurrent.MoreExecutors$DirectExecutor.execute(MoreExecutors.java:457)
> at 
> com.google.common.util.concurrent.ExecutionList.executeListener(ExecutionList.java:156)
> at 
> com.google.common.util.concurrent.ExecutionList.execute(ExecutionList.java:145)
> at 
> com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:202)
> at 
> org.apache.cassandra.streaming.StreamResultFuture.maybeComplete(StreamResultFuture.java:210)
> at 
> org.apache.cassandra.streaming.StreamResultFuture.handleSessionComplete(StreamResultFuture.java:186)
> at 
> org.apache.cassandra.streaming.StreamSession.closeSession(StreamSession.java:430)
> at 
> org.apache.cassandra.streaming.StreamSession.complete(StreamSession.java:622)
> at 
> org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:486)
> at 
> org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:274)
> at java.lang.Thread.run(Thread.java:745)
> 08:04 PM   ~ $ nodetool decommission
> nodetool: Unsupported operation: Node in LEAVING state; wait for status to 
> become normal or restart
> See 'nodetool help' or 'nodetool help '.
> {code}
> Streaming failed, probably due to load :
> {code}
> ERROR [STREAM-IN-/] 2016-06-14 18:05:47,275 StreamSession.java:520 - 
> [Stream #] Streaming error occurred
> java.net.SocketTimeoutException: null
> at 
> sun.nio.ch.SocketAdaptor$SocketInputStream.read(SocketAdaptor.java:211) 
> ~[na:1.8.0_77]
> at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103) 
> ~[na:1.8.0_77]
>

[jira] [Comment Edited] (CASSANDRA-12251) dtest failure in upgrade_tests.cql_tests.TestCQLNodes3RF3_Upgrade_current_3_x_To_indev_3_x.whole_list_conditional_test

2016-07-29 Thread Joel Knighton (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-12251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15400277#comment-15400277
 ] 

Joel Knighton edited comment on CASSANDRA-12251 at 7/30/16 12:10 AM:
-

This looks to me like in drain/StorageServiceShutdownHook, the schema stage is 
not shutdown, so if you have a task submitted to that executor that doesn't 
execute until after the postflush executor has been terminated in drain, you'll 
hit this exception. This is a C* fix for sure.


was (Author: jkni):
This looks to me like in drain/StorageShutdownHook, the schema stage is not 
shutdown, so if you have a task submitted to that executor that doesn't execute 
until after the postflush executor has been terminated in drain, you'll hit 
this exception. This is a C* fix for sure.

> dtest failure in 
> upgrade_tests.cql_tests.TestCQLNodes3RF3_Upgrade_current_3_x_To_indev_3_x.whole_list_conditional_test
> --
>
> Key: CASSANDRA-12251
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12251
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Philip Thompson
>Assignee: Alex Petrov
>  Labels: dtest
> Attachments: node1.log, node1_debug.log, node1_gc.log, node2.log, 
> node2_debug.log, node2_gc.log, node3.log, node3_debug.log, node3_gc.log
>
>
> example failure:
> http://cassci.datastax.com/job/cassandra-3.8_dtest_upgrade/1/testReport/upgrade_tests.cql_tests/TestCQLNodes3RF3_Upgrade_current_3_x_To_indev_3_x/whole_list_conditional_test
> Failed on CassCI build cassandra-3.8_dtest_upgrade #1
> Relevant error in logs is
> {code}
> Unexpected error in node1 log, error: 
> ERROR [InternalResponseStage:2] 2016-07-20 04:58:45,876 
> CassandraDaemon.java:217 - Exception in thread 
> Thread[InternalResponseStage:2,5,main]
> java.util.concurrent.RejectedExecutionException: ThreadPoolExecutor has shut 
> down
>   at 
> org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor$1.rejectedExecution(DebuggableThreadPoolExecutor.java:61)
>  ~[apache-cassandra-3.7.jar:3.7]
>   at 
> java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:823) 
> ~[na:1.8.0_51]
>   at 
> java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1369) 
> ~[na:1.8.0_51]
>   at 
> org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor.execute(DebuggableThreadPoolExecutor.java:165)
>  ~[apache-cassandra-3.7.jar:3.7]
>   at 
> java.util.concurrent.AbstractExecutorService.submit(AbstractExecutorService.java:112)
>  ~[na:1.8.0_51]
>   at 
> org.apache.cassandra.db.ColumnFamilyStore.switchMemtable(ColumnFamilyStore.java:842)
>  ~[apache-cassandra-3.7.jar:3.7]
>   at 
> org.apache.cassandra.db.ColumnFamilyStore.switchMemtableIfCurrent(ColumnFamilyStore.java:822)
>  ~[apache-cassandra-3.7.jar:3.7]
>   at 
> org.apache.cassandra.db.ColumnFamilyStore.forceFlush(ColumnFamilyStore.java:891)
>  ~[apache-cassandra-3.7.jar:3.7]
>   at 
> org.apache.cassandra.schema.SchemaKeyspace.lambda$flush$1(SchemaKeyspace.java:279)
>  ~[apache-cassandra-3.7.jar:3.7]
>   at 
> org.apache.cassandra.schema.SchemaKeyspace$$Lambda$200/1129213153.accept(Unknown
>  Source) ~[na:na]
>   at java.lang.Iterable.forEach(Iterable.java:75) ~[na:1.8.0_51]
>   at 
> org.apache.cassandra.schema.SchemaKeyspace.flush(SchemaKeyspace.java:279) 
> ~[apache-cassandra-3.7.jar:3.7]
>   at 
> org.apache.cassandra.schema.SchemaKeyspace.mergeSchema(SchemaKeyspace.java:1271)
>  ~[apache-cassandra-3.7.jar:3.7]
>   at 
> org.apache.cassandra.schema.SchemaKeyspace.mergeSchemaAndAnnounceVersion(SchemaKeyspace.java:1253)
>  ~[apache-cassandra-3.7.jar:3.7]
>   at 
> org.apache.cassandra.service.MigrationTask$1.response(MigrationTask.java:92) 
> ~[apache-cassandra-3.7.jar:3.7]
>   at 
> org.apache.cassandra.net.ResponseVerbHandler.doVerb(ResponseVerbHandler.java:53)
>  ~[apache-cassandra-3.7.jar:3.7]
>   at 
> org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:64) 
> ~[apache-cassandra-3.7.jar:3.7]
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> ~[na:1.8.0_51]
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
> ~[na:1.8.0_51]
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  ~[na:1.8.0_51]
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  [na:1.8.0_51]
>   at java.lang.Thread.run(Thread.java:745) [na:1.8.0_51]
> {code}
> This is on a mixed 3.0.8, 3.8-tentative cluster



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-12251) dtest failure in upgrade_tests.cql_tests.TestCQLNodes3RF3_Upgrade_current_3_x_To_indev_3_x.whole_list_conditional_test

2016-07-29 Thread Joel Knighton (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-12251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15400277#comment-15400277
 ] 

Joel Knighton commented on CASSANDRA-12251:
---

This looks to me like in drain/StorageShutdownHook, the schema stage is not 
shutdown, so if you have a task submitted to that executor that doesn't execute 
until after the postflush executor has been terminated in drain, you'll hit 
this exception. This is a C* fix for sure.

> dtest failure in 
> upgrade_tests.cql_tests.TestCQLNodes3RF3_Upgrade_current_3_x_To_indev_3_x.whole_list_conditional_test
> --
>
> Key: CASSANDRA-12251
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12251
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Philip Thompson
>Assignee: Alex Petrov
>  Labels: dtest
> Attachments: node1.log, node1_debug.log, node1_gc.log, node2.log, 
> node2_debug.log, node2_gc.log, node3.log, node3_debug.log, node3_gc.log
>
>
> example failure:
> http://cassci.datastax.com/job/cassandra-3.8_dtest_upgrade/1/testReport/upgrade_tests.cql_tests/TestCQLNodes3RF3_Upgrade_current_3_x_To_indev_3_x/whole_list_conditional_test
> Failed on CassCI build cassandra-3.8_dtest_upgrade #1
> Relevant error in logs is
> {code}
> Unexpected error in node1 log, error: 
> ERROR [InternalResponseStage:2] 2016-07-20 04:58:45,876 
> CassandraDaemon.java:217 - Exception in thread 
> Thread[InternalResponseStage:2,5,main]
> java.util.concurrent.RejectedExecutionException: ThreadPoolExecutor has shut 
> down
>   at 
> org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor$1.rejectedExecution(DebuggableThreadPoolExecutor.java:61)
>  ~[apache-cassandra-3.7.jar:3.7]
>   at 
> java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:823) 
> ~[na:1.8.0_51]
>   at 
> java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1369) 
> ~[na:1.8.0_51]
>   at 
> org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor.execute(DebuggableThreadPoolExecutor.java:165)
>  ~[apache-cassandra-3.7.jar:3.7]
>   at 
> java.util.concurrent.AbstractExecutorService.submit(AbstractExecutorService.java:112)
>  ~[na:1.8.0_51]
>   at 
> org.apache.cassandra.db.ColumnFamilyStore.switchMemtable(ColumnFamilyStore.java:842)
>  ~[apache-cassandra-3.7.jar:3.7]
>   at 
> org.apache.cassandra.db.ColumnFamilyStore.switchMemtableIfCurrent(ColumnFamilyStore.java:822)
>  ~[apache-cassandra-3.7.jar:3.7]
>   at 
> org.apache.cassandra.db.ColumnFamilyStore.forceFlush(ColumnFamilyStore.java:891)
>  ~[apache-cassandra-3.7.jar:3.7]
>   at 
> org.apache.cassandra.schema.SchemaKeyspace.lambda$flush$1(SchemaKeyspace.java:279)
>  ~[apache-cassandra-3.7.jar:3.7]
>   at 
> org.apache.cassandra.schema.SchemaKeyspace$$Lambda$200/1129213153.accept(Unknown
>  Source) ~[na:na]
>   at java.lang.Iterable.forEach(Iterable.java:75) ~[na:1.8.0_51]
>   at 
> org.apache.cassandra.schema.SchemaKeyspace.flush(SchemaKeyspace.java:279) 
> ~[apache-cassandra-3.7.jar:3.7]
>   at 
> org.apache.cassandra.schema.SchemaKeyspace.mergeSchema(SchemaKeyspace.java:1271)
>  ~[apache-cassandra-3.7.jar:3.7]
>   at 
> org.apache.cassandra.schema.SchemaKeyspace.mergeSchemaAndAnnounceVersion(SchemaKeyspace.java:1253)
>  ~[apache-cassandra-3.7.jar:3.7]
>   at 
> org.apache.cassandra.service.MigrationTask$1.response(MigrationTask.java:92) 
> ~[apache-cassandra-3.7.jar:3.7]
>   at 
> org.apache.cassandra.net.ResponseVerbHandler.doVerb(ResponseVerbHandler.java:53)
>  ~[apache-cassandra-3.7.jar:3.7]
>   at 
> org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:64) 
> ~[apache-cassandra-3.7.jar:3.7]
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> ~[na:1.8.0_51]
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
> ~[na:1.8.0_51]
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  ~[na:1.8.0_51]
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  [na:1.8.0_51]
>   at java.lang.Thread.run(Thread.java:745) [na:1.8.0_51]
> {code}
> This is on a mixed 3.0.8, 3.8-tentative cluster



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-12311) Propagate TombstoneOverwhelmingException to the client

2016-07-29 Thread Geoffrey Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-12311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15400265#comment-15400265
 ] 

Geoffrey Yu commented on CASSANDRA-12311:
-

Sure, that sounds reasonable to me. I'll make the changes and update the patch.

> Propagate TombstoneOverwhelmingException to the client
> --
>
> Key: CASSANDRA-12311
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12311
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Geoffrey Yu
>Assignee: Geoffrey Yu
>Priority: Minor
> Fix For: 4.x
>
> Attachments: 12311-trunk.txt
>
>
> Right now if a data node fails to perform a read because it ran into a 
> {{TombstoneOverwhelmingException}}, it only responds back to the coordinator 
> node with a generic failure. Under this scheme, the coordinator won't be able 
> to know exactly why the request failed and subsequently the client only gets 
> a generic {{ReadFailureException}}. It would be useful to inform the client 
> that their read failed because we read too many tombstones. We should have 
> the data nodes reply with a failure type so the coordinator can pass this 
> information to the client.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-12092) dtest failure in consistency_test.TestAccuracy.test_simple_strategy_counters

2016-07-29 Thread Russ Hatch (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-12092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15400145#comment-15400145
 ] 

Russ Hatch commented on CASSANDRA-12092:


[~Stefania] any idea what could be causing this test to fail (intermittently) 
but on the same key when it does?

> dtest failure in consistency_test.TestAccuracy.test_simple_strategy_counters
> 
>
> Key: CASSANDRA-12092
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12092
> Project: Cassandra
>  Issue Type: Test
>Reporter: Sean McCarthy
>Assignee: Russ Hatch
>  Labels: dtest
> Attachments: node1.log, node2.log, node3.log
>
>
> example failure:
> http://cassci.datastax.com/job/cassandra-2.1_dtest/484/testReport/consistency_test/TestAccuracy/test_simple_strategy_counters
> Failed on CassCI build cassandra-2.1_dtest #484
> {code}
> Standard Error
> Traceback (most recent call last):
>   File "/home/automaton/cassandra-dtest/consistency_test.py", line 514, in run
> valid_fcn(v)
>   File "/home/automaton/cassandra-dtest/consistency_test.py", line 497, in 
> validate_counters
> check_all_sessions(s, n, c)
>   File "/home/automaton/cassandra-dtest/consistency_test.py", line 490, in 
> check_all_sessions
> "value of %s at key %d, instead got these values: %s" % (write_nodes, 
> val, n, results)
> AssertionError: Failed to read value from sufficient number of nodes, 
> required 2 nodes to have a counter value of 1 at key 200, instead got these 
> values: [0, 0, 1]
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (CASSANDRA-12228) Write performance regression in 3.x vs 3.0

2016-07-29 Thread Ariel Weisberg (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-12228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15400133#comment-15400133
 ] 

Ariel Weisberg edited comment on CASSANDRA-12228 at 7/29/16 10:28 PM:
--

There are some remaining issues with thread pool sizes. See 
[CASSANDRA-12071|https://issues.apache.org/jira/browse/CASSANDRA-12071?focusedCommentId=15400086=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15400086].

You still can't get multiple threads if you have a single disk due to TPE not 
spinning up additional threads if you are using an unbounded queue.  Seems like 
this would be a good place to address the related issue. I also don't think 
this is minor it's pretty crippling for performance and you can't work around 
it by changing configuration values.


was (Author: aweisberg):
There are some remaining issues with thread pool sizes. See 
[CASSANDRA-12071|https://issues.apache.org/jira/browse/CASSANDRA-12071?focusedCommentId=15400086=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15400086].

You still can't get multiple threads if you have a single disk.  Seems like 
this would be a good place to address the related issue. I also don't think 
this is minor it's pretty crippling for performance and you can't work around 
it by changing configuration values.

> Write performance regression in 3.x vs 3.0
> --
>
> Key: CASSANDRA-12228
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12228
> Project: Cassandra
>  Issue Type: Bug
>Reporter: T Jake Luciani
>Assignee: Marcus Eriksson
>Priority: Minor
> Fix For: 3.9
>
>
> I've been tracking down a performance issue in trunk vs cassandra-3.0 branch.
> I think I've found it.  CASSANDRA-6696 changed the default memtable flush 
> default to 1 vs the min of 2 in cassandra-3.0.
> I don't see any technical reason for this and we should add back the min of 2 
> sstable flushers per disk.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-12228) Write performance regression in 3.x vs 3.0

2016-07-29 Thread Ariel Weisberg (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-12228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15400133#comment-15400133
 ] 

Ariel Weisberg commented on CASSANDRA-12228:


There are some remaining issues with thread pool sizes. See 
[CASSANDRA-12071|https://issues.apache.org/jira/browse/CASSANDRA-12071?focusedCommentId=15400086=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15400086].

You still can't get multiple threads if you have a single disk.  Seems like 
this would be a good place to address the related issue. I also don't think 
this is minor it's pretty crippling for performance and you can't work around 
it by changing configuration values.

> Write performance regression in 3.x vs 3.0
> --
>
> Key: CASSANDRA-12228
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12228
> Project: Cassandra
>  Issue Type: Bug
>Reporter: T Jake Luciani
>Assignee: Marcus Eriksson
>Priority: Minor
> Fix For: 3.9
>
>
> I've been tracking down a performance issue in trunk vs cassandra-3.0 branch.
> I think I've found it.  CASSANDRA-6696 changed the default memtable flush 
> default to 1 vs the min of 2 in cassandra-3.0.
> I don't see any technical reason for this and we should add back the min of 2 
> sstable flushers per disk.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (CASSANDRA-12071) Regression in flushing throughput under load after CASSANDRA-6696

2016-07-29 Thread Ariel Weisberg (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ariel Weisberg resolved CASSANDRA-12071.

Resolution: Fixed

Going to bring this up in CASSANDRA-12228 since this was resolved and released 
already.

> Regression in flushing throughput under load after CASSANDRA-6696
> -
>
> Key: CASSANDRA-12071
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12071
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local Write-Read Paths
>Reporter: Ariel Weisberg
>Assignee: Marcus Eriksson
> Fix For: 3.8
>
>
> The way flushing used to work is that a ColumnFamilyStore could have multiple 
> Memtables flushing at once and multiple ColumnFamilyStores could flush at the 
> same time. The way it works now there can be only a single flush of any 
> ColumnFamilyStore & Memtable running in the C* process, and the number of 
> threads applied to that flush is bounded by the number of disks in JBOD.
> This works ok most of the time but occasionally flushing will be a little 
> slower and ingest will outstrip it and then block on available memory. At 
> this point you see several second stalls that cause timeouts.
> This is a problem for reasonable configurations that don't use JBOD but have 
> access to a fast disk that can handle some IO queuing (RAID, SSD).
> You can reproduce on beefy hardware (12 cores 24 threads, 64 gigs of RAM, 
> SSD) if you unthrottle compaction or set it to something like 64 
> megabytes/second and run with 8 compaction threads and stress with the 
> default write workload and a reasonable number of threads. I tested with 96.
> It started happening after about 60 gigabytes of data was loaded.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-12346) Gossip 2.0 - introduce a Peer Sampling Service for partial cluster views

2016-07-29 Thread Jason Brown (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-12346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15400097#comment-15400097
 ] 

Jason Brown commented on CASSANDRA-12346:
-

Here is the branch for an implementation of hyparview:

||hyparview||
|[branch|https://github.com/jasobrown/cassandra/tree/broadcast_hyparview]|
|[dtest|http://cassci.datastax.com/view/Dev/view/jasobrown/job/jasobrown-broadcast_hyparview-dtest/]|
|[testall|http://cassci.datastax.com/view/Dev/view/jasobrown/job/jasobrown-broadcast_hyparview-testall/]|

I've documented it rather extensively, so I hope that aides the reviewer. There 
are still a couple of (very) minor things left to clean up, including the 
simulator (implemented as a long test), but that should not hinder any review, 
i think.


> Gossip 2.0 - introduce a Peer Sampling Service for partial cluster views
> 
>
> Key: CASSANDRA-12346
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12346
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Jason Brown
>Assignee: Jason Brown
>  Labels: gossip
>
> A [Peer Sampling 
> Service|infoscience.epfl.ch/record/83409/files/neg--1184036295all.pdf] is a 
> module that provides a partial view of a cluster to dependent modules. A 
> node's partial view, combined with all other nodes' partial views, combine to 
> create a fully-connected mesh over the cluster. This way, a given node does 
> not need to have direct connections to every other node in the cluster, and 
> can be much more efficient in terms of resource management as well as 
> information dissemination. Peer Sampling Services by their nature must be 
> self-healing and self-balancing to maintain the fully-connected mesh.
> I propose we use an algorithm based on [HyParView 
> (http://asc.di.fct.unl.pt/~jleitao/pdf/dsn07-leitao.pdf], which is a concrete 
> algorithm for a Peer Sampling Service. HyParView has a clearly defined 
> protocol, and is reasonably simple to implement.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-12347) Gossip 2.0 - broadcast tree for data dissemination

2016-07-29 Thread Joel Knighton (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Knighton updated CASSANDRA-12347:
--
Description: 
Description: A broadcast tree (spanning tree) allows an originating node to 
efficiently send out updates to all of the peers in the cluster by constructing 
a balanced, self-healing tree based upon the view it gets from the peer 
sampling service (CASSANDRA-12346). 

I propose we use an algorithm based on the [Thicket 
paper|http://www.gsd.inesc-id.pt/%7Ejleitao/pdf/srds10-mario.pdf], which 
describes a dynamic, self-healing broadcast tree. When a given node needs to 
send out a message, it dynamically builds a tree for each node in the cluster; 
thus giving us a unique tree for every node in the cluster (a tree rooted at 
every cluster node). The trees, of course, would be reusable until the cluster 
configurations changes or failures are detected (by the mechanism described in 
the paper). Additionally, Thicket includes a mechanism for load-balancing the 
trees such that nodes spread out the work amongst themselves.


  was:
Description: A broadcast tree (spanning tree) allows an originating node to 
efficiently send out updates to all of the peers in the cluster by constructing 
a balanced, self-healing tree based upon the view it gets from the peer 
sampling service (CASSANDRA-12346). 

I propose we use an algorithm based on the [Thicket 
paper|www.gsd.inesc-id.pt/%7Ejleitao/pdf/srds10-mario.pdf], which describes a 
dynamic, self-healing broadcast tree. When a given node needs to send out a 
message, it dynamically builds a tree for each node in the cluster; thus giving 
us a unique tree for every node in the cluster (a tree rooted at every cluster 
node). The trees, of course, would be reusable until the cluster configurations 
changes or failures are detected (by the mechanism described in the paper). 
Additionally, Thicket includes a mechanism for load-balancing the trees such 
that nodes spread out the work amongst themselves.



> Gossip 2.0 - broadcast tree for data dissemination
> --
>
> Key: CASSANDRA-12347
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12347
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Jason Brown
>
> Description: A broadcast tree (spanning tree) allows an originating node to 
> efficiently send out updates to all of the peers in the cluster by 
> constructing a balanced, self-healing tree based upon the view it gets from 
> the peer sampling service (CASSANDRA-12346). 
> I propose we use an algorithm based on the [Thicket 
> paper|http://www.gsd.inesc-id.pt/%7Ejleitao/pdf/srds10-mario.pdf], which 
> describes a dynamic, self-healing broadcast tree. When a given node needs to 
> send out a message, it dynamically builds a tree for each node in the 
> cluster; thus giving us a unique tree for every node in the cluster (a tree 
> rooted at every cluster node). The trees, of course, would be reusable until 
> the cluster configurations changes or failures are detected (by the mechanism 
> described in the paper). Additionally, Thicket includes a mechanism for 
> load-balancing the trees such that nodes spread out the work amongst 
> themselves.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-12345) Gossip 2.0

2016-07-29 Thread Jason Brown (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Brown updated CASSANDRA-12345:

Description: 
This is a parent ticket covering changes to the dissemination aspects of the 
current gossip subsystem. (Changes to the actual data being exchanged by the 
current gossip (the cluster metadata) will be handled elsewhere, but the 
current primary ticket covering that work is CASSANDRA-9667.)


This work requires several components, which largely need to completed in this 
order:
- a peer sampling service to create partial cluster views (CASSANDRA-12346). 
This forms the basis of the next two components
- a broadcast tree, which creates dynamic spanning trees given the partial 
views provided by the peer sampling service (CASSANDRA-12347)
- an anti-entropy component, which is similar to the pair-wise exchange and 
reconciliation of the exitsing gossip implementation (CASSANDRA-???)

These base components (primarily the broadcast and anti-entropy) can allow for 
generic consumers to simply and effectively share a body of data across an 
entire cluster. The most obvious consumer will be a cluster metadata component, 
which can replace the existing gossip system, but also other components like 
CASSANDRA-12106.


  was:
This is a parent ticket covering changes to the dissemination aspects of the 
current gossip subsystem. (Changes to the actual data being exchanged by the 
current gossip (the cluster metadata) will be handled elsewhere, but the 
current primary ticket covering that work is CASSANDRA-9667.)


This work requires several components, which largely need to completed in this 
order:
- a peer sampling service to create partial cluster views (CASSANDRA-12346). 
This forms the basis of the next two components
- a broadcast tree, which creates dynamic spanning trees given the partial 
views provided by the peer sampling service (CASSANDRA-???)
- an anti-entropy component, which is similar to the pair-wise exchange and 
reconciliation of the exitsing gossip implementation (CASSANDRA-???)

These base components (primarily the broadcast and anti-entropy) can allow for 
generic consumers to simply and effectively share a body of data across an 
entire cluster. The most obvious consumer will be a cluster metadata component, 
which can replace the existing gossip system, but also other components like 
CASSANDRA-12106.



> Gossip 2.0
> --
>
> Key: CASSANDRA-12345
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12345
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Jason Brown
>Assignee: Jason Brown
>  Labels: gossip
>
> This is a parent ticket covering changes to the dissemination aspects of the 
> current gossip subsystem. (Changes to the actual data being exchanged by the 
> current gossip (the cluster metadata) will be handled elsewhere, but the 
> current primary ticket covering that work is CASSANDRA-9667.)
> This work requires several components, which largely need to completed in 
> this order:
> - a peer sampling service to create partial cluster views (CASSANDRA-12346). 
> This forms the basis of the next two components
> - a broadcast tree, which creates dynamic spanning trees given the partial 
> views provided by the peer sampling service (CASSANDRA-12347)
> - an anti-entropy component, which is similar to the pair-wise exchange and 
> reconciliation of the exitsing gossip implementation (CASSANDRA-???)
> These base components (primarily the broadcast and anti-entropy) can allow 
> for generic consumers to simply and effectively share a body of data across 
> an entire cluster. The most obvious consumer will be a cluster metadata 
> component, which can replace the existing gossip system, but also other 
> components like CASSANDRA-12106.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (CASSANDRA-12347) Gossip 2.0 - broadcast tree for data dissemination

2016-07-29 Thread Jason Brown (JIRA)

Jason Brown created CASSANDRA-12347:
---

 Summary: Gossip 2.0 - broadcast tree for data dissemination
 Key: CASSANDRA-12347
 URL: https://issues.apache.org/jira/browse/CASSANDRA-12347
 Project: Cassandra
  Issue Type: Improvement
Reporter: Jason Brown


Description: A broadcast tree (spanning tree) allows an originating node to 
efficiently send out updates to all of the peers in the cluster by constructing 
a balanced, self-healing tree based upon the view it gets from the peer 
sampling service (CASSANDRA-12346). 

I propose we use an algorithm based on the [Thicket 
paper|www.gsd.inesc-id.pt/%7Ejleitao/pdf/srds10-mario.pdf], which describes a 
dynamic, self-healing broadcast tree. When a given node needs to send out a 
message, it dynamically builds a tree for each node in the cluster; thus giving 
us a unique tree for every node in the cluster (a tree rooted at every cluster 
node). The trees, of course, would be reusable until the cluster configurations 
changes or failures are detected (by the mechanism described in the paper). 
Additionally, Thicket includes a mechanism for load-balancing the trees such 
that nodes spread out the work amongst themselves.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-12345) Gossip 2.0

2016-07-29 Thread Jason Brown (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Brown updated CASSANDRA-12345:

Description: 
This is a parent ticket covering changes to the dissemination aspects of the 
current gossip subsystem. (Changes to the actual data being exchanged by the 
current gossip (the cluster metadata) will be handled elsewhere, but the 
current primary ticket covering that work is CASSANDRA-9667.)


This work requires several components, which largely need to completed in this 
order:
- a peer sampling service to create partial cluster views (CASSANDRA-12346). 
This forms the basis of the next two components
- a broadcast tree, which creates dynamic spanning trees given the partial 
views provided by the peer sampling service (CASSANDRA-???)
- an anti-entropy component, which is similar to the pair-wise exchange and 
reconciliation of the exitsing gossip implementation (CASSANDRA-???)

These base components (primarily the broadcast and anti-entropy) can allow for 
generic consumers to simply and effectively share a body of data across an 
entire cluster. The most obvious consumer will be a cluster metadata component, 
which can replace the existing gossip system, but also other components like 
CASSANDRA-12106.


  was:
This is a parent ticket covering changes to the dissemination aspects of the 
current gossip subsystem. (Changes to the actual data being exchanged by the 
current gossip (the cluster metadata) will be handled elsewhere, but the 
current primary ticket covering that work is CASSANDRA-9667.)


This work requires several components, which largely need to completed in this 
order:
- a peer sampling service to create partial cluster views (CASSANDRA-). 
This forms the basis of the next two components
- a broadcast tree, which creates dynamic spanning trees given the partial 
views provided by the peer sampling service (CASSANDRA-???)
- an anti-entropy component, which is similar to the pair-wise exchange and 
reconciliation of the exitsing gossip implementation (CASSANDRA-???)

These base components (primarily the broadcast and anti-entropy) can allow for 
generic consumers to simply and effectively share a body of data across an 
entire cluster. The most obvious consumer will be a cluster metadata component, 
which can replace the existing gossip system, but also other components like 
CASSANDRA-12106.



> Gossip 2.0
> --
>
> Key: CASSANDRA-12345
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12345
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Jason Brown
>Assignee: Jason Brown
>  Labels: gossip
>
> This is a parent ticket covering changes to the dissemination aspects of the 
> current gossip subsystem. (Changes to the actual data being exchanged by the 
> current gossip (the cluster metadata) will be handled elsewhere, but the 
> current primary ticket covering that work is CASSANDRA-9667.)
> This work requires several components, which largely need to completed in 
> this order:
> - a peer sampling service to create partial cluster views (CASSANDRA-12346). 
> This forms the basis of the next two components
> - a broadcast tree, which creates dynamic spanning trees given the partial 
> views provided by the peer sampling service (CASSANDRA-???)
> - an anti-entropy component, which is similar to the pair-wise exchange and 
> reconciliation of the exitsing gossip implementation (CASSANDRA-???)
> These base components (primarily the broadcast and anti-entropy) can allow 
> for generic consumers to simply and effectively share a body of data across 
> an entire cluster. The most obvious consumer will be a cluster metadata 
> component, which can replace the existing gossip system, but also other 
> components like CASSANDRA-12106.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Reopened] (CASSANDRA-12071) Regression in flushing throughput under load after CASSANDRA-6696

2016-07-29 Thread Ariel Weisberg (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ariel Weisberg reopened CASSANDRA-12071:


Turns out this is still a problem because the usage of an unbounded 
LinkedBlockingQueue means that TPE will never actually spin up additional 
threads.

You can see that this was necessary for CASSANDRA-2178 as well.

> Regression in flushing throughput under load after CASSANDRA-6696
> -
>
> Key: CASSANDRA-12071
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12071
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local Write-Read Paths
>Reporter: Ariel Weisberg
>Assignee: Marcus Eriksson
> Fix For: 3.8
>
>
> The way flushing used to work is that a ColumnFamilyStore could have multiple 
> Memtables flushing at once and multiple ColumnFamilyStores could flush at the 
> same time. The way it works now there can be only a single flush of any 
> ColumnFamilyStore & Memtable running in the C* process, and the number of 
> threads applied to that flush is bounded by the number of disks in JBOD.
> This works ok most of the time but occasionally flushing will be a little 
> slower and ingest will outstrip it and then block on available memory. At 
> this point you see several second stalls that cause timeouts.
> This is a problem for reasonable configurations that don't use JBOD but have 
> access to a fast disk that can handle some IO queuing (RAID, SSD).
> You can reproduce on beefy hardware (12 cores 24 threads, 64 gigs of RAM, 
> SSD) if you unthrottle compaction or set it to something like 64 
> megabytes/second and run with 8 compaction threads and stress with the 
> default write workload and a reasonable number of threads. I tested with 96.
> It started happening after about 60 gigabytes of data was loaded.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (CASSANDRA-12346) Gossip 2.0 - introduce a Peer Sampling Service for partial cluster views

2016-07-29 Thread Jason Brown (JIRA)

Jason Brown created CASSANDRA-12346:
---

 Summary: Gossip 2.0 - introduce a Peer Sampling Service for 
partial cluster views
 Key: CASSANDRA-12346
 URL: https://issues.apache.org/jira/browse/CASSANDRA-12346
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Jason Brown
Assignee: Jason Brown


A [Peer Sampling 
Service|infoscience.epfl.ch/record/83409/files/neg--1184036295all.pdf] is a 
module that provides a partial view of a cluster to dependent modules. A node's 
partial view, combined with all other nodes' partial views, combine to create a 
fully-connected mesh over the cluster. This way, a given node does not need to 
have direct connections to every other node in the cluster, and can be much 
more efficient in terms of resource management as well as information 
dissemination. Peer Sampling Services by their nature must be self-healing and 
self-balancing to maintain the fully-connected mesh.

I propose we use an algorithm based on [HyParView 
(http://asc.di.fct.unl.pt/~jleitao/pdf/dsn07-leitao.pdf], which is a concrete 
algorithm for a Peer Sampling Service. HyParView has a clearly defined 
protocol, and is reasonably simple to implement.





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-12340) dtest failure in upgrade_supercolumns_test.TestSCUpgrade.upgrade_with_counters_test

2016-07-29 Thread Joel Knighton (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-12340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15400057#comment-15400057
 ] 

Joel Knighton commented on CASSANDRA-12340:
---

This is almost certainly because when we're stopping 2.0.17 for the upgrade, 
the StorageServiceShutdownHook doesn't stop compactions and a compaction is 
attempting to schedule a task on the miscellaneous tasks executor after that 
executor has been stopped. 

This isn't going to get fixed in 2.0 (or likely 2.1 for that matter), so the 
best option is to just ignore this error in the test if possible. 

> dtest failure in 
> upgrade_supercolumns_test.TestSCUpgrade.upgrade_with_counters_test
> ---
>
> Key: CASSANDRA-12340
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12340
> Project: Cassandra
>  Issue Type: Test
>Reporter: Sean McCarthy
>Assignee: DS Test Eng
>  Labels: dtest
> Attachments: node1.log, node2.log, node3.log
>
>
> example failure:
> http://cassci.datastax.com/job/cassandra-2.1_dtest_jdk8/249/testReport/upgrade_supercolumns_test/TestSCUpgrade/upgrade_with_counters_test
> {code}
> Standard Output
> Unexpected error in node3 log, error: 
> ERROR [CompactionExecutor:1] 2016-07-28 15:34:19,533 CassandraDaemon.java 
> (line 191) Exception in thread Thread[CompactionExecutor:1,1,main]
> java.util.concurrent.RejectedExecutionException: Task 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask@5fb8b2bf 
> rejected from 
> org.apache.cassandra.concurrent.DebuggableScheduledThreadPoolExecutor@1ae851ad[Terminated,
>  pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 8]
>   at 
> java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2047)
>   at 
> java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:823)
>   at 
> java.util.concurrent.ScheduledThreadPoolExecutor.delayedExecute(ScheduledThreadPoolExecutor.java:326)
>   at 
> java.util.concurrent.ScheduledThreadPoolExecutor.schedule(ScheduledThreadPoolExecutor.java:533)
>   at 
> java.util.concurrent.ScheduledThreadPoolExecutor.submit(ScheduledThreadPoolExecutor.java:632)
>   at 
> org.apache.cassandra.io.sstable.SSTableDeletingTask.schedule(SSTableDeletingTask.java:65)
>   at 
> org.apache.cassandra.io.sstable.SSTableReader.releaseReference(SSTableReader.java:976)
>   at 
> org.apache.cassandra.db.DataTracker.removeOldSSTablesSize(DataTracker.java:383)
>   at org.apache.cassandra.db.DataTracker.postReplace(DataTracker.java:348)
>   at org.apache.cassandra.db.DataTracker.replace(DataTracker.java:342)
>   at 
> org.apache.cassandra.db.DataTracker.replaceCompactedSSTables(DataTracker.java:245)
>   at 
> org.apache.cassandra.db.ColumnFamilyStore.replaceCompactedSSTables(ColumnFamilyStore.java:995)
>   at 
> org.apache.cassandra.db.compaction.CompactionTask.replaceCompactedSSTables(CompactionTask.java:270)
>   at 
> org.apache.cassandra.db.compaction.CompactionTask.runWith(CompactionTask.java:230)
>   at 
> org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48)
>   at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
>   at 
> org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:58)
>   at 
> org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:60)
>   at 
> org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionTask.run(CompactionManager.java:208)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (CASSANDRA-12345) Gossip 2.0

2016-07-29 Thread Jason Brown (JIRA)

Jason Brown created CASSANDRA-12345:
---

 Summary: Gossip 2.0
 Key: CASSANDRA-12345
 URL: https://issues.apache.org/jira/browse/CASSANDRA-12345
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Jason Brown
Assignee: Jason Brown


This is a parent ticket covering changes to the dissemination aspects of the 
current gossip subsystem. (Changes to the actual data being exchanged by the 
current gossip (the cluster metadata) will be handled elsewhere, but the 
current primary ticket covering that work is CASSANDRA-9667.)


This work requires several components, which largely need to completed in this 
order:
- a peer sampling service to create partial cluster views (CASSANDRA-). 
This forms the basis of the next two components
- a broadcast tree, which creates dynamic spanning trees given the partial 
views provided by the peer sampling service (CASSANDRA-???)
- an anti-entropy component, which is similar to the pair-wise exchange and 
reconciliation of the exitsing gossip implementation (CASSANDRA-???)

These base components (primarily the broadcast and anti-entropy) can allow for 
generic consumers to simply and effectively share a body of data across an 
entire cluster. The most obvious consumer will be a cluster metadata component, 
which can replace the existing gossip system, but also other components like 
CASSANDRA-12106.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-12092) dtest failure in consistency_test.TestAccuracy.test_simple_strategy_counters

2016-07-29 Thread Russ Hatch (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-12092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1547#comment-1547
 ] 

Russ Hatch commented on CASSANDRA-12092:


Interestingly both failures shown on this ticket happened at 'key 200', which 
looks to be writing at quorum, reading back at one, with serial unset. For a 
random-looking failure, the same key of 200 is a suspicious value.

> dtest failure in consistency_test.TestAccuracy.test_simple_strategy_counters
> 
>
> Key: CASSANDRA-12092
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12092
> Project: Cassandra
>  Issue Type: Test
>Reporter: Sean McCarthy
>Assignee: Russ Hatch
>  Labels: dtest
> Attachments: node1.log, node2.log, node3.log
>
>
> example failure:
> http://cassci.datastax.com/job/cassandra-2.1_dtest/484/testReport/consistency_test/TestAccuracy/test_simple_strategy_counters
> Failed on CassCI build cassandra-2.1_dtest #484
> {code}
> Standard Error
> Traceback (most recent call last):
>   File "/home/automaton/cassandra-dtest/consistency_test.py", line 514, in run
> valid_fcn(v)
>   File "/home/automaton/cassandra-dtest/consistency_test.py", line 497, in 
> validate_counters
> check_all_sessions(s, n, c)
>   File "/home/automaton/cassandra-dtest/consistency_test.py", line 490, in 
> check_all_sessions
> "value of %s at key %d, instead got these values: %s" % (write_nodes, 
> val, n, results)
> AssertionError: Failed to read value from sufficient number of nodes, 
> required 2 nodes to have a counter value of 1 at key 200, instead got these 
> values: [0, 0, 1]
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-12092) dtest failure in consistency_test.TestAccuracy.test_simple_strategy_counters

2016-07-29 Thread Russ Hatch (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-12092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1533#comment-1533
 ] 

Russ Hatch commented on CASSANDRA-12092:


failure from recent multiplex: 
http://cassci.datastax.com/view/Parameterized/job/parameterized_dtest_multiplexer/195/testReport/node_1_iter_059.consistency_test/TestAccuracy/test_simple_strategy_counters/

> dtest failure in consistency_test.TestAccuracy.test_simple_strategy_counters
> 
>
> Key: CASSANDRA-12092
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12092
> Project: Cassandra
>  Issue Type: Test
>Reporter: Sean McCarthy
>Assignee: Russ Hatch
>  Labels: dtest
> Attachments: node1.log, node2.log, node3.log
>
>
> example failure:
> http://cassci.datastax.com/job/cassandra-2.1_dtest/484/testReport/consistency_test/TestAccuracy/test_simple_strategy_counters
> Failed on CassCI build cassandra-2.1_dtest #484
> {code}
> Standard Error
> Traceback (most recent call last):
>   File "/home/automaton/cassandra-dtest/consistency_test.py", line 514, in run
> valid_fcn(v)
>   File "/home/automaton/cassandra-dtest/consistency_test.py", line 497, in 
> validate_counters
> check_all_sessions(s, n, c)
>   File "/home/automaton/cassandra-dtest/consistency_test.py", line 490, in 
> check_all_sessions
> "value of %s at key %d, instead got these values: %s" % (write_nodes, 
> val, n, results)
> AssertionError: Failed to read value from sufficient number of nodes, 
> required 2 nodes to have a counter value of 1 at key 200, instead got these 
> values: [0, 0, 1]
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-12092) dtest failure in consistency_test.TestAccuracy.test_simple_strategy_counters

2016-07-29 Thread Russ Hatch (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-12092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1532#comment-1532
 ] 

Russ Hatch commented on CASSANDRA-12092:


1 failure in 200 iterations. Either the test is bad or there's a bug here.

> dtest failure in consistency_test.TestAccuracy.test_simple_strategy_counters
> 
>
> Key: CASSANDRA-12092
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12092
> Project: Cassandra
>  Issue Type: Test
>Reporter: Sean McCarthy
>Assignee: Russ Hatch
>  Labels: dtest
> Attachments: node1.log, node2.log, node3.log
>
>
> example failure:
> http://cassci.datastax.com/job/cassandra-2.1_dtest/484/testReport/consistency_test/TestAccuracy/test_simple_strategy_counters
> Failed on CassCI build cassandra-2.1_dtest #484
> {code}
> Standard Error
> Traceback (most recent call last):
>   File "/home/automaton/cassandra-dtest/consistency_test.py", line 514, in run
> valid_fcn(v)
>   File "/home/automaton/cassandra-dtest/consistency_test.py", line 497, in 
> validate_counters
> check_all_sessions(s, n, c)
>   File "/home/automaton/cassandra-dtest/consistency_test.py", line 490, in 
> check_all_sessions
> "value of %s at key %d, instead got these values: %s" % (write_nodes, 
> val, n, results)
> AssertionError: Failed to read value from sufficient number of nodes, 
> required 2 nodes to have a counter value of 1 at key 200, instead got these 
> values: [0, 0, 1]
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-8523) Writes should be sent to a replacement node while it is streaming in data

2016-07-29 Thread Richard Low (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15399949#comment-15399949
 ] 

Richard Low commented on CASSANDRA-8523:


I'll review the 3.9 version. I'm very much in favour of putting this in 2.2 and 
3.0 as this hurts us badly and no doubt others suffer too.

> Writes should be sent to a replacement node while it is streaming in data
> -
>
> Key: CASSANDRA-8523
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8523
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Richard Wagner
>Assignee: Paulo Motta
> Fix For: 2.1.x
>
>
> In our operations, we make heavy use of replace_address (or 
> replace_address_first_boot) in order to replace broken nodes. We now realize 
> that writes are not sent to the replacement nodes while they are in hibernate 
> state and streaming in data. This runs counter to what our expectations were, 
> especially since we know that writes ARE sent to nodes when they are 
> bootstrapped into the ring.
> It seems like cassandra should arrange to send writes to a node that is in 
> the process of replacing another node, just like it does for a nodes that are 
> bootstraping. I hesitate to phrase this as "we should send writes to a node 
> in hibernate" because the concept of hibernate may be useful in other 
> contexts, as per CASSANDRA-8336. Maybe a new state is needed here?
> Among other things, the fact that we don't get writes during this period 
> makes subsequent repairs more expensive, proportional to the number of writes 
> that we miss (and depending on the amount of data that needs to be streamed 
> during replacement and the time it may take to rebuild secondary indexes, we 
> could miss many many hours worth of writes). It also leaves us more exposed 
> to consistency violations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-8523) Writes should be sent to a replacement node while it is streaming in data

2016-07-29 Thread Paulo Motta (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15399941#comment-15399941
 ] 

Paulo Motta commented on CASSANDRA-8523:


Thanks for taking a look! Created CASSANDRA-12344 to follow-up with support for 
this when the replacement node has the same address as the original node.

Rebased patch and dtests as well as merged up to 3.0+. All patches and CI 
results available below:

||2.2||3.0||3.9||trunk||dtest||
|[branch|https://github.com/apache/cassandra/compare/cassandra-2.2...pauloricardomg:2.2-8523]|[branch|https://github.com/apache/cassandra/compare/cassandra-3.0...pauloricardomg:3.0-8523]|[branch|https://github.com/apache/cassandra/compare/cassandra-3.9...pauloricardomg:3.9-8523]|[branch|https://github.com/apache/cassandra/compare/trunk...pauloricardomg:trunk-8523]|[branch|https://github.com/riptano/cassandra-dtest/compare/master...pauloricardomg:8523]|
|[testall|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-2.2-8523-testall/lastCompletedBuild/testReport/]|[testall|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-3.0-8523-testall/lastCompletedBuild/testReport/]|[testall|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-3.9-8523-testall/lastCompletedBuild/testReport/]|[testall|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-trunk-8523-testall/lastCompletedBuild/testReport/]|
|[dtest|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-2.2-8523-dtest/lastCompletedBuild/testReport/]|[dtest|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-3.0-8523-dtest/lastCompletedBuild/testReport/]|[dtest|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-3.9-8523-dtest/lastCompletedBuild/testReport/]|[dtest|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-trunk-8523-dtest/lastCompletedBuild/testReport/]|

There were some minor merge conflicts on 3.0, and a slightly larger conflict on 
3.9 due to CASSANDRA-10134, so I did some refactoring in the 3.9+ version to 
move most of the logic to {{prepareForReplacement}}. Can you take another look 
[~rlow]?

Dtest PR created [here|https://github.com/riptano/cassandra-dtest/pull/1155].

While this is marked an improvement and would theoretically only go to trunk, 
this limitation is pretty counter-intuitive and probably hurts many users in 
the wild, and since the changeset is relatively small and self-contained, I 
think it could be interpreted as a bugfix and perhaps go on 2.2+ or maybe 3.0+. 
WDYT [~brandon.williams] [~jkni] ?

> Writes should be sent to a replacement node while it is streaming in data
> -
>
> Key: CASSANDRA-8523
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8523
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Richard Wagner
>Assignee: Paulo Motta
> Fix For: 2.1.x
>
>
> In our operations, we make heavy use of replace_address (or 
> replace_address_first_boot) in order to replace broken nodes. We now realize 
> that writes are not sent to the replacement nodes while they are in hibernate 
> state and streaming in data. This runs counter to what our expectations were, 
> especially since we know that writes ARE sent to nodes when they are 
> bootstrapped into the ring.
> It seems like cassandra should arrange to send writes to a node that is in 
> the process of replacing another node, just like it does for a nodes that are 
> bootstraping. I hesitate to phrase this as "we should send writes to a node 
> in hibernate" because the concept of hibernate may be useful in other 
> contexts, as per CASSANDRA-8336. Maybe a new state is needed here?
> Among other things, the fact that we don't get writes during this period 
> makes subsequent repairs more expensive, proportional to the number of writes 
> that we miss (and depending on the amount of data that needs to be streamed 
> during replacement and the time it may take to rebuild secondary indexes, we 
> could miss many many hours worth of writes). It also leaves us more exposed 
> to consistency violations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-12160) dtest failure in counter_tests.TestCounters.upgrade_test

2016-07-29 Thread Russ Hatch (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-12160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15399932#comment-15399932
 ] 

Russ Hatch commented on CASSANDRA-12160:


single flap in recent history, going to try a multiplex (x100) here: 
http://cassci.datastax.com/view/Parameterized/job/parameterized_dtest_multiplexer/196/

> dtest failure in counter_tests.TestCounters.upgrade_test
> 
>
> Key: CASSANDRA-12160
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12160
> Project: Cassandra
>  Issue Type: Test
>Reporter: Sean McCarthy
>Assignee: Russ Hatch
>  Labels: dtest
> Attachments: node1.log, node2.log
>
>
> example failure:
> http://cassci.datastax.com/job/cassandra-2.1_dtest/493/testReport/counter_tests/TestCounters/upgrade_test
> Failed on CassCI build cassandra-2.1_dtest #493
> {code}
> Error Message
> 07 Jul 2016 21:07:28 [node1] Missing: ['127.0.0.2.* now UP']:
> .
> See system.log for remainder
> {code}
> {code}
> Stacktrace
>   File "/usr/lib/python2.7/unittest/case.py", line 329, in run
> testMethod()
>   File "/home/automaton/cassandra-dtest/counter_tests.py", line 101, in 
> upgrade_test
> rolling_restart()
>   File "/home/automaton/cassandra-dtest/counter_tests.py", line 96, in 
> rolling_restart
> nodes[i].start(wait_other_notice=True, wait_for_binary_proto=True)
>   File "/home/automaton/ccm/ccmlib/node.py", line 634, in start
> node.watch_log_for_alive(self, from_mark=mark)
>   File "/home/automaton/ccm/ccmlib/node.py", line 481, in watch_log_for_alive
> self.watch_log_for(tofind, from_mark=from_mark, timeout=timeout, 
> filename=filename)
>   File "/home/automaton/ccm/ccmlib/node.py", line 449, in watch_log_for
> raise TimeoutError(time.strftime("%d %b %Y %H:%M:%S", time.gmtime()) + " 
> [" + self.name + "] Missing: " + str([e.pattern for e in tofind]) + ":\n" + 
> reads[:50] + ".\nSee {} for remainder".format(filename))
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (CASSANDRA-12160) dtest failure in counter_tests.TestCounters.upgrade_test

2016-07-29 Thread Russ Hatch (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Russ Hatch reassigned CASSANDRA-12160:
--

Assignee: Russ Hatch  (was: DS Test Eng)

> dtest failure in counter_tests.TestCounters.upgrade_test
> 
>
> Key: CASSANDRA-12160
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12160
> Project: Cassandra
>  Issue Type: Test
>Reporter: Sean McCarthy
>Assignee: Russ Hatch
>  Labels: dtest
> Attachments: node1.log, node2.log
>
>
> example failure:
> http://cassci.datastax.com/job/cassandra-2.1_dtest/493/testReport/counter_tests/TestCounters/upgrade_test
> Failed on CassCI build cassandra-2.1_dtest #493
> {code}
> Error Message
> 07 Jul 2016 21:07:28 [node1] Missing: ['127.0.0.2.* now UP']:
> .
> See system.log for remainder
> {code}
> {code}
> Stacktrace
>   File "/usr/lib/python2.7/unittest/case.py", line 329, in run
> testMethod()
>   File "/home/automaton/cassandra-dtest/counter_tests.py", line 101, in 
> upgrade_test
> rolling_restart()
>   File "/home/automaton/cassandra-dtest/counter_tests.py", line 96, in 
> rolling_restart
> nodes[i].start(wait_other_notice=True, wait_for_binary_proto=True)
>   File "/home/automaton/ccm/ccmlib/node.py", line 634, in start
> node.watch_log_for_alive(self, from_mark=mark)
>   File "/home/automaton/ccm/ccmlib/node.py", line 481, in watch_log_for_alive
> self.watch_log_for(tofind, from_mark=from_mark, timeout=timeout, 
> filename=filename)
>   File "/home/automaton/ccm/ccmlib/node.py", line 449, in watch_log_for
> raise TimeoutError(time.strftime("%d %b %Y %H:%M:%S", time.gmtime()) + " 
> [" + self.name + "] Missing: " + str([e.pattern for e in tofind]) + ":\n" + 
> reads[:50] + ".\nSee {} for remainder".format(filename))
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (CASSANDRA-12344) Forward writes to replacement node with same address during replace

2016-07-29 Thread Paulo Motta (JIRA)

Paulo Motta created CASSANDRA-12344:
---

 Summary: Forward writes to replacement node with same address 
during replace
 Key: CASSANDRA-12344
 URL: https://issues.apache.org/jira/browse/CASSANDRA-12344
 Project: Cassandra
  Issue Type: Improvement
  Components: Coordination, Distributed Metadata
Reporter: Paulo Motta


On CASSANDRA-8523 it was added support to forwarding writes to a replacement 
node via a new gossip state {{BOOTSTRAPPING_REPLACE}}.

Currently this is limited to replacement nodes with a different address of the 
original node, because if a replacement node with the same address of a normal 
endpoint joins gossip with a non-dead state, it will become alive in the 
Failure Detector and reads will be forwarded to it before the node is ready to 
serve reads.

This ticket is to add support to forwarding writes to replacement nodes with 
the same IP address as the original node.

The initial idea is to allow marking a node as unavailable for reads on 
{{TokenMetadata}}, what will allow the replacement node with the same IP join 
gossip without having reads forwarded to it. This will be enabled by 
CASSANDRA-11559.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9054) Break DatabaseDescriptor up into multiple classes.

2016-07-29 Thread Robert Stupp (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15399828#comment-15399828
 ] 

Robert Stupp commented on CASSANDRA-9054:
-

Alright - pushed a couple of commits to the branch to address the review 
comments and also fix some things.
utests + dtests look good now. Latest dtest run has 0 errors an utest a couple 
of timeouts (triggered another run).

Worked in all the review comments. Just removed the weird comments in Config + 
DD. The intention of these comments was to hint contributors to be careful to 
not introduce new "magic" class dependencies that startup "everything" by 
accessing DD.

> Break DatabaseDescriptor up into multiple classes.
> --
>
> Key: CASSANDRA-9054
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9054
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Jeremiah Jordan
>Assignee: Robert Stupp
> Fix For: 3.x
>
>
> Right now to get at Config stuff you go through DatabaseDescriptor.  But when 
> you instantiate DatabaseDescriptor it actually opens system tables and such, 
> which triggers commit log replays, and other things if the right flags aren't 
> set ahead of time.  This makes getting at config stuff from tools annoying, 
> as you have to be very careful about instantiation orders.
> It would be nice if we could break DatabaseDescriptor up into multiple 
> classes, so that getting at config stuff from tools wasn't such a pain.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (CASSANDRA-12092) dtest failure in consistency_test.TestAccuracy.test_simple_strategy_counters

2016-07-29 Thread Russ Hatch (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Russ Hatch reassigned CASSANDRA-12092:
--

Assignee: Russ Hatch  (was: DS Test Eng)

> dtest failure in consistency_test.TestAccuracy.test_simple_strategy_counters
> 
>
> Key: CASSANDRA-12092
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12092
> Project: Cassandra
>  Issue Type: Test
>Reporter: Sean McCarthy
>Assignee: Russ Hatch
>  Labels: dtest
> Attachments: node1.log, node2.log, node3.log
>
>
> example failure:
> http://cassci.datastax.com/job/cassandra-2.1_dtest/484/testReport/consistency_test/TestAccuracy/test_simple_strategy_counters
> Failed on CassCI build cassandra-2.1_dtest #484
> {code}
> Standard Error
> Traceback (most recent call last):
>   File "/home/automaton/cassandra-dtest/consistency_test.py", line 514, in run
> valid_fcn(v)
>   File "/home/automaton/cassandra-dtest/consistency_test.py", line 497, in 
> validate_counters
> check_all_sessions(s, n, c)
>   File "/home/automaton/cassandra-dtest/consistency_test.py", line 490, in 
> check_all_sessions
> "value of %s at key %d, instead got these values: %s" % (write_nodes, 
> val, n, results)
> AssertionError: Failed to read value from sufficient number of nodes, 
> required 2 nodes to have a counter value of 1 at key 200, instead got these 
> values: [0, 0, 1]
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-12092) dtest failure in consistency_test.TestAccuracy.test_simple_strategy_counters

2016-07-29 Thread Russ Hatch (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-12092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15399825#comment-15399825
 ] 

Russ Hatch commented on CASSANDRA-12092:


Since this is one isolated flap in recent history, testing with multiplex (200 
iterations) here:

http://cassci.datastax.com/view/Parameterized/job/parameterized_dtest_multiplexer/195/

> dtest failure in consistency_test.TestAccuracy.test_simple_strategy_counters
> 
>
> Key: CASSANDRA-12092
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12092
> Project: Cassandra
>  Issue Type: Test
>Reporter: Sean McCarthy
>Assignee: DS Test Eng
>  Labels: dtest
> Attachments: node1.log, node2.log, node3.log
>
>
> example failure:
> http://cassci.datastax.com/job/cassandra-2.1_dtest/484/testReport/consistency_test/TestAccuracy/test_simple_strategy_counters
> Failed on CassCI build cassandra-2.1_dtest #484
> {code}
> Standard Error
> Traceback (most recent call last):
>   File "/home/automaton/cassandra-dtest/consistency_test.py", line 514, in run
> valid_fcn(v)
>   File "/home/automaton/cassandra-dtest/consistency_test.py", line 497, in 
> validate_counters
> check_all_sessions(s, n, c)
>   File "/home/automaton/cassandra-dtest/consistency_test.py", line 490, in 
> check_all_sessions
> "value of %s at key %d, instead got these values: %s" % (write_nodes, 
> val, n, results)
> AssertionError: Failed to read value from sufficient number of nodes, 
> required 2 nodes to have a counter value of 1 at key 200, instead got these 
> values: [0, 0, 1]
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (CASSANDRA-12336) NullPointerException during compaction on table with static columns

2016-07-29 Thread Evan Prothro (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-12336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15399808#comment-15399808
 ] 

Evan Prothro edited comment on CASSANDRA-12336 at 7/29/16 6:23 PM:
---

[~dannyantonetti] have you tried using {{sstabledump}} to inspect your data? 

http://www.datastax.com/dev/blog/debugging-sstables-in-3-0-with-sstabledump


was (Author: eprothro):
[~dannyantonetti] have you tried using {{sstabledump}} to inspect your data? 

http://www.datastax.com/dev/blog/debugging-sstables-in-3-0-with-sstabledump

It might help if you explain what you are doing and what exception you are 
seeing where and when.

> NullPointerException during compaction on table with static columns
> ---
>
> Key: CASSANDRA-12336
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12336
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
> Environment: cqlsh 5.0.1
> Cassandra 3.0.8-SNAPSHOT (3.0.x dev - a5cbb0)
>Reporter: Evan Prothro
>Assignee: Sylvain Lebresne
> Fix For: 3.0.9
>
>
> After being affected by 
> https://issues.apache.org/jira/browse/CASSANDRA-11988, we built a5cbb0. 
> Compaction still fails with the following trace:
> {code}
> WARN  [SharedPool-Worker-2] 2016-07-28 10:51:56,111 
> AbstractLocalAwareExecutorService.java:169 - Uncaught exception on thread 
> Thread[SharedPool-Worker-2,5,main]: {}
> java.lang.RuntimeException: java.lang.NullPointerException
>   at 
> org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2453)
>  ~[main/:na]
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> ~[na:1.8.0_72]
>   at 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:164)
>  ~[main/:na]
>   at 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:136)
>  [main/:na]
>   at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) 
> [main/:na]
>   at java.lang.Thread.run(Thread.java:745) [na:1.8.0_72]
> Caused by: java.lang.NullPointerException: null
>   at 
> org.apache.cassandra.db.ReadCommand$1MetricRecording.applyToRow(ReadCommand.java:466)
>  ~[main/:na]
>   at 
> org.apache.cassandra.db.ReadCommand$1MetricRecording.applyToStatic(ReadCommand.java:460)
>  ~[main/:na]
>   at org.apache.cassandra.db.transform.BaseRows.add(BaseRows.java:105) 
> ~[main/:na]
>   at 
> org.apache.cassandra.db.transform.UnfilteredRows.add(UnfilteredRows.java:41) 
> ~[main/:na]
>   at 
> org.apache.cassandra.db.transform.Transformation.add(Transformation.java:156) 
> ~[main/:na]
>   at 
> org.apache.cassandra.db.transform.Transformation.apply(Transformation.java:122)
>  ~[main/:na]
>   at 
> org.apache.cassandra.db.ReadCommand$1MetricRecording.applyToPartition(ReadCommand.java:454)
>  ~[main/:na]
>   at 
> org.apache.cassandra.db.ReadCommand$1MetricRecording.applyToPartition(ReadCommand.java:438)
>  ~[main/:na]
>   at 
> org.apache.cassandra.db.transform.BasePartitions.hasNext(BasePartitions.java:96)
>  ~[main/:na]
>   at 
> org.apache.cassandra.db.partitions.UnfilteredPartitionIterators$Serializer.serialize(UnfilteredPartitionIterators.java:295)
>  ~[main/:na]
>   at 
> org.apache.cassandra.db.ReadResponse$LocalDataResponse.build(ReadResponse.java:145)
>  ~[main/:na]
>   at 
> org.apache.cassandra.db.ReadResponse$LocalDataResponse.(ReadResponse.java:138)
>  ~[main/:na]
>   at 
> org.apache.cassandra.db.ReadResponse$LocalDataResponse.(ReadResponse.java:134)
>  ~[main/:na]
>   at 
> org.apache.cassandra.db.ReadResponse.createDataResponse(ReadResponse.java:76) 
> ~[main/:na]
>   at 
> org.apache.cassandra.db.ReadCommand.createResponse(ReadCommand.java:320) 
> ~[main/:na]
>   at 
> org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:1796)
>  ~[main/:na]
>   at 
> org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2449)
>  ~[main/:na]
>   ... 5 common frames omitted
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-12336) NullPointerException during compaction on table with static columns

2016-07-29 Thread Evan Prothro (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-12336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15399808#comment-15399808
 ] 

Evan Prothro commented on CASSANDRA-12336:
--

[~dannyantonetti] have you tried using {{sstabledump}} to inspect your data? 

http://www.datastax.com/dev/blog/debugging-sstables-in-3-0-with-sstabledump

It might help if you explain what you are doing and what exception you are 
seeing where and when.

> NullPointerException during compaction on table with static columns
> ---
>
> Key: CASSANDRA-12336
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12336
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
> Environment: cqlsh 5.0.1
> Cassandra 3.0.8-SNAPSHOT (3.0.x dev - a5cbb0)
>Reporter: Evan Prothro
>Assignee: Sylvain Lebresne
> Fix For: 3.0.9
>
>
> After being affected by 
> https://issues.apache.org/jira/browse/CASSANDRA-11988, we built a5cbb0. 
> Compaction still fails with the following trace:
> {code}
> WARN  [SharedPool-Worker-2] 2016-07-28 10:51:56,111 
> AbstractLocalAwareExecutorService.java:169 - Uncaught exception on thread 
> Thread[SharedPool-Worker-2,5,main]: {}
> java.lang.RuntimeException: java.lang.NullPointerException
>   at 
> org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2453)
>  ~[main/:na]
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> ~[na:1.8.0_72]
>   at 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:164)
>  ~[main/:na]
>   at 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:136)
>  [main/:na]
>   at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) 
> [main/:na]
>   at java.lang.Thread.run(Thread.java:745) [na:1.8.0_72]
> Caused by: java.lang.NullPointerException: null
>   at 
> org.apache.cassandra.db.ReadCommand$1MetricRecording.applyToRow(ReadCommand.java:466)
>  ~[main/:na]
>   at 
> org.apache.cassandra.db.ReadCommand$1MetricRecording.applyToStatic(ReadCommand.java:460)
>  ~[main/:na]
>   at org.apache.cassandra.db.transform.BaseRows.add(BaseRows.java:105) 
> ~[main/:na]
>   at 
> org.apache.cassandra.db.transform.UnfilteredRows.add(UnfilteredRows.java:41) 
> ~[main/:na]
>   at 
> org.apache.cassandra.db.transform.Transformation.add(Transformation.java:156) 
> ~[main/:na]
>   at 
> org.apache.cassandra.db.transform.Transformation.apply(Transformation.java:122)
>  ~[main/:na]
>   at 
> org.apache.cassandra.db.ReadCommand$1MetricRecording.applyToPartition(ReadCommand.java:454)
>  ~[main/:na]
>   at 
> org.apache.cassandra.db.ReadCommand$1MetricRecording.applyToPartition(ReadCommand.java:438)
>  ~[main/:na]
>   at 
> org.apache.cassandra.db.transform.BasePartitions.hasNext(BasePartitions.java:96)
>  ~[main/:na]
>   at 
> org.apache.cassandra.db.partitions.UnfilteredPartitionIterators$Serializer.serialize(UnfilteredPartitionIterators.java:295)
>  ~[main/:na]
>   at 
> org.apache.cassandra.db.ReadResponse$LocalDataResponse.build(ReadResponse.java:145)
>  ~[main/:na]
>   at 
> org.apache.cassandra.db.ReadResponse$LocalDataResponse.(ReadResponse.java:138)
>  ~[main/:na]
>   at 
> org.apache.cassandra.db.ReadResponse$LocalDataResponse.(ReadResponse.java:134)
>  ~[main/:na]
>   at 
> org.apache.cassandra.db.ReadResponse.createDataResponse(ReadResponse.java:76) 
> ~[main/:na]
>   at 
> org.apache.cassandra.db.ReadCommand.createResponse(ReadCommand.java:320) 
> ~[main/:na]
>   at 
> org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:1796)
>  ~[main/:na]
>   at 
> org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2449)
>  ~[main/:na]
>   ... 5 common frames omitted
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-7190) Add schema to snapshot manifest

2016-07-29 Thread sankalp kohli (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-7190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15399792#comment-15399792
 ] 

sankalp kohli commented on CASSANDRA-7190:
--

Can we commit this to 3.0 or its too late? 

> Add schema to snapshot manifest
> ---
>
> Key: CASSANDRA-7190
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7190
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Tools
>Reporter: Jonathan Ellis
>Assignee: Alex Petrov
>Priority: Minor
>  Labels: client-impacting, doc-impacting, lhf
> Fix For: 3.10
>
>
> followup from CASSANDRA-6326



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-12336) NullPointerException during compaction on table with static columns

2016-07-29 Thread Daniel Antonetti (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-12336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15399763#comment-15399763
 ] 

Daniel Antonetti commented on CASSANDRA-12336:
--

Looking a little bit into this, these seem to be bad rows in our database.
Is there a way to find them (maybe add a logging statement on the primary key), 
so that we can identify the records and investigate further, to see how many 
records that we have like this.

> NullPointerException during compaction on table with static columns
> ---
>
> Key: CASSANDRA-12336
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12336
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
> Environment: cqlsh 5.0.1
> Cassandra 3.0.8-SNAPSHOT (3.0.x dev - a5cbb0)
>Reporter: Evan Prothro
>Assignee: Sylvain Lebresne
> Fix For: 3.0.9
>
>
> After being affected by 
> https://issues.apache.org/jira/browse/CASSANDRA-11988, we built a5cbb0. 
> Compaction still fails with the following trace:
> {code}
> WARN  [SharedPool-Worker-2] 2016-07-28 10:51:56,111 
> AbstractLocalAwareExecutorService.java:169 - Uncaught exception on thread 
> Thread[SharedPool-Worker-2,5,main]: {}
> java.lang.RuntimeException: java.lang.NullPointerException
>   at 
> org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2453)
>  ~[main/:na]
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> ~[na:1.8.0_72]
>   at 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:164)
>  ~[main/:na]
>   at 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:136)
>  [main/:na]
>   at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) 
> [main/:na]
>   at java.lang.Thread.run(Thread.java:745) [na:1.8.0_72]
> Caused by: java.lang.NullPointerException: null
>   at 
> org.apache.cassandra.db.ReadCommand$1MetricRecording.applyToRow(ReadCommand.java:466)
>  ~[main/:na]
>   at 
> org.apache.cassandra.db.ReadCommand$1MetricRecording.applyToStatic(ReadCommand.java:460)
>  ~[main/:na]
>   at org.apache.cassandra.db.transform.BaseRows.add(BaseRows.java:105) 
> ~[main/:na]
>   at 
> org.apache.cassandra.db.transform.UnfilteredRows.add(UnfilteredRows.java:41) 
> ~[main/:na]
>   at 
> org.apache.cassandra.db.transform.Transformation.add(Transformation.java:156) 
> ~[main/:na]
>   at 
> org.apache.cassandra.db.transform.Transformation.apply(Transformation.java:122)
>  ~[main/:na]
>   at 
> org.apache.cassandra.db.ReadCommand$1MetricRecording.applyToPartition(ReadCommand.java:454)
>  ~[main/:na]
>   at 
> org.apache.cassandra.db.ReadCommand$1MetricRecording.applyToPartition(ReadCommand.java:438)
>  ~[main/:na]
>   at 
> org.apache.cassandra.db.transform.BasePartitions.hasNext(BasePartitions.java:96)
>  ~[main/:na]
>   at 
> org.apache.cassandra.db.partitions.UnfilteredPartitionIterators$Serializer.serialize(UnfilteredPartitionIterators.java:295)
>  ~[main/:na]
>   at 
> org.apache.cassandra.db.ReadResponse$LocalDataResponse.build(ReadResponse.java:145)
>  ~[main/:na]
>   at 
> org.apache.cassandra.db.ReadResponse$LocalDataResponse.(ReadResponse.java:138)
>  ~[main/:na]
>   at 
> org.apache.cassandra.db.ReadResponse$LocalDataResponse.(ReadResponse.java:134)
>  ~[main/:na]
>   at 
> org.apache.cassandra.db.ReadResponse.createDataResponse(ReadResponse.java:76) 
> ~[main/:na]
>   at 
> org.apache.cassandra.db.ReadCommand.createResponse(ReadCommand.java:320) 
> ~[main/:na]
>   at 
> org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:1796)
>  ~[main/:na]
>   at 
> org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2449)
>  ~[main/:na]
>   ... 5 common frames omitted
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-10643) Implement compaction for a specific token range

2016-07-29 Thread sankalp kohli (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-10643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15399751#comment-15399751
 ] 

sankalp kohli commented on CASSANDRA-10643:
---

[~krummas] Can you please review this. We already run this internally in 2.0,2.1

> Implement compaction for a specific token range
> ---
>
> Key: CASSANDRA-10643
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10643
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Vishy Kasar
>Assignee: Vishy Kasar
>  Labels: lcs
> Attachments: 10643-trunk-REV01.txt, 10643-trunk-REV02.txt
>
>
> We see repeated cases in production (using LCS) where small number of users 
> generate a large number repeated updates or tombstones. Reading data of such 
> users brings in large amounts of data in to java process. Apart from the read 
> itself being slow for the user, the excessive GC affects other users as well. 
> Our solution so far is to move from LCS to SCS and back. This takes long and 
> is an over kill if the number of outliers is small. For such cases, we can 
> implement the point compaction of a token range. We make the nodetool compact 
> take a starting and ending token range and compact all the SSTables that fall 
> with in that range. We can refuse to compact if the number of sstables is 
> beyond a max_limit.
> Example: 
> nodetool -st 3948291562518219268 -et 3948291562518219269 compact keyspace 
> table



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-12336) NullPointerException during compaction on table with static columns

2016-07-29 Thread Daniel Antonetti (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-12336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15399760#comment-15399760
 ] 

Daniel Antonetti commented on CASSANDRA-12336:
--

This patch does seem to fix the issue that we saw before.

> NullPointerException during compaction on table with static columns
> ---
>
> Key: CASSANDRA-12336
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12336
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
> Environment: cqlsh 5.0.1
> Cassandra 3.0.8-SNAPSHOT (3.0.x dev - a5cbb0)
>Reporter: Evan Prothro
>Assignee: Sylvain Lebresne
> Fix For: 3.0.9
>
>
> After being affected by 
> https://issues.apache.org/jira/browse/CASSANDRA-11988, we built a5cbb0. 
> Compaction still fails with the following trace:
> {code}
> WARN  [SharedPool-Worker-2] 2016-07-28 10:51:56,111 
> AbstractLocalAwareExecutorService.java:169 - Uncaught exception on thread 
> Thread[SharedPool-Worker-2,5,main]: {}
> java.lang.RuntimeException: java.lang.NullPointerException
>   at 
> org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2453)
>  ~[main/:na]
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> ~[na:1.8.0_72]
>   at 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:164)
>  ~[main/:na]
>   at 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:136)
>  [main/:na]
>   at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) 
> [main/:na]
>   at java.lang.Thread.run(Thread.java:745) [na:1.8.0_72]
> Caused by: java.lang.NullPointerException: null
>   at 
> org.apache.cassandra.db.ReadCommand$1MetricRecording.applyToRow(ReadCommand.java:466)
>  ~[main/:na]
>   at 
> org.apache.cassandra.db.ReadCommand$1MetricRecording.applyToStatic(ReadCommand.java:460)
>  ~[main/:na]
>   at org.apache.cassandra.db.transform.BaseRows.add(BaseRows.java:105) 
> ~[main/:na]
>   at 
> org.apache.cassandra.db.transform.UnfilteredRows.add(UnfilteredRows.java:41) 
> ~[main/:na]
>   at 
> org.apache.cassandra.db.transform.Transformation.add(Transformation.java:156) 
> ~[main/:na]
>   at 
> org.apache.cassandra.db.transform.Transformation.apply(Transformation.java:122)
>  ~[main/:na]
>   at 
> org.apache.cassandra.db.ReadCommand$1MetricRecording.applyToPartition(ReadCommand.java:454)
>  ~[main/:na]
>   at 
> org.apache.cassandra.db.ReadCommand$1MetricRecording.applyToPartition(ReadCommand.java:438)
>  ~[main/:na]
>   at 
> org.apache.cassandra.db.transform.BasePartitions.hasNext(BasePartitions.java:96)
>  ~[main/:na]
>   at 
> org.apache.cassandra.db.partitions.UnfilteredPartitionIterators$Serializer.serialize(UnfilteredPartitionIterators.java:295)
>  ~[main/:na]
>   at 
> org.apache.cassandra.db.ReadResponse$LocalDataResponse.build(ReadResponse.java:145)
>  ~[main/:na]
>   at 
> org.apache.cassandra.db.ReadResponse$LocalDataResponse.(ReadResponse.java:138)
>  ~[main/:na]
>   at 
> org.apache.cassandra.db.ReadResponse$LocalDataResponse.(ReadResponse.java:134)
>  ~[main/:na]
>   at 
> org.apache.cassandra.db.ReadResponse.createDataResponse(ReadResponse.java:76) 
> ~[main/:na]
>   at 
> org.apache.cassandra.db.ReadCommand.createResponse(ReadCommand.java:320) 
> ~[main/:na]
>   at 
> org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:1796)
>  ~[main/:na]
>   at 
> org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2449)
>  ~[main/:na]
>   ... 5 common frames omitted
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (CASSANDRA-12337) dtest failure in scrub_test.TestScrubIndexes.test_standalone_scrub

2016-07-29 Thread Joel Knighton (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Knighton reassigned CASSANDRA-12337:
-

Assignee: Joel Knighton

> dtest failure in scrub_test.TestScrubIndexes.test_standalone_scrub
> --
>
> Key: CASSANDRA-12337
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12337
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Joel Knighton
>Assignee: Joel Knighton
>  Labels: dtest
>
> We have an existing open ticket for this test in [CASSANDRA-11236]. That 
> ticket is for a Windows failure with a different failure mode. Since the 
> resolution will likely be different, I've opened this ticket to track the 
> most recent failure.
> example failure: 
> [http://cassci.datastax.com/job/cassandra-3.9_dtest/20/testReport/junit/scrub_test/TestScrubIndexes/test_standalone_scrub/]
> {code}
> sstablescrub failed
>  >> begin captured logging << 
> dtest: DEBUG: cluster ccm directory: /tmp/dtest-Ilk7GU
> dtest: DEBUG: Done setting configuration options:
> {   'initial_token': None,
> 'num_tokens': '32',
> 'phi_convict_threshold': 5,
> 'range_request_timeout_in_ms': 1,
> 'read_request_timeout_in_ms': 1,
> 'request_timeout_in_ms': 1,
> 'truncate_request_timeout_in_ms': 1,
> 'write_request_timeout_in_ms': 1}
> dtest: DEBUG: Checking sstables in 
> ['/tmp/dtest-Ilk7GU/test/node1/data0/ks/users-b1956190547611e683f7630547833e83']
> dtest: DEBUG: Found sstable file mb-1-big-Statistics.db
> dtest: DEBUG: Found sstable file mb-1-big-CRC.db
> dtest: DEBUG: Found sstable file mb-1-big-Filter.db
> dtest: DEBUG: Found sstable file mb-1-big-Summary.db
> dtest: DEBUG: Found sstable file mb-1-big-Data.db
> dtest: DEBUG: Found sstable file mb-1-big-Index.db
> dtest: DEBUG: Found sstable file mb-1-big-TOC.txt
> dtest: DEBUG: Checking sstables in 
> ['/tmp/dtest-Ilk7GU/test/node1/data0/ks/users-b1956190547611e683f7630547833e83']
> dtest: DEBUG: Found sstable file mb-1-big-Statistics.db
> dtest: DEBUG: Found sstable file mb-1-big-CRC.db
> dtest: DEBUG: Found sstable file mb-1-big-Filter.db
> dtest: DEBUG: Found sstable file mb-1-big-Summary.db
> dtest: DEBUG: Found sstable file mb-1-big-Data.db
> dtest: DEBUG: Found sstable file mb-1-big-Index.db
> dtest: DEBUG: Found sstable file mb-1-big-TOC.txt
> dtest: DEBUG: Checking sstables in 
> ['/tmp/dtest-Ilk7GU/test/node1/data0/ks/users-b1956190547611e683f7630547833e83']
> dtest: DEBUG: Found sstable file mb-1-big-Statistics.db
> dtest: DEBUG: Found sstable file mb-1-big-CRC.db
> dtest: DEBUG: Found sstable file mb-1-big-Filter.db
> dtest: DEBUG: Found sstable file mb-1-big-Summary.db
> dtest: DEBUG: Found sstable file mb-1-big-Data.db
> dtest: DEBUG: Found sstable file mb-1-big-Index.db
> dtest: DEBUG: Found sstable file mb-1-big-TOC.txt
> dtest: DEBUG: Checking sstables in 
> ['/tmp/dtest-Ilk7GU/test/node1/data0/ks/users-b1956190547611e683f7630547833e83']
> dtest: DEBUG: Found sstable file mb-1-big-Statistics.db
> dtest: DEBUG: Found sstable file mb-1-big-CRC.db
> dtest: DEBUG: Found sstable file mb-1-big-Filter.db
> dtest: DEBUG: Found sstable file mb-1-big-Summary.db
> dtest: DEBUG: Found sstable file mb-1-big-Data.db
> dtest: DEBUG: Found sstable file mb-1-big-Index.db
> dtest: DEBUG: Found sstable file mb-1-big-TOC.txt
> dtest: DEBUG: /home/automaton/cassandra/bin/sstablescrub
> dtest: DEBUG: Pre-scrub sstables snapshotted into snapshot 
> pre-scrub-1469677957710
> Scrubbing 
> BigTableReader(path='/tmp/dtest-Ilk7GU/test/node1/data0/ks/users-b1956190547611e683f7630547833e83/mb-1-big-Data.db')
>  (0.317KiB)
> Scrub of 
> BigTableReader(path='/tmp/dtest-Ilk7GU/test/node1/data0/ks/users-b1956190547611e683f7630547833e83/mb-1-big-Data.db')
>  complete; looks like all 0 rows were tombstoned
> dtest: DEBUG: /home/automaton/cassandra/bin/sstablescrub
> dtest: DEBUG: Pre-scrub sstables snapshotted into snapshot 
> pre-scrub-1469677962057
> Scrubbing 
> BigTableReader(path='/tmp/dtest-Ilk7GU/test/node1/data0/ks/users-b1956190547611e683f7630547833e83/.gender_idx/mb-1-big-Data.db')
>  (0.176KiB)
> Scrub of 
> BigTableReader(path='/tmp/dtest-Ilk7GU/test/node1/data0/ks/users-b1956190547611e683f7630547833e83/.gender_idx/mb-1-big-Data.db')
>  complete; looks like all 0 rows were tombstoned
> dtest: DEBUG: /home/automaton/cassandra/bin/sstablescrub
> dtest: DEBUG: Pre-scrub sstables snapshotted into snapshot 
> pre-scrub-1469677966308
> Scrubbing 
> BigTableReader(path='/tmp/dtest-Ilk7GU/test/node1/data0/ks/users-b1956190547611e683f7630547833e83/.state_idx/mb-1-big-Data.db')
>  (0.178KiB)
> Scrub of 
> BigTableReader(path='/tmp/dtest-Ilk7GU/test/node1/data0/ks/users-b1956190547611e683f7630547833e83/.state_idx/mb-1-big-Data.db')
>

[jira] [Commented] (CASSANDRA-10848) Upgrade paging dtests involving deletion flap on CassCI

2016-07-29 Thread Russ Hatch (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-10848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15399643#comment-15399643
 ] 

Russ Hatch commented on CASSANDRA-10848:


I've had similar difficulty trying to get a test error to repro with 500 
iterations. But the CI results still stand, so I'm not sure how we repro and 
fix the test issue.

> Upgrade paging dtests involving deletion flap on CassCI
> ---
>
> Key: CASSANDRA-10848
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10848
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Jim Witschey
>Assignee: DS Test Eng
>  Labels: dtest
> Fix For: 3.0.x, 3.x
>
>
> A number of dtests in the {{upgrade_tests.paging_tests}} that involve 
> deletion flap with the following error:
> {code}
> Requested pages were not delivered before timeout.
> {code}
> This may just be an effect of CASSANDRA-10730, but it's worth having a look 
> at separately. Here are some examples of tests flapping in this way:



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-12337) dtest failure in scrub_test.TestScrubIndexes.test_standalone_scrub

2016-07-29 Thread Joel Knighton (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Knighton updated CASSANDRA-12337:
--
Description: 
We have an existing open ticket for this test in [CASSANDRA-11236]. That ticket 
is for a Windows failure with a different failure mode. Since the resolution 
will likely be different, I've opened this ticket to track the most recent 
failure.

example failure: 

[http://cassci.datastax.com/job/cassandra-3.9_dtest/20/testReport/junit/scrub_test/TestScrubIndexes/test_standalone_scrub/]

{code}
sstablescrub failed
 >> begin captured logging << 
dtest: DEBUG: cluster ccm directory: /tmp/dtest-Ilk7GU
dtest: DEBUG: Done setting configuration options:
{   'initial_token': None,
'num_tokens': '32',
'phi_convict_threshold': 5,
'range_request_timeout_in_ms': 1,
'read_request_timeout_in_ms': 1,
'request_timeout_in_ms': 1,
'truncate_request_timeout_in_ms': 1,
'write_request_timeout_in_ms': 1}
dtest: DEBUG: Checking sstables in 
['/tmp/dtest-Ilk7GU/test/node1/data0/ks/users-b1956190547611e683f7630547833e83']
dtest: DEBUG: Found sstable file mb-1-big-Statistics.db
dtest: DEBUG: Found sstable file mb-1-big-CRC.db
dtest: DEBUG: Found sstable file mb-1-big-Filter.db
dtest: DEBUG: Found sstable file mb-1-big-Summary.db
dtest: DEBUG: Found sstable file mb-1-big-Data.db
dtest: DEBUG: Found sstable file mb-1-big-Index.db
dtest: DEBUG: Found sstable file mb-1-big-TOC.txt
dtest: DEBUG: Checking sstables in 
['/tmp/dtest-Ilk7GU/test/node1/data0/ks/users-b1956190547611e683f7630547833e83']
dtest: DEBUG: Found sstable file mb-1-big-Statistics.db
dtest: DEBUG: Found sstable file mb-1-big-CRC.db
dtest: DEBUG: Found sstable file mb-1-big-Filter.db
dtest: DEBUG: Found sstable file mb-1-big-Summary.db
dtest: DEBUG: Found sstable file mb-1-big-Data.db
dtest: DEBUG: Found sstable file mb-1-big-Index.db
dtest: DEBUG: Found sstable file mb-1-big-TOC.txt
dtest: DEBUG: Checking sstables in 
['/tmp/dtest-Ilk7GU/test/node1/data0/ks/users-b1956190547611e683f7630547833e83']
dtest: DEBUG: Found sstable file mb-1-big-Statistics.db
dtest: DEBUG: Found sstable file mb-1-big-CRC.db
dtest: DEBUG: Found sstable file mb-1-big-Filter.db
dtest: DEBUG: Found sstable file mb-1-big-Summary.db
dtest: DEBUG: Found sstable file mb-1-big-Data.db
dtest: DEBUG: Found sstable file mb-1-big-Index.db
dtest: DEBUG: Found sstable file mb-1-big-TOC.txt
dtest: DEBUG: Checking sstables in 
['/tmp/dtest-Ilk7GU/test/node1/data0/ks/users-b1956190547611e683f7630547833e83']
dtest: DEBUG: Found sstable file mb-1-big-Statistics.db
dtest: DEBUG: Found sstable file mb-1-big-CRC.db
dtest: DEBUG: Found sstable file mb-1-big-Filter.db
dtest: DEBUG: Found sstable file mb-1-big-Summary.db
dtest: DEBUG: Found sstable file mb-1-big-Data.db
dtest: DEBUG: Found sstable file mb-1-big-Index.db
dtest: DEBUG: Found sstable file mb-1-big-TOC.txt
dtest: DEBUG: /home/automaton/cassandra/bin/sstablescrub
dtest: DEBUG: Pre-scrub sstables snapshotted into snapshot 
pre-scrub-1469677957710
Scrubbing 
BigTableReader(path='/tmp/dtest-Ilk7GU/test/node1/data0/ks/users-b1956190547611e683f7630547833e83/mb-1-big-Data.db')
 (0.317KiB)
Scrub of 
BigTableReader(path='/tmp/dtest-Ilk7GU/test/node1/data0/ks/users-b1956190547611e683f7630547833e83/mb-1-big-Data.db')
 complete; looks like all 0 rows were tombstoned

dtest: DEBUG: /home/automaton/cassandra/bin/sstablescrub
dtest: DEBUG: Pre-scrub sstables snapshotted into snapshot 
pre-scrub-1469677962057
Scrubbing 
BigTableReader(path='/tmp/dtest-Ilk7GU/test/node1/data0/ks/users-b1956190547611e683f7630547833e83/.gender_idx/mb-1-big-Data.db')
 (0.176KiB)
Scrub of 
BigTableReader(path='/tmp/dtest-Ilk7GU/test/node1/data0/ks/users-b1956190547611e683f7630547833e83/.gender_idx/mb-1-big-Data.db')
 complete; looks like all 0 rows were tombstoned

dtest: DEBUG: /home/automaton/cassandra/bin/sstablescrub
dtest: DEBUG: Pre-scrub sstables snapshotted into snapshot 
pre-scrub-1469677966308
Scrubbing 
BigTableReader(path='/tmp/dtest-Ilk7GU/test/node1/data0/ks/users-b1956190547611e683f7630547833e83/.state_idx/mb-1-big-Data.db')
 (0.178KiB)
Scrub of 
BigTableReader(path='/tmp/dtest-Ilk7GU/test/node1/data0/ks/users-b1956190547611e683f7630547833e83/.state_idx/mb-1-big-Data.db')
 complete; looks like all 0 rows were tombstoned

dtest: DEBUG: /home/automaton/cassandra/bin/sstablescrub
dtest: DEBUG: Pre-scrub sstables snapshotted into snapshot 
pre-scrub-1469677970549
Scrubbing 
BigTableReader(path='/tmp/dtest-Ilk7GU/test/node1/data0/ks/users-b1956190547611e683f7630547833e83/.birth_year_idx/mb-1-big-Data.db')
 (0.189KiB)
Scrub of 
BigTableReader(path='/tmp/dtest-Ilk7GU/test/node1/data0/ks/users-b1956190547611e683f7630547833e83/.birth_year_idx/mb-1-big-Data.db')
 complete; looks like all 0 rows were tombstoned

dtest: DEBUG: ERROR 03:52:50 Error in ThreadPoolExecutor

[jira] [Commented] (CASSANDRA-12151) Audit logging for database activity

2016-07-29 Thread Jonathan Ellis (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-12151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15399584#comment-15399584
 ] 

Jonathan Ellis commented on CASSANDRA-12151:


Remember that people almost always use Cassandra to drive applications at 
scale, not to do interactive analytics.  I can't see that logging 100,000 ops 
per second of the same ten queries is going to add much value.  I don't want to 
load that gun for people to blow their feet off with...

Generally auditing is most useful to see "who *changed* what" not "who *asked 
for* what."  (Again, the "who" for most of the latter is going to be "the 
application server.")  And again, it's not super useful to know that the app 
server inserted 10,000 new user accounts today, but it IS useful to know when 
Jonathan added a new column to the users table.  

(I would also include user logins as an interesting event.  This will be 
dominated by app servers still but much much less noise than logging every 
query or update.)

Besides changes over CQL, this could also include JMX changes, although there 
are so many entry points to JMX mbeans that this would be ugly to do by hand.  
Perhaps we could inject this with byteman?

> Audit logging for database activity
> ---
>
> Key: CASSANDRA-12151
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12151
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: stefan setyadi
> Fix For: 3.x
>
> Attachments: 12151.txt
>
>
> we would like a way to enable cassandra to log database activity being done 
> on our server.
> It should show username, remote address, timestamp, action type, keyspace, 
> column family, and the query statement.
> it should also be able to log connection attempt and changes to the 
> user/roles.
> I was thinking of making a new keyspace and insert an entry for every 
> activity that occurs.
> Then It would be possible to query for specific activity or a query targeting 
> a specific keyspace and column family.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (CASSANDRA-12343) Make 'static final boolean' easier to optimize for Hotspot

2016-07-29 Thread Robert Stupp (JIRA)

Robert Stupp created CASSANDRA-12343:


 Summary: Make 'static final boolean' easier to optimize for Hotspot
 Key: CASSANDRA-12343
 URL: https://issues.apache.org/jira/browse/CASSANDRA-12343
 Project: Cassandra
  Issue Type: Improvement
Reporter: Robert Stupp
Assignee: Robert Stupp
Priority: Trivial
 Fix For: 3.x


Hotspot is able to optimize condition checks on `static final` fields. But the 
compiler can only optimize if the referenced "constant" is the first condition 
to check. (If I understood the optimization in Hotspot correctly.)

I.e. the first {{if}} block can be "eliminated" whereas the second cannot:
{code}
class Foo {
  static final boolean CONST = /* some fragment evaluating to false */;
  
  public void doSomeStuff(boolean param) {

if (!CONST) {
  // this code block can be eliminated
}

if (!CONST && param) {
  // this code block can be eliminated
}

if (param && !CONST) {
  // this code block cannot be eliminated due to some compiler logic
}

  }
}
{code}

Linked patch changes the order in some {{if}} statements and migrates a few 
methods to static final fields.

||trunk|[branch|https://github.com/apache/cassandra/compare/trunk...snazy:boolean-hotspot]|[testall|http://cassci.datastax.com/view/Dev/view/snazy/job/snazy-boolean-hotspot-testall/lastSuccessfulBuild/]|[dtest|http://cassci.datastax.com/view/Dev/view/snazy/job/snazy-boolean-hotspot-dtest/lastSuccessfulBuild/]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (CASSANDRA-12342) CLibrary improvements

2016-07-29 Thread Robert Stupp (JIRA)

Robert Stupp created CASSANDRA-12342:


 Summary: CLibrary improvements
 Key: CASSANDRA-12342
 URL: https://issues.apache.org/jira/browse/CASSANDRA-12342
 Project: Cassandra
  Issue Type: Improvement
Reporter: Robert Stupp
Assignee: Robert Stupp
Priority: Minor
 Fix For: 3.x


{{CLibrary}} uses {{FBUtilities.getProtectedField}} for each invocation of 
{{getfd}} - i.e. {{Class.getDeclaredField}} + {{Field.setAccessible}}. Linked 
patch migrates these {{Field}} references to static class fields + adds 
constants for the OS. Also adds a tiny optimization for non-linux OSs in 
{{trySync}}.

||trunk|[branch|https://github.com/apache/cassandra/compare/trunk...snazy:CLibrary-opts]|[testall|http://cassci.datastax.com/view/Dev/view/snazy/job/snazy-CLibrary-opts-testall/lastSuccessfulBuild/]|[dtest|http://cassci.datastax.com/view/Dev/view/snazy/job/snazy-CLibrary-opts-dtest/lastSuccessfulBuild/]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-11635) test-clientutil-jar unit test fails

2016-07-29 Thread Sylvain Lebresne (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-11635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15399531#comment-15399531
 ] 

Sylvain Lebresne commented on CASSANDRA-11635:
--

Ok, I got confused because I was running the test as part of the normal {{ant 
test}} target, and was looking at a job failure on CI, but it's actually the 
specific {{ant test-clientutil-jar}} target, which is run by CI but whose 
failure doesn't seem to be reported (as far as I can tell, and unless you 
manually check the console output).

Anyway, the original error message with Sigar actually makes sense: we did make 
sigar a dependency of UUIDGen and forgot to add it to the {{clientutil.jar}}. 
That's still the error with which the test fails on 2.2 and 3.0 and I'm 
attaching below simple update to fix the test. I didn't bother running CI 
because it only changes the build file, and only the target related to 
{{clientutil.java}}, which as said above is not even reported by CI. It's easy 
enough to check locally that it fixes the test though.

| [11635-2.2|https://github.com/pcmanus/cassandra/commits/11635-2.2] |
| [11635-3.0|https://github.com/pcmanus/cassandra/commits/11635-3.0] |

It does is worth noting that the inclusion of sigar will make the use of 
clientutil.jar a tad more annoying as you need to put sigar's binary for your 
architecture. Or more precisely, you don't _need_ to, but you'll get an ugly 
error message if you don't. I'm not entirely sure clientutil.jar is still in 
use though, and as I said, it's not actually mandatory, so I'm not convinced 
it's a problem, but still mentioning it for completness.

Now, that leaves the issue on trunk (or 3.9 for that matter). And I'm not 
entirely sure why the test complains about {{IPartitioner}}. What I do know is 
that the reason it complains on 3.x and not on 3.0.x is CASSANDRA-12002. For 
some reason, the code that patch added to {{FBUtilities}} triggers the problem 
(the test pass if I revert those changes). Which is weird in the sense that 
those change only included 2 new methods returning {{IPartitioner}} but there 
already has one before. I'm not an expert in class loaders though.

Anyway, I'm not sure what is the right fix for 3.x. I could try to randomly 
bend the code in {{FBUtilities}} to make the test happy, or move that code 
somewhere else (even more random), or spend a few hours understanding the 
subtlety of the class loader, but as I said above, I'm not really sure 
clientutil.java is used anywhere anymore as the functionality it offers is 
provided by other clients, and those other clients don't use it. So I'm 
seriously wondering if the most progmatic solution isn't to just stop providing 
that jar in 3.x (and if you really depend on that jar, you're probably better 
off sticking to an old version anyway). Any opinions?


> test-clientutil-jar unit test fails
> ---
>
> Key: CASSANDRA-11635
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11635
> Project: Cassandra
>  Issue Type: Test
>  Components: Testing
>Reporter: Michael Shuler
>Assignee: Sylvain Lebresne
>  Labels: unittest
> Fix For: 2.2.x, 3.0.x, 3.x
>
>
> {noformat}
> test-clientutil-jar:
> [junit] Testsuite: org.apache.cassandra.serializers.ClientUtilsTest
> [junit] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 
> 0.314 sec
> [junit] 
> [junit] Testcase: test(org.apache.cassandra.serializers.ClientUtilsTest): 
>   Caused an ERROR
> [junit] org/apache/cassandra/utils/SigarLibrary
> [junit] java.lang.NoClassDefFoundError: 
> org/apache/cassandra/utils/SigarLibrary
> [junit] at org.apache.cassandra.utils.UUIDGen.hash(UUIDGen.java:328)
> [junit] at 
> org.apache.cassandra.utils.UUIDGen.makeNode(UUIDGen.java:307)
> [junit] at 
> org.apache.cassandra.utils.UUIDGen.makeClockSeqAndNode(UUIDGen.java:256)
> [junit] at 
> org.apache.cassandra.utils.UUIDGen.(UUIDGen.java:39)
> [junit] at 
> org.apache.cassandra.serializers.ClientUtilsTest.test(ClientUtilsTest.java:56)
> [junit] Caused by: java.lang.ClassNotFoundException: 
> org.apache.cassandra.utils.SigarLibrary
> [junit] at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
> [junit] at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
> [junit] at 
> sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
> [junit] at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
> [junit] 
> [junit] 
> [junit] Test org.apache.cassandra.serializers.ClientUtilsTest FAILED
> BUILD FAILED
> {noformat}
> I'll see if I can find a spot where this passes, but it appears to have been 
> failing for a long time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-7190) Add schema to snapshot manifest

2016-07-29 Thread Aleksey Yeschenko (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-7190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15399533#comment-15399533
 ] 

Aleksey Yeschenko commented on CASSANDRA-7190:
--

Alright. Committed as 
[a123e984c3236b2a188411cad5c29f16e662c369|https://github.com/apache/cassandra/commit/a123e984c3236b2a188411cad5c29f16e662c369]
 to trunk. Yay (:

If I come up with another case that shouldn't be represented, I'll file a 
follow-up JIRA. 

> Add schema to snapshot manifest
> ---
>
> Key: CASSANDRA-7190
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7190
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Tools
>Reporter: Jonathan Ellis
>Assignee: Alex Petrov
>Priority: Minor
>  Labels: client-impacting, doc-impacting, lhf
> Fix For: 3.10
>
>
> followup from CASSANDRA-6326



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-7190) Add schema to snapshot manifest

2016-07-29 Thread Aleksey Yeschenko (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko updated CASSANDRA-7190:
-
   Resolution: Fixed
Fix Version/s: (was: 3.x)
   3.10
   Status: Resolved  (was: Ready to Commit)

> Add schema to snapshot manifest
> ---
>
> Key: CASSANDRA-7190
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7190
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Tools
>Reporter: Jonathan Ellis
>Assignee: Alex Petrov
>Priority: Minor
>  Labels: client-impacting, doc-impacting, lhf
> Fix For: 3.10
>
>
> followup from CASSANDRA-6326



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

cassandra git commit: Add schema to snapshot manifest, add WITH TIMESTAMP to DROP statement

2016-07-29 Thread aleksey

Repository: cassandra
Updated Branches:
  refs/heads/trunk bdaa53de4 -> a123e984c


Add schema to snapshot manifest, add WITH TIMESTAMP to DROP statement

Patch by Alex Petrov; reviewed by Aleksey Yeschenko for CASSANDRA-7190


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/a123e984
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/a123e984
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/a123e984

Branch: refs/heads/trunk
Commit: a123e984c3236b2a188411cad5c29f16e662c369
Parents: bdaa53d
Author: Alex Petrov 
Authored: Wed Apr 20 14:57:52 2016 +0200
Committer: Aleksey Yeschenko 
Committed: Fri Jul 29 16:39:03 2016 +0100

--
 CHANGES.txt |   1 +
 src/antlr/Parser.g  |  10 +-
 .../org/apache/cassandra/config/CFMetaData.java |  12 +-
 .../cql3/statements/AlterTableStatement.java|  26 +-
 .../apache/cassandra/db/ColumnFamilyStore.java  |  22 +
 .../db/ColumnFamilyStoreCQLHelper.java  | 442 
 .../org/apache/cassandra/db/Directories.java|   6 +
 .../unit/org/apache/cassandra/SchemaLoader.java |   5 +
 .../cql3/validation/operations/AlterTest.java   |  70 ++
 .../db/ColumnFamilyStoreCQLHelperTest.java  | 683 +++
 .../schema/LegacySchemaMigratorTest.java|   3 +-
 11 files changed, 1259 insertions(+), 21 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/a123e984/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 27655d2..80063c8 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 3.10
+ * Add schema to snapshot manifest, add USING TIMESTAMP clause to ALTER TABLE 
statements (CASSANDRA-7190)
  * Add beta protocol flag for v5 native protocol (CASSANDRA-12142)
  * Support filtering on non-PRIMARY KEY columns in the CREATE
MATERIALIZED VIEW statement's WHERE clause (CASSANDRA-10368)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/a123e984/src/antlr/Parser.g
--
diff --git a/src/antlr/Parser.g b/src/antlr/Parser.g
index f00f9d0..e762bde 100644
--- a/src/antlr/Parser.g
+++ b/src/antlr/Parser.g
@@ -777,22 +777,24 @@ alterTableStatement returns [AlterTableStatement expr]
 TableAttributes attrs = new TableAttributes();
 Map renames = new 
HashMap();
 List colNameList = new 
ArrayList();
+Long deleteTimestamp = null;
 }
 : K_ALTER K_COLUMNFAMILY cf=columnFamilyName
   ( K_ALTER id=cident  K_TYPE v=comparatorType  { type = 
AlterTableStatement.Type.ALTER; } { colNameList.add(new 
AlterTableStatementColumn(id,v)); }
   | K_ADD  ((id=cident   v=comparatorType   b1=cfisStatic { 
colNameList.add(new AlterTableStatementColumn(id,v,b1)); })
  | ('('  id1=cident  v1=comparatorType  b1=cfisStatic { 
colNameList.add(new AlterTableStatementColumn(id1,v1,b1)); }
( ',' idn=cident  vn=comparatorType  bn=cfisStatic { 
colNameList.add(new AlterTableStatementColumn(idn,vn,bn)); } )* ')' ) ) { type 
= AlterTableStatement.Type.ADD; }
-  | K_DROP ( id=cident  { colNameList.add(new 
AlterTableStatementColumn(id)); }
- | ('('  id1=cident { colNameList.add(new 
AlterTableStatementColumn(id1)); }
-   ( ',' idn=cident { colNameList.add(new 
AlterTableStatementColumn(idn)); } )* ')') ) { type = 
AlterTableStatement.Type.DROP; }
+  | K_DROP ( ( id=cident  { colNameList.add(new 
AlterTableStatementColumn(id)); }
+  | ('('  id1=cident { colNameList.add(new 
AlterTableStatementColumn(id1)); }
+( ',' idn=cident { colNameList.add(new 
AlterTableStatementColumn(idn)); } )* ')') )
+ ( K_USING K_TIMESTAMP t=INTEGER { deleteTimestamp = 
Long.parseLong(Constants.Literal.integer($t.text).getText()); })? ) { type = 
AlterTableStatement.Type.DROP; }
   | K_WITH  properties[attrs] { type = 
AlterTableStatement.Type.OPTS; }
   | K_RENAME  { type = 
AlterTableStatement.Type.RENAME; }
id1=cident K_TO toId1=cident { renames.put(id1, toId1); }
( K_AND idn=cident K_TO toIdn=cident { renames.put(idn, toIdn); 
} )*
   )
 {
-$expr = new AlterTableStatement(cf, type, colNameList, attrs, renames);
+$expr = new AlterTableStatement(cf, type, colNameList, attrs, renames, 
deleteTimestamp);
 }
 ;

[jira] [Created] (CASSANDRA-12341) dtest failure in hintedhandoff_test.TestHintedHandoffConfig.hintedhandoff_enabled_test

2016-07-29 Thread Sean McCarthy (JIRA)

Sean McCarthy created CASSANDRA-12341:
-

 Summary: dtest failure in 
hintedhandoff_test.TestHintedHandoffConfig.hintedhandoff_enabled_test
 Key: CASSANDRA-12341
 URL: https://issues.apache.org/jira/browse/CASSANDRA-12341
 Project: Cassandra
  Issue Type: Test
Reporter: Sean McCarthy
Assignee: DS Test Eng
 Attachments: node1.log, node1_debug.log, node1_gc.log, node2.log, 
node2_debug.log, node2_gc.log

example failure:

http://cassci.datastax.com/job/trunk_novnode_dtest/440/testReport/hintedhandoff_test/TestHintedHandoffConfig/hintedhandoff_enabled_test

{code}
Error Message

29 Jul 2016 00:56:17 [node1] Missing: ['Finished hinted']:
INFO  [HANDSHAKE-/127.0.0.2] 2016-07-29 00:54:14,4.
See system.log for remainder
{code}

{code}
Stacktrace

  File "/usr/lib/python2.7/unittest/case.py", line 329, in run
testMethod()
  File "/home/automaton/cassandra-dtest/hintedhandoff_test.py", line 125, in 
hintedhandoff_enabled_test
self._do_hinted_handoff(node1, node2, True)
  File "/home/automaton/cassandra-dtest/hintedhandoff_test.py", line 61, in 
_do_hinted_handoff
node1.watch_log_for(["Finished hinted"], from_mark=log_mark, timeout=120)
  File "/home/automaton/ccm/ccmlib/node.py", line 449, in watch_log_for
raise TimeoutError(time.strftime("%d %b %Y %H:%M:%S", time.gmtime()) + " [" 
+ self.name + "] Missing: " + str([e.pattern for e in tofind]) + ":\n" + 
reads[:50] + ".\nSee {} for remainder".format(filename))
"29 Jul 2016 00:56:17 [node1] Missing: ['Finished hinted']:\nINFO  
[HANDSHAKE-/127.0.0.2] 2016-07-29 00:54:14,4.\nSee system.log for remainder
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (CASSANDRA-12340) dtest failure in upgrade_supercolumns_test.TestSCUpgrade.upgrade_with_counters_test

2016-07-29 Thread Sean McCarthy (JIRA)

Sean McCarthy created CASSANDRA-12340:
-

 Summary: dtest failure in 
upgrade_supercolumns_test.TestSCUpgrade.upgrade_with_counters_test
 Key: CASSANDRA-12340
 URL: https://issues.apache.org/jira/browse/CASSANDRA-12340
 Project: Cassandra
  Issue Type: Test
Reporter: Sean McCarthy
Assignee: DS Test Eng
 Attachments: node1.log, node2.log, node3.log

example failure:

http://cassci.datastax.com/job/cassandra-2.1_dtest_jdk8/249/testReport/upgrade_supercolumns_test/TestSCUpgrade/upgrade_with_counters_test

{code}
Standard Output

Unexpected error in node3 log, error: 
ERROR [CompactionExecutor:1] 2016-07-28 15:34:19,533 CassandraDaemon.java (line 
191) Exception in thread Thread[CompactionExecutor:1,1,main]
java.util.concurrent.RejectedExecutionException: Task 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask@5fb8b2bf 
rejected from 
org.apache.cassandra.concurrent.DebuggableScheduledThreadPoolExecutor@1ae851ad[Terminated,
 pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 8]
at 
java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2047)
at 
java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:823)
at 
java.util.concurrent.ScheduledThreadPoolExecutor.delayedExecute(ScheduledThreadPoolExecutor.java:326)
at 
java.util.concurrent.ScheduledThreadPoolExecutor.schedule(ScheduledThreadPoolExecutor.java:533)
at 
java.util.concurrent.ScheduledThreadPoolExecutor.submit(ScheduledThreadPoolExecutor.java:632)
at 
org.apache.cassandra.io.sstable.SSTableDeletingTask.schedule(SSTableDeletingTask.java:65)
at 
org.apache.cassandra.io.sstable.SSTableReader.releaseReference(SSTableReader.java:976)
at 
org.apache.cassandra.db.DataTracker.removeOldSSTablesSize(DataTracker.java:383)
at org.apache.cassandra.db.DataTracker.postReplace(DataTracker.java:348)
at org.apache.cassandra.db.DataTracker.replace(DataTracker.java:342)
at 
org.apache.cassandra.db.DataTracker.replaceCompactedSSTables(DataTracker.java:245)
at 
org.apache.cassandra.db.ColumnFamilyStore.replaceCompactedSSTables(ColumnFamilyStore.java:995)
at 
org.apache.cassandra.db.compaction.CompactionTask.replaceCompactedSSTables(CompactionTask.java:270)
at 
org.apache.cassandra.db.compaction.CompactionTask.runWith(CompactionTask.java:230)
at 
org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48)
at 
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
at 
org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:58)
at 
org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:60)
at 
org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionTask.run(CompactionManager.java:208)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (CASSANDRA-12339) dtest failure in cql_tracing_test.TestCqlTracing.tracing_unknown_impl_test

2016-07-29 Thread Jim Witschey (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jim Witschey resolved CASSANDRA-12339.
--
Resolution: Invalid

> dtest failure in cql_tracing_test.TestCqlTracing.tracing_unknown_impl_test
> --
>
> Key: CASSANDRA-12339
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12339
> Project: Cassandra
>  Issue Type: Test
>Reporter: Craig Kodman
>Assignee: DS Test Eng
>  Labels: dtest
> Attachments: node1.log, node1_debug.log, node1_gc.log, node2.log, 
> node2_debug.log, node2_gc.log, node3.log, node3_debug.log, node3_gc.log
>
>
> example failure:
> http://cassci.datastax.com/job/cassandra-3.9_dtest/21/testReport/cql_tracing_test/TestCqlTracing/tracing_unknown_impl_test



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-12339) dtest failure in cql_tracing_test.TestCqlTracing.tracing_unknown_impl_test

2016-07-29 Thread Jim Witschey (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-12339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15399457#comment-15399457
 ] 

Jim Witschey commented on CASSANDRA-12339:
--

This failure seems to have happened on an earlier commit than the one that 
closed 
[CASSANDRA-11465|https://issues.apache.org/jira/browse/CASSANDRA-11465?focusedCommentId=1539=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-1539].
 Closing.

> dtest failure in cql_tracing_test.TestCqlTracing.tracing_unknown_impl_test
> --
>
> Key: CASSANDRA-12339
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12339
> Project: Cassandra
>  Issue Type: Test
>Reporter: Craig Kodman
>Assignee: DS Test Eng
>  Labels: dtest
> Attachments: node1.log, node1_debug.log, node1_gc.log, node2.log, 
> node2_debug.log, node2_gc.log, node3.log, node3_debug.log, node3_gc.log
>
>
> example failure:
> http://cassci.datastax.com/job/cassandra-3.9_dtest/21/testReport/cql_tracing_test/TestCqlTracing/tracing_unknown_impl_test



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-12311) Propagate TombstoneOverwhelmingException to the client

2016-07-29 Thread Sylvain Lebresne (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-12311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15399436#comment-15399436
 ] 

Sylvain Lebresne commented on CASSANDRA-12311:
--

I'm wondering if it's worth adding a totally new exception. I agree 
{{ReadFailureException}} is currently a bit too imprecise regarding it's 
details, but it's not necessarily only true of 
{{TombstoneOverwhelmingException}}, and the latter is still a read failure 
exception. I think I'd have a preference for adding a (potentially optional) 
{{cause}} to the existing {{ReadFailureException}}. 

> Propagate TombstoneOverwhelmingException to the client
> --
>
> Key: CASSANDRA-12311
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12311
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Geoffrey Yu
>Assignee: Geoffrey Yu
>Priority: Minor
> Fix For: 4.x
>
> Attachments: 12311-trunk.txt
>
>
> Right now if a data node fails to perform a read because it ran into a 
> {{TombstoneOverwhelmingException}}, it only responds back to the coordinator 
> node with a generic failure. Under this scheme, the coordinator won't be able 
> to know exactly why the request failed and subsequently the client only gets 
> a generic {{ReadFailureException}}. It would be useful to inform the client 
> that their read failed because we read too many tombstones. We should have 
> the data nodes reply with a failure type so the coordinator can pass this 
> information to the client.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-12312) Restore JVM metric export for metric reporters

2016-07-29 Thread Sylvain Lebresne (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sylvain Lebresne updated CASSANDRA-12312:
-
Labels: lhf  (was: )

> Restore JVM metric export for metric reporters
> --
>
> Key: CASSANDRA-12312
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12312
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Stefan Podkowinski
>Assignee: Stefan Podkowinski
>  Labels: lhf
> Fix For: 2.2.x, 3.0.x, 3.x
>
> Attachments: 12312-2.2.patch, 12312-3.0.patch, 12312-trunk.patch, 
> metrics-jvm-3.1.0.jar.asc
>
>
> JVM instrumentation as part of dropwizard metrics has been moved to a 
> separate {{metrics-jvm}} artifact in metrics-v3.0. After CASSANDRA-5657, no 
> jvm related metrics will be exported to any reporter configured via 
> {{metrics-reporter-config}}, as this isn't part of {{metrics-core}} anymore. 
> As memory and GC stats are essential for monitoring Cassandra, this turns out 
> to be a blocker for us for upgrading to 2.2.
> I've included a patch that would add the now separate {{metrics-jvm}} package 
> and enables some of the provided metrics on startup in case a metrics 
> reporter is used ({{-Dcassandra.metricsReporterConfigFile}}).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-12270) Nodetool toppartitions - add metrics of latency and payload

2016-07-29 Thread Aleksey Yeschenko (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko updated CASSANDRA-12270:
--
Component/s: Observability

> Nodetool toppartitions - add metrics of latency and payload
> ---
>
> Key: CASSANDRA-12270
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12270
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Observability
>Reporter: Rich Rein
>
> Its painful to diagnose complex 300 table clusters in production.
> Extending toppartitions to record based on latency and payload size would 
> greatly simplify this and lower the time, cost, and drama of hot partitions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-12281) Gossip blocks on startup when another node is bootstrapping

2016-07-29 Thread Sylvain Lebresne (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sylvain Lebresne updated CASSANDRA-12281:
-
Assignee: Joel Knighton

> Gossip blocks on startup when another node is bootstrapping
> ---
>
> Key: CASSANDRA-12281
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12281
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Eric Evans
>Assignee: Joel Knighton
>Priority: Minor
> Attachments: restbase1015-a_jstack.txt
>
>
> In our cluster, normal node startup times (after a drain on shutdown) are 
> less than 1 minute.  However, when another node in the cluster is 
> bootstrapping, the same node startup takes nearly 30 minutes to complete, the 
> apparent result of gossip blocking on pending range calculations.
> {noformat}
> $ nodetool-a tpstats
> Pool NameActive   Pending  Completed   Blocked  All 
> time blocked
> MutationStage 0 0   1840 0
>  0
> ReadStage 0 0   2350 0
>  0
> RequestResponseStage  0 0 53 0
>  0
> ReadRepairStage   0 0  1 0
>  0
> CounterMutationStage  0 0  0 0
>  0
> HintedHandoff 0 0 44 0
>  0
> MiscStage 0 0  0 0
>  0
> CompactionExecutor3 3395 0
>  0
> MemtableReclaimMemory 0 0 30 0
>  0
> PendingRangeCalculator1 2 29 0
>  0
> GossipStage   1  5602164 0
>  0
> MigrationStage0 0  0 0
>  0
> MemtablePostFlush 0 0111 0
>  0
> ValidationExecutor0 0  0 0
>  0
> Sampler   0 0  0 0
>  0
> MemtableFlushWriter   0 0 30 0
>  0
> InternalResponseStage 0 0  0 0
>  0
> AntiEntropyStage  0 0  0 0
>  0
> CacheCleanupExecutor  0 0  0 0
>  0
> Message type   Dropped
> READ 0
> RANGE_SLICE  0
> _TRACE   0
> MUTATION 0
> COUNTER_MUTATION 0
> REQUEST_RESPONSE 0
> PAGED_RANGE  0
> READ_REPAIR  0
> {noformat}
> A full thread dump is attached, but the relevant bit seems to be here:
> {noformat}
> [ ... ]
> "GossipStage:1" #1801 daemon prio=5 os_prio=0 tid=0x7fe4cd54b000 
> nid=0xea9 waiting on condition [0x7fddcf883000]
>java.lang.Thread.State: WAITING (parking)
>   at sun.misc.Unsafe.park(Native Method)
>   - parking to wait for  <0x0004c1e922c0> (a 
> java.util.concurrent.locks.ReentrantReadWriteLock$FairSync)
>   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
>   at 
> java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:943)
>   at 
> org.apache.cassandra.locator.TokenMetadata.updateNormalTokens(TokenMetadata.java:174)
>   at 
> org.apache.cassandra.locator.TokenMetadata.updateNormalTokens(TokenMetadata.java:160)
>   at 
> org.apache.cassandra.service.StorageService.handleStateNormal(StorageService.java:2023)
>   at 
> org.apache.cassandra.service.StorageService.onChange(StorageService.java:1682)
>   at 
> org.apache.cassandra.gms.Gossiper.doOnChangeNotifications(Gossiper.java:1182)
>   at org.apache.cassandra.gms.Gossiper.applyNewStates(Gossiper.java:1165)
>   at 
> org.apache.cassandra.gms.Gossiper.applyStateLocally(Gossiper.java:1128)
>   at 
> org.apache.cassandra.gms.GossipDigestAckVerbHandler.doVerb(GossipDigestAckVerbHandler.java:58)
>

[jira] [Updated] (CASSANDRA-12279) nodetool repair hangs on non-existant table

2016-07-29 Thread Sylvain Lebresne (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sylvain Lebresne updated CASSANDRA-12279:
-
Labels: lhf  (was: )

> nodetool repair hangs on non-existant table
> ---
>
> Key: CASSANDRA-12279
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12279
> Project: Cassandra
>  Issue Type: Bug
> Environment: Linux Ubuntu, Openjdk
>Reporter: Benjamin Roth
>Priority: Minor
>  Labels: lhf
>
> If nodetool repair is called with a table that does not exist, ist hangs 
> infinitely without any error message or logs.
> E.g.
> nodetool repair foo bar
> Keyspace foo exists but table bar does not



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-12270) Nodetool toppartitions - add metrics of latency and payload

2016-07-29 Thread Sylvain Lebresne (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sylvain Lebresne updated CASSANDRA-12270:
-
Issue Type: Improvement  (was: Bug)

> Nodetool toppartitions - add metrics of latency and payload
> ---
>
> Key: CASSANDRA-12270
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12270
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Observability
>Reporter: Rich Rein
>
> Its painful to diagnose complex 300 table clusters in production.
> Extending toppartitions to record based on latency and payload size would 
> greatly simplify this and lower the time, cost, and drama of hot partitions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-12336) NullPointerException during compaction on table with static columns

2016-07-29 Thread Evan Prothro (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-12336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15399380#comment-15399380
 ] 

Evan Prothro commented on CASSANDRA-12336:
--

Confirmed that patch 
[12336-3.0|https://github.com/pcmanus/cassandra/commits/12336-3.0] fixes this 
issue for us. Both manual and triggered-from-read compaction work without error.

> NullPointerException during compaction on table with static columns
> ---
>
> Key: CASSANDRA-12336
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12336
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
> Environment: cqlsh 5.0.1
> Cassandra 3.0.8-SNAPSHOT (3.0.x dev - a5cbb0)
>Reporter: Evan Prothro
>Assignee: Sylvain Lebresne
> Fix For: 3.0.9
>
>
> After being affected by 
> https://issues.apache.org/jira/browse/CASSANDRA-11988, we built a5cbb0. 
> Compaction still fails with the following trace:
> {code}
> WARN  [SharedPool-Worker-2] 2016-07-28 10:51:56,111 
> AbstractLocalAwareExecutorService.java:169 - Uncaught exception on thread 
> Thread[SharedPool-Worker-2,5,main]: {}
> java.lang.RuntimeException: java.lang.NullPointerException
>   at 
> org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2453)
>  ~[main/:na]
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> ~[na:1.8.0_72]
>   at 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:164)
>  ~[main/:na]
>   at 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:136)
>  [main/:na]
>   at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) 
> [main/:na]
>   at java.lang.Thread.run(Thread.java:745) [na:1.8.0_72]
> Caused by: java.lang.NullPointerException: null
>   at 
> org.apache.cassandra.db.ReadCommand$1MetricRecording.applyToRow(ReadCommand.java:466)
>  ~[main/:na]
>   at 
> org.apache.cassandra.db.ReadCommand$1MetricRecording.applyToStatic(ReadCommand.java:460)
>  ~[main/:na]
>   at org.apache.cassandra.db.transform.BaseRows.add(BaseRows.java:105) 
> ~[main/:na]
>   at 
> org.apache.cassandra.db.transform.UnfilteredRows.add(UnfilteredRows.java:41) 
> ~[main/:na]
>   at 
> org.apache.cassandra.db.transform.Transformation.add(Transformation.java:156) 
> ~[main/:na]
>   at 
> org.apache.cassandra.db.transform.Transformation.apply(Transformation.java:122)
>  ~[main/:na]
>   at 
> org.apache.cassandra.db.ReadCommand$1MetricRecording.applyToPartition(ReadCommand.java:454)
>  ~[main/:na]
>   at 
> org.apache.cassandra.db.ReadCommand$1MetricRecording.applyToPartition(ReadCommand.java:438)
>  ~[main/:na]
>   at 
> org.apache.cassandra.db.transform.BasePartitions.hasNext(BasePartitions.java:96)
>  ~[main/:na]
>   at 
> org.apache.cassandra.db.partitions.UnfilteredPartitionIterators$Serializer.serialize(UnfilteredPartitionIterators.java:295)
>  ~[main/:na]
>   at 
> org.apache.cassandra.db.ReadResponse$LocalDataResponse.build(ReadResponse.java:145)
>  ~[main/:na]
>   at 
> org.apache.cassandra.db.ReadResponse$LocalDataResponse.(ReadResponse.java:138)
>  ~[main/:na]
>   at 
> org.apache.cassandra.db.ReadResponse$LocalDataResponse.(ReadResponse.java:134)
>  ~[main/:na]
>   at 
> org.apache.cassandra.db.ReadResponse.createDataResponse(ReadResponse.java:76) 
> ~[main/:na]
>   at 
> org.apache.cassandra.db.ReadCommand.createResponse(ReadCommand.java:320) 
> ~[main/:na]
>   at 
> org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:1796)
>  ~[main/:na]
>   at 
> org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2449)
>  ~[main/:na]
>   ... 5 common frames omitted
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-12273) Casandra stess graph: option to create directory for graph if it doesn't exist

2016-07-29 Thread Sylvain Lebresne (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sylvain Lebresne updated CASSANDRA-12273:
-
Labels: lhf  (was: )

> Casandra stess graph: option to create directory for graph if it doesn't exist
> --
>
> Key: CASSANDRA-12273
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12273
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Tools
>Reporter: Christopher Batey
>Assignee: Christopher Batey
>Priority: Minor
>  Labels: lhf
>
> I am running it in CI with ephemeral workspace  / build dirs. It would be 
> nice if CS would create the directory so my build tool doesn't have to



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-7190) Add schema to snapshot manifest

2016-07-29 Thread Alex Petrov (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Petrov updated CASSANDRA-7190:
---
Status: Ready to Commit  (was: Patch Available)

> Add schema to snapshot manifest
> ---
>
> Key: CASSANDRA-7190
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7190
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Tools
>Reporter: Jonathan Ellis
>Assignee: Alex Petrov
>Priority: Minor
>  Labels: client-impacting, doc-impacting, lhf
> Fix For: 3.x
>
>
> followup from CASSANDRA-6326



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

svn commit: r1754523 - /cassandra/site/src/.htaccess

2016-07-29 Thread slebresne

Author: slebresne
Date: Fri Jul 29 13:31:08 2016
New Revision: 1754523

URL: http://svn.apache.org/viewvc?rev=1754523=rev
Log:
Fix .htaccess in source too

Modified:
cassandra/site/src/.htaccess

Modified: cassandra/site/src/.htaccess
URL: 
http://svn.apache.org/viewvc/cassandra/site/src/.htaccess?rev=1754523=1754522=1754523=diff
==
--- cassandra/site/src/.htaccess (original)
+++ cassandra/site/src/.htaccess Fri Jul 29 13:31:08 2016
@@ -1,3 +1,3 @@
 RewriteEngine On
 
-RewriteRule /doc/ /doc/latest/ [NC, L]
+RewriteRule /doc/ /doc/latest/ [NC,L]

svn commit: r1754519 - /cassandra/site/publish/.htaccess

2016-07-29 Thread slebresne

Author: slebresne
Date: Fri Jul 29 13:21:42 2016
New Revision: 1754519

URL: http://svn.apache.org/viewvc?rev=1754519=rev
Log:
Fix .htaccess

Modified:
cassandra/site/publish/.htaccess

Modified: cassandra/site/publish/.htaccess
URL: 
http://svn.apache.org/viewvc/cassandra/site/publish/.htaccess?rev=1754519=1754518=1754519=diff
==
--- cassandra/site/publish/.htaccess (original)
+++ cassandra/site/publish/.htaccess Fri Jul 29 13:21:42 2016
@@ -1,3 +1,3 @@
 RewriteEngine On
 
-RewriteRule /doc/ /doc/latest/ [NC, L]
+RewriteRule /doc/ /doc/latest/ [NC,L]

[jira] [Resolved] (CASSANDRA-12338) Upgrading 2.1.0 / 2.1.9->3.0.2/ 3.7 failed

2016-07-29 Thread Aleksey Yeschenko (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko resolved CASSANDRA-12338.
---
Resolution: Duplicate

This is a duplicate of CASSANDRA-11900, caused by CASSANDRA-11877 most likely. 
There is a good chance it's been fixed by CASSANDRA-12144 already, which will 
be released soon in 3.0.9/3.9. Is there any way you can build off latest 
cassandra-3.0 branch HEAD and try starting up Cassandra again?

If that doesn't work, feel free to reopen this JIRA. Thanks.

> Upgrading 2.1.0 / 2.1.9->3.0.2/ 3.7 failed
> --
>
> Key: CASSANDRA-12338
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12338
> Project: Cassandra
>  Issue Type: Bug
> Environment: Windows / Linux
>Reporter: Samraj
>  Labels: upgrade
>
> I am trying to upgrade the cassandra from 2.1.0 to 3.7. As per the 
> recommendation, i just migrated from 2.1.0 to 2.1.9 and then i tried to 
> migrate over 3.0.2 or 3.7. But i am getting the below exception while 
> startup. How to skip this or overcome this.
> INFO  [WrapperSimpleAppMain] 2016-07-29 11:33:36,684 SystemKeyspace.java:1283 
> - Detected version upgrade from 2.1.9 to 3.0.8, snapshotting system keyspace
> WARN  [WrapperSimpleAppMain] 2016-07-29 11:33:38,565 
> CompressionParams.java:382 - The sstable_compression option has been 
> deprecated. You should use class instead
> ERROR [WrapperSimpleAppMain] 2016-07-29 11:33:40,984 CassandraDaemon.java:698 
> - Exception encountered during startup
> java.lang.IllegalStateException: One row required, 2 found
>   at 
> org.apache.cassandra.cql3.UntypedResultSet$FromResultSet.one(UntypedResultSet.java:84)
>  ~[apache-cassandra-3.0.8.jar:3.0.8]
>   at 
> org.apache.cassandra.schema.LegacySchemaMigrator.readTableTimestamp(LegacySchemaMigrator.java:253)
>  ~[apache-cassandra-3.0.8.jar:3.0.8]
>   at 
> org.apache.cassandra.schema.LegacySchemaMigrator.readTable(LegacySchemaMigrator.java:243)
>  ~[apache-cassandra-3.0.8.jar:3.0.8]
>   at 
> org.apache.cassandra.schema.LegacySchemaMigrator.lambda$readTables$7(LegacySchemaMigrator.java:237)
>  ~[apache-cassandra-3.0.8.jar:3.0.8]
>   at java.util.ArrayList.forEach(ArrayList.java:1249) ~[na:1.8.0_60]
>   at 
> org.apache.cassandra.schema.LegacySchemaMigrator.readTables(LegacySchemaMigrator.java:237)
>  ~[apache-cassandra-3.0.8.jar:3.0.8]
>   at 
> org.apache.cassandra.schema.LegacySchemaMigrator.readKeyspace(LegacySchemaMigrator.java:186)
>  ~[apache-cassandra-3.0.8.jar:3.0.8]
>   at 
> org.apache.cassandra.schema.LegacySchemaMigrator.lambda$readSchema$4(LegacySchemaMigrator.java:177)
>  ~[apache-cassandra-3.0.8.jar:3.0.8]
>   at java.util.ArrayList.forEach(ArrayList.java:1249) ~[na:1.8.0_60]
>   at 
> org.apache.cassandra.schema.LegacySchemaMigrator.readSchema(LegacySchemaMigrator.java:177)
>  ~[apache-cassandra-3.0.8.jar:3.0.8]
>   at 
> org.apache.cassandra.schema.LegacySchemaMigrator.migrate(LegacySchemaMigrator.java:77)
>  ~[apache-cassandra-3.0.8.jar:3.0.8]
>   at 
> org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:229) 
> ~[apache-cassandra-3.0.8.jar:3.0.8]
>   at 
> org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:557)
>  ~[apache-cassandra-3.0.8.jar:3.0.8]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (CASSANDRA-12339) dtest failure in cql_tracing_test.TestCqlTracing.tracing_unknown_impl_test

2016-07-29 Thread Craig Kodman (JIRA)

Craig Kodman created CASSANDRA-12339:


 Summary: dtest failure in 
cql_tracing_test.TestCqlTracing.tracing_unknown_impl_test
 Key: CASSANDRA-12339
 URL: https://issues.apache.org/jira/browse/CASSANDRA-12339
 Project: Cassandra
  Issue Type: Test
Reporter: Craig Kodman
Assignee: DS Test Eng
 Attachments: node1.log, node1_debug.log, node1_gc.log, node2.log, 
node2_debug.log, node2_gc.log, node3.log, node3_debug.log, node3_gc.log

example failure:

http://cassci.datastax.com/job/cassandra-3.9_dtest/21/testReport/cql_tracing_test/TestCqlTracing/tracing_unknown_impl_test



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

svn commit: r1754516 - /cassandra/site/publish/index.html

2016-07-29 Thread slebresne

Author: slebresne
Date: Fri Jul 29 13:14:43 2016
New Revision: 1754516

URL: http://svn.apache.org/viewvc?rev=1754516=rev
Log:
2nd attempt to waking up svnpubsub

Modified:
cassandra/site/publish/index.html

Modified: cassandra/site/publish/index.html
URL: 
http://svn.apache.org/viewvc/cassandra/site/publish/index.html?rev=1754516=1754515=1754516=diff
==
--- cassandra/site/publish/index.html (original)
+++ cassandra/site/publish/index.html Fri Jul 29 13:14:43 2016
@@ -1,8 +1,5 @@
 
 
-  
-
-

svn commit: r1754509 - /cassandra/site/src/README

2016-07-29 Thread slebresne

Author: slebresne
Date: Fri Jul 29 12:56:02 2016
New Revision: 1754509

URL: http://svn.apache.org/viewvc?rev=1754509=rev
Log:
Trying to wake up svnpubsub

Modified:
cassandra/site/src/README

Modified: cassandra/site/src/README
URL: 
http://svn.apache.org/viewvc/cassandra/site/src/README?rev=1754509=1754508=1754509=diff
==
--- cassandra/site/src/README (original)
+++ cassandra/site/src/README Fri Jul 29 12:56:02 2016
@@ -51,3 +51,4 @@ The rest of the layout is standard to Je
 * `_sass/` is to `css/` what `_includes` is to `_layout`; it contains sass 
fragments imported by the main css files
   (currently only the pygments theme for syntax highligthing in the 
documentation).
 * `_plugins/` contains a tiny plugin that make it easier to input download 
links in the `download.md` file.
+

[jira] [Commented] (CASSANDRA-7190) Add schema to snapshot manifest

2016-07-29 Thread Alex Petrov (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-7190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15399193#comment-15399193
 ] 

Alex Petrov commented on CASSANDRA-7190:


I've fixed the {{DynamicCompositeType}} support. 

As regards the second clause, it refers to the [python 
driver|https://github.com/datastax/python-driver/blob/master/cassandra/metadata.py#L1061-L1067].
 Although I could not construct the table that would fail with this case. 

bq. some cases of mixed dynamic/static Thrift CFs

I've tested several permutation of dynamic/static thrift CFs that were also 
worked on during [CASSANDRA-10857] and tried constructing other cases and so 
far could not find other mismatches. Might be since Thrift is hiding some or 
they're being dumped without compact storage.

For example: 

{code}
CfDef cfDef = new 
CfDef().setDefault_validation_class(Int32Type.instance.toString())
 
.setKey_validation_class(AsciiType.instance.toString())
 
.setComparator_type(CompositeType.getInstance(AsciiType.instance, 
AsciiType.instance).toString())
 .setColumn_metadata(Arrays.asList(new 
ColumnDef(CompositeType.build(ByteBufferUtil.bytes("col1"), 
ByteBufferUtil.bytes("col1")),

 AsciiType.instance.toString()),
   new 
ColumnDef(CompositeType.build(ByteBufferUtil.bytes("col2"), 
ByteBufferUtil.bytes("col2")),

 AsciiType.instance.toString())
 ))
 .setKeyspace(KEYSPACE)
 .setName(TABLE);
{code}

is represented as 

{code}
CREATE TABLE IF NOT EXISTS thrift_created_table_test_ks.test_table_1 (
key ascii,
column1 ascii,
column2 ascii,
"col1:col1" ascii static,
"col2:col2" ascii static,
value int,
PRIMARY KEY (key, column1, column2))
WITH ID = d1e70820-5581-11e6-9b6d-53f9c6c224e8
AND CLUSTERING ORDER BY (column1 ASC, column2 ASC)
{code}

And {{isThriftCompatible}} yields {{false}}, since it's not dense table.

> Add schema to snapshot manifest
> ---
>
> Key: CASSANDRA-7190
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7190
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Tools
>Reporter: Jonathan Ellis
>Assignee: Alex Petrov
>Priority: Minor
>  Labels: client-impacting, doc-impacting, lhf
> Fix For: 3.x
>
>
> followup from CASSANDRA-6326



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (CASSANDRA-12335) Super columns are broken after upgrading to 3.0

2016-07-29 Thread Aleksey Yeschenko (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko reassigned CASSANDRA-12335:
-

Assignee: Aleksey Yeschenko

> Super columns are broken after upgrading to 3.0
> ---
>
> Key: CASSANDRA-12335
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12335
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Jeremiah Jordan
>Assignee: Aleksey Yeschenko
>Priority: Blocker
> Fix For: 3.0.x, 3.x
>
>
> Super Columns are broken after upgrading to cassandra-3.0 HEAD.  The below 
> script shows this.
> 2.1 cli output for get:
> {code}
> [default@test] get Sites[utf8('Bob')][utf8('attr')]['name'] as utf8;
> => (name=name, value=Bob, timestamp=1469724504357000)
> {code}
> cqlsh:
> {code}
> [default@test]
>  key  | blobAsText(column1)
> --+-
>  0x53696d6f6e |attr
>  0x426f62 |attr
> {code}
> 3.0 cli:
> {code}
> [default@unknown] use test;
> unconfigured table schema_columnfamilies
> [default@test] get Sites[utf8('Bob')][utf8('attr')]['name'] as utf8;
> null
> [default@test]
> {code}
> cqlsh:
> {code}
>  key  | system.blobastext(column1)
> --+--
>  0x53696d6f6e | \x00\x04attr\x00\x00\x04name\x00
>  0x426f62 | \x00\x04attr\x00\x00\x04name\x00
> {code}
> Run this from a directory with cassandra-3.0 checked out and compiled
> {code}
> ccm create -n 2 -v 2.1.14 testsuper
> echo "### Starting 2.1 ###"
> ccm start
> MYFILE=`mktemp`
> echo "create keyspace test with placement_strategy = 
> 'org.apache.cassandra.locator.SimpleStrategy' and strategy_options = 
> {replication_factor:2};
> use test;
> create column family Sites with column_type = 'Super' and comparator = 
> 'BytesType' and subcomparator='UTF8Type';
> set Sites[utf8('Simon')][utf8('attr')]['name'] = utf8('Simon');
> set Sites[utf8('Bob')][utf8('attr')]['name'] = utf8('Bob');
> get Sites[utf8('Bob')][utf8('attr')]['name'] as utf8;" > $MYFILE
> ~/.ccm/repository/2.1.14/bin/cassandra-cli < $MYFILE
> rm $MYFILE
> ~/.ccm/repository/2.1.14/bin/nodetool -p 7100 flush
> ~/.ccm/repository/2.1.14/bin/nodetool -p 7200 flush
> ccm stop
> # run from cassandra-3.0 checked out and compiled
> ccm setdir
> echo "### Starting Current Directory 
> ###"
> ccm start
> ./bin/nodetool -p 7100 upgradesstables
> ./bin/nodetool -p 7200 upgradesstables
> ./bin/nodetool -p 7100 enablethrift
> ./bin/nodetool -p 7200 enablethrift
> MYFILE=`mktemp`
> echo "use test;
> get Sites[utf8('Bob')][utf8('attr')]['name'] as utf8;" > $MYFILE
> ~/.ccm/repository/2.1.14/bin/cassandra-cli < $MYFILE
> rm $MYFILE
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-12335) Super columns are broken after upgrading to 3.0

2016-07-29 Thread Aleksey Yeschenko (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko updated CASSANDRA-12335:
--
Priority: Major  (was: Blocker)

> Super columns are broken after upgrading to 3.0
> ---
>
> Key: CASSANDRA-12335
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12335
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Jeremiah Jordan
>Assignee: Aleksey Yeschenko
> Fix For: 3.0.x, 3.x
>
>
> Super Columns are broken after upgrading to cassandra-3.0 HEAD.  The below 
> script shows this.
> 2.1 cli output for get:
> {code}
> [default@test] get Sites[utf8('Bob')][utf8('attr')]['name'] as utf8;
> => (name=name, value=Bob, timestamp=1469724504357000)
> {code}
> cqlsh:
> {code}
> [default@test]
>  key  | blobAsText(column1)
> --+-
>  0x53696d6f6e |attr
>  0x426f62 |attr
> {code}
> 3.0 cli:
> {code}
> [default@unknown] use test;
> unconfigured table schema_columnfamilies
> [default@test] get Sites[utf8('Bob')][utf8('attr')]['name'] as utf8;
> null
> [default@test]
> {code}
> cqlsh:
> {code}
>  key  | system.blobastext(column1)
> --+--
>  0x53696d6f6e | \x00\x04attr\x00\x00\x04name\x00
>  0x426f62 | \x00\x04attr\x00\x00\x04name\x00
> {code}
> Run this from a directory with cassandra-3.0 checked out and compiled
> {code}
> ccm create -n 2 -v 2.1.14 testsuper
> echo "### Starting 2.1 ###"
> ccm start
> MYFILE=`mktemp`
> echo "create keyspace test with placement_strategy = 
> 'org.apache.cassandra.locator.SimpleStrategy' and strategy_options = 
> {replication_factor:2};
> use test;
> create column family Sites with column_type = 'Super' and comparator = 
> 'BytesType' and subcomparator='UTF8Type';
> set Sites[utf8('Simon')][utf8('attr')]['name'] = utf8('Simon');
> set Sites[utf8('Bob')][utf8('attr')]['name'] = utf8('Bob');
> get Sites[utf8('Bob')][utf8('attr')]['name'] as utf8;" > $MYFILE
> ~/.ccm/repository/2.1.14/bin/cassandra-cli < $MYFILE
> rm $MYFILE
> ~/.ccm/repository/2.1.14/bin/nodetool -p 7100 flush
> ~/.ccm/repository/2.1.14/bin/nodetool -p 7200 flush
> ccm stop
> # run from cassandra-3.0 checked out and compiled
> ccm setdir
> echo "### Starting Current Directory 
> ###"
> ccm start
> ./bin/nodetool -p 7100 upgradesstables
> ./bin/nodetool -p 7200 upgradesstables
> ./bin/nodetool -p 7100 enablethrift
> ./bin/nodetool -p 7200 enablethrift
> MYFILE=`mktemp`
> echo "use test;
> get Sites[utf8('Bob')][utf8('attr')]['name'] as utf8;" > $MYFILE
> ~/.ccm/repository/2.1.14/bin/cassandra-cli < $MYFILE
> rm $MYFILE
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-12008) Make decommission operations resumable

2016-07-29 Thread Kaide Mu (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-12008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15399170#comment-15399170
 ] 

Kaide Mu commented on CASSANDRA-12008:
--

bq. It seems getStreamedRanges is querying the AVAILABLE_RANGES table instead 
of STREAMED_RANGES, that's why is generating the Undefined column name 
operation error.

Unbelievable, but yes it was the error, already fixed it, thanks!

bq. Maybe it's not working because of the previous error? Perhaps it would help 
to add a unit test on StreamStateStoreTest to verify that updateStreamedRanges 
and getStreamedRanges is being populated correctly and working as expected. You 
can also add debug logs to troubleshoot.


Another stupid error, wasn't adding {{StreamTransferTask}} to 
{{SessionCompleteEvent}}, fixed.

bq. SystemKeyspace.getStreamedRanges is being called from inside a for-loop 
what may be inefficient, it's maybe better to retrieve it before and re-use it 
inside the loop.

I've added a new strategy, please let me know what do you think about it.

Some additional modifications, we are not going to pass description to 
{{StreamTransferTask}} constructor, if we do so it will raise an error because 
when task is created {{StreamResultFuture}} is not initialized yet, thus 
{{StreamSession.description()}} will return a null value at creation time. So 
instead we will obtain {{StreamSession}} from 
{{StreamTransferTask.getSession()}} when each {{StreamTransferTask}} is 
complete i.e when {{StreamStateStore.handleStreamEvent}} is invoked. All these 
means that we are going to only pass its responsible keyspace.

Some minor details：
Don't know if there's some problem with current implementation or there's 
something weird in the set-up, but it skips twice the same range:
{quote}
DEBUG [RMI TCP Connection(9)-127.0.0.1] 2016-07-29 12:48:36,301 
StorageService.java:4556 - Range (3074457345618258602,-9223372036854775808] 
already in /127.0.0.3, skipping
DEBUG [RMI TCP Connection(9)-127.0.0.1] 2016-07-29 12:48:36,301 
StorageService.java:4556 - Range (3074457345618258602,-9223372036854775808] 
already in /127.0.0.3, skipping
{quote}
I think it's the set-up itself since 
{{StorageService.getChangedRangesForLeaving}} is also returning the same range 
twice
{quote}
DEBUG [RMI TCP Connection(9)-127.0.0.1] 2016-07-29 12:48:36,289 
StorageService.java:2526 - Range (3074457345618258602,-9223372036854775808] 
will be responsibility of /127.0.0.3
DEBUG [RMI TCP Connection(9)-127.0.0.1] 2016-07-29 12:48:36,294 
StorageService.java:2526 - Range (3074457345618258602,-9223372036854775808] 
will be responsibility of /127.0.0.3
{quote}

You can find latest working patch via: 
https://github.com/apache/cassandra/compare/trunk...kdmu:trunk-12008?expand=1


> Make decommission operations resumable
> --
>
> Key: CASSANDRA-12008
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12008
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Streaming and Messaging
>Reporter: Tom van der Woerdt
>Assignee: Kaide Mu
>Priority: Minor
>
> We're dealing with large data sets (multiple terabytes per node) and 
> sometimes we need to add or remove nodes. These operations are very dependent 
> on the entire cluster being up, so while we're joining a new node (which 
> sometimes takes 6 hours or longer) a lot can go wrong and in a lot of cases 
> something does.
> It would be great if the ability to retry streams was implemented.
> Example to illustrate the problem :
> {code}
> 03:18 PM   ~ $ nodetool decommission
> error: Stream failed
> -- StackTrace --
> org.apache.cassandra.streaming.StreamException: Stream failed
> at 
> org.apache.cassandra.streaming.management.StreamEventJMXNotifier.onFailure(StreamEventJMXNotifier.java:85)
> at com.google.common.util.concurrent.Futures$6.run(Futures.java:1310)
> at 
> com.google.common.util.concurrent.MoreExecutors$DirectExecutor.execute(MoreExecutors.java:457)
> at 
> com.google.common.util.concurrent.ExecutionList.executeListener(ExecutionList.java:156)
> at 
> com.google.common.util.concurrent.ExecutionList.execute(ExecutionList.java:145)
> at 
> com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:202)
> at 
> org.apache.cassandra.streaming.StreamResultFuture.maybeComplete(StreamResultFuture.java:210)
> at 
> org.apache.cassandra.streaming.StreamResultFuture.handleSessionComplete(StreamResultFuture.java:186)
> at 
> org.apache.cassandra.streaming.StreamSession.closeSession(StreamSession.java:430)
> at 
> org.apache.cassandra.streaming.StreamSession.complete(StreamSession.java:622)
> at 
> org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:486)
> at 
>

[jira] [Commented] (CASSANDRA-11990) Address rows rather than partitions in SASI

2016-07-29 Thread Sylvain Lebresne (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-11990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15399144#comment-15399144
 ] 

Sylvain Lebresne commented on CASSANDRA-11990:
--

I'm not familiar enough with the SASI code at this point to have much opinions 
on the specifics of what is the best implementation choices. But it does sound 
like supporting other partitioners is fairly orthogonal to the original ticket 
intent so it should likely be left to a follow-up ticket (if only so we can 
focus on proper testing separately). And in general, the more incremental we 
can do stuff, the better.


> Address rows rather than partitions in SASI
> ---
>
> Key: CASSANDRA-11990
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11990
> Project: Cassandra
>  Issue Type: Improvement
>  Components: CQL
>Reporter: Alex Petrov
>Assignee: Alex Petrov
> Attachments: perf.pdf, size_comparison.png
>
>
> Currently, the lookup in SASI index would return the key position of the 
> partition. After the partition lookup, the rows are iterated and the 
> operators are applied in order to filter out ones that do not match.
> bq. TokenTree which accepts variable size keys (such would enable different 
> partitioners, collections support, primary key indexing etc.), 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-11990) Address rows rather than partitions in SASI

2016-07-29 Thread Alex Petrov (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-11990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15399120#comment-15399120
 ] 

Alex Petrov commented on CASSANDRA-11990:
-

During several discussions it's been proposed to evaluate the support for 
different partitioners, since it'd help with wider SASI adoption and remove 
current limitation of Long tokens. I've evaluated the support, and can conclude 
that supporting the constant-size tokens can be included into the patch without 
large overhead. Patch was adjusted accordingly. There are still several failing 
tests, although they'll be fixed shortly. 

Support for variable-size tokens (for partitioners such as 
{{ByteOrderedPartitioner}} requires much larger time investment. My personal 
suggestion is to encode them with the size and avoid on-disk format changes. 
This will result into more complex iteration process for variable-size tokens, 
since we'll have to skip tokens depending on the size and won't be able to use 
simple multiplication for offset calculation. I've made a small patch / proof 
of concept for variable size tokens by adding `serializedSize` method into the 
token tree nodes, currently (for sakes of POC and in order to save some time), 
it was done by reusing the `serialize` function and passing a throwaway byte 
buffer, and calculating offsets by iterating and reading integers with token 
size. It worked just fine for simple cases. I'll mention that SASI code is 
written very well and offset calculation methods are very well isolated. 

Having that said, I'd suggest to leave the "algorithmic" heavy-lifting 
(variable token offset calculation) for the separate ticket to reduce the scope 
of current ticket. Since it's not going to require the on-disk format changes, 
we can safely postpone this work. 


Another thing that's been mentioned was is to include the column offset into 
clustering offset long. I'll be evaluating this proposal in terms of 
performance today. It seems that we can avoid increasing the size of {{long[]}} 
array that hold offsets and this change can help to avoid post-filtering 
alltogether. Additional optimisation (which, once again, could be left for the 
follow-up patch) is to avoid the second seek within the data file for cases 
when we are only querying columns that are indexed. This can be a significant 
performance improvement, although it'd be good to discuss whether such queries 
are widely used.

cc [~slebresne] [~iamaleksey] [~jbellis] [~beobal]


> Address rows rather than partitions in SASI
> ---
>
> Key: CASSANDRA-11990
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11990
> Project: Cassandra
>  Issue Type: Improvement
>  Components: CQL
>Reporter: Alex Petrov
>Assignee: Alex Petrov
> Attachments: perf.pdf, size_comparison.png
>
>
> Currently, the lookup in SASI index would return the key position of the 
> partition. After the partition lookup, the rows are iterated and the 
> operators are applied in order to filter out ones that do not match.
> bq. TokenTree which accepts variable size keys (such would enable different 
> partitioners, collections support, primary key indexing etc.), 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-12336) NullPointerException during compaction on table with static columns

2016-07-29 Thread Sylvain Lebresne (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sylvain Lebresne updated CASSANDRA-12336:
-
Reviewer: Carl Yeksigian
  Status: Patch Available  (was: Open)

The patch for CASSANDRA-11988 wasn't thorough enough. While the iterator won't 
return {{null}} static rows, we may still have {{BaseRows.staticRow == null}} 
after a transformation, and since transformation are "chained", we can end up 
passing a {{null}} to an {{applytoStatic()}} call, which shouldn't really be 
done. Anyway, it's pretty easy to fix by guarding the call to 
{{applytoStatic()}} (much like the call to {{applyToRow()}} is for that matter).

I was surprised the test added by {{CASSANDRA-11988}} didn't catch it but it 
appears to be a timing issue: even though the test sets {{gc_grace == 0}}, it 
should wait at least 1 second to make sure stuffs gets purged, and I get to 
reproduce consistently with that additional wait (including in the patch below).

| [12336-3.0|https://github.com/pcmanus/cassandra/commits/12336-3.0] | 
[utests|http://cassci.datastax.com/job/pcmanus-12336-3.0-testall] | 
[dtests|http://cassci.datastax.com/job/pcmanus-12336-3.0-dtest] |
| [12336-3.9|https://github.com/pcmanus/cassandra/commits/12336-3.9] | 
[utests|http://cassci.datastax.com/job/pcmanus-12336-3.9-testall] | 
[dtests|http://cassci.datastax.com/job/pcmanus-12336-3.9-dtest] |

> NullPointerException during compaction on table with static columns
> ---
>
> Key: CASSANDRA-12336
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12336
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
> Environment: cqlsh 5.0.1
> Cassandra 3.0.8-SNAPSHOT (3.0.x dev - a5cbb0)
>Reporter: Evan Prothro
>Assignee: Sylvain Lebresne
> Fix For: 3.0.9
>
>
> After being affected by 
> https://issues.apache.org/jira/browse/CASSANDRA-11988, we built a5cbb0. 
> Compaction still fails with the following trace:
> {code}
> WARN  [SharedPool-Worker-2] 2016-07-28 10:51:56,111 
> AbstractLocalAwareExecutorService.java:169 - Uncaught exception on thread 
> Thread[SharedPool-Worker-2,5,main]: {}
> java.lang.RuntimeException: java.lang.NullPointerException
>   at 
> org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2453)
>  ~[main/:na]
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> ~[na:1.8.0_72]
>   at 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:164)
>  ~[main/:na]
>   at 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:136)
>  [main/:na]
>   at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) 
> [main/:na]
>   at java.lang.Thread.run(Thread.java:745) [na:1.8.0_72]
> Caused by: java.lang.NullPointerException: null
>   at 
> org.apache.cassandra.db.ReadCommand$1MetricRecording.applyToRow(ReadCommand.java:466)
>  ~[main/:na]
>   at 
> org.apache.cassandra.db.ReadCommand$1MetricRecording.applyToStatic(ReadCommand.java:460)
>  ~[main/:na]
>   at org.apache.cassandra.db.transform.BaseRows.add(BaseRows.java:105) 
> ~[main/:na]
>   at 
> org.apache.cassandra.db.transform.UnfilteredRows.add(UnfilteredRows.java:41) 
> ~[main/:na]
>   at 
> org.apache.cassandra.db.transform.Transformation.add(Transformation.java:156) 
> ~[main/:na]
>   at 
> org.apache.cassandra.db.transform.Transformation.apply(Transformation.java:122)
>  ~[main/:na]
>   at 
> org.apache.cassandra.db.ReadCommand$1MetricRecording.applyToPartition(ReadCommand.java:454)
>  ~[main/:na]
>   at 
> org.apache.cassandra.db.ReadCommand$1MetricRecording.applyToPartition(ReadCommand.java:438)
>  ~[main/:na]
>   at 
> org.apache.cassandra.db.transform.BasePartitions.hasNext(BasePartitions.java:96)
>  ~[main/:na]
>   at 
> org.apache.cassandra.db.partitions.UnfilteredPartitionIterators$Serializer.serialize(UnfilteredPartitionIterators.java:295)
>  ~[main/:na]
>   at 
> org.apache.cassandra.db.ReadResponse$LocalDataResponse.build(ReadResponse.java:145)
>  ~[main/:na]
>   at 
> org.apache.cassandra.db.ReadResponse$LocalDataResponse.(ReadResponse.java:138)
>  ~[main/:na]
>   at 
> org.apache.cassandra.db.ReadResponse$LocalDataResponse.(ReadResponse.java:134)
>  ~[main/:na]
>   at 
> org.apache.cassandra.db.ReadResponse.createDataResponse(ReadResponse.java:76) 
> ~[main/:na]
>   at 
> org.apache.cassandra.db.ReadCommand.createResponse(ReadCommand.java:320) 
> ~[main/:na]
>   at 
> org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:1796)
>  ~[main/:na]
>   at 
>

[jira] [Commented] (CASSANDRA-11960) Hints are not seekable

2016-07-29 Thread Branimir Lambov (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-11960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15399102#comment-15399102
 ] 

Branimir Lambov commented on CASSANDRA-11960:
-

I pushed an update to your branch 
[here|https://github.com/blambov/cassandra/tree/spodkowinski/WIP-11960] which 
replaces the single {{long}} position we stored with a mark object that 
includes compressed and uncompressed positions and can be used to seek quickly 
and accurately. This should be able to solve the problem for all hint file 
variations (plain, compressed, encrypted); we must test all of them though.

I tried to go back to the old code in {{HintsDispatcher}} but had to turn off 
{{RETRY}} to make the tests work. I must admit I haven't yet looked closely at 
all the changes you made and don't understand why this should be necessary.


> Hints are not seekable
> --
>
> Key: CASSANDRA-11960
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11960
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Robert Stupp
>Assignee: Stefan Podkowinski
>
> Got the following error message on trunk. No idea how to reproduce. But the 
> only thing the (not overridden) seek method does is throwing this exception.
> {code}
> ERROR [HintsDispatcher:2] 2016-06-05 18:51:09,397 CassandraDaemon.java:222 - 
> Exception in thread Thread[HintsDispatcher:2,1,main]
> java.lang.UnsupportedOperationException: Hints are not seekable.
>   at org.apache.cassandra.hints.HintsReader.seek(HintsReader.java:114) 
> ~[main/:na]
>   at 
> org.apache.cassandra.hints.HintsDispatcher.seek(HintsDispatcher.java:79) 
> ~[main/:na]
>   at 
> org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.deliver(HintsDispatchExecutor.java:257)
>  ~[main/:na]
>   at 
> org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.dispatch(HintsDispatchExecutor.java:242)
>  ~[main/:na]
>   at 
> org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.dispatch(HintsDispatchExecutor.java:220)
>  ~[main/:na]
>   at 
> org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.run(HintsDispatchExecutor.java:199)
>  ~[main/:na]
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> ~[na:1.8.0_91]
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
> ~[na:1.8.0_91]
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  ~[na:1.8.0_91]
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  [na:1.8.0_91]
>   at java.lang.Thread.run(Thread.java:745) [na:1.8.0_91]
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-12215) NullPointerException during Compaction

2016-07-29 Thread Sylvain Lebresne (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-12215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15399099#comment-15399099
 ] 

Sylvain Lebresne commented on CASSANDRA-12215:
--

It's the same trace than in CASSANDRA-12336, so let's follow up there (but I 
have all the information I need to fix the problem now).

> NullPointerException during Compaction
> --
>
> Key: CASSANDRA-12215
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12215
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
> Environment: Cassandra 3.0.8, cqlsh 5.0.1
>Reporter: Hau Phan
> Fix For: 3.0.x
>
>
> Running 3.0.8 on a single standalone node with cqlsh 5.0.1, the keyspace RF = 
> 1 and class SimpleStrategy.  
> Attempting to run a 'select * from ' and receiving this error:
> ReadFailure: code=1300 [Replica(s) failed to execute read] message="Operation 
> failed - received 0 responses and 1 failures" info={'failures': 1, 
> 'received_responses': 0, 'required_responses': 1, 'consistency': 'ONE'}
> Cassandra system.log prints this:
> {code}
> ERROR [CompactionExecutor:5] 2016-07-15 13:42:13,219 CassandraDaemon.java:201 
> - Exception in thread Thread[CompactionExecutor:5,1,main]
> java.lang.NullPointerException: null
>   at 
> org.apache.cassandra.db.transform.UnfilteredRows.isEmpty(UnfilteredRows.java:58)
>  ~[apache-cassandra-3.0.8.jar:3.0.8]
>   at 
> org.apache.cassandra.db.partitions.PurgeFunction.applyToPartition(PurgeFunction.java:64)
>  ~[apache-cassandra-3.0.8.jar:3.0.8]
>   at 
> org.apache.cassandra.db.partitions.PurgeFunction.applyToPartition(PurgeFunction.java:24)
>  ~[apache-cassandra-3.0.8.jar:3.0.8]
>   at 
> org.apache.cassandra.db.transform.BasePartitions.hasNext(BasePartitions.java:96)
>  ~[apache-cassandra-3.0.8.jar:3.0.8]
>   at 
> org.apache.cassandra.db.compaction.CompactionIterator.hasNext(CompactionIterator.java:226)
>  ~[apache-cassandra-3.0.8.jar:3.0.8]
>   at 
> org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:177)
>  ~[apache-cassandra-3.0.8.jar:3.0.8]
>   at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) 
> ~[apache-cassandra-3.0.8.jar:3.0.8]
>   at 
> org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:78)
>  ~[apache-cassandra-3.0.8.jar:3.0.8]
>   at 
> org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:60)
>  ~[apache-cassandra-3.0.8.jar:3.0.8]
>   at 
> org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionCandidate.run(CompactionManager.java:263)
>  ~[apache-cassandra-3.0.8.jar:3.0.8]
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> ~[na:1.8.0_65]
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
> ~[na:1.8.0_65]
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  ~[na:1.8.0_65]
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  [na:1.8.0_65]
>   at java.lang.Thread.run(Thread.java:745) [na:1.8.0_65]
> {code}
> Doing a sstabledump -d shows a few rows with the column value of 
> "", telling me compaction doesn't seem to be working correctly.  
> # nodetool compactionstats 
> pending tasks: 1
> attempting to run a compaction gets:
> # nodetool compact  
> error: null
> -- StackTrace --
> java.lang.NullPointerException
>   at 
> org.apache.cassandra.db.transform.UnfilteredRows.isEmpty(UnfilteredRows.java:58)
>   at 
> org.apache.cassandra.db.partitions.PurgeFunction.applyToPartition(PurgeFunction.java:64)
>   at 
> org.apache.cassandra.db.partitions.PurgeFunction.applyToPartition(PurgeFunction.java:24)
>   at 
> org.apache.cassandra.db.transform.BasePartitions.hasNext(BasePartitions.java:96)
>   at 
> org.apache.cassandra.db.compaction.CompactionIterator.hasNext(CompactionIterator.java:226)
>   at 
> org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:177)
>   at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
>   at 
> org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:78)
>   at 
> org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:60)
>   at 
> org.apache.cassandra.db.compaction.CompactionManager$8.runMayThrow(CompactionManager.java:606)
>   at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
>

[jira] [Assigned] (CASSANDRA-12336) NullPointerException during compaction on table with static columns

2016-07-29 Thread Sylvain Lebresne (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sylvain Lebresne reassigned CASSANDRA-12336:


Assignee: Sylvain Lebresne

> NullPointerException during compaction on table with static columns
> ---
>
> Key: CASSANDRA-12336
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12336
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
> Environment: cqlsh 5.0.1
> Cassandra 3.0.8-SNAPSHOT (3.0.x dev - a5cbb0)
>Reporter: Evan Prothro
>Assignee: Sylvain Lebresne
> Fix For: 3.0.9
>
>
> After being affected by 
> https://issues.apache.org/jira/browse/CASSANDRA-11988, we built a5cbb0. 
> Compaction still fails with the following trace:
> {code}
> WARN  [SharedPool-Worker-2] 2016-07-28 10:51:56,111 
> AbstractLocalAwareExecutorService.java:169 - Uncaught exception on thread 
> Thread[SharedPool-Worker-2,5,main]: {}
> java.lang.RuntimeException: java.lang.NullPointerException
>   at 
> org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2453)
>  ~[main/:na]
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> ~[na:1.8.0_72]
>   at 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:164)
>  ~[main/:na]
>   at 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:136)
>  [main/:na]
>   at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) 
> [main/:na]
>   at java.lang.Thread.run(Thread.java:745) [na:1.8.0_72]
> Caused by: java.lang.NullPointerException: null
>   at 
> org.apache.cassandra.db.ReadCommand$1MetricRecording.applyToRow(ReadCommand.java:466)
>  ~[main/:na]
>   at 
> org.apache.cassandra.db.ReadCommand$1MetricRecording.applyToStatic(ReadCommand.java:460)
>  ~[main/:na]
>   at org.apache.cassandra.db.transform.BaseRows.add(BaseRows.java:105) 
> ~[main/:na]
>   at 
> org.apache.cassandra.db.transform.UnfilteredRows.add(UnfilteredRows.java:41) 
> ~[main/:na]
>   at 
> org.apache.cassandra.db.transform.Transformation.add(Transformation.java:156) 
> ~[main/:na]
>   at 
> org.apache.cassandra.db.transform.Transformation.apply(Transformation.java:122)
>  ~[main/:na]
>   at 
> org.apache.cassandra.db.ReadCommand$1MetricRecording.applyToPartition(ReadCommand.java:454)
>  ~[main/:na]
>   at 
> org.apache.cassandra.db.ReadCommand$1MetricRecording.applyToPartition(ReadCommand.java:438)
>  ~[main/:na]
>   at 
> org.apache.cassandra.db.transform.BasePartitions.hasNext(BasePartitions.java:96)
>  ~[main/:na]
>   at 
> org.apache.cassandra.db.partitions.UnfilteredPartitionIterators$Serializer.serialize(UnfilteredPartitionIterators.java:295)
>  ~[main/:na]
>   at 
> org.apache.cassandra.db.ReadResponse$LocalDataResponse.build(ReadResponse.java:145)
>  ~[main/:na]
>   at 
> org.apache.cassandra.db.ReadResponse$LocalDataResponse.(ReadResponse.java:138)
>  ~[main/:na]
>   at 
> org.apache.cassandra.db.ReadResponse$LocalDataResponse.(ReadResponse.java:134)
>  ~[main/:na]
>   at 
> org.apache.cassandra.db.ReadResponse.createDataResponse(ReadResponse.java:76) 
> ~[main/:na]
>   at 
> org.apache.cassandra.db.ReadCommand.createResponse(ReadCommand.java:320) 
> ~[main/:na]
>   at 
> org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:1796)
>  ~[main/:na]
>   at 
> org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2449)
>  ~[main/:na]
>   ... 5 common frames omitted
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-12282) SSTablesIteratedTest.testDeletionOnIndexedSSTableASC-compression failure

2016-07-29 Thread Stefania (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-12282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15399098#comment-15399098
 ] 

Stefania commented on CASSANDRA-12282:
--

When this test fails it's because there is an unexpected sstable, and this is 
caused by a flush operation that is triggered by a schema change. The clean-up 
tasks in CQLTester.afterTest() are causing this schema change, and they are 
currently asynchronous:

{code}
DEBUG [OptionalTasks:1] 2016-07-29 09:52:30,992 
java.lang.Thread.getStackTrace(Thread.java:1552)
org.apache.cassandra.db.ColumnFamilyStore.getCurrentStackTrace(ColumnFamilyStore.java:866)
org.apache.cassandra.db.ColumnFamilyStore.logFlush(ColumnFamilyStore.java:896)
org.apache.cassandra.db.ColumnFamilyStore.switchMemtable(ColumnFamilyStore.java:854)
org.apache.cassandra.db.ColumnFamilyStore.switchMemtableIfCurrent(ColumnFamilyStore.java:838)
org.apache.cassandra.db.ColumnFamilyStore.forceFlush(ColumnFamilyStore.java:921)
org.apache.cassandra.db.commitlog.AbstractCommitLogSegmentManager.flushDataFrom(AbstractCommitLogSegmentManager.java:452)
org.apache.cassandra.db.commitlog.AbstractCommitLogSegmentManager.forceRecycleAll(AbstractCommitLogSegmentManager.java:314)
org.apache.cassandra.db.commitlog.CommitLog.forceRecycleAllSegments(CommitLog.java:220)
org.apache.cassandra.config.Schema.dropTable(Schema.java:692)
org.apache.cassandra.schema.SchemaKeyspace.lambda$updateKeyspace$376(SchemaKeyspace.java:1343)
org.apache.cassandra.schema.SchemaKeyspace$$Lambda$162/1250499735.accept(Unknown
 Source)
java.util.HashMap$Values.forEach(HashMap.java:972)
java.util.Collections$UnmodifiableCollection.forEach(Collections.java:1080)
org.apache.cassandra.schema.SchemaKeyspace.updateKeyspace(SchemaKeyspace.java:1343)
org.apache.cassandra.schema.SchemaKeyspace.mergeSchema(SchemaKeyspace.java:1313)
org.apache.cassandra.service.MigrationManager.announce(MigrationManager.java:512)
org.apache.cassandra.service.MigrationManager.announceColumnFamilyDrop(MigrationManager.java:466)
org.apache.cassandra.cql3.statements.DropTableStatement.announceMigration(DropTableStatement.java:93)
org.apache.cassandra.cql3.statements.SchemaAlteringStatement.executeInternal(SchemaAlteringStatement.java:120)
org.apache.cassandra.cql3.CQLTester.schemaChange(CQLTester.java:669)
org.apache.cassandra.cql3.CQLTester$2.run(CQLTester.java:294)
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
java.util.concurrent.FutureTask.run(FutureTask.java:266)
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
java.lang.Thread.run(Thread.java:745)
{code}

Note the trace at CQLTester.java line 294.

The cleanup operations need to be asynchronous to reduce the runtime of CQL 
tests, see CASSANDRA-7327. However, this specific test doesn't need to drop all 
previous tables every time a single test is run. So I'm thinking of adding an 
opt-out mechanism to the cleanup done after each test, in which case we would 
only clean up after the entire test suite has executed.

> SSTablesIteratedTest.testDeletionOnIndexedSSTableASC-compression failure
> 
>
> Key: CASSANDRA-12282
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12282
> Project: Cassandra
>  Issue Type: Test
>Reporter: Joshua McKenzie
>Assignee: Stefania
>  Labels: unittest
>
> Error Message
> expected:<3> but was:<4>
> Stacktrace
> junit.framework.AssertionFailedError: expected:<3> but was:<4>
>   at 
> org.apache.cassandra.cql3.validation.miscellaneous.SSTablesIteratedTest.executeAndCheck(SSTablesIteratedTest.java:45)
>   at 
> org.apache.cassandra.cql3.validation.miscellaneous.SSTablesIteratedTest.testDeletionOnIndexedSSTableASC(SSTablesIteratedTest.java:348)
>   at 
> org.apache.cassandra.cql3.validation.miscellaneous.SSTablesIteratedTest.testDeletionOnIndexedSSTableASC(SSTablesIteratedTest.java:312)
> [Failure|http://cassci.datastax.com/job/cassandra-3.9_testall/lastCompletedBuild/testReport/org.apache.cassandra.cql3.validation.miscellaneous/SSTablesIteratedTest/testDeletionOnIndexedSSTableASC_compression/]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-12236) RTE from new CDC column breaks in flight queries.

2016-07-29 Thread Sylvain Lebresne (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-12236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15399095#comment-15399095
 ] 

Sylvain Lebresne commented on CASSANDRA-12236:
--

The upgrade test has finished, but something went wrong. Looking at the console 
output, at least some tests "seems" to have run successfully, but there is some 
timeout during the "POST BUILD TASK". I'm not entirely sure what to make of 
that so I'll restart the job in case that was a temporary env issue.

> RTE from new CDC column breaks in flight queries.
> -
>
> Key: CASSANDRA-12236
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12236
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Jeremiah Jordan
>Assignee: Sylvain Lebresne
> Fix For: 3.x
>
> Attachments: 12236.txt
>
>
> This RTE is not harmless. It will cause the internode connection to break 
> which will cause all in flight requests between these nodes to die/timeout.
> {noformat}
> - Due to changes in schema migration handling and the storage format 
> after 3.0, you will
>   see error messages such as:
>  "java.lang.RuntimeException: Unknown column cdc during 
> deserialization"
>   in your system logs on a mixed-version cluster during upgrades. This 
> error message
>   is harmless and due to the 3.8 nodes having cdc added to their schema 
> tables while
>   the <3.8 nodes do not. This message should cease once all nodes are 
> upgraded to 3.8.
>   As always, refrain from schema changes during cluster upgrades.
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

svn commit: r1754493 - in /cassandra/site: publish/ publish/community/ publish/css/ publish/doc/ publish/doc/3.7/ publish/doc/3.7/architecture/ publish/doc/3.7/configuration/ publish/doc/3.7/cql/ publ

2016-07-29 Thread slebresne

Author: slebresne
Date: Fri Jul 29 09:59:38 2016
New Revision: 1754493

URL: http://svn.apache.org/viewvc?rev=1754493=rev
Log:
New website (that includes new documentation)


[This commit notification would consist of 78 parts, 
which exceeds the limit of 50 ones, so it was shortened to the summary.]

[jira] [Commented] (CASSANDRA-9318) Bound the number of in-flight requests at the coordinator

2016-07-29 Thread Stefania (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15398930#comment-15398930
 ] 

Stefania commented on CASSANDRA-9318:
-

It's in pretty good shape now! :)

Let's rebase and start the testing.

> Bound the number of in-flight requests at the coordinator
> -
>
> Key: CASSANDRA-9318
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9318
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local Write-Read Paths, Streaming and Messaging
>Reporter: Ariel Weisberg
>Assignee: Sergio Bossa
> Attachments: 9318-3.0-nits-trailing-spaces.patch, backpressure.png, 
> limit.btm, no_backpressure.png
>
>
> It's possible to somewhat bound the amount of load accepted into the cluster 
> by bounding the number of in-flight requests and request bytes.
> An implementation might do something like track the number of outstanding 
> bytes and requests and if it reaches a high watermark disable read on client 
> connections until it goes back below some low watermark.
> Need to make sure that disabling read on the client connection won't 
> introduce other issues.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-12236) RTE from new CDC column breaks in flight queries.

2016-07-29 Thread Sylvain Lebresne (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-12236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15398914#comment-15398914
 ] 

Sylvain Lebresne commented on CASSANDRA-12236:
--

I fixed the unit test failures (mainly due to various minor errors in the 
change to {{RowUpdateBuilder}} I did in the tests) in a new commit. There is 2 
dtests failures for cqlsh DESCRIBE but that's just because a consequence of 
this patch is that the {{cdc}} table properly won't be displayed by DESCRIBE by 
default, which is imo fine. I have a trivial fix of that locally that I'll push 
on commit. The rest of the dtest failures "seems" unrelated.

I seem to be have been able to start upgrade tests on that last branch which I 
include below, but they haven't finished at the time of this writing so we'll 
see the results.

| [12236-trunk|https://github.com/pcmanus/cassandra/commits/12236-trunk] | 
[utests|http://cassci.datastax.com/job/pcmanus-12236-trunk-testall] | 
[dtests|http://cassci.datastax.com/job/pcmanus-12236-trunk-dtest] | [upgrade 
tests|http://cassci.datastax.com/view/Dev/view/pcmanus/job/pcmanus-upgrade_12236-upgrade/]
 |


> RTE from new CDC column breaks in flight queries.
> -
>
> Key: CASSANDRA-12236
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12236
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Jeremiah Jordan
>Assignee: Sylvain Lebresne
> Fix For: 3.x
>
> Attachments: 12236.txt
>
>
> This RTE is not harmless. It will cause the internode connection to break 
> which will cause all in flight requests between these nodes to die/timeout.
> {noformat}
> - Due to changes in schema migration handling and the storage format 
> after 3.0, you will
>   see error messages such as:
>  "java.lang.RuntimeException: Unknown column cdc during 
> deserialization"
>   in your system logs on a mixed-version cluster during upgrades. This 
> error message
>   is harmless and due to the 3.8 nodes having cdc added to their schema 
> tables while
>   the <3.8 nodes do not. This message should cease once all nodes are 
> upgraded to 3.8.
>   As always, refrain from schema changes during cluster upgrades.
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-11465) dtest failure in cql_tracing_test.TestCqlTracing.tracing_unknown_impl_test

2016-07-29 Thread Stefania (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefania updated CASSANDRA-11465:
-
Component/s: Observability

> dtest failure in cql_tracing_test.TestCqlTracing.tracing_unknown_impl_test
> --
>
> Key: CASSANDRA-11465
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11465
> Project: Cassandra
>  Issue Type: Bug
>  Components: Observability
>Reporter: Philip Thompson
>Assignee: Stefania
>  Labels: dtest
> Fix For: 2.2.8, 3.0.9, 3.9
>
>
> Failing on the following assert, on trunk only: 
> {{self.assertEqual(len(errs[0]), 1)}}
> Is not failing consistently.
> example failure:
> http://cassci.datastax.com/job/trunk_dtest/1087/testReport/cql_tracing_test/TestCqlTracing/tracing_unknown_impl_test
> Failed on CassCI build trunk_dtest #1087



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

1 2 >

1 - 100 of 125 matches

Mail list logo