[jira] [Updated] (CASSANDRA-8819) LOCAL_QUORUM writes returns wrong message
[ https://issues.apache.org/jira/browse/CASSANDRA-8819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Philip Thompson updated CASSANDRA-8819: --- Description: We have two DC3, each with 7 nodes. Here is the keyspace setup: create keyspace test with placement_strategy = 'NetworkTopologyStrategy' and strategy_options = {DC2 : 3, DC1 : 3} and durable_writes = true; We brought down two nodes in DC2 for maintenance. We only write to DC1 using local_quroum (using datastax JavaClient) But we see this errors in the log: Cassandra timeout during write query at consistency LOCAL_QUORUM (4 replica were required but only 3 acknowledged the write why does it say 4 replica were required? and Why would it give error back to client since local_quorum should succeed. Here are the output from nodetool status Note: Ownership information does not include topology; for complete information, specify a keyspace Datacenter: DC2 === Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns Host ID Rack UN 10.2.0.1 10.92 GB 256 7.9% RAC206 UN 10.2.0.2 6.17 GB256 8.0% RAC106 UN 10.2.0.3 6.63 GB256 7.3% RAC107 DL 10.2.0.4 1.54 GB256 7.7% RAC107 UN 10.2.0.5 6.02 GB256 6.6% RAC106 UJ 10.2.0.6 3.68 GB256 ? RAC205 UN 10.2.0.7 7.22 GB256 7.7% RAC205 Datacenter: DC1 === Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns Host ID Rack UN 10.1.0.1 6.04 GB256 8.6% RAC10 UN 10.1.0.2 7.55 GB256 7.4% RAC8 UN 10.1.0.3 5.83 GB256 7.0% RAC9 UN 10.1.0.47.34 GB256 7.9% RAC6 UN 10.1.0.5 7.57 GB256 8.0% RAC7 UN 10.1.0.6 5.31 GB256 7.3% RAC10 UN 10.1.0.7 5.47 GB256 8.6% RAC9 I did a cql trace on the query and here is the trace, and it does say Write timeout; received 3 of 4 required replies | 17:27:52,831 | 10.1.0.1 | 2002873 at the end. I guess that is where the client gets the error from. But the rows was inserted to Cassandra correctly. And I traced read with local_quorum and it behaves correctly and the reads don't go to DC2. The problem is only with writes on local_quorum. {code} Tracing session: 5a789fb0-b70d-11e4-8fca-99bff9c19890 activity | timestamp| source | source_elapsed -+--+-+ execute_cql3_query | 17:27:50,828 | 10.1.0.1 | 0 Parsing insert into test (user_id, created, event_data, event_id)values ( 123456789 , 9eab8950-b70c-11e4-8fca-99bff9c19891, 'test', '16'); | 17:27:50,828 | 10.1.0.1 | 39 Preparing statement | 17:27:50,828 | 10.1.0.1 |135 Message received from /10.1.0.1 | 17:27:50,829 | 10.1.0.5 | 25 Sending message to /10.1.0.5 | 17:27:50,829 | 10.1.0.1 |421 Executing single-partition query on users | 17:27:50,829 | 10.1.0.5 |177 Acquiring sstable references | 17:27:50,829 | 10.1.0.5 |191 Merging memtable tombstones | 17:27:50,830 | 10.1.0.5 |208 Message received from /10.1.0.5 | 17:27:50,830 | 10.1.0.1 | 1461
[jira] [Commented] (CASSANDRA-8819) LOCAL_QUORUM writes returns wrong message
[ https://issues.apache.org/jira/browse/CASSANDRA-8819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14325331#comment-14325331 ] Philip Thompson commented on CASSANDRA-8819: [~weizhu], what exactly is the query you are executing? LOCAL_QUORUM writes returns wrong message - Key: CASSANDRA-8819 URL: https://issues.apache.org/jira/browse/CASSANDRA-8819 Project: Cassandra Issue Type: Bug Components: Core Environment: CentOS 6.6 Reporter: Wei Zhu Assignee: Tyler Hobbs Fix For: 2.0.13 We have two DC3, each with 7 nodes. Here is the keyspace setup: create keyspace test with placement_strategy = 'NetworkTopologyStrategy' and strategy_options = {DC2 : 3, DC1 : 3} and durable_writes = true; We brought down two nodes in DC2 for maintenance. We only write to DC1 using local_quroum (using datastax JavaClient) But we see this errors in the log: Cassandra timeout during write query at consistency LOCAL_QUORUM (4 replica were required but only 3 acknowledged the write why does it say 4 replica were required? and Why would it give error back to client since local_quorum should succeed. Here are the output from nodetool status Note: Ownership information does not include topology; for complete information, specify a keyspace Datacenter: DC2 === Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns Host ID Rack UN 10.2.0.1 10.92 GB 256 7.9% RAC206 UN 10.2.0.2 6.17 GB256 8.0% RAC106 UN 10.2.0.3 6.63 GB256 7.3% RAC107 DL 10.2.0.4 1.54 GB256 7.7% RAC107 UN 10.2.0.5 6.02 GB256 6.6% RAC106 UJ 10.2.0.6 3.68 GB256 ? RAC205 UN 10.2.0.7 7.22 GB256 7.7% RAC205 Datacenter: DC1 === Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns Host ID Rack UN 10.1.0.1 6.04 GB256 8.6% RAC10 UN 10.1.0.2 7.55 GB256 7.4% RAC8 UN 10.1.0.3 5.83 GB256 7.0% RAC9 UN 10.1.0.47.34 GB256 7.9% RAC6 UN 10.1.0.5 7.57 GB256 8.0% RAC7 UN 10.1.0.6 5.31 GB256 7.3% RAC10 UN 10.1.0.7 5.47 GB256 8.6% RAC9 I did a cql trace on the query and here is the trace, and it does say Write timeout; received 3 of 4 required replies | 17:27:52,831 | 10.1.0.1 |2002873 at the end. I guess that is where the client gets the error from. But the rows was inserted to Cassandra correctly. And I traced read with local_quorum and it behaves correctly and the reads don't go to DC2. The problem is only with writes on local_quorum. {code} Tracing session: 5a789fb0-b70d-11e4-8fca-99bff9c19890 activity | timestamp | source | source_elapsed -+--+-+ execute_cql3_query | 17:27:50,828 | 10.1.0.1 | 0 Parsing insert into test (user_id, created, event_data, event_id)values ( 123456789 , 9eab8950-b70c-11e4-8fca-99bff9c19891, 'test', '16'); | 17:27:50,828 | 10.1.0.1 | 39 Preparing statement | 17:27:50,828 | 10.1.0.1 |135 Message received from /10.1.0.1 | 17:27:50,829 | 10.1.0.5 | 25 Sending message to /10.1.0.5 | 17:27:50,829 | 10.1.0.1 |421 Executing single-partition query on users | 17:27:50,829 | 10.1.0.5 |177
[jira] [Updated] (CASSANDRA-8819) LOCAL_QUORUM writes returns wrong message
[ https://issues.apache.org/jira/browse/CASSANDRA-8819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Philip Thompson updated CASSANDRA-8819: --- Reviewer: (was: Philip Thompson) Reproduced In: 2.0.8 Tester: Philip Thompson Fix Version/s: (was: 2.0.8) 2.0.13 Assignee: Tyler Hobbs Any ideas what this could be Tyler? We discussed it on IRC today. It doesn't appear to be CASSANDRA-7947 and Wei says they aren't using LWT. All nodes agree on the schema. LOCAL_QUORUM writes returns wrong message - Key: CASSANDRA-8819 URL: https://issues.apache.org/jira/browse/CASSANDRA-8819 Project: Cassandra Issue Type: Bug Components: Core Environment: CentOS 6.6 Reporter: Wei Zhu Assignee: Tyler Hobbs Fix For: 2.0.13 We have two DC3, each with 7 nodes. Here is the keyspace setup: create keyspace test with placement_strategy = 'NetworkTopologyStrategy' and strategy_options = {DC2 : 3, DC1 : 3} and durable_writes = true; We brought down two nodes in DC2 for maintenance. We only write to DC1 using local_quroum (using datastax JavaClient) But we see this errors in the log: Cassandra timeout during write query at consistency LOCAL_QUORUM (4 replica were required but only 3 acknowledged the write why does it say 4 replica were required? and Why would it give error back to client since local_quorum should succeed. Here are the output from nodetool status Note: Ownership information does not include topology; for complete information, specify a keyspace Datacenter: DC2 === Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns Host ID Rack UN 10.2.0.1 10.92 GB 256 7.9% RAC206 UN 10.2.0.2 6.17 GB256 8.0% RAC106 UN 10.2.0.3 6.63 GB256 7.3% RAC107 DL 10.2.0.4 1.54 GB256 7.7% RAC107 UN 10.2.0.5 6.02 GB256 6.6% RAC106 UJ 10.2.0.6 3.68 GB256 ? RAC205 UN 10.2.0.7 7.22 GB256 7.7% RAC205 Datacenter: DC1 === Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns Host ID Rack UN 10.1.0.1 6.04 GB256 8.6% RAC10 UN 10.1.0.2 7.55 GB256 7.4% RAC8 UN 10.1.0.3 5.83 GB256 7.0% RAC9 UN 10.1.0.47.34 GB256 7.9% RAC6 UN 10.1.0.5 7.57 GB256 8.0% RAC7 UN 10.1.0.6 5.31 GB256 7.3% RAC10 UN 10.1.0.7 5.47 GB256 8.6% RAC9 I did a cql trace on the query and here is the trace, and it does say Write timeout; received 3 of 4 required replies | 17:27:52,831 | 10.1.0.1 |2002873 at the end. I guess that is where the client gets the error from. But the rows was inserted to Cassandra correctly. And I traced read with local_quorum and it behaves correctly and the reads don't go to DC2. The problem is only with writes on local_quorum. Tracing session: 5a789fb0-b70d-11e4-8fca-99bff9c19890 activity | timestamp | source | source_elapsed -+--+-+ execute_cql3_query | 17:27:50,828 | 10.1.0.1 | 0 Parsing insert into test (user_id, created, event_data, event_id)values ( 123456789 , 9eab8950-b70c-11e4-8fca-99bff9c19891, 'test', '16'); | 17:27:50,828 | 10.1.0.1 | 39 Preparing statement | 17:27:50,828 | 10.1.0.1 |135 Message received from /10.1.0.1 | 17:27:50,829 | 10.1.0.5 | 25 Sending message to /10.1.0.5 | 17:27:50,829 |
[jira] [Updated] (CASSANDRA-8818) Creating keyspace then table fails with non-prepared query
[ https://issues.apache.org/jira/browse/CASSANDRA-8818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan New updated CASSANDRA-8818: Description: Hi, I'm not sure if this is a driver or cassandra issue, so please feel free to move to the appropriate component. I'm using C# on mono (linux), and the 2.5.0 cassandra driver for C#. We have a cluster of 3 nodes, and we noticed that when we created a keyspace, then a table for that keyspace in quick succession it would fail frequently. I put our approximate code below. Additionally, we noticed that if we did a prepared statement instead of just executing the query, it would succeed. It also appeared that running the queries from a .cql file (outside of our C# program) would succeed as well. In this case with tracing on, we saw that it was Preparing statement. Please let me know if you need additional details. Thanks! {noformat} var pooling = new PoolingOptions () .SetMaxConnectionsPerHost (HostDistance.Remote, 24) .SetHeartBeatInterval (1000); var queryOptions = new QueryOptions () .SetConsistencyLevel(ConsistencyLevel.ALL); var builder = Cluster.Builder () .AddContactPoints (contactPoints) .WithPort (9042) .WithPoolingOptions (pooling) .WithQueryOptions (queryOptions) .WithQueryTimeout (15000); String keyspaceQuery = CREATE KEYSPACE IF NOT EXISTS metadata WITH replication = {'class': 'SimpleStrategy', 'replication_factor': '3'} AND durable_writes = true;; String tableQuery = CREATE TABLE IF NOT EXISTS metadata.patch_history ( metadata_key text, patch_version int, applied_date timestamp, patch_file text, PRIMARY KEY (metadata_key, patch_version) ) WITH CLUSTERING ORDER BY (patch_version DESC) AND bloom_filter_fp_chance = 0.01 AND caching = 'KEYS_ONLY' AND comment = '' AND compaction = {'min_threshold': '4', 'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32'} AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'} AND dclocal_read_repair_chance = 0.1 AND default_time_to_live = 0 AND gc_grace_seconds = 864000 AND memtable_flush_period_in_ms = 0 AND read_repair_chance = 0.0 AND speculative_retry = '99.0PERCENTILE';; using (var session = cluster.Connect ()) { session.Execute(keyspaceQuery); session.Execute(tableQuery); } {noformat} was: Hi, I'm not sure if this is a driver or cassandra issue, so please feel free to move to the appropriate component. I'm using C# on mono (linux), and the 2.5.0 cassandra driver for C#. We have a cluster of 3 nodes, and we noticed that when we created a keyspace, then a table for that keyspace in quick succession it would fail frequently. I put our approximate code below. Additionally, we noticed that if we did a prepared statement instead of just executing the query, it would succeed. It also appeared that running the queries from a .cql file (outside of our C# program) would succeed as well. Please let me know if you need additional details. Thanks! {noformat} var pooling = new PoolingOptions () .SetMaxConnectionsPerHost (HostDistance.Remote, 24) .SetHeartBeatInterval (1000); var queryOptions = new QueryOptions () .SetConsistencyLevel(ConsistencyLevel.ALL); var builder = Cluster.Builder () .AddContactPoints (contactPoints) .WithPort (9042) .WithPoolingOptions (pooling) .WithQueryOptions (queryOptions) .WithQueryTimeout (15000); String keyspaceQuery = CREATE KEYSPACE IF NOT EXISTS metadata WITH replication = {'class': 'SimpleStrategy', 'replication_factor': '3'} AND durable_writes = true;; String tableQuery = CREATE TABLE IF NOT EXISTS metadata.patch_history ( metadata_key text, patch_version int, applied_date timestamp, patch_file text, PRIMARY KEY (metadata_key, patch_version) ) WITH CLUSTERING ORDER BY (patch_version DESC) AND bloom_filter_fp_chance = 0.01 AND caching = 'KEYS_ONLY' AND comment = '' AND compaction = {'min_threshold': '4', 'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32'} AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'} AND dclocal_read_repair_chance = 0.1 AND default_time_to_live = 0 AND gc_grace_seconds = 864000 AND memtable_flush_period_in_ms = 0 AND read_repair_chance = 0.0 AND speculative_retry = '99.0PERCENTILE';; using (var session = cluster.Connect ()) { session.Execute(keyspaceQuery); session.Execute(tableQuery); } {noformat} Creating keyspace then table fails with non-prepared query -- Key: CASSANDRA-8818 URL: https://issues.apache.org/jira/browse/CASSANDRA-8818 Project: Cassandra Issue Type: Bug Components: Drivers (now out of tree)
[jira] [Commented] (CASSANDRA-6434) Repair-aware gc grace period
[ https://issues.apache.org/jira/browse/CASSANDRA-6434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14325049#comment-14325049 ] sankalp kohli commented on CASSANDRA-6434: -- [~krummas] Any update on this. We need this patch so that we don't have to keep patches. Repair-aware gc grace period - Key: CASSANDRA-6434 URL: https://issues.apache.org/jira/browse/CASSANDRA-6434 Project: Cassandra Issue Type: New Feature Components: Core Reporter: sankalp kohli Assignee: Marcus Eriksson Fix For: 3.0 Since the reason for gcgs is to ensure that we don't purge tombstones until every replica has been notified, it's redundant in a world where we're tracking repair times per sstable (and repairing frequentily), i.e., a world where we default to incremental repair a la CASSANDRA-5351. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Reopened] (CASSANDRA-8803) Implement transitional mode in C* that will accept both encrypted and non-encrypted client traffic
[ https://issues.apache.org/jira/browse/CASSANDRA-8803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Williams reopened CASSANDRA-8803: - To clarify, this is about *client* encryption, not server encryption. Implement transitional mode in C* that will accept both encrypted and non-encrypted client traffic -- Key: CASSANDRA-8803 URL: https://issues.apache.org/jira/browse/CASSANDRA-8803 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Vishy Kasar We have some non-secure clusters taking live traffic in production from active clients. We want to enable client to node encryption on these clusters. Once we set the client_encryption_options enabled to true in yaml and bounce a cassandra node in the ring, the existing clients that do not do SSL will fail to connect to that node. There does not seem to be a good way to roll this change with out taking an outage. Can we implement a transitional mode in C* that will accept both encrypted and non-encrypted client traffic? We would enable this during transition and turn it off after both server and client start talking SSL. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-8814) Formatting of code blocks in CQL doc in github is a little messed up
[ https://issues.apache.org/jira/browse/CASSANDRA-8814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Philip Thompson updated CASSANDRA-8814: --- Fix Version/s: 2.1.4 2.0.13 3.0 Formatting of code blocks in CQL doc in github is a little messed up Key: CASSANDRA-8814 URL: https://issues.apache.org/jira/browse/CASSANDRA-8814 Project: Cassandra Issue Type: Task Components: Documentation website Reporter: Jack Krupansky Priority: Minor Fix For: 3.0, 2.0.13, 2.1.4 Although the html version of the CQL doc on the website looks fine, the textile conversion of the source files in github looks a little messed up. In particular, the p. paragraph directives that terminate bc.. block code directives are not properly recognized and then the following text gets subsumed into the code block. The directives look fine, as per my read of the textile doc, but it appears that the textile converter used by github requires that there be a blank line before the p. directive to end the code block. It also requires a space after the dot for p. . If you go to the github pages for the CQL doc for trunk, 2.1, and 2.0, you will see stray p. directives as well as \_\_Sample\_\_ text in the code blocks, but only where the syntax code block was multiple lines. This is not a problem where the bc. directive is used with a single dot for a single line, as opposed to the bc.. directive used with a double dot for a block of lines. Or in the case of the CREATE KEYSPACE section you see all of the notes crammed into what should be the Sample box. See: https://github.com/apache/cassandra/blob/trunk/doc/cql3/CQL.textile https://github.com/apache/cassandra/blob/cassandra-2.1.2/doc/cql3/CQL.textile https://github.com/apache/cassandra/blob/cassandra-2.0.11/doc/cql3/CQL.textile This problem (p. not recognized to terminate a code block unless followed by a space and preceded by a blank line) actually occurs for the interactive textile formatter as well: http://txstyle.org/doc/4/block-code -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-8817) Error handling in Cassandra logs in low memory scenarios could use improvement
[ https://issues.apache.org/jira/browse/CASSANDRA-8817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Philip Thompson updated CASSANDRA-8817: --- Issue Type: Improvement (was: Bug) Error handling in Cassandra logs in low memory scenarios could use improvement -- Key: CASSANDRA-8817 URL: https://issues.apache.org/jira/browse/CASSANDRA-8817 Project: Cassandra Issue Type: Improvement Components: Core Environment: Ubuntu 14.04, VM originally created with 1 GB RAM, DSE 4.6.0 installed Reporter: Michael DeHaan Priority: Minor Labels: lhf Fix For: 2.1.4 When running Cassandra with a low amount of RAM, in this case, using DataStax Enterprise 4.6.0 in a reasonably default configuration, I find that I get an error after starting and trying to use nodetool, namely that it cannot connect to 127.0.0.1. Originally this sends me up a creek, looking for why Cassandra is not listening on 7199. The truth ends up being a bit more cryptic - that Cassandra isn't running. Upon looking at the Cassandra system logs, I see the last thing that it did was print out the (very long) class path. This confused me as basically I'm seeing no errors in the log at all. I am proposing that Cassandra should check the amount of available RAM and issue a warning in the log, or possibly an error, because in this scenario Cassandra is going to oomkill and probably could have predicted this in advance. Something like: Found X MB of RAM, expecting at least Y MB of RAM, Z MB recommended, may crash, adjust SETTINGS or something similar would be a possible solution. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-8818) Creating keyspace then table fails with non-prepared query
Jonathan New created CASSANDRA-8818: --- Summary: Creating keyspace then table fails with non-prepared query Key: CASSANDRA-8818 URL: https://issues.apache.org/jira/browse/CASSANDRA-8818 Project: Cassandra Issue Type: Bug Components: Drivers (now out of tree) Reporter: Jonathan New Hi, I'm not sure if this is a driver or cassandra issue, so please feel free to move to the appropriate component. I'm using C# on mono (linux), and the 2.5.0 cassandra driver for C#. We have a cluster of 3 nodes, and we noticed that when we created a keyspace, then a table for that keyspace in quick succession it would fail frequently. I put our approximate code below. Additionally, we noticed that if we did a prepared statement instead of just executing the query, it would succeed. It also appeared that running the queries from a .cql file (outside of our C# program) would succeed as well. Please let me know if you need additional details. Thanks! {noformat} var pooling = new PoolingOptions () .SetMaxConnectionsPerHost (HostDistance.Remote, 24) .SetHeartBeatInterval (1000); var queryOptions = new QueryOptions () .SetConsistencyLevel(ConsistencyLevel.ALL); var builder = Cluster.Builder () .AddContactPoints (contactPoints) .WithPort (9042) .WithPoolingOptions (pooling) .WithQueryOptions (queryOptions) .WithQueryTimeout (15000); String keyspaceQuery = CREATE KEYSPACE IF NOT EXISTS metadata WITH replication = {'class': 'SimpleStrategy', 'replication_factor': '3'} AND durable_writes = true;; String tableQuery = CREATE TABLE IF NOT EXISTS metadata.patch_history ( metadata_key text, patch_version int, applied_date timestamp, patch_file text, PRIMARY KEY (metadata_key, patch_version) ) WITH CLUSTERING ORDER BY (patch_version DESC) AND bloom_filter_fp_chance = 0.01 AND caching = 'KEYS_ONLY' AND comment = '' AND compaction = {'min_threshold': '4', 'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32'} AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'} AND dclocal_read_repair_chance = 0.1 AND default_time_to_live = 0 AND gc_grace_seconds = 864000 AND memtable_flush_period_in_ms = 0 AND read_repair_chance = 0.0 AND speculative_retry = '99.0PERCENTILE';; using (var session = cluster.Connect ()) { session.Execute(keyspaceQuery); session.Execute(tableQuery); } {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8818) Creating keyspace then table fails with non-prepared query
[ https://issues.apache.org/jira/browse/CASSANDRA-8818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14325325#comment-14325325 ] Aleksey Yeschenko commented on CASSANDRA-8818: -- Nope, CL.ALL does not affect schema propagation and synchronisation. Creating keyspace then table fails with non-prepared query -- Key: CASSANDRA-8818 URL: https://issues.apache.org/jira/browse/CASSANDRA-8818 Project: Cassandra Issue Type: Bug Components: Drivers (now out of tree) Reporter: Jonathan New Hi, I'm not sure if this is a driver or cassandra issue, so please feel free to move to the appropriate component. I'm using C# on mono (linux), and the 2.5.0 cassandra driver for C#. We have a cluster of 3 nodes, and we noticed that when we created a keyspace, then a table for that keyspace in quick succession it would fail frequently. I put our approximate code below. Additionally, we noticed that if we did a prepared statement instead of just executing the query, it would succeed. It also appeared that running the queries from a .cql file (outside of our C# program) would succeed as well. In this case with tracing on, we saw that it was Preparing statement. Please let me know if you need additional details. Thanks! {noformat} var pooling = new PoolingOptions () .SetMaxConnectionsPerHost (HostDistance.Remote, 24) .SetHeartBeatInterval (1000); var queryOptions = new QueryOptions () .SetConsistencyLevel(ConsistencyLevel.ALL); var builder = Cluster.Builder () .AddContactPoints (contactPoints) .WithPort (9042) .WithPoolingOptions (pooling) .WithQueryOptions (queryOptions) .WithQueryTimeout (15000); String keyspaceQuery = CREATE KEYSPACE IF NOT EXISTS metadata WITH replication = {'class': 'SimpleStrategy', 'replication_factor': '3'} AND durable_writes = true;; String tableQuery = CREATE TABLE IF NOT EXISTS metadata.patch_history ( metadata_key text, patch_version int, applied_date timestamp, patch_file text, PRIMARY KEY (metadata_key, patch_version) ) WITH CLUSTERING ORDER BY (patch_version DESC) AND bloom_filter_fp_chance = 0.01 AND caching = 'KEYS_ONLY' AND comment = '' AND compaction = {'min_threshold': '4', 'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32'} AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'} AND dclocal_read_repair_chance = 0.1 AND default_time_to_live = 0 AND gc_grace_seconds = 864000 AND memtable_flush_period_in_ms = 0 AND read_repair_chance = 0.0 AND speculative_retry = '99.0PERCENTILE';; using (var session = cluster.Connect ()) { session.Execute(keyspaceQuery); session.Execute(tableQuery); } {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-8819) LOCAL_QUORUM writes returns wrong message
[ https://issues.apache.org/jira/browse/CASSANDRA-8819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14325331#comment-14325331 ] Philip Thompson edited comment on CASSANDRA-8819 at 2/18/15 2:28 AM: - [~weizhu], what exactly is the query you are executing? EDIT: Nevermind, it is in the trace. was (Author: philipthompson): [~weizhu], what exactly is the query you are executing? LOCAL_QUORUM writes returns wrong message - Key: CASSANDRA-8819 URL: https://issues.apache.org/jira/browse/CASSANDRA-8819 Project: Cassandra Issue Type: Bug Components: Core Environment: CentOS 6.6 Reporter: Wei Zhu Assignee: Tyler Hobbs Fix For: 2.0.13 We have two DC3, each with 7 nodes. Here is the keyspace setup: create keyspace test with placement_strategy = 'NetworkTopologyStrategy' and strategy_options = {DC2 : 3, DC1 : 3} and durable_writes = true; We brought down two nodes in DC2 for maintenance. We only write to DC1 using local_quroum (using datastax JavaClient) But we see this errors in the log: Cassandra timeout during write query at consistency LOCAL_QUORUM (4 replica were required but only 3 acknowledged the write why does it say 4 replica were required? and Why would it give error back to client since local_quorum should succeed. Here are the output from nodetool status Note: Ownership information does not include topology; for complete information, specify a keyspace Datacenter: DC2 === Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns Host ID Rack UN 10.2.0.1 10.92 GB 256 7.9% RAC206 UN 10.2.0.2 6.17 GB256 8.0% RAC106 UN 10.2.0.3 6.63 GB256 7.3% RAC107 DL 10.2.0.4 1.54 GB256 7.7% RAC107 UN 10.2.0.5 6.02 GB256 6.6% RAC106 UJ 10.2.0.6 3.68 GB256 ? RAC205 UN 10.2.0.7 7.22 GB256 7.7% RAC205 Datacenter: DC1 === Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns Host ID Rack UN 10.1.0.1 6.04 GB256 8.6% RAC10 UN 10.1.0.2 7.55 GB256 7.4% RAC8 UN 10.1.0.3 5.83 GB256 7.0% RAC9 UN 10.1.0.47.34 GB256 7.9% RAC6 UN 10.1.0.5 7.57 GB256 8.0% RAC7 UN 10.1.0.6 5.31 GB256 7.3% RAC10 UN 10.1.0.7 5.47 GB256 8.6% RAC9 I did a cql trace on the query and here is the trace, and it does say Write timeout; received 3 of 4 required replies | 17:27:52,831 | 10.1.0.1 |2002873 at the end. I guess that is where the client gets the error from. But the rows was inserted to Cassandra correctly. And I traced read with local_quorum and it behaves correctly and the reads don't go to DC2. The problem is only with writes on local_quorum. {code} Tracing session: 5a789fb0-b70d-11e4-8fca-99bff9c19890 activity | timestamp | source | source_elapsed -+--+-+ execute_cql3_query | 17:27:50,828 | 10.1.0.1 | 0 Parsing insert into test (user_id, created, event_data, event_id)values ( 123456789 , 9eab8950-b70c-11e4-8fca-99bff9c19891, 'test', '16'); | 17:27:50,828 | 10.1.0.1 | 39 Preparing statement | 17:27:50,828 | 10.1.0.1 |135 Message received from /10.1.0.1 | 17:27:50,829 | 10.1.0.5 | 25 Sending message to /10.1.0.5 | 17:27:50,829 | 10.1.0.1 |421
[jira] [Commented] (CASSANDRA-8808) CQLSSTableWriter: close does not work + more than one table throws ex
[ https://issues.apache.org/jira/browse/CASSANDRA-8808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14325072#comment-14325072 ] Yuki Morishita commented on CASSANDRA-8808: --- I think it is better to eliminate the path that we access Keyspace/ColumnFamilyStore from CQLSSTableWrtier. It seems we only use that for secondary index update which we don't need. CQLSSTableWriter: close does not work + more than one table throws ex - Key: CASSANDRA-8808 URL: https://issues.apache.org/jira/browse/CASSANDRA-8808 Project: Cassandra Issue Type: Bug Components: Core Reporter: Sebastian YEPES FERNANDEZ Assignee: Benjamin Lerer Labels: cql Fix For: 2.1.0, 2.1.4 I have encountered the following two issues: - When closing the CQLSSTableWriter it just hangs the process and does nothing. (https://issues.apache.org/jira/browse/CASSANDRA-8281) - When writing more than one table throws ex. (https://issues.apache.org/jira/browse/CASSANDRA-8251) These issue can be reproduced with the following code: {code:title=test.java|borderStyle=solid} import org.apache.cassandra.config.Config; import org.apache.cassandra.io.sstable.CQLSSTableWriter; public static void main(String[] args) { Config.setClientMode(true); CQLSSTableWriter w1 = CQLSSTableWriter.builder() .inDirectory(/tmp/kspc/t1) .forTable(CREATE TABLE kspc.t1 ( id int, PRIMARY KEY (id));) .using(INSERT INTO kspc.t1 (id) VALUES ( ? );) .build(); CQLSSTableWriter w2 = CQLSSTableWriter.builder() .inDirectory(/tmp/kspc/t2) .forTable(CREATE TABLE kspc.t2 ( id int, PRIMARY KEY (id));) .using(INSERT INTO kspc.t2 (id) VALUES ( ? );) .build(); try { w1.addRow(1); w2.addRow(1); w1.close(); w2.close(); } catch (Exception e) { System.out.println(e); } } {code} {code:title=The error|borderStyle=solid} Exception in thread main java.lang.ExceptionInInitializerError at org.apache.cassandra.db.Keyspace.initCf(Keyspace.java:324) at org.apache.cassandra.db.Keyspace.init(Keyspace.java:277) at org.apache.cassandra.db.Keyspace.open(Keyspace.java:119) at org.apache.cassandra.db.Keyspace.open(Keyspace.java:96) at org.apache.cassandra.cql3.statements.UpdateStatement.addUpdateForKey(UpdateStatement.java:101) at org.apache.cassandra.io.sstable.CQLSSTableWriter.rawAddRow(CQLSSTableWriter.java:226) at org.apache.cassandra.io.sstable.CQLSSTableWriter.addRow(CQLSSTableWriter.java:145) at org.apache.cassandra.io.sstable.CQLSSTableWriter.addRow(CQLSSTableWriter.java:120) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.codehaus.groovy.runtime.callsite.PojoMetaMethodSite$PojoCachedMethodSite.invoke(PojoMetaMethodSite.java:189) at org.codehaus.groovy.runtime.callsite.PojoMetaMethodSite.call(PojoMetaMethodSite.java:53) at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:45) at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:108) at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:120) at com.allthingsmonitoring.utils.BulkDataLoader.main(BulkDataLoader.groovy:415) Caused by: java.lang.NullPointerException at org.apache.cassandra.config.DatabaseDescriptor.getFlushWriters(DatabaseDescriptor.java:1053) at org.apache.cassandra.db.ColumnFamilyStore.clinit(ColumnFamilyStore.java:85) ... 18 more {code} I have just tested the in the cassandra-2.1 branch and the issue still persists. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8336) Quarantine nodes after receiving the gossip shutdown message
[ https://issues.apache.org/jira/browse/CASSANDRA-8336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14325170#comment-14325170 ] Richard Low commented on CASSANDRA-8336: v3 works well and I was able to do a full cluster bounce with zero timeouts. Here’s a few minor points: * The shutting down node might as well set the version of the shutdown state to Integer.MAX_VALUE since receiving nodes will blindly use that. * Why does it increment the generation number? We call Gossiper.instance.start with a new generation number set to the current time so it would make sense to use that. * If hit 'Unable to gossip with any seeds’ on replace, it shuts down the gossiper. This throws an AssertionError in addLocalApplicationState since the local epState is null. Quarantine nodes after receiving the gossip shutdown message Key: CASSANDRA-8336 URL: https://issues.apache.org/jira/browse/CASSANDRA-8336 Project: Cassandra Issue Type: Bug Components: Core Reporter: Brandon Williams Assignee: Brandon Williams Fix For: 2.0.13 Attachments: 8336-v2.txt, 8336-v3.txt, 8336.txt In CASSANDRA-3936 we added a gossip shutdown announcement. The problem here is that this isn't sufficient; you can still get TOEs and have to wait on the FD to figure things out. This happens due to gossip propagation time and variance; if node X shuts down and sends the message to Y, but Z has a greater gossip version than Y for X and has not yet received the message, it can initiate gossip with Y and thus mark X alive again. I propose quarantining to solve this, however I feel it should be a -D parameter you have to specify, so as not to destroy current dev and test practices, since this will mean a node that shuts down will not be able to restart until the quarantine expires. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-8819) LOCAL_QUORUM writes returns wrong message
Wei Zhu created CASSANDRA-8819: -- Summary: LOCAL_QUORUM writes returns wrong message Key: CASSANDRA-8819 URL: https://issues.apache.org/jira/browse/CASSANDRA-8819 Project: Cassandra Issue Type: Bug Components: Core Environment: CentOS 6.6 Reporter: Wei Zhu Fix For: 2.0.8 We have two DC3, each with 7 nodes. Here is the keyspace setup: create keyspace test with placement_strategy = 'NetworkTopologyStrategy' and strategy_options = {DC2 : 3, DC1 : 3} and durable_writes = true; We brought down two nodes in DC2 for maintenance. We only write to DC1 using local_quroum (using datastax JavaClient) But we see this errors in the log: Cassandra timeout during write query at consistency LOCAL_QUORUM (4 replica were required but only 3 acknowledged the write why does it say 4 replica were required? and Why would it give error back to client since local_quorum should succeed. Here are the output from nodetool status Note: Ownership information does not include topology; for complete information, specify a keyspace Datacenter: DC2 === Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns Host ID Rack UN 10.2.0.1 10.92 GB 256 7.9% RAC206 UN 10.2.0.2 6.17 GB256 8.0% RAC106 UN 10.2.0.3 6.63 GB256 7.3% RAC107 DL 10.2.0.4 1.54 GB256 7.7% RAC107 UN 10.2.0.5 6.02 GB256 6.6% RAC106 UJ 10.2.0.6 3.68 GB256 ? RAC205 UN 10.2.0.7 7.22 GB256 7.7% RAC205 Datacenter: DC1 === Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns Host ID Rack UN 10.1.0.1 6.04 GB256 8.6% RAC10 UN 10.1.0.2 7.55 GB256 7.4% RAC8 UN 10.1.0.3 5.83 GB256 7.0% RAC9 UN 10.1.0.47.34 GB256 7.9% RAC6 UN 10.1.0.5 7.57 GB256 8.0% RAC7 UN 10.1.0.6 5.31 GB256 7.3% RAC10 UN 10.1.0.7 5.47 GB256 8.6% RAC9 I did a cql trace on the query and here is the trace, and it does say Write timeout; received 3 of 4 required replies | 17:27:52,831 | 10.1.0.1 | 2002873 at the end. I guess that is where the client gets the error from. Tracing session: 5a789fb0-b70d-11e4-8fca-99bff9c19890 activity | timestamp| source | source_elapsed -+--+-+ execute_cql3_query | 17:27:50,828 | 10.1.0.1 | 0 Parsing insert into test (user_id, created, event_data, event_id)values ( 123456789 , 9eab8950-b70c-11e4-8fca-99bff9c19891, 'test', '16'); | 17:27:50,828 | 10.1.0.1 | 39 Preparing statement | 17:27:50,828 | 10.1.0.1 |135 Message received from /10.1.0.1 | 17:27:50,829 | 10.1.0.5 | 25 Sending message to /10.1.0.5 | 17:27:50,829 | 10.1.0.1 |421 Executing single-partition query on users | 17:27:50,829 | 10.1.0.5 |177 Acquiring sstable references | 17:27:50,829 | 10.1.0.5 |191 Merging memtable tombstones | 17:27:50,830 | 10.1.0.5 |208 Message received from /10.1.0.5 | 17:27:50,830 | 10.1.0.1 | 1461
[jira] [Commented] (CASSANDRA-8818) Creating keyspace then table fails with non-prepared query
[ https://issues.apache.org/jira/browse/CASSANDRA-8818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14325307#comment-14325307 ] Sotirios Delimanolis commented on CASSANDRA-8818: - Shouldn't the ConsistencyLevel of All account for that? Creating keyspace then table fails with non-prepared query -- Key: CASSANDRA-8818 URL: https://issues.apache.org/jira/browse/CASSANDRA-8818 Project: Cassandra Issue Type: Bug Components: Drivers (now out of tree) Reporter: Jonathan New Hi, I'm not sure if this is a driver or cassandra issue, so please feel free to move to the appropriate component. I'm using C# on mono (linux), and the 2.5.0 cassandra driver for C#. We have a cluster of 3 nodes, and we noticed that when we created a keyspace, then a table for that keyspace in quick succession it would fail frequently. I put our approximate code below. Additionally, we noticed that if we did a prepared statement instead of just executing the query, it would succeed. It also appeared that running the queries from a .cql file (outside of our C# program) would succeed as well. In this case with tracing on, we saw that it was Preparing statement. Please let me know if you need additional details. Thanks! {noformat} var pooling = new PoolingOptions () .SetMaxConnectionsPerHost (HostDistance.Remote, 24) .SetHeartBeatInterval (1000); var queryOptions = new QueryOptions () .SetConsistencyLevel(ConsistencyLevel.ALL); var builder = Cluster.Builder () .AddContactPoints (contactPoints) .WithPort (9042) .WithPoolingOptions (pooling) .WithQueryOptions (queryOptions) .WithQueryTimeout (15000); String keyspaceQuery = CREATE KEYSPACE IF NOT EXISTS metadata WITH replication = {'class': 'SimpleStrategy', 'replication_factor': '3'} AND durable_writes = true;; String tableQuery = CREATE TABLE IF NOT EXISTS metadata.patch_history ( metadata_key text, patch_version int, applied_date timestamp, patch_file text, PRIMARY KEY (metadata_key, patch_version) ) WITH CLUSTERING ORDER BY (patch_version DESC) AND bloom_filter_fp_chance = 0.01 AND caching = 'KEYS_ONLY' AND comment = '' AND compaction = {'min_threshold': '4', 'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32'} AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'} AND dclocal_read_repair_chance = 0.1 AND default_time_to_live = 0 AND gc_grace_seconds = 864000 AND memtable_flush_period_in_ms = 0 AND read_repair_chance = 0.0 AND speculative_retry = '99.0PERCENTILE';; using (var session = cluster.Connect ()) { session.Execute(keyspaceQuery); session.Execute(tableQuery); } {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-8818) Creating keyspace then table fails with non-prepared query
[ https://issues.apache.org/jira/browse/CASSANDRA-8818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14325307#comment-14325307 ] Sotirios Delimanolis edited comment on CASSANDRA-8818 at 2/18/15 2:06 AM: -- Shouldn't the ConsistencyLevel of ALL account for that? was (Author: s_delima): Shouldn't the ConsistencyLevel of All account for that? Creating keyspace then table fails with non-prepared query -- Key: CASSANDRA-8818 URL: https://issues.apache.org/jira/browse/CASSANDRA-8818 Project: Cassandra Issue Type: Bug Components: Drivers (now out of tree) Reporter: Jonathan New Hi, I'm not sure if this is a driver or cassandra issue, so please feel free to move to the appropriate component. I'm using C# on mono (linux), and the 2.5.0 cassandra driver for C#. We have a cluster of 3 nodes, and we noticed that when we created a keyspace, then a table for that keyspace in quick succession it would fail frequently. I put our approximate code below. Additionally, we noticed that if we did a prepared statement instead of just executing the query, it would succeed. It also appeared that running the queries from a .cql file (outside of our C# program) would succeed as well. In this case with tracing on, we saw that it was Preparing statement. Please let me know if you need additional details. Thanks! {noformat} var pooling = new PoolingOptions () .SetMaxConnectionsPerHost (HostDistance.Remote, 24) .SetHeartBeatInterval (1000); var queryOptions = new QueryOptions () .SetConsistencyLevel(ConsistencyLevel.ALL); var builder = Cluster.Builder () .AddContactPoints (contactPoints) .WithPort (9042) .WithPoolingOptions (pooling) .WithQueryOptions (queryOptions) .WithQueryTimeout (15000); String keyspaceQuery = CREATE KEYSPACE IF NOT EXISTS metadata WITH replication = {'class': 'SimpleStrategy', 'replication_factor': '3'} AND durable_writes = true;; String tableQuery = CREATE TABLE IF NOT EXISTS metadata.patch_history ( metadata_key text, patch_version int, applied_date timestamp, patch_file text, PRIMARY KEY (metadata_key, patch_version) ) WITH CLUSTERING ORDER BY (patch_version DESC) AND bloom_filter_fp_chance = 0.01 AND caching = 'KEYS_ONLY' AND comment = '' AND compaction = {'min_threshold': '4', 'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32'} AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'} AND dclocal_read_repair_chance = 0.1 AND default_time_to_live = 0 AND gc_grace_seconds = 864000 AND memtable_flush_period_in_ms = 0 AND read_repair_chance = 0.0 AND speculative_retry = '99.0PERCENTILE';; using (var session = cluster.Connect ()) { session.Execute(keyspaceQuery); session.Execute(tableQuery); } {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-8819) LOCAL_QUORUM writes returns wrong message
[ https://issues.apache.org/jira/browse/CASSANDRA-8819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei Zhu updated CASSANDRA-8819: --- Reviewer: Philip Thompson LOCAL_QUORUM writes returns wrong message - Key: CASSANDRA-8819 URL: https://issues.apache.org/jira/browse/CASSANDRA-8819 Project: Cassandra Issue Type: Bug Components: Core Environment: CentOS 6.6 Reporter: Wei Zhu Fix For: 2.0.8 We have two DC3, each with 7 nodes. Here is the keyspace setup: create keyspace test with placement_strategy = 'NetworkTopologyStrategy' and strategy_options = {DC2 : 3, DC1 : 3} and durable_writes = true; We brought down two nodes in DC2 for maintenance. We only write to DC1 using local_quroum (using datastax JavaClient) But we see this errors in the log: Cassandra timeout during write query at consistency LOCAL_QUORUM (4 replica were required but only 3 acknowledged the write why does it say 4 replica were required? and Why would it give error back to client since local_quorum should succeed. Here are the output from nodetool status Note: Ownership information does not include topology; for complete information, specify a keyspace Datacenter: DC2 === Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns Host ID Rack UN 10.2.0.1 10.92 GB 256 7.9% RAC206 UN 10.2.0.2 6.17 GB256 8.0% RAC106 UN 10.2.0.3 6.63 GB256 7.3% RAC107 DL 10.2.0.4 1.54 GB256 7.7% RAC107 UN 10.2.0.5 6.02 GB256 6.6% RAC106 UJ 10.2.0.6 3.68 GB256 ? RAC205 UN 10.2.0.7 7.22 GB256 7.7% RAC205 Datacenter: DC1 === Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns Host ID Rack UN 10.1.0.1 6.04 GB256 8.6% RAC10 UN 10.1.0.2 7.55 GB256 7.4% RAC8 UN 10.1.0.3 5.83 GB256 7.0% RAC9 UN 10.1.0.47.34 GB256 7.9% RAC6 UN 10.1.0.5 7.57 GB256 8.0% RAC7 UN 10.1.0.6 5.31 GB256 7.3% RAC10 UN 10.1.0.7 5.47 GB256 8.6% RAC9 I did a cql trace on the query and here is the trace, and it does say Write timeout; received 3 of 4 required replies | 17:27:52,831 | 10.1.0.1 |2002873 at the end. I guess that is where the client gets the error from. Tracing session: 5a789fb0-b70d-11e4-8fca-99bff9c19890 activity | timestamp | source | source_elapsed -+--+-+ execute_cql3_query | 17:27:50,828 | 10.1.0.1 | 0 Parsing insert into test (user_id, created, event_data, event_id)values ( 123456789 , 9eab8950-b70c-11e4-8fca-99bff9c19891, 'test', '16'); | 17:27:50,828 | 10.1.0.1 | 39 Preparing statement | 17:27:50,828 | 10.1.0.1 |135 Message received from /10.1.0.1 | 17:27:50,829 | 10.1.0.5 | 25 Sending message to /10.1.0.5 | 17:27:50,829 | 10.1.0.1 |421 Executing single-partition query on users | 17:27:50,829 | 10.1.0.5 |177 Acquiring sstable references | 17:27:50,829 | 10.1.0.5 |191 Merging memtable tombstones | 17:27:50,830 | 10.1.0.5 |
[jira] [Updated] (CASSANDRA-8819) LOCAL_QUORUM writes returns wrong message
[ https://issues.apache.org/jira/browse/CASSANDRA-8819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei Zhu updated CASSANDRA-8819: --- Description: We have two DC3, each with 7 nodes. Here is the keyspace setup: create keyspace test with placement_strategy = 'NetworkTopologyStrategy' and strategy_options = {DC2 : 3, DC1 : 3} and durable_writes = true; We brought down two nodes in DC2 for maintenance. We only write to DC1 using local_quroum (using datastax JavaClient) But we see this errors in the log: Cassandra timeout during write query at consistency LOCAL_QUORUM (4 replica were required but only 3 acknowledged the write why does it say 4 replica were required? and Why would it give error back to client since local_quorum should succeed. Here are the output from nodetool status Note: Ownership information does not include topology; for complete information, specify a keyspace Datacenter: DC2 === Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns Host ID Rack UN 10.2.0.1 10.92 GB 256 7.9% RAC206 UN 10.2.0.2 6.17 GB256 8.0% RAC106 UN 10.2.0.3 6.63 GB256 7.3% RAC107 DL 10.2.0.4 1.54 GB256 7.7% RAC107 UN 10.2.0.5 6.02 GB256 6.6% RAC106 UJ 10.2.0.6 3.68 GB256 ? RAC205 UN 10.2.0.7 7.22 GB256 7.7% RAC205 Datacenter: DC1 === Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns Host ID Rack UN 10.1.0.1 6.04 GB256 8.6% RAC10 UN 10.1.0.2 7.55 GB256 7.4% RAC8 UN 10.1.0.3 5.83 GB256 7.0% RAC9 UN 10.1.0.47.34 GB256 7.9% RAC6 UN 10.1.0.5 7.57 GB256 8.0% RAC7 UN 10.1.0.6 5.31 GB256 7.3% RAC10 UN 10.1.0.7 5.47 GB256 8.6% RAC9 I did a cql trace on the query and here is the trace, and it does say Write timeout; received 3 of 4 required replies | 17:27:52,831 | 10.1.0.1 | 2002873 at the end. I guess that is where the client gets the error from. But the rows was inserted to Cassandra correctly. And I traced read with local_quorum and it behaves correctly and the reads don't go to DC2. The problem is only with writes on local_quorum. Tracing session: 5a789fb0-b70d-11e4-8fca-99bff9c19890 activity | timestamp| source | source_elapsed -+--+-+ execute_cql3_query | 17:27:50,828 | 10.1.0.1 | 0 Parsing insert into test (user_id, created, event_data, event_id)values ( 123456789 , 9eab8950-b70c-11e4-8fca-99bff9c19891, 'test', '16'); | 17:27:50,828 | 10.1.0.1 | 39 Preparing statement | 17:27:50,828 | 10.1.0.1 |135 Message received from /10.1.0.1 | 17:27:50,829 | 10.1.0.5 | 25 Sending message to /10.1.0.5 | 17:27:50,829 | 10.1.0.1 |421 Executing single-partition query on users | 17:27:50,829 | 10.1.0.5 |177 Acquiring sstable references | 17:27:50,829 | 10.1.0.5 |191 Merging memtable tombstones | 17:27:50,830 | 10.1.0.5 |208 Message received from /10.1.0.5 | 17:27:50,830 | 10.1.0.1 | 1461
[jira] [Commented] (CASSANDRA-8819) LOCAL_QUORUM writes returns wrong message
[ https://issues.apache.org/jira/browse/CASSANDRA-8819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14325342#comment-14325342 ] Philip Thompson commented on CASSANDRA-8819: What is the schema for your table? I'm assuming {code} CREATE TABLE test ( user_id int, created timeuuid, event_data text, event_id text, PRIMARY KEY ((user_id), created) {code} LOCAL_QUORUM writes returns wrong message - Key: CASSANDRA-8819 URL: https://issues.apache.org/jira/browse/CASSANDRA-8819 Project: Cassandra Issue Type: Bug Components: Core Environment: CentOS 6.6 Reporter: Wei Zhu Assignee: Tyler Hobbs Fix For: 2.0.13 We have two DC3, each with 7 nodes. Here is the keyspace setup: create keyspace test with placement_strategy = 'NetworkTopologyStrategy' and strategy_options = {DC2 : 3, DC1 : 3} and durable_writes = true; We brought down two nodes in DC2 for maintenance. We only write to DC1 using local_quroum (using datastax JavaClient) But we see this errors in the log: Cassandra timeout during write query at consistency LOCAL_QUORUM (4 replica were required but only 3 acknowledged the write why does it say 4 replica were required? and Why would it give error back to client since local_quorum should succeed. Here are the output from nodetool status Note: Ownership information does not include topology; for complete information, specify a keyspace Datacenter: DC2 === Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns Host ID Rack UN 10.2.0.1 10.92 GB 256 7.9% RAC206 UN 10.2.0.2 6.17 GB256 8.0% RAC106 UN 10.2.0.3 6.63 GB256 7.3% RAC107 DL 10.2.0.4 1.54 GB256 7.7% RAC107 UN 10.2.0.5 6.02 GB256 6.6% RAC106 UJ 10.2.0.6 3.68 GB256 ? RAC205 UN 10.2.0.7 7.22 GB256 7.7% RAC205 Datacenter: DC1 === Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns Host ID Rack UN 10.1.0.1 6.04 GB256 8.6% RAC10 UN 10.1.0.2 7.55 GB256 7.4% RAC8 UN 10.1.0.3 5.83 GB256 7.0% RAC9 UN 10.1.0.47.34 GB256 7.9% RAC6 UN 10.1.0.5 7.57 GB256 8.0% RAC7 UN 10.1.0.6 5.31 GB256 7.3% RAC10 UN 10.1.0.7 5.47 GB256 8.6% RAC9 I did a cql trace on the query and here is the trace, and it does say Write timeout; received 3 of 4 required replies | 17:27:52,831 | 10.1.0.1 |2002873 at the end. I guess that is where the client gets the error from. But the rows was inserted to Cassandra correctly. And I traced read with local_quorum and it behaves correctly and the reads don't go to DC2. The problem is only with writes on local_quorum. {code} Tracing session: 5a789fb0-b70d-11e4-8fca-99bff9c19890 activity | timestamp | source | source_elapsed -+--+-+ execute_cql3_query | 17:27:50,828 | 10.1.0.1 | 0 Parsing insert into test (user_id, created, event_data, event_id)values ( 123456789 , 9eab8950-b70c-11e4-8fca-99bff9c19891, 'test', '16'); | 17:27:50,828 | 10.1.0.1 | 39 Preparing statement | 17:27:50,828 | 10.1.0.1 |135 Message received from /10.1.0.1 | 17:27:50,829 | 10.1.0.5 | 25 Sending message to /10.1.0.5 | 17:27:50,829 | 10.1.0.1 |421
cassandra git commit: rename
Repository: cassandra Updated Branches: refs/heads/trunk f6879b205 - 48563e0be rename Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/48563e0b Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/48563e0b Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/48563e0b Branch: refs/heads/trunk Commit: 48563e0be186dbeca2933f874d0030d7671d02ab Parents: f6879b2 Author: Jonathan Ellis jbel...@apache.org Authored: Tue Feb 17 16:55:06 2015 -0600 Committer: Jonathan Ellis jbel...@apache.org Committed: Tue Feb 17 16:55:06 2015 -0600 -- src/java/org/apache/cassandra/gms/Gossiper.java | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/48563e0b/src/java/org/apache/cassandra/gms/Gossiper.java -- diff --git a/src/java/org/apache/cassandra/gms/Gossiper.java b/src/java/org/apache/cassandra/gms/Gossiper.java index c34793c..4584044 100644 --- a/src/java/org/apache/cassandra/gms/Gossiper.java +++ b/src/java/org/apache/cassandra/gms/Gossiper.java @@ -152,7 +152,7 @@ public class Gossiper implements IFailureDetectionEventListener, GossiperMBean boolean gossipedToSeed = doGossipToLiveMember(message); /* Gossip to some unreachable member with some probability to check if he is back up */ -doGossipToUnreachableMember(message); +maybeGossipToUnreachableMember(message); /* Gossip to a seed if we did not do so above, or we have seen less nodes than there are seeds. This prevents partitions where each group of nodes @@ -615,7 +615,7 @@ public class Gossiper implements IFailureDetectionEventListener, GossiperMBean } /* Sends a Gossip message to an unreachable member */ -private void doGossipToUnreachableMember(MessageOutGossipDigestSyn message) +private void maybeGossipToUnreachableMember(MessageOutGossipDigestSyn message) { double liveEndpointCount = liveEndpoints.size(); double unreachableEndpointCount = unreachableEndpoints.size();
[jira] [Commented] (CASSANDRA-8818) Creating keyspace then table fails with non-prepared query
[ https://issues.apache.org/jira/browse/CASSANDRA-8818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14325287#comment-14325287 ] Philip Thompson commented on CASSANDRA-8818: What Cassandra version are you using? I assume what is happening is that the keyspace creation hasn't finished propagating to all nodes before the table create statement gets there. I believe blocking is the driver's responsibility. Creating keyspace then table fails with non-prepared query -- Key: CASSANDRA-8818 URL: https://issues.apache.org/jira/browse/CASSANDRA-8818 Project: Cassandra Issue Type: Bug Components: Drivers (now out of tree) Reporter: Jonathan New Hi, I'm not sure if this is a driver or cassandra issue, so please feel free to move to the appropriate component. I'm using C# on mono (linux), and the 2.5.0 cassandra driver for C#. We have a cluster of 3 nodes, and we noticed that when we created a keyspace, then a table for that keyspace in quick succession it would fail frequently. I put our approximate code below. Additionally, we noticed that if we did a prepared statement instead of just executing the query, it would succeed. It also appeared that running the queries from a .cql file (outside of our C# program) would succeed as well. In this case with tracing on, we saw that it was Preparing statement. Please let me know if you need additional details. Thanks! {noformat} var pooling = new PoolingOptions () .SetMaxConnectionsPerHost (HostDistance.Remote, 24) .SetHeartBeatInterval (1000); var queryOptions = new QueryOptions () .SetConsistencyLevel(ConsistencyLevel.ALL); var builder = Cluster.Builder () .AddContactPoints (contactPoints) .WithPort (9042) .WithPoolingOptions (pooling) .WithQueryOptions (queryOptions) .WithQueryTimeout (15000); String keyspaceQuery = CREATE KEYSPACE IF NOT EXISTS metadata WITH replication = {'class': 'SimpleStrategy', 'replication_factor': '3'} AND durable_writes = true;; String tableQuery = CREATE TABLE IF NOT EXISTS metadata.patch_history ( metadata_key text, patch_version int, applied_date timestamp, patch_file text, PRIMARY KEY (metadata_key, patch_version) ) WITH CLUSTERING ORDER BY (patch_version DESC) AND bloom_filter_fp_chance = 0.01 AND caching = 'KEYS_ONLY' AND comment = '' AND compaction = {'min_threshold': '4', 'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32'} AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'} AND dclocal_read_repair_chance = 0.1 AND default_time_to_live = 0 AND gc_grace_seconds = 864000 AND memtable_flush_period_in_ms = 0 AND read_repair_chance = 0.0 AND speculative_retry = '99.0PERCENTILE';; using (var session = cluster.Connect ()) { session.Execute(keyspaceQuery); session.Execute(tableQuery); } {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8819) LOCAL_QUORUM writes returns wrong message
[ https://issues.apache.org/jira/browse/CASSANDRA-8819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14325345#comment-14325345 ] Philip Thompson commented on CASSANDRA-8819: I can sanity check this and say that a six node, two DC cluster on 2.0.8 is working as expected for me locally. He can reproduce in cqlsh and through java driver, so it isn't some sort of driver bug. If all of the nodes agree on the schema, and on nodetool status info, I'm stumped as to what could be going on. LOCAL_QUORUM writes returns wrong message - Key: CASSANDRA-8819 URL: https://issues.apache.org/jira/browse/CASSANDRA-8819 Project: Cassandra Issue Type: Bug Components: Core Environment: CentOS 6.6 Reporter: Wei Zhu Assignee: Tyler Hobbs Fix For: 2.0.13 We have two DC3, each with 7 nodes. Here is the keyspace setup: create keyspace test with placement_strategy = 'NetworkTopologyStrategy' and strategy_options = {DC2 : 3, DC1 : 3} and durable_writes = true; We brought down two nodes in DC2 for maintenance. We only write to DC1 using local_quroum (using datastax JavaClient) But we see this errors in the log: Cassandra timeout during write query at consistency LOCAL_QUORUM (4 replica were required but only 3 acknowledged the write why does it say 4 replica were required? and Why would it give error back to client since local_quorum should succeed. Here are the output from nodetool status Note: Ownership information does not include topology; for complete information, specify a keyspace Datacenter: DC2 === Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns Host ID Rack UN 10.2.0.1 10.92 GB 256 7.9% RAC206 UN 10.2.0.2 6.17 GB256 8.0% RAC106 UN 10.2.0.3 6.63 GB256 7.3% RAC107 DL 10.2.0.4 1.54 GB256 7.7% RAC107 UN 10.2.0.5 6.02 GB256 6.6% RAC106 UJ 10.2.0.6 3.68 GB256 ? RAC205 UN 10.2.0.7 7.22 GB256 7.7% RAC205 Datacenter: DC1 === Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns Host ID Rack UN 10.1.0.1 6.04 GB256 8.6% RAC10 UN 10.1.0.2 7.55 GB256 7.4% RAC8 UN 10.1.0.3 5.83 GB256 7.0% RAC9 UN 10.1.0.47.34 GB256 7.9% RAC6 UN 10.1.0.5 7.57 GB256 8.0% RAC7 UN 10.1.0.6 5.31 GB256 7.3% RAC10 UN 10.1.0.7 5.47 GB256 8.6% RAC9 I did a cql trace on the query and here is the trace, and it does say Write timeout; received 3 of 4 required replies | 17:27:52,831 | 10.1.0.1 |2002873 at the end. I guess that is where the client gets the error from. But the rows was inserted to Cassandra correctly. And I traced read with local_quorum and it behaves correctly and the reads don't go to DC2. The problem is only with writes on local_quorum. {code} Tracing session: 5a789fb0-b70d-11e4-8fca-99bff9c19890 activity | timestamp | source | source_elapsed -+--+-+ execute_cql3_query | 17:27:50,828 | 10.1.0.1 | 0 Parsing insert into test (user_id, created, event_data, event_id)values ( 123456789 , 9eab8950-b70c-11e4-8fca-99bff9c19891, 'test', '16'); | 17:27:50,828 | 10.1.0.1 | 39 Preparing statement | 17:27:50,828 | 10.1.0.1 |135 Message received from /10.1.0.1 | 17:27:50,829 | 10.1.0.5 | 25 Sending message to /10.1.0.5 | 17:27:50,829
[jira] [Comment Edited] (CASSANDRA-8819) LOCAL_QUORUM writes returns wrong message
[ https://issues.apache.org/jira/browse/CASSANDRA-8819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14325345#comment-14325345 ] Philip Thompson edited comment on CASSANDRA-8819 at 2/18/15 2:59 AM: - I have sanity checked this and a six node, two DC cluster on 2.0.8 is working as expected for me locally. Wei can reproduce in cqlsh and through java driver, so it isn't some sort of driver bug. If all of the nodes agree on the schema, and on nodetool status info, I'm stumped as to what could be going on. was (Author: philipthompson): I can sanity check this and say that a six node, two DC cluster on 2.0.8 is working as expected for me locally. He can reproduce in cqlsh and through java driver, so it isn't some sort of driver bug. If all of the nodes agree on the schema, and on nodetool status info, I'm stumped as to what could be going on. LOCAL_QUORUM writes returns wrong message - Key: CASSANDRA-8819 URL: https://issues.apache.org/jira/browse/CASSANDRA-8819 Project: Cassandra Issue Type: Bug Components: Core Environment: CentOS 6.6 Reporter: Wei Zhu Assignee: Tyler Hobbs Fix For: 2.0.13 We have two DC3, each with 7 nodes. Here is the keyspace setup: create keyspace test with placement_strategy = 'NetworkTopologyStrategy' and strategy_options = {DC2 : 3, DC1 : 3} and durable_writes = true; We brought down two nodes in DC2 for maintenance. We only write to DC1 using local_quroum (using datastax JavaClient) But we see this errors in the log: Cassandra timeout during write query at consistency LOCAL_QUORUM (4 replica were required but only 3 acknowledged the write why does it say 4 replica were required? and Why would it give error back to client since local_quorum should succeed. Here are the output from nodetool status Note: Ownership information does not include topology; for complete information, specify a keyspace Datacenter: DC2 === Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns Host ID Rack UN 10.2.0.1 10.92 GB 256 7.9% RAC206 UN 10.2.0.2 6.17 GB256 8.0% RAC106 UN 10.2.0.3 6.63 GB256 7.3% RAC107 DL 10.2.0.4 1.54 GB256 7.7% RAC107 UN 10.2.0.5 6.02 GB256 6.6% RAC106 UJ 10.2.0.6 3.68 GB256 ? RAC205 UN 10.2.0.7 7.22 GB256 7.7% RAC205 Datacenter: DC1 === Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns Host ID Rack UN 10.1.0.1 6.04 GB256 8.6% RAC10 UN 10.1.0.2 7.55 GB256 7.4% RAC8 UN 10.1.0.3 5.83 GB256 7.0% RAC9 UN 10.1.0.47.34 GB256 7.9% RAC6 UN 10.1.0.5 7.57 GB256 8.0% RAC7 UN 10.1.0.6 5.31 GB256 7.3% RAC10 UN 10.1.0.7 5.47 GB256 8.6% RAC9 I did a cql trace on the query and here is the trace, and it does say Write timeout; received 3 of 4 required replies | 17:27:52,831 | 10.1.0.1 |2002873 at the end. I guess that is where the client gets the error from. But the rows was inserted to Cassandra correctly. And I traced read with local_quorum and it behaves correctly and the reads don't go to DC2. The problem is only with writes on local_quorum. {code} Tracing session: 5a789fb0-b70d-11e4-8fca-99bff9c19890 activity | timestamp | source | source_elapsed -+--+-+ execute_cql3_query | 17:27:50,828 | 10.1.0.1 | 0 Parsing insert into test (user_id, created, event_data, event_id)values ( 123456789 , 9eab8950-b70c-11e4-8fca-99bff9c19891, 'test', '16'); | 17:27:50,828 | 10.1.0.1 | 39 Preparing
[jira] [Updated] (CASSANDRA-8821) Errors in JVM_OPTS and cassandra_parms environment vars
[ https://issues.apache.org/jira/browse/CASSANDRA-8821?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Terry Moschou updated CASSANDRA-8821: - Environment: Ubuntu 14.04 LTS amd64 (was: Ubuntu 14.04 LTS amd64 Repos: deb http://www.apache.org/dist/cassandra/debian 21x main deb-src http://www.apache.org/dist/cassandra/debian 21x main) Errors in JVM_OPTS and cassandra_parms environment vars --- Key: CASSANDRA-8821 URL: https://issues.apache.org/jira/browse/CASSANDRA-8821 Project: Cassandra Issue Type: Bug Environment: Ubuntu 14.04 LTS amd64 Reporter: Terry Moschou Priority: Minor Fix For: 2.1.3 The cassandra init script /etc/init.d/cassandra is sourcing the environment file /etc/cassandra/cassandra-env.sh twice. Once directly from the init script, and again inside /usr/sbin/cassandra The result is arguments in JVM_OPTS are duplicated. Further the JVM opt -XX:CMSWaitDuration=1 is defined twice if jvm = 1.7.60. Also, for the environment variable CASSANDRA_CONF used in this context -XX:CompileCommandFile=$CASSANDRA_CONF/hotspot_compiler is undefined when /etc/cassandra/cassandra-env.sh is sourced from the init script. Lastly the variable cassandra_storagedir is undefined in /usr/sbin/cassandra when used in this context -Dcassandra.storagedir=$cassandra_storagedir -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-8821) Errors in JVM_OPTS and cassandra_parms environment vars
[ https://issues.apache.org/jira/browse/CASSANDRA-8821?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Terry Moschou updated CASSANDRA-8821: - Description: Repos: deb http://www.apache.org/dist/cassandra/debian 21x main deb-src http://www.apache.org/dist/cassandra/debian 21x main The cassandra init script /etc/init.d/cassandra is sourcing the environment file /etc/cassandra/cassandra-env.sh twice. Once directly from the init script, and again inside /usr/sbin/cassandra The result is arguments in JVM_OPTS are duplicated. Further the JVM opt -XX:CMSWaitDuration=1 is defined twice if jvm = 1.7.60. Also, for the environment variable CASSANDRA_CONF used in this context -XX:CompileCommandFile=$CASSANDRA_CONF/hotspot_compiler is undefined when /etc/cassandra/cassandra-env.sh is sourced from the init script. Lastly the variable cassandra_storagedir is undefined in /usr/sbin/cassandra when used in this context -Dcassandra.storagedir=$cassandra_storagedir was: The cassandra init script /etc/init.d/cassandra is sourcing the environment file /etc/cassandra/cassandra-env.sh twice. Once directly from the init script, and again inside /usr/sbin/cassandra The result is arguments in JVM_OPTS are duplicated. Further the JVM opt -XX:CMSWaitDuration=1 is defined twice if jvm = 1.7.60. Also, for the environment variable CASSANDRA_CONF used in this context -XX:CompileCommandFile=$CASSANDRA_CONF/hotspot_compiler is undefined when /etc/cassandra/cassandra-env.sh is sourced from the init script. Lastly the variable cassandra_storagedir is undefined in /usr/sbin/cassandra when used in this context -Dcassandra.storagedir=$cassandra_storagedir Errors in JVM_OPTS and cassandra_parms environment vars --- Key: CASSANDRA-8821 URL: https://issues.apache.org/jira/browse/CASSANDRA-8821 Project: Cassandra Issue Type: Bug Environment: Ubuntu 14.04 LTS amd64 Reporter: Terry Moschou Priority: Minor Fix For: 2.1.3 Repos: deb http://www.apache.org/dist/cassandra/debian 21x main deb-src http://www.apache.org/dist/cassandra/debian 21x main The cassandra init script /etc/init.d/cassandra is sourcing the environment file /etc/cassandra/cassandra-env.sh twice. Once directly from the init script, and again inside /usr/sbin/cassandra The result is arguments in JVM_OPTS are duplicated. Further the JVM opt -XX:CMSWaitDuration=1 is defined twice if jvm = 1.7.60. Also, for the environment variable CASSANDRA_CONF used in this context -XX:CompileCommandFile=$CASSANDRA_CONF/hotspot_compiler is undefined when /etc/cassandra/cassandra-env.sh is sourced from the init script. Lastly the variable cassandra_storagedir is undefined in /usr/sbin/cassandra when used in this context -Dcassandra.storagedir=$cassandra_storagedir -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-8821) Errors in JVM_OPTS and cassandra_parms environment vars
[ https://issues.apache.org/jira/browse/CASSANDRA-8821?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Terry Moschou updated CASSANDRA-8821: - Fix Version/s: 2.1.3 Errors in JVM_OPTS and cassandra_parms environment vars --- Key: CASSANDRA-8821 URL: https://issues.apache.org/jira/browse/CASSANDRA-8821 Project: Cassandra Issue Type: Bug Environment: Ubuntu 14.04 LTS amd64 Repos: deb http://www.apache.org/dist/cassandra/debian 21x main deb-src http://www.apache.org/dist/cassandra/debian 21x main Reporter: Terry Moschou Priority: Minor Fix For: 2.1.3 The cassandra init script /etc/init.d/cassandra is sourcing the environment file /etc/cassandra/cassandra-env.sh twice. Once directly from the init script, and again inside /usr/sbin/cassandra The result is arguments in JVM_OPTS are duplicated. Further the JVM opt -XX:CMSWaitDuration=1 is defined twice if jvm = 1.7.60. Also, for the environment variable CASSANDRA_CONF used in this context -XX:CompileCommandFile=$CASSANDRA_CONF/hotspot_compiler is undefined when /etc/cassandra/cassandra-env.sh is sourced from the init script. Lastly the variable cassandra_storagedir is undefined in /usr/sbin/cassandra when used in this context -Dcassandra.storagedir=$cassandra_storagedir -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-8821) Errors in JVM_OPTS and cassandra_parms environment vars
Terry Moschou created CASSANDRA-8821: Summary: Errors in JVM_OPTS and cassandra_parms environment vars Key: CASSANDRA-8821 URL: https://issues.apache.org/jira/browse/CASSANDRA-8821 Project: Cassandra Issue Type: Bug Environment: Ubuntu 14.04 LTS amd64 Repos: deb http://www.apache.org/dist/cassandra/debian 21x main deb-src http://www.apache.org/dist/cassandra/debian 21x main Reporter: Terry Moschou Priority: Minor Fix For: 2.1.3 The cassandra init script /etc/init.d/cassandra is sourcing the environment file /etc/cassandra/cassandra-env.sh twice. Once directly from the init script, and again inside /usr/sbin/cassandra The result is arguments in JVM_OPTS are duplicated. Further the JVM opt -XX:CMSWaitDuration=1 is defined twice if jvm = 1.7.60. Also, for the environment variable CASSANDRA_CONF used in this context -XX:CompileCommandFile=$CASSANDRA_CONF/hotspot_compiler is undefined when /etc/cassandra/cassandra-env.sh is sourced from the init script. Lastly the variable cassandra_storagedir is undefined in /usr/sbin/cassandra when used in this context -Dcassandra.storagedir=$cassandra_storagedir -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8818) Creating keyspace then table fails with non-prepared query
[ https://issues.apache.org/jira/browse/CASSANDRA-8818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14325469#comment-14325469 ] Jonathan New commented on CASSANDRA-8818: - Cassandra 2.0.12. Is this the right place to file this bug? Creating keyspace then table fails with non-prepared query -- Key: CASSANDRA-8818 URL: https://issues.apache.org/jira/browse/CASSANDRA-8818 Project: Cassandra Issue Type: Bug Components: Drivers (now out of tree) Reporter: Jonathan New Hi, I'm not sure if this is a driver or cassandra issue, so please feel free to move to the appropriate component. I'm using C# on mono (linux), and the 2.5.0 cassandra driver for C#. We have a cluster of 3 nodes, and we noticed that when we created a keyspace, then a table for that keyspace in quick succession it would fail frequently. I put our approximate code below. Additionally, we noticed that if we did a prepared statement instead of just executing the query, it would succeed. It also appeared that running the queries from a .cql file (outside of our C# program) would succeed as well. In this case with tracing on, we saw that it was Preparing statement. Please let me know if you need additional details. Thanks! {noformat} var pooling = new PoolingOptions () .SetMaxConnectionsPerHost (HostDistance.Remote, 24) .SetHeartBeatInterval (1000); var queryOptions = new QueryOptions () .SetConsistencyLevel(ConsistencyLevel.ALL); var builder = Cluster.Builder () .AddContactPoints (contactPoints) .WithPort (9042) .WithPoolingOptions (pooling) .WithQueryOptions (queryOptions) .WithQueryTimeout (15000); String keyspaceQuery = CREATE KEYSPACE IF NOT EXISTS metadata WITH replication = {'class': 'SimpleStrategy', 'replication_factor': '3'} AND durable_writes = true;; String tableQuery = CREATE TABLE IF NOT EXISTS metadata.patch_history ( metadata_key text, patch_version int, applied_date timestamp, patch_file text, PRIMARY KEY (metadata_key, patch_version) ) WITH CLUSTERING ORDER BY (patch_version DESC) AND bloom_filter_fp_chance = 0.01 AND caching = 'KEYS_ONLY' AND comment = '' AND compaction = {'min_threshold': '4', 'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32'} AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'} AND dclocal_read_repair_chance = 0.1 AND default_time_to_live = 0 AND gc_grace_seconds = 864000 AND memtable_flush_period_in_ms = 0 AND read_repair_chance = 0.0 AND speculative_retry = '99.0PERCENTILE';; using (var session = cluster.Connect ()) { session.Execute(keyspaceQuery); session.Execute(tableQuery); } {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-8821) Errors in JVM_OPTS and cassandra_parms environment vars
[ https://issues.apache.org/jira/browse/CASSANDRA-8821?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Terry Moschou updated CASSANDRA-8821: - Fix Version/s: (was: 2.1.3) Errors in JVM_OPTS and cassandra_parms environment vars --- Key: CASSANDRA-8821 URL: https://issues.apache.org/jira/browse/CASSANDRA-8821 Project: Cassandra Issue Type: Bug Environment: Ubuntu 14.04 LTS amd64 Repos: deb http://www.apache.org/dist/cassandra/debian 21x main deb-src http://www.apache.org/dist/cassandra/debian 21x main Reporter: Terry Moschou Priority: Minor The cassandra init script /etc/init.d/cassandra is sourcing the environment file /etc/cassandra/cassandra-env.sh twice. Once directly from the init script, and again inside /usr/sbin/cassandra The result is arguments in JVM_OPTS are duplicated. Further the JVM opt -XX:CMSWaitDuration=1 is defined twice if jvm = 1.7.60. Also, for the environment variable CASSANDRA_CONF used in this context -XX:CompileCommandFile=$CASSANDRA_CONF/hotspot_compiler is undefined when /etc/cassandra/cassandra-env.sh is sourced from the init script. Lastly the variable cassandra_storagedir is undefined in /usr/sbin/cassandra when used in this context -Dcassandra.storagedir=$cassandra_storagedir -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-8820) Broken package dependency in Debian repository
Terry Moschou created CASSANDRA-8820: Summary: Broken package dependency in Debian repository Key: CASSANDRA-8820 URL: https://issues.apache.org/jira/browse/CASSANDRA-8820 Project: Cassandra Issue Type: Bug Components: Packaging Environment: Ubuntu 14.04 LTS amd64 Reporter: Terry Moschou Fix For: 2.1.3 The Apache Debian package repository currently has unmet dependencies. Configured repos: deb http://www.apache.org/dist/cassandra/debian 21x main deb-src http://www.apache.org/dist/cassandra/debian 21x main Problem file: cassandra/dists/21x/main/binary-amd64/Packages $ sudo apt-get update sudo apt-get install cassandra-tools ...(omitted) Reading state information... Done Some packages could not be installed. This may mean that you have requested an impossible situation or if you are using the unstable distribution that some required packages have not yet been created or been moved out of Incoming. The following information may help to resolve the situation: The following packages have unmet dependencies: cassandra-tools : Depends: cassandra (= 2.1.2) but it is not going to be installed E: Unable to correct problems, you have held broken packages. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-4914) Aggregation functions in CQL
[ https://issues.apache.org/jira/browse/CASSANDRA-4914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14324355#comment-14324355 ] Cristian O edited comment on CASSANDRA-4914 at 2/17/15 4:00 PM: To be clear I'm not talking about distributed processing. I'm talking about online analitical queries, particularly time series. Incidentally, I don't know how many people realize this but Cass has the same distributed storage architecture that Vertica has. It's also possible to map a columnar schema on top of sstable. Of course native support for columnar storage would be immensely better. See CASSANDRA-7447 was (Author: onetoinfin...@yahoo.com): To be clear I'm not talking about distributed processing. I'm talking about online analitical queries, particularly time series. Incidentally, I don't know how many people realize this but Cass has the same distributed storage architecture that Vertica has. It's also possible to map a columnar schema on top of sstable. Of course native support for columnar storage would be immensely better. See CASSANDRA-7447 There's a lot of opportunity in this space, just need a bit of vision. Aggregation functions in CQL Key: CASSANDRA-4914 URL: https://issues.apache.org/jira/browse/CASSANDRA-4914 Project: Cassandra Issue Type: New Feature Reporter: Vijay Assignee: Benjamin Lerer Labels: cql, docs Fix For: 3.0 Attachments: CASSANDRA-4914-V2.txt, CASSANDRA-4914-V3.txt, CASSANDRA-4914-V4.txt, CASSANDRA-4914-V5.txt, CASSANDRA-4914.txt The requirement is to do aggregation of data in Cassandra (Wide row of column values of int, double, float etc). With some basic agree gate functions like AVG, SUM, Mean, Min, Max, etc (for the columns within a row). Example: SELECT * FROM emp WHERE empID IN (130) ORDER BY deptID DESC; empid | deptid | first_name | last_name | salary ---+++---+ 130 | 3 | joe| doe | 10.1 130 | 2 | joe| doe |100 130 | 1 | joe| doe | 1e+03 SELECT sum(salary), empid FROM emp WHERE empID IN (130); sum(salary) | empid -+ 1110.1| 130 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-8789) OutboundTcpConnectionPool should route messages to sockets by size not type
[ https://issues.apache.org/jira/browse/CASSANDRA-8789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ariel Weisberg updated CASSANDRA-8789: -- Summary: OutboundTcpConnectionPool should route messages to sockets by size not type (was: Revisit how OutboundTcpConnection pools two connections for different message types) OutboundTcpConnectionPool should route messages to sockets by size not type --- Key: CASSANDRA-8789 URL: https://issues.apache.org/jira/browse/CASSANDRA-8789 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Ariel Weisberg Assignee: Ariel Weisberg Fix For: 3.0 Attachments: 8789.diff I was looking at this trying to understand what messages flow over which connection. For reads the request goes out over the command connection and the response comes back over the ack connection. For writes the request goes out over the command connection and the response comes back over the command connection. Reads get a dedicated socket for responses. Mutation commands and responses both travel over the same socket along with read requests. Sockets are used uni-directional so there are actually four sockets in play and four threads at each node (2 inbounded, 2 outbound). CASSANDRA-488 doesn't leave a record of what the impact of this change was. If someone remembers what situations were made better it would be good to know. I am not clear on when/how this is helpful. The consumer side shouldn't be blocking so the only head of line blocking issue is the time it takes to transfer data over the wire. If message size is the cause of blocking issues then the current design mixes small messages and large messages on the same connection retaining the head of line blocking. Read requests share the same connection as write requests (which are large), and write acknowledgments (which are small) share the same connections as write requests. The only winner is read acknowledgements. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[1/6] cassandra git commit: AssertionError: Memory was freed when running cleanup
Repository: cassandra Updated Branches: refs/heads/cassandra-2.0 93769b3b9 - 7f10cbd8a refs/heads/cassandra-2.1 9ad797863 - cd9da447b refs/heads/trunk f9d4044f1 - f6879b205 AssertionError: Memory was freed when running cleanup Patch by Robert Stupp; Reviewed by Benedict Elliott Smith for CASSANDRA-8716 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/7f10cbd8 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/7f10cbd8 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/7f10cbd8 Branch: refs/heads/cassandra-2.0 Commit: 7f10cbd8ad7b4ea1f7471084eeb935fe911845bb Parents: 93769b3 Author: Robert Stupp sn...@snazy.de Authored: Tue Feb 17 17:25:53 2015 +0100 Committer: Robert Stupp sn...@snazy.de Committed: Tue Feb 17 17:25:53 2015 +0100 -- CHANGES.txt | 1 + .../apache/cassandra/db/compaction/CompactionManager.java | 9 + 2 files changed, 6 insertions(+), 4 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/7f10cbd8/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 2052f70..24f70a3 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 2.0.13: + * AssertionError: Memory was freed when running cleanup (CASSANDRA-8716) * Make it possible to set max_sstable_age to fractional days (CASSANDRA-8406) * Fix memory leak in SSTableSimple*Writer and SSTableReader.validate() (CASSANDRA-8748) * Fix some multi-column relations with indexes on some clustering http://git-wip-us.apache.org/repos/asf/cassandra/blob/7f10cbd8/src/java/org/apache/cassandra/db/compaction/CompactionManager.java -- diff --git a/src/java/org/apache/cassandra/db/compaction/CompactionManager.java b/src/java/org/apache/cassandra/db/compaction/CompactionManager.java index 62599e3..0978ae6 100644 --- a/src/java/org/apache/cassandra/db/compaction/CompactionManager.java +++ b/src/java/org/apache/cassandra/db/compaction/CompactionManager.java @@ -539,9 +539,10 @@ public class CompactionManager implements CompactionManagerMBean for (SSTableReader sstable : sstables) { +SetSSTableReader sstableAsSet = Collections.singleton(sstable); if (!hasIndexes !new BoundsToken(sstable.first.token, sstable.last.token).intersects(ranges)) { -cfs.replaceCompactedSSTables(Arrays.asList(sstable), Collections.SSTableReaderemptyList(), OperationType.CLEANUP); +cfs.replaceCompactedSSTables(sstableAsSet, Collections.SSTableReaderemptyList(), OperationType.CLEANUP); continue; } if (!needsCleanup(sstable, ranges)) @@ -550,18 +551,18 @@ public class CompactionManager implements CompactionManagerMBean continue; } -CompactionController controller = new CompactionController(cfs, Collections.singleton(sstable), getDefaultGcBefore(cfs)); +CompactionController controller = new CompactionController(cfs, sstableAsSet, getDefaultGcBefore(cfs)); long start = System.nanoTime(); long totalkeysWritten = 0; int expectedBloomFilterSize = Math.max(cfs.metadata.getIndexInterval(), - (int) (SSTableReader.getApproximateKeyCount(Arrays.asList(sstable), cfs.metadata))); + (int) (SSTableReader.getApproximateKeyCount(sstableAsSet, cfs.metadata))); if (logger.isDebugEnabled()) logger.debug(Expected bloom filter size : + expectedBloomFilterSize); logger.info(Cleaning up + sstable); -File compactionFileLocation = cfs.directories.getWriteableLocationAsFile(cfs.getExpectedCompactedFileSize(sstables, OperationType.CLEANUP)); +File compactionFileLocation = cfs.directories.getWriteableLocationAsFile(cfs.getExpectedCompactedFileSize(sstableAsSet, OperationType.CLEANUP)); if (compactionFileLocation == null) throw new IOException(disk full);
[jira] [Comment Edited] (CASSANDRA-8757) IndexSummaryBuilder should construct itself offheap, and share memory between the result of each build() invocation
[ https://issues.apache.org/jira/browse/CASSANDRA-8757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14318575#comment-14318575 ] Benedict edited comment on CASSANDRA-8757 at 2/17/15 5:21 PM: -- Patch available [here|https://github.com/belliottsmith/cassandra/tree/8757-offheapsummarybuilder] The approach is pretty straight forward in principle: we split the offheap memory for the summary into two allocations, the summary offsets and the summary entries - the latter composed of the key and its offset in the index file. The offsets index from zero now, instead of from the end of the offsets themselves, and so to maintain compatibility we do not change the serialization format, on read/write we simply subtract/add the necessary offset. This split permits us to have a separate chunk of memory for each that we can append to in the writer, so that a prefix of both can be used to open a summary before we've finished writing. This permits us to share memory between all early instances of a table. was (Author: benedict): Patch available [here|github.com/belliottsmith/cassandra/tree/8757-offheapsummarybuilder] The approach is pretty straight forward in principle: we split the offheap memory for the summary into two allocations, the summary offsets and the summary entries - the latter composed of the key and its offset in the index file. The offsets index from zero now, instead of from the end of the offsets themselves, and so to maintain compatibility we do not change the serialization format, on read/write we simply subtract/add the necessary offset. This split permits us to have a separate chunk of memory for each that we can append to in the writer, so that a prefix of both can be used to open a summary before we've finished writing. This permits us to share memory between all early instances of a table. IndexSummaryBuilder should construct itself offheap, and share memory between the result of each build() invocation --- Key: CASSANDRA-8757 URL: https://issues.apache.org/jira/browse/CASSANDRA-8757 Project: Cassandra Issue Type: Bug Components: Core Reporter: Benedict Assignee: Benedict Fix For: 2.1.4 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-8767) Added column does not sort as the last column when using new python driver
[ https://issues.apache.org/jira/browse/CASSANDRA-8767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Russ Garrett updated CASSANDRA-8767: Attachment: exception-with-logging.txt OK, I upgraded to 2.0.12 and applied the patch, and the resulting logging is attached. Added column does not sort as the last column when using new python driver Key: CASSANDRA-8767 URL: https://issues.apache.org/jira/browse/CASSANDRA-8767 Project: Cassandra Issue Type: Bug Components: Core, Drivers (now out of tree) Environment: Cassandra 2.0.10, python-driver 2.1.3 Reporter: Russ Garrett Assignee: Tyler Hobbs Fix For: 2.0.13 Attachments: 8767-debug-logging.txt, describe-table.txt, exception-with-logging.txt, exception.txt We've just upgraded one of our python apps from using the old cql library to the new python-driver. When running one particular query, it produces the attached assertion error in Cassandra. The query is: bq. SELECT buffer, id, type, json FROM events WHERE buffer = %(bid)s AND idkey = %(idkey)s ORDER BY id ASC Where buffer and idkey are integer primary keys, and id is the clustering key (ordered asc). This query, with identical parameters, does not cause this error using the old cql python library, or with the cqlsh client.. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-7287) Pig CqlStorage test fails with IAE
[ https://issues.apache.org/jira/browse/CASSANDRA-7287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14324200#comment-14324200 ] Chhavi Gangwal edited comment on CASSANDRA-7287 at 2/17/15 2:08 PM: Same issue persist for MapType as well, MapType objects retrieved are empty even for version 2.1.0-rc4 was (Author: chhavigangwal): Same issue persist for MapType as well, MapType objects retrieved are empty even for versions for version 2.1.0-rc4 Pig CqlStorage test fails with IAE -- Key: CASSANDRA-7287 URL: https://issues.apache.org/jira/browse/CASSANDRA-7287 Project: Cassandra Issue Type: Bug Components: Hadoop, Tests Reporter: Brandon Williams Assignee: Sylvain Lebresne Fix For: 2.1 rc1 Attachments: 7287.txt {noformat} [junit] java.lang.IllegalArgumentException [junit] at java.nio.Buffer.limit(Buffer.java:267) [junit] at org.apache.cassandra.utils.ByteBufferUtil.readBytes(ByteBufferUtil.java:542) [junit] at org.apache.cassandra.serializers.CollectionSerializer.readValue(CollectionSerializer.java:117) [junit] at org.apache.cassandra.serializers.MapSerializer.deserializeForNativeProtocol(MapSerializer.java:97) [junit] at org.apache.cassandra.serializers.MapSerializer.deserializeForNativeProtocol(MapSerializer.java:28) [junit] at org.apache.cassandra.serializers.CollectionSerializer.deserialize(CollectionSerializer.java:48) [junit] at org.apache.cassandra.db.marshal.AbstractType.compose(AbstractType.java:66) [junit] at org.apache.cassandra.hadoop.pig.AbstractCassandraStorage.cassandraToObj(AbstractCassandraStorage.java:792) [junit] at org.apache.cassandra.hadoop.pig.CqlStorage.cqlColumnToObj(CqlStorage.java:195) [junit] at org.apache.cassandra.hadoop.pig.CqlStorage.getNext(CqlStorage.java:118) {noformat} I'm guessing this is caused by CqlStorage passing an empty BB to BBU, but I don't know if it's pig that's broken or is a deeper issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8366) Repair grows data on nodes, causes load to become unbalanced
[ https://issues.apache.org/jira/browse/CASSANDRA-8366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14324206#comment-14324206 ] Alan Boudreault commented on CASSANDRA-8366: [~krummas]Thank you for your work and tests. I will stay on touch to test the patch when ready. Repair grows data on nodes, causes load to become unbalanced Key: CASSANDRA-8366 URL: https://issues.apache.org/jira/browse/CASSANDRA-8366 Project: Cassandra Issue Type: Bug Environment: 4 node cluster 2.1.2 Cassandra Inserts and reads are done with CQL driver Reporter: Jan Karlsson Assignee: Marcus Eriksson Attachments: results-1000-inc-repairs.txt, results-1750_inc_repair.txt, results-500_1_inc_repairs.txt, results-500_2_inc_repairs.txt, results-500_full_repair_then_inc_repairs.txt, results-500_inc_repairs_not_parallel.txt, run1_with_compact_before_repair.log, run2_no_compact_before_repair.log, run3_no_compact_before_repair.log, test.sh, testv2.sh There seems to be something weird going on when repairing data. I have a program that runs 2 hours which inserts 250 random numbers and reads 250 times per second. It creates 2 keyspaces with SimpleStrategy and RF of 3. I use size-tiered compaction for my cluster. After those 2 hours I run a repair and the load of all nodes goes up. If I run incremental repair the load goes up alot more. I saw the load shoot up 8 times the original size multiple times with incremental repair. (from 2G to 16G) with node 9 8 7 and 6 the repro procedure looked like this: (Note that running full repair first is not a requirement to reproduce.) {noformat} After 2 hours of 250 reads + 250 writes per second: UN 9 583.39 MB 256 ? 28220962-26ae-4eeb-8027-99f96e377406 rack1 UN 8 584.01 MB 256 ? f2de6ea1-de88-4056-8fde-42f9c476a090 rack1 UN 7 583.72 MB 256 ? 2b6b5d66-13c8-43d8-855c-290c0f3c3a0b rack1 UN 6 583.84 MB 256 ? b8bd67f1-a816-46ff-b4a4-136ad5af6d4b rack1 Repair -pr -par on all nodes sequentially UN 9 746.29 MB 256 ? 28220962-26ae-4eeb-8027-99f96e377406 rack1 UN 8 751.02 MB 256 ? f2de6ea1-de88-4056-8fde-42f9c476a090 rack1 UN 7 748.89 MB 256 ? 2b6b5d66-13c8-43d8-855c-290c0f3c3a0b rack1 UN 6 758.34 MB 256 ? b8bd67f1-a816-46ff-b4a4-136ad5af6d4b rack1 repair -inc -par on all nodes sequentially UN 9 2.41 GB256 ? 28220962-26ae-4eeb-8027-99f96e377406 rack1 UN 8 2.53 GB256 ? f2de6ea1-de88-4056-8fde-42f9c476a090 rack1 UN 7 2.6 GB 256 ? 2b6b5d66-13c8-43d8-855c-290c0f3c3a0b rack1 UN 6 2.17 GB256 ? b8bd67f1-a816-46ff-b4a4-136ad5af6d4b rack1 after rolling restart UN 9 1.47 GB256 ? 28220962-26ae-4eeb-8027-99f96e377406 rack1 UN 8 1.5 GB 256 ? f2de6ea1-de88-4056-8fde-42f9c476a090 rack1 UN 7 2.46 GB256 ? 2b6b5d66-13c8-43d8-855c-290c0f3c3a0b rack1 UN 6 1.19 GB256 ? b8bd67f1-a816-46ff-b4a4-136ad5af6d4b rack1 compact all nodes sequentially UN 9 989.99 MB 256 ? 28220962-26ae-4eeb-8027-99f96e377406 rack1 UN 8 994.75 MB 256 ? f2de6ea1-de88-4056-8fde-42f9c476a090 rack1 UN 7 1.46 GB256 ? 2b6b5d66-13c8-43d8-855c-290c0f3c3a0b rack1 UN 6 758.82 MB 256 ? b8bd67f1-a816-46ff-b4a4-136ad5af6d4b rack1 repair -inc -par on all nodes sequentially UN 9 1.98 GB256 ? 28220962-26ae-4eeb-8027-99f96e377406 rack1 UN 8 2.3 GB 256 ? f2de6ea1-de88-4056-8fde-42f9c476a090 rack1 UN 7 3.71 GB256 ? 2b6b5d66-13c8-43d8-855c-290c0f3c3a0b rack1 UN 6 1.68 GB256 ? b8bd67f1-a816-46ff-b4a4-136ad5af6d4b rack1 restart once more UN 9 2 GB 256 ? 28220962-26ae-4eeb-8027-99f96e377406 rack1 UN 8 2.05 GB256 ? f2de6ea1-de88-4056-8fde-42f9c476a090 rack1 UN 7 4.1 GB 256 ? 2b6b5d66-13c8-43d8-855c-290c0f3c3a0b rack1 UN 6 1.68 GB256 ? b8bd67f1-a816-46ff-b4a4-136ad5af6d4b rack1 {noformat} Is there something im missing or is this strange behavior? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8810) incorrect indexing of list collection
[ https://issues.apache.org/jira/browse/CASSANDRA-8810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14324248#comment-14324248 ] Philip Thompson commented on CASSANDRA-8810: This error is not occurring on 2.1-head, so I suspect it was fixed as one of the several allow filtering or index bugs that made it into 2.1.3. I cannot find a duplicate ticket however. [~blerer] any suggestions of what ticket this might be a dupe of? incorrect indexing of list collection - Key: CASSANDRA-8810 URL: https://issues.apache.org/jira/browse/CASSANDRA-8810 Project: Cassandra Issue Type: Bug Components: Core Environment: windows 8 Reporter: 007reader Fix For: 2.1.4 in a table with one indexed field type listtext ,data retrieval is not working properly: I have a simple table with an indexed listtext field, but it shows unexpected behavior when I query the list. {code} create table test (whole text PRIMARY KEY, parts listtext); create index on test (parts); insert into test (whole,parts) values('a', ['a']); insert into test (whole,parts) values('b', ['b']); insert into test (whole,parts) values('c', ['c']); insert into test (whole,parts) values('a.a', ['a','a']); insert into test (whole,parts) values('a.b', ['a','b']); insert into test (whole,parts) values('a.c', ['a','c']); insert into test (whole,parts) values('b.a', ['b','a']); insert into test (whole,parts) values('b.b', ['b','b']); insert into test (whole,parts) values('b.c', ['b','c']); insert into test (whole,parts) values('c.c', ['c','c']); insert into test (whole,parts) values('c.b', ['c','b']); insert into test (whole,parts) values('c.a', ['c','a']);{code} This is expected behavior: -- select * from test where parts contains 'a' ALLOW FILTERING; whole | parts ---+ a | ['a'] b.a | ['b', 'a'] a.c | ['a', 'c'] a.b | ['a', 'b'] a.a | ['a', 'a'] c.a | ['c', 'a'] From the following query I expect a subset of the previous query result, but it returns no data --- select * from test where parts contains 'a' and parts contains 'b' ALLOW FILTERING; whole | parts ---+--- -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Git Push Summary
Repository: cassandra Updated Tags: refs/tags/cassandra-2.1.3 [created] ea4013aeb
[jira] [Updated] (CASSANDRA-8789) OutboundTcpConnectionPool should route messages to sockets by size not type
[ https://issues.apache.org/jira/browse/CASSANDRA-8789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benedict updated CASSANDRA-8789: Reviewer: Benedict OutboundTcpConnectionPool should route messages to sockets by size not type --- Key: CASSANDRA-8789 URL: https://issues.apache.org/jira/browse/CASSANDRA-8789 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Ariel Weisberg Assignee: Ariel Weisberg Fix For: 3.0 Attachments: 8789.diff I was looking at this trying to understand what messages flow over which connection. For reads the request goes out over the command connection and the response comes back over the ack connection. For writes the request goes out over the command connection and the response comes back over the command connection. Reads get a dedicated socket for responses. Mutation commands and responses both travel over the same socket along with read requests. Sockets are used uni-directional so there are actually four sockets in play and four threads at each node (2 inbounded, 2 outbound). CASSANDRA-488 doesn't leave a record of what the impact of this change was. If someone remembers what situations were made better it would be good to know. I am not clear on when/how this is helpful. The consumer side shouldn't be blocking so the only head of line blocking issue is the time it takes to transfer data over the wire. If message size is the cause of blocking issues then the current design mixes small messages and large messages on the same connection retaining the head of line blocking. Read requests share the same connection as write requests (which are large), and write acknowledgments (which are small) share the same connections as write requests. The only winner is read acknowledgements. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[5/6] cassandra git commit: Merge branch 'cassandra-2.0' into cassandra-2.1
Merge branch 'cassandra-2.0' into cassandra-2.1 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/cd9da447 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/cd9da447 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/cd9da447 Branch: refs/heads/cassandra-2.1 Commit: cd9da447b644744ec344b9d763574e49da29b470 Parents: 9ad7978 7f10cbd Author: Robert Stupp sn...@snazy.de Authored: Tue Feb 17 17:27:46 2015 +0100 Committer: Robert Stupp sn...@snazy.de Committed: Tue Feb 17 17:27:46 2015 +0100 -- --
[2/6] cassandra git commit: AssertionError: Memory was freed when running cleanup
AssertionError: Memory was freed when running cleanup Patch by Robert Stupp; Reviewed by Benedict Elliott Smith for CASSANDRA-8716 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/7f10cbd8 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/7f10cbd8 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/7f10cbd8 Branch: refs/heads/cassandra-2.1 Commit: 7f10cbd8ad7b4ea1f7471084eeb935fe911845bb Parents: 93769b3 Author: Robert Stupp sn...@snazy.de Authored: Tue Feb 17 17:25:53 2015 +0100 Committer: Robert Stupp sn...@snazy.de Committed: Tue Feb 17 17:25:53 2015 +0100 -- CHANGES.txt | 1 + .../apache/cassandra/db/compaction/CompactionManager.java | 9 + 2 files changed, 6 insertions(+), 4 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/7f10cbd8/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 2052f70..24f70a3 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 2.0.13: + * AssertionError: Memory was freed when running cleanup (CASSANDRA-8716) * Make it possible to set max_sstable_age to fractional days (CASSANDRA-8406) * Fix memory leak in SSTableSimple*Writer and SSTableReader.validate() (CASSANDRA-8748) * Fix some multi-column relations with indexes on some clustering http://git-wip-us.apache.org/repos/asf/cassandra/blob/7f10cbd8/src/java/org/apache/cassandra/db/compaction/CompactionManager.java -- diff --git a/src/java/org/apache/cassandra/db/compaction/CompactionManager.java b/src/java/org/apache/cassandra/db/compaction/CompactionManager.java index 62599e3..0978ae6 100644 --- a/src/java/org/apache/cassandra/db/compaction/CompactionManager.java +++ b/src/java/org/apache/cassandra/db/compaction/CompactionManager.java @@ -539,9 +539,10 @@ public class CompactionManager implements CompactionManagerMBean for (SSTableReader sstable : sstables) { +SetSSTableReader sstableAsSet = Collections.singleton(sstable); if (!hasIndexes !new BoundsToken(sstable.first.token, sstable.last.token).intersects(ranges)) { -cfs.replaceCompactedSSTables(Arrays.asList(sstable), Collections.SSTableReaderemptyList(), OperationType.CLEANUP); +cfs.replaceCompactedSSTables(sstableAsSet, Collections.SSTableReaderemptyList(), OperationType.CLEANUP); continue; } if (!needsCleanup(sstable, ranges)) @@ -550,18 +551,18 @@ public class CompactionManager implements CompactionManagerMBean continue; } -CompactionController controller = new CompactionController(cfs, Collections.singleton(sstable), getDefaultGcBefore(cfs)); +CompactionController controller = new CompactionController(cfs, sstableAsSet, getDefaultGcBefore(cfs)); long start = System.nanoTime(); long totalkeysWritten = 0; int expectedBloomFilterSize = Math.max(cfs.metadata.getIndexInterval(), - (int) (SSTableReader.getApproximateKeyCount(Arrays.asList(sstable), cfs.metadata))); + (int) (SSTableReader.getApproximateKeyCount(sstableAsSet, cfs.metadata))); if (logger.isDebugEnabled()) logger.debug(Expected bloom filter size : + expectedBloomFilterSize); logger.info(Cleaning up + sstable); -File compactionFileLocation = cfs.directories.getWriteableLocationAsFile(cfs.getExpectedCompactedFileSize(sstables, OperationType.CLEANUP)); +File compactionFileLocation = cfs.directories.getWriteableLocationAsFile(cfs.getExpectedCompactedFileSize(sstableAsSet, OperationType.CLEANUP)); if (compactionFileLocation == null) throw new IOException(disk full);
Git Push Summary
Repository: cassandra Updated Tags: refs/tags/2.1.3-tentative [deleted] 7cc1cf000
[3/6] cassandra git commit: AssertionError: Memory was freed when running cleanup
AssertionError: Memory was freed when running cleanup Patch by Robert Stupp; Reviewed by Benedict Elliott Smith for CASSANDRA-8716 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/7f10cbd8 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/7f10cbd8 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/7f10cbd8 Branch: refs/heads/trunk Commit: 7f10cbd8ad7b4ea1f7471084eeb935fe911845bb Parents: 93769b3 Author: Robert Stupp sn...@snazy.de Authored: Tue Feb 17 17:25:53 2015 +0100 Committer: Robert Stupp sn...@snazy.de Committed: Tue Feb 17 17:25:53 2015 +0100 -- CHANGES.txt | 1 + .../apache/cassandra/db/compaction/CompactionManager.java | 9 + 2 files changed, 6 insertions(+), 4 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/7f10cbd8/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 2052f70..24f70a3 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 2.0.13: + * AssertionError: Memory was freed when running cleanup (CASSANDRA-8716) * Make it possible to set max_sstable_age to fractional days (CASSANDRA-8406) * Fix memory leak in SSTableSimple*Writer and SSTableReader.validate() (CASSANDRA-8748) * Fix some multi-column relations with indexes on some clustering http://git-wip-us.apache.org/repos/asf/cassandra/blob/7f10cbd8/src/java/org/apache/cassandra/db/compaction/CompactionManager.java -- diff --git a/src/java/org/apache/cassandra/db/compaction/CompactionManager.java b/src/java/org/apache/cassandra/db/compaction/CompactionManager.java index 62599e3..0978ae6 100644 --- a/src/java/org/apache/cassandra/db/compaction/CompactionManager.java +++ b/src/java/org/apache/cassandra/db/compaction/CompactionManager.java @@ -539,9 +539,10 @@ public class CompactionManager implements CompactionManagerMBean for (SSTableReader sstable : sstables) { +SetSSTableReader sstableAsSet = Collections.singleton(sstable); if (!hasIndexes !new BoundsToken(sstable.first.token, sstable.last.token).intersects(ranges)) { -cfs.replaceCompactedSSTables(Arrays.asList(sstable), Collections.SSTableReaderemptyList(), OperationType.CLEANUP); +cfs.replaceCompactedSSTables(sstableAsSet, Collections.SSTableReaderemptyList(), OperationType.CLEANUP); continue; } if (!needsCleanup(sstable, ranges)) @@ -550,18 +551,18 @@ public class CompactionManager implements CompactionManagerMBean continue; } -CompactionController controller = new CompactionController(cfs, Collections.singleton(sstable), getDefaultGcBefore(cfs)); +CompactionController controller = new CompactionController(cfs, sstableAsSet, getDefaultGcBefore(cfs)); long start = System.nanoTime(); long totalkeysWritten = 0; int expectedBloomFilterSize = Math.max(cfs.metadata.getIndexInterval(), - (int) (SSTableReader.getApproximateKeyCount(Arrays.asList(sstable), cfs.metadata))); + (int) (SSTableReader.getApproximateKeyCount(sstableAsSet, cfs.metadata))); if (logger.isDebugEnabled()) logger.debug(Expected bloom filter size : + expectedBloomFilterSize); logger.info(Cleaning up + sstable); -File compactionFileLocation = cfs.directories.getWriteableLocationAsFile(cfs.getExpectedCompactedFileSize(sstables, OperationType.CLEANUP)); +File compactionFileLocation = cfs.directories.getWriteableLocationAsFile(cfs.getExpectedCompactedFileSize(sstableAsSet, OperationType.CLEANUP)); if (compactionFileLocation == null) throw new IOException(disk full);
[4/6] cassandra git commit: Merge branch 'cassandra-2.0' into cassandra-2.1
Merge branch 'cassandra-2.0' into cassandra-2.1 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/cd9da447 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/cd9da447 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/cd9da447 Branch: refs/heads/trunk Commit: cd9da447b644744ec344b9d763574e49da29b470 Parents: 9ad7978 7f10cbd Author: Robert Stupp sn...@snazy.de Authored: Tue Feb 17 17:27:46 2015 +0100 Committer: Robert Stupp sn...@snazy.de Committed: Tue Feb 17 17:27:46 2015 +0100 -- --
[jira] [Updated] (CASSANDRA-8789) Revisit how OutboundTcpConnection pools two connections for different message types
[ https://issues.apache.org/jira/browse/CASSANDRA-8789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ariel Weisberg updated CASSANDRA-8789: -- Attachment: 8789.diff Instead of changing the socket in used based on message type use the message size. The payload size is already being calculated as part of serialization so when calculating the size to use for picking a socket memoize the value. Revisit how OutboundTcpConnection pools two connections for different message types --- Key: CASSANDRA-8789 URL: https://issues.apache.org/jira/browse/CASSANDRA-8789 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Ariel Weisberg Assignee: Ariel Weisberg Fix For: 3.0 Attachments: 8789.diff I was looking at this trying to understand what messages flow over which connection. For reads the request goes out over the command connection and the response comes back over the ack connection. For writes the request goes out over the command connection and the response comes back over the command connection. Reads get a dedicated socket for responses. Mutation commands and responses both travel over the same socket along with read requests. Sockets are used uni-directional so there are actually four sockets in play and four threads at each node (2 inbounded, 2 outbound). CASSANDRA-488 doesn't leave a record of what the impact of this change was. If someone remembers what situations were made better it would be good to know. I am not clear on when/how this is helpful. The consumer side shouldn't be blocking so the only head of line blocking issue is the time it takes to transfer data over the wire. If message size is the cause of blocking issues then the current design mixes small messages and large messages on the same connection retaining the head of line blocking. Read requests share the same connection as write requests (which are large), and write acknowledgments (which are small) share the same connections as write requests. The only winner is read acknowledgements. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8716) java.util.concurrent.ExecutionException: java.lang.AssertionError: Memory was freed when running cleanup
[ https://issues.apache.org/jira/browse/CASSANDRA-8716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14324399#comment-14324399 ] Robert Stupp commented on CASSANDRA-8716: - Committed as 7f10cbd8ad7b4ea1f7471084eeb935fe911845bb with the less uglier name :) java.util.concurrent.ExecutionException: java.lang.AssertionError: Memory was freed when running cleanup -- Key: CASSANDRA-8716 URL: https://issues.apache.org/jira/browse/CASSANDRA-8716 Project: Cassandra Issue Type: Bug Components: Core Environment: Centos 6.6, Cassandra 2.0.12, Oracle JDK 1.7.0_67 Reporter: Imri Zvik Assignee: Robert Stupp Priority: Minor Fix For: 2.0.13 Attachments: 8716.txt, system.log.gz {code}Error occurred during cleanup java.util.concurrent.ExecutionException: java.lang.AssertionError: Memory was freed at java.util.concurrent.FutureTask.report(FutureTask.java:122) at java.util.concurrent.FutureTask.get(FutureTask.java:188) at org.apache.cassandra.db.compaction.CompactionManager.performAllSSTableOperation(CompactionManager.java:234) at org.apache.cassandra.db.compaction.CompactionManager.performCleanup(CompactionManager.java:272) at org.apache.cassandra.db.ColumnFamilyStore.forceCleanup(ColumnFamilyStore.java:1115) at org.apache.cassandra.service.StorageService.forceKeyspaceCleanup(StorageService.java:2177) at sun.reflect.GeneratedMethodAccessor29.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at sun.reflect.misc.Trampoline.invoke(MethodUtil.java:75) at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at sun.reflect.misc.MethodUtil.invoke(MethodUtil.java:279) at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:112) at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:46) at com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:237) at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:138) at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:252) at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:819) at com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:801) at javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1487) at javax.management.remote.rmi.RMIConnectionImpl.access$300(RMIConnectionImpl.java:97) at javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1328) at javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1420) at javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:848) at sun.reflect.GeneratedMethodAccessor23.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:322) at sun.rmi.transport.Transport$1.run(Transport.java:177) at sun.rmi.transport.Transport$1.run(Transport.java:174) at java.security.AccessController.doPrivileged(Native Method) at sun.rmi.transport.Transport.serviceCall(Transport.java:173) at sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:556) at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:811) at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:670) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.AssertionError: Memory was freed at org.apache.cassandra.io.util.Memory.checkPosition(Memory.java:259) at org.apache.cassandra.io.util.Memory.getInt(Memory.java:211) at org.apache.cassandra.io.sstable.IndexSummary.getIndex(IndexSummary.java:79) at org.apache.cassandra.io.sstable.IndexSummary.getKey(IndexSummary.java:84) at
[jira] [Commented] (CASSANDRA-8494) incremental bootstrap
[ https://issues.apache.org/jira/browse/CASSANDRA-8494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14324434#comment-14324434 ] Yuki Morishita commented on CASSANDRA-8494: --- Now I'm having challenging part. Bootstrapping token as it is ready can change replica placement or replica range in the middle of bootstrapping process. We may succeed bootstrapping for few vnodes but if we have many, it is likely to fail streaming. Or even if we suceed, we may have over-streamed (out of range) data. So, it may be easier to go Jake's route: proxy reads while bootstrapping. It is simpler for bootstrapping node to decide what to do for read request based on its bootstrapping progress, rather than coordinating (through gossip) which tokens are up/down amoung nodes in the cluster. To do that, I guess we need to have: * Bootstrapping node - Anounce bootstrapping tokens as usual. Receive data while keeping track of the completed range (This can also be used for resume). When read request comes and it is in the completed range, just serve the data. Otherwise forward the request to current replica to answer instead. * Existing node - When receiving read request and it's in the pending range, just do extra work for preparing to receive proxied response. incremental bootstrap - Key: CASSANDRA-8494 URL: https://issues.apache.org/jira/browse/CASSANDRA-8494 Project: Cassandra Issue Type: New Feature Components: Core Reporter: Jon Haddad Assignee: Yuki Morishita Priority: Minor Labels: density Fix For: 3.0 Current bootstrapping involves (to my knowledge) picking tokens and streaming data before the node is available for requests. This can be problematic with fat nodes, since it may require 20TB of data to be streamed over before the machine can be useful. This can result in a massive window of time before the machine can do anything useful. As a potential approach to mitigate the huge window of time before a node is available, I suggest modifying the bootstrap process to only acquire a single initial token before being marked UP. This would likely be a configuration parameter incremental_bootstrap or something similar. After the node is bootstrapped with this one token, it could go into UP state, and could then acquire additional tokens (one or a handful at a time), which would be streamed over while the node is active and serving requests. The benefit here is that with the default 256 tokens a node could become an active part of the cluster with less than 1% of it's final data streamed over. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8757) IndexSummaryBuilder should construct itself offheap, and share memory between the result of each build() invocation
[ https://issues.apache.org/jira/browse/CASSANDRA-8757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14324492#comment-14324492 ] Benedict commented on CASSANDRA-8757: - Just to explain why I consider this a priority for 2.1, if you have users with very large STCS compactions, we can have some fairly pathological behaviour. Let's say our target file is 500Gb, and 20% of the data is the partition key. This means the summary will be approximately 800Mb, assuming defaults. If we re-open the result every 50Mb (default behaviour) we will allocate a total of 4Tb of memory for summaries over the duration of the compaction. Not all of this will be used at once; ideally, in fact, we would only ever have maybe 1.6Gb allocated. But there is no guarantee, and longer running operations like compactions could retain copies of multiple different instances indefinitely, so we could see several Gb of summary floating around in this pathological case. If there is a reticence to introduce this into 2.1, another option might be to either disable early reopening entirely for very large files, or to open far less frequently, say at even intervals of sqrt(N) where N is the expected end size, or at logarthmically further apart intervals. But the advantage of reopening vanishes if we do this, so we may as well just not do it for such files without this patch. IndexSummaryBuilder should construct itself offheap, and share memory between the result of each build() invocation --- Key: CASSANDRA-8757 URL: https://issues.apache.org/jira/browse/CASSANDRA-8757 Project: Cassandra Issue Type: Bug Components: Core Reporter: Benedict Assignee: Benedict Fix For: 2.1.4 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8792) Improve Memory assertions
[ https://issues.apache.org/jira/browse/CASSANDRA-8792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14324501#comment-14324501 ] Benedict commented on CASSANDRA-8792: - Whichever way that decision goes, it would also be great to get the improved checkBounds() assertions in. Should I split that into a separate patch, or can we reach a quick decision about both of these small changes? Improve Memory assertions - Key: CASSANDRA-8792 URL: https://issues.apache.org/jira/browse/CASSANDRA-8792 Project: Cassandra Issue Type: Bug Components: Core Reporter: Benedict Assignee: Benedict Priority: Trivial Fix For: 2.1.4 Null pointers are valid returns if a size of zero is returned. We assume a null pointer implies resource mismanagement in a few places. We also don't properly check the bounds of all of our accesses; this patch attempts to tidy up both of these things. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8767) Added column does not sort as the last column when using new python driver
[ https://issues.apache.org/jira/browse/CASSANDRA-8767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14324503#comment-14324503 ] Tyler Hobbs commented on CASSANDRA-8767: Thanks, that provides some insight. It's adding the cell (int-value, json) after (int-value, type) with the same value for int-value. The comparator for this schema includes a UTF8Type component at the end for the column name (and it's always in normal sorting order), so json should come before type. I was able to reproduce this with a multi-node cluster, so this is happening somewhere that the single-node path doesn't trigger. Added column does not sort as the last column when using new python driver Key: CASSANDRA-8767 URL: https://issues.apache.org/jira/browse/CASSANDRA-8767 Project: Cassandra Issue Type: Bug Components: Core, Drivers (now out of tree) Environment: Cassandra 2.0.10, python-driver 2.1.3 Reporter: Russ Garrett Assignee: Tyler Hobbs Fix For: 2.0.13 Attachments: 8767-debug-logging.txt, describe-table.txt, exception-with-logging.txt, exception.txt We've just upgraded one of our python apps from using the old cql library to the new python-driver. When running one particular query, it produces the attached assertion error in Cassandra. The query is: bq. SELECT buffer, id, type, json FROM events WHERE buffer = %(bid)s AND idkey = %(idkey)s ORDER BY id ASC Where buffer and idkey are integer primary keys, and id is the clustering key (ordered asc). This query, with identical parameters, does not cause this error using the old cql python library, or with the cqlsh client.. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-4914) Aggregation functions in CQL
[ https://issues.apache.org/jira/browse/CASSANDRA-4914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14324355#comment-14324355 ] Cristian O commented on CASSANDRA-4914: --- To be clear I'm not talking about distributed processing. I'm talking about online analitical queries, particularly time series. Incidentally, I don't know how many people realize this but Cass has the same distributed storage architecture that Vertica has. It's also possible to map a columnar schema on top of sstable. Of course native support for columnar storage would be immensely better. See CASSANDRA-7447 There's a lot of opportunity in this space, just need a bit of vision. Aggregation functions in CQL Key: CASSANDRA-4914 URL: https://issues.apache.org/jira/browse/CASSANDRA-4914 Project: Cassandra Issue Type: New Feature Reporter: Vijay Assignee: Benjamin Lerer Labels: cql, docs Fix For: 3.0 Attachments: CASSANDRA-4914-V2.txt, CASSANDRA-4914-V3.txt, CASSANDRA-4914-V4.txt, CASSANDRA-4914-V5.txt, CASSANDRA-4914.txt The requirement is to do aggregation of data in Cassandra (Wide row of column values of int, double, float etc). With some basic agree gate functions like AVG, SUM, Mean, Min, Max, etc (for the columns within a row). Example: SELECT * FROM emp WHERE empID IN (130) ORDER BY deptID DESC; empid | deptid | first_name | last_name | salary ---+++---+ 130 | 3 | joe| doe | 10.1 130 | 2 | joe| doe |100 130 | 1 | joe| doe | 1e+03 SELECT sum(salary), empid FROM emp WHERE empID IN (130); sum(salary) | empid -+ 1110.1| 130 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[6/6] cassandra git commit: Merge branch 'cassandra-2.1' into trunk
Merge branch 'cassandra-2.1' into trunk Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/f6879b20 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/f6879b20 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/f6879b20 Branch: refs/heads/trunk Commit: f6879b205880551729169d4cc506fe2ba7f7ab11 Parents: f9d4044 cd9da44 Author: Robert Stupp sn...@snazy.de Authored: Tue Feb 17 17:27:56 2015 +0100 Committer: Robert Stupp sn...@snazy.de Committed: Tue Feb 17 17:27:56 2015 +0100 -- --
[jira] [Commented] (CASSANDRA-4914) Aggregation functions in CQL
[ https://issues.apache.org/jira/browse/CASSANDRA-4914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14324408#comment-14324408 ] Tyler Hobbs commented on CASSANDRA-4914: bq. I don't know the internals but it should be doable to push the aggregation function to the partitions without requiring the data interface to understand CQL. The problem with pushing aggregate calculation down to the replicas is that there's no conflict resolution. So the aggregation can be computed over stale or deleted data. That may be acceptable if you're reading at consistency level ONE, but then we're dealing with a limited, special case. bq. Note that all agg functions are eminently parallelizible I don't believe this is true. Off the top of my head, computing the median of a dataset is not really parallelizable (without some sort of internode communication). bq. dealing with consistency is tricky but then Cassandra is by design eventually consistent so why not have eventually consistent aggregations. Just pick a partition and aggregate on that. With large datasets an average differing at the sixth decimal won't really matter. That may be acceptable for aggregates like average, but other aggregates may require precision. With all of that said, I wouldn't necessarily be opposed to supporting selecting a sampling of data from a table (and allowing an aggregate to be run over that), but I suggest opening a new ticket for that discussion. Aggregation functions in CQL Key: CASSANDRA-4914 URL: https://issues.apache.org/jira/browse/CASSANDRA-4914 Project: Cassandra Issue Type: New Feature Reporter: Vijay Assignee: Benjamin Lerer Labels: cql, docs Fix For: 3.0 Attachments: CASSANDRA-4914-V2.txt, CASSANDRA-4914-V3.txt, CASSANDRA-4914-V4.txt, CASSANDRA-4914-V5.txt, CASSANDRA-4914.txt The requirement is to do aggregation of data in Cassandra (Wide row of column values of int, double, float etc). With some basic agree gate functions like AVG, SUM, Mean, Min, Max, etc (for the columns within a row). Example: SELECT * FROM emp WHERE empID IN (130) ORDER BY deptID DESC; empid | deptid | first_name | last_name | salary ---+++---+ 130 | 3 | joe| doe | 10.1 130 | 2 | joe| doe |100 130 | 1 | joe| doe | 1e+03 SELECT sum(salary), empid FROM emp WHERE empID IN (130); sum(salary) | empid -+ 1110.1| 130 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8716) java.util.concurrent.ExecutionException: java.lang.AssertionError: Memory was freed when running cleanup
[ https://issues.apache.org/jira/browse/CASSANDRA-8716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14324381#comment-14324381 ] Benedict commented on CASSANDRA-8716: - +1, with one nit: sstableColl looks ugly, and we already have one slightly less ugly way of representing this, which is sstableAsSet. I'd prefer not to proliferate too many ways of being ugly :) java.util.concurrent.ExecutionException: java.lang.AssertionError: Memory was freed when running cleanup -- Key: CASSANDRA-8716 URL: https://issues.apache.org/jira/browse/CASSANDRA-8716 Project: Cassandra Issue Type: Bug Components: Core Environment: Centos 6.6, Cassandra 2.0.12, Oracle JDK 1.7.0_67 Reporter: Imri Zvik Assignee: Robert Stupp Priority: Minor Fix For: 2.0.13 Attachments: 8716.txt, system.log.gz {code}Error occurred during cleanup java.util.concurrent.ExecutionException: java.lang.AssertionError: Memory was freed at java.util.concurrent.FutureTask.report(FutureTask.java:122) at java.util.concurrent.FutureTask.get(FutureTask.java:188) at org.apache.cassandra.db.compaction.CompactionManager.performAllSSTableOperation(CompactionManager.java:234) at org.apache.cassandra.db.compaction.CompactionManager.performCleanup(CompactionManager.java:272) at org.apache.cassandra.db.ColumnFamilyStore.forceCleanup(ColumnFamilyStore.java:1115) at org.apache.cassandra.service.StorageService.forceKeyspaceCleanup(StorageService.java:2177) at sun.reflect.GeneratedMethodAccessor29.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at sun.reflect.misc.Trampoline.invoke(MethodUtil.java:75) at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at sun.reflect.misc.MethodUtil.invoke(MethodUtil.java:279) at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:112) at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:46) at com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:237) at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:138) at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:252) at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:819) at com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:801) at javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1487) at javax.management.remote.rmi.RMIConnectionImpl.access$300(RMIConnectionImpl.java:97) at javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1328) at javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1420) at javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:848) at sun.reflect.GeneratedMethodAccessor23.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:322) at sun.rmi.transport.Transport$1.run(Transport.java:177) at sun.rmi.transport.Transport$1.run(Transport.java:174) at java.security.AccessController.doPrivileged(Native Method) at sun.rmi.transport.Transport.serviceCall(Transport.java:173) at sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:556) at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:811) at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:670) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.AssertionError: Memory was freed at org.apache.cassandra.io.util.Memory.checkPosition(Memory.java:259) at org.apache.cassandra.io.util.Memory.getInt(Memory.java:211) at org.apache.cassandra.io.sstable.IndexSummary.getIndex(IndexSummary.java:79)
[jira] [Commented] (CASSANDRA-8366) Repair grows data on nodes, causes load to become unbalanced
[ https://issues.apache.org/jira/browse/CASSANDRA-8366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14324190#comment-14324190 ] Marcus Eriksson commented on CASSANDRA-8366: I tried it once more, with autocompaction disabled to remove a bit of randomness (incremental repairs are sensitive to compactions as that will make an actually repaired sstable not be anticompacted since it has been compacted away). after 1 run with incremental repair: {code} $ du -sch /home/marcuse/.ccm/8366/node?/data/r1/ 1,5G/home/marcuse/.ccm/8366/node1/data/r1/ 1,5G/home/marcuse/.ccm/8366/node2/data/r1/ 1,5G/home/marcuse/.ccm/8366/node3/data/r1/ 4,4Gtotal {code} all sstables were marked as repaired and, after 1 run with standard repair: {code} $ du -sch /home/marcuse/.ccm/8366/node?/data/r1/ 1,5G/home/marcuse/.ccm/8366/node1/data/r1/ 1,5G/home/marcuse/.ccm/8366/node2/data/r1/ 1,5G/home/marcuse/.ccm/8366/node3/data/r1/ 4,4Gtotal {code} but, after an incremental repair with compactions enabled: {code} $ du -sch /home/marcuse/.ccm/8366/node?/data/r1/ 2,3G/home/marcuse/.ccm/8366/node1/data/r1/ 2,8G/home/marcuse/.ccm/8366/node2/data/r1/ 2,0G/home/marcuse/.ccm/8366/node3/data/r1/ 6,9Gtotal {code} And, the reason is that we validate the wrong sstables: 1. we send out a prepare message to all nodes, the nodes select which sstables to repair 2. time passes, sstables get compacted (basically randomly) 3. we start validating the sstables out of the ones we picked in (1) *that still exist*. This set will differ between nodes. 4. overstream, pain Bug. Stand by for patch Repair grows data on nodes, causes load to become unbalanced Key: CASSANDRA-8366 URL: https://issues.apache.org/jira/browse/CASSANDRA-8366 Project: Cassandra Issue Type: Bug Environment: 4 node cluster 2.1.2 Cassandra Inserts and reads are done with CQL driver Reporter: Jan Karlsson Assignee: Marcus Eriksson Attachments: results-1000-inc-repairs.txt, results-1750_inc_repair.txt, results-500_1_inc_repairs.txt, results-500_2_inc_repairs.txt, results-500_full_repair_then_inc_repairs.txt, results-500_inc_repairs_not_parallel.txt, run1_with_compact_before_repair.log, run2_no_compact_before_repair.log, run3_no_compact_before_repair.log, test.sh, testv2.sh There seems to be something weird going on when repairing data. I have a program that runs 2 hours which inserts 250 random numbers and reads 250 times per second. It creates 2 keyspaces with SimpleStrategy and RF of 3. I use size-tiered compaction for my cluster. After those 2 hours I run a repair and the load of all nodes goes up. If I run incremental repair the load goes up alot more. I saw the load shoot up 8 times the original size multiple times with incremental repair. (from 2G to 16G) with node 9 8 7 and 6 the repro procedure looked like this: (Note that running full repair first is not a requirement to reproduce.) {noformat} After 2 hours of 250 reads + 250 writes per second: UN 9 583.39 MB 256 ? 28220962-26ae-4eeb-8027-99f96e377406 rack1 UN 8 584.01 MB 256 ? f2de6ea1-de88-4056-8fde-42f9c476a090 rack1 UN 7 583.72 MB 256 ? 2b6b5d66-13c8-43d8-855c-290c0f3c3a0b rack1 UN 6 583.84 MB 256 ? b8bd67f1-a816-46ff-b4a4-136ad5af6d4b rack1 Repair -pr -par on all nodes sequentially UN 9 746.29 MB 256 ? 28220962-26ae-4eeb-8027-99f96e377406 rack1 UN 8 751.02 MB 256 ? f2de6ea1-de88-4056-8fde-42f9c476a090 rack1 UN 7 748.89 MB 256 ? 2b6b5d66-13c8-43d8-855c-290c0f3c3a0b rack1 UN 6 758.34 MB 256 ? b8bd67f1-a816-46ff-b4a4-136ad5af6d4b rack1 repair -inc -par on all nodes sequentially UN 9 2.41 GB256 ? 28220962-26ae-4eeb-8027-99f96e377406 rack1 UN 8 2.53 GB256 ? f2de6ea1-de88-4056-8fde-42f9c476a090 rack1 UN 7 2.6 GB 256 ? 2b6b5d66-13c8-43d8-855c-290c0f3c3a0b rack1 UN 6 2.17 GB256 ? b8bd67f1-a816-46ff-b4a4-136ad5af6d4b rack1 after rolling restart UN 9 1.47 GB256 ? 28220962-26ae-4eeb-8027-99f96e377406 rack1 UN 8 1.5 GB 256 ? f2de6ea1-de88-4056-8fde-42f9c476a090 rack1 UN 7 2.46 GB256 ? 2b6b5d66-13c8-43d8-855c-290c0f3c3a0b rack1 UN 6 1.19 GB256 ? b8bd67f1-a816-46ff-b4a4-136ad5af6d4b rack1 compact all nodes sequentially UN 9 989.99 MB 256 ? 28220962-26ae-4eeb-8027-99f96e377406 rack1 UN 8 994.75 MB 256 ? f2de6ea1-de88-4056-8fde-42f9c476a090 rack1 UN 7 1.46 GB256 ?
[jira] [Commented] (CASSANDRA-4914) Aggregation functions in CQL
[ https://issues.apache.org/jira/browse/CASSANDRA-4914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14324266#comment-14324266 ] Anton Slutsky commented on CASSANDRA-4914: -- Robert Stupp, you are absolutely right! Somehow, I forgot we are hashing! This is cool. I'll look into the code more carefully, but if hash buckets are narrow enough, maybe true random sampling can be assumed. If that's the case, then this is not that difficult to do. Also, please dont add distributed calculation facilities to Cassandra! :) It will turn it into a poor-man's hadoop and either you guys will have to support a real distributed processing system or everyone will be pissed off that Map/Reduce on Cassandra doesnt work like it does on Hadoop :-) Aggregation functions in CQL Key: CASSANDRA-4914 URL: https://issues.apache.org/jira/browse/CASSANDRA-4914 Project: Cassandra Issue Type: New Feature Reporter: Vijay Assignee: Benjamin Lerer Labels: cql, docs Fix For: 3.0 Attachments: CASSANDRA-4914-V2.txt, CASSANDRA-4914-V3.txt, CASSANDRA-4914-V4.txt, CASSANDRA-4914-V5.txt, CASSANDRA-4914.txt The requirement is to do aggregation of data in Cassandra (Wide row of column values of int, double, float etc). With some basic agree gate functions like AVG, SUM, Mean, Min, Max, etc (for the columns within a row). Example: SELECT * FROM emp WHERE empID IN (130) ORDER BY deptID DESC; empid | deptid | first_name | last_name | salary ---+++---+ 130 | 3 | joe| doe | 10.1 130 | 2 | joe| doe |100 130 | 1 | joe| doe | 1e+03 SELECT sum(salary), empid FROM emp WHERE empID IN (130); sum(salary) | empid -+ 1110.1| 130 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7287) Pig CqlStorage test fails with IAE
[ https://issues.apache.org/jira/browse/CASSANDRA-7287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14324200#comment-14324200 ] Chhavi Gangwal commented on CASSANDRA-7287: --- Same issue persist for MapType as well, MapType objects retrieved are empty even for versions for version 2.1.0-rc4 Pig CqlStorage test fails with IAE -- Key: CASSANDRA-7287 URL: https://issues.apache.org/jira/browse/CASSANDRA-7287 Project: Cassandra Issue Type: Bug Components: Hadoop, Tests Reporter: Brandon Williams Assignee: Sylvain Lebresne Fix For: 2.1 rc1 Attachments: 7287.txt {noformat} [junit] java.lang.IllegalArgumentException [junit] at java.nio.Buffer.limit(Buffer.java:267) [junit] at org.apache.cassandra.utils.ByteBufferUtil.readBytes(ByteBufferUtil.java:542) [junit] at org.apache.cassandra.serializers.CollectionSerializer.readValue(CollectionSerializer.java:117) [junit] at org.apache.cassandra.serializers.MapSerializer.deserializeForNativeProtocol(MapSerializer.java:97) [junit] at org.apache.cassandra.serializers.MapSerializer.deserializeForNativeProtocol(MapSerializer.java:28) [junit] at org.apache.cassandra.serializers.CollectionSerializer.deserialize(CollectionSerializer.java:48) [junit] at org.apache.cassandra.db.marshal.AbstractType.compose(AbstractType.java:66) [junit] at org.apache.cassandra.hadoop.pig.AbstractCassandraStorage.cassandraToObj(AbstractCassandraStorage.java:792) [junit] at org.apache.cassandra.hadoop.pig.CqlStorage.cqlColumnToObj(CqlStorage.java:195) [junit] at org.apache.cassandra.hadoop.pig.CqlStorage.getNext(CqlStorage.java:118) {noformat} I'm guessing this is caused by CqlStorage passing an empty BB to BBU, but I don't know if it's pig that's broken or is a deeper issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-7287) Pig CqlStorage test fails with IAE
[ https://issues.apache.org/jira/browse/CASSANDRA-7287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14324200#comment-14324200 ] Chhavi Gangwal edited comment on CASSANDRA-7287 at 2/17/15 2:25 PM: Same issue persist for MapType as well. was (Author: chhavigangwal): Same issue persist for MapType as well, MapType objects retrieved are empty even for version 2.1.0-rc4 Pig CqlStorage test fails with IAE -- Key: CASSANDRA-7287 URL: https://issues.apache.org/jira/browse/CASSANDRA-7287 Project: Cassandra Issue Type: Bug Components: Hadoop, Tests Reporter: Brandon Williams Assignee: Sylvain Lebresne Fix For: 2.1 rc1 Attachments: 7287.txt {noformat} [junit] java.lang.IllegalArgumentException [junit] at java.nio.Buffer.limit(Buffer.java:267) [junit] at org.apache.cassandra.utils.ByteBufferUtil.readBytes(ByteBufferUtil.java:542) [junit] at org.apache.cassandra.serializers.CollectionSerializer.readValue(CollectionSerializer.java:117) [junit] at org.apache.cassandra.serializers.MapSerializer.deserializeForNativeProtocol(MapSerializer.java:97) [junit] at org.apache.cassandra.serializers.MapSerializer.deserializeForNativeProtocol(MapSerializer.java:28) [junit] at org.apache.cassandra.serializers.CollectionSerializer.deserialize(CollectionSerializer.java:48) [junit] at org.apache.cassandra.db.marshal.AbstractType.compose(AbstractType.java:66) [junit] at org.apache.cassandra.hadoop.pig.AbstractCassandraStorage.cassandraToObj(AbstractCassandraStorage.java:792) [junit] at org.apache.cassandra.hadoop.pig.CqlStorage.cqlColumnToObj(CqlStorage.java:195) [junit] at org.apache.cassandra.hadoop.pig.CqlStorage.getNext(CqlStorage.java:118) {noformat} I'm guessing this is caused by CqlStorage passing an empty BB to BBU, but I don't know if it's pig that's broken or is a deeper issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-8366) Repair grows data on nodes, causes load to become unbalanced
[ https://issues.apache.org/jira/browse/CASSANDRA-8366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcus Eriksson updated CASSANDRA-8366: --- Attachment: 0001-8366.patch attaching patch that picks the sstables to compact as late as possible. Actually a semi-backport of CASSANDRA-7586 We will still have a slightly bigger live size on the nodes after one of these repairs as some sstables will not get anticompacted due to being compacted away (we could probably improve this as well, but in another ticket), but it is much better: {code} $ du -sch /home/marcuse/.ccm/8366/node?/data/r1/ 1,8G/home/marcuse/.ccm/8366/node1/data/r1/ 1,8G/home/marcuse/.ccm/8366/node2/data/r1/ 1,8G/home/marcuse/.ccm/8366/node3/data/r1/ 5,2Gtotal {code} Repair grows data on nodes, causes load to become unbalanced Key: CASSANDRA-8366 URL: https://issues.apache.org/jira/browse/CASSANDRA-8366 Project: Cassandra Issue Type: Bug Environment: 4 node cluster 2.1.2 Cassandra Inserts and reads are done with CQL driver Reporter: Jan Karlsson Assignee: Marcus Eriksson Attachments: 0001-8366.patch, results-1000-inc-repairs.txt, results-1750_inc_repair.txt, results-500_1_inc_repairs.txt, results-500_2_inc_repairs.txt, results-500_full_repair_then_inc_repairs.txt, results-500_inc_repairs_not_parallel.txt, run1_with_compact_before_repair.log, run2_no_compact_before_repair.log, run3_no_compact_before_repair.log, test.sh, testv2.sh There seems to be something weird going on when repairing data. I have a program that runs 2 hours which inserts 250 random numbers and reads 250 times per second. It creates 2 keyspaces with SimpleStrategy and RF of 3. I use size-tiered compaction for my cluster. After those 2 hours I run a repair and the load of all nodes goes up. If I run incremental repair the load goes up alot more. I saw the load shoot up 8 times the original size multiple times with incremental repair. (from 2G to 16G) with node 9 8 7 and 6 the repro procedure looked like this: (Note that running full repair first is not a requirement to reproduce.) {noformat} After 2 hours of 250 reads + 250 writes per second: UN 9 583.39 MB 256 ? 28220962-26ae-4eeb-8027-99f96e377406 rack1 UN 8 584.01 MB 256 ? f2de6ea1-de88-4056-8fde-42f9c476a090 rack1 UN 7 583.72 MB 256 ? 2b6b5d66-13c8-43d8-855c-290c0f3c3a0b rack1 UN 6 583.84 MB 256 ? b8bd67f1-a816-46ff-b4a4-136ad5af6d4b rack1 Repair -pr -par on all nodes sequentially UN 9 746.29 MB 256 ? 28220962-26ae-4eeb-8027-99f96e377406 rack1 UN 8 751.02 MB 256 ? f2de6ea1-de88-4056-8fde-42f9c476a090 rack1 UN 7 748.89 MB 256 ? 2b6b5d66-13c8-43d8-855c-290c0f3c3a0b rack1 UN 6 758.34 MB 256 ? b8bd67f1-a816-46ff-b4a4-136ad5af6d4b rack1 repair -inc -par on all nodes sequentially UN 9 2.41 GB256 ? 28220962-26ae-4eeb-8027-99f96e377406 rack1 UN 8 2.53 GB256 ? f2de6ea1-de88-4056-8fde-42f9c476a090 rack1 UN 7 2.6 GB 256 ? 2b6b5d66-13c8-43d8-855c-290c0f3c3a0b rack1 UN 6 2.17 GB256 ? b8bd67f1-a816-46ff-b4a4-136ad5af6d4b rack1 after rolling restart UN 9 1.47 GB256 ? 28220962-26ae-4eeb-8027-99f96e377406 rack1 UN 8 1.5 GB 256 ? f2de6ea1-de88-4056-8fde-42f9c476a090 rack1 UN 7 2.46 GB256 ? 2b6b5d66-13c8-43d8-855c-290c0f3c3a0b rack1 UN 6 1.19 GB256 ? b8bd67f1-a816-46ff-b4a4-136ad5af6d4b rack1 compact all nodes sequentially UN 9 989.99 MB 256 ? 28220962-26ae-4eeb-8027-99f96e377406 rack1 UN 8 994.75 MB 256 ? f2de6ea1-de88-4056-8fde-42f9c476a090 rack1 UN 7 1.46 GB256 ? 2b6b5d66-13c8-43d8-855c-290c0f3c3a0b rack1 UN 6 758.82 MB 256 ? b8bd67f1-a816-46ff-b4a4-136ad5af6d4b rack1 repair -inc -par on all nodes sequentially UN 9 1.98 GB256 ? 28220962-26ae-4eeb-8027-99f96e377406 rack1 UN 8 2.3 GB 256 ? f2de6ea1-de88-4056-8fde-42f9c476a090 rack1 UN 7 3.71 GB256 ? 2b6b5d66-13c8-43d8-855c-290c0f3c3a0b rack1 UN 6 1.68 GB256 ? b8bd67f1-a816-46ff-b4a4-136ad5af6d4b rack1 restart once more UN 9 2 GB 256 ? 28220962-26ae-4eeb-8027-99f96e377406 rack1 UN 8 2.05 GB256 ? f2de6ea1-de88-4056-8fde-42f9c476a090 rack1 UN 7 4.1 GB 256 ? 2b6b5d66-13c8-43d8-855c-290c0f3c3a0b rack1 UN 6 1.68 GB256 ? b8bd67f1-a816-46ff-b4a4-136ad5af6d4b rack1 {noformat} Is there something im missing or is this strange behavior? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8716) java.util.concurrent.ExecutionException: java.lang.AssertionError: Memory was freed when running cleanup
[ https://issues.apache.org/jira/browse/CASSANDRA-8716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14324207#comment-14324207 ] Philip Thompson commented on CASSANDRA-8716: [~snazy], the attached 8716.txt patch corrects the issue on 2.0 head for me. java.util.concurrent.ExecutionException: java.lang.AssertionError: Memory was freed when running cleanup -- Key: CASSANDRA-8716 URL: https://issues.apache.org/jira/browse/CASSANDRA-8716 Project: Cassandra Issue Type: Bug Components: Core Environment: Centos 6.6, Cassandra 2.0.12, Oracle JDK 1.7.0_67 Reporter: Imri Zvik Assignee: Robert Stupp Priority: Minor Fix For: 2.0.13 Attachments: 8716.txt, system.log.gz {code}Error occurred during cleanup java.util.concurrent.ExecutionException: java.lang.AssertionError: Memory was freed at java.util.concurrent.FutureTask.report(FutureTask.java:122) at java.util.concurrent.FutureTask.get(FutureTask.java:188) at org.apache.cassandra.db.compaction.CompactionManager.performAllSSTableOperation(CompactionManager.java:234) at org.apache.cassandra.db.compaction.CompactionManager.performCleanup(CompactionManager.java:272) at org.apache.cassandra.db.ColumnFamilyStore.forceCleanup(ColumnFamilyStore.java:1115) at org.apache.cassandra.service.StorageService.forceKeyspaceCleanup(StorageService.java:2177) at sun.reflect.GeneratedMethodAccessor29.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at sun.reflect.misc.Trampoline.invoke(MethodUtil.java:75) at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at sun.reflect.misc.MethodUtil.invoke(MethodUtil.java:279) at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:112) at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:46) at com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:237) at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:138) at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:252) at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:819) at com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:801) at javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1487) at javax.management.remote.rmi.RMIConnectionImpl.access$300(RMIConnectionImpl.java:97) at javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1328) at javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1420) at javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:848) at sun.reflect.GeneratedMethodAccessor23.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:322) at sun.rmi.transport.Transport$1.run(Transport.java:177) at sun.rmi.transport.Transport$1.run(Transport.java:174) at java.security.AccessController.doPrivileged(Native Method) at sun.rmi.transport.Transport.serviceCall(Transport.java:173) at sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:556) at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:811) at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:670) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.AssertionError: Memory was freed at org.apache.cassandra.io.util.Memory.checkPosition(Memory.java:259) at org.apache.cassandra.io.util.Memory.getInt(Memory.java:211) at org.apache.cassandra.io.sstable.IndexSummary.getIndex(IndexSummary.java:79) at org.apache.cassandra.io.sstable.IndexSummary.getKey(IndexSummary.java:84) at
[jira] [Commented] (CASSANDRA-8716) java.util.concurrent.ExecutionException: java.lang.AssertionError: Memory was freed when running cleanup
[ https://issues.apache.org/jira/browse/CASSANDRA-8716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14324219#comment-14324219 ] Robert Stupp commented on CASSANDRA-8716: - Alright, just waiting for a formal review til I commit the patch. java.util.concurrent.ExecutionException: java.lang.AssertionError: Memory was freed when running cleanup -- Key: CASSANDRA-8716 URL: https://issues.apache.org/jira/browse/CASSANDRA-8716 Project: Cassandra Issue Type: Bug Components: Core Environment: Centos 6.6, Cassandra 2.0.12, Oracle JDK 1.7.0_67 Reporter: Imri Zvik Assignee: Robert Stupp Priority: Minor Fix For: 2.0.13 Attachments: 8716.txt, system.log.gz {code}Error occurred during cleanup java.util.concurrent.ExecutionException: java.lang.AssertionError: Memory was freed at java.util.concurrent.FutureTask.report(FutureTask.java:122) at java.util.concurrent.FutureTask.get(FutureTask.java:188) at org.apache.cassandra.db.compaction.CompactionManager.performAllSSTableOperation(CompactionManager.java:234) at org.apache.cassandra.db.compaction.CompactionManager.performCleanup(CompactionManager.java:272) at org.apache.cassandra.db.ColumnFamilyStore.forceCleanup(ColumnFamilyStore.java:1115) at org.apache.cassandra.service.StorageService.forceKeyspaceCleanup(StorageService.java:2177) at sun.reflect.GeneratedMethodAccessor29.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at sun.reflect.misc.Trampoline.invoke(MethodUtil.java:75) at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at sun.reflect.misc.MethodUtil.invoke(MethodUtil.java:279) at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:112) at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:46) at com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:237) at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:138) at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:252) at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:819) at com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:801) at javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1487) at javax.management.remote.rmi.RMIConnectionImpl.access$300(RMIConnectionImpl.java:97) at javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1328) at javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1420) at javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:848) at sun.reflect.GeneratedMethodAccessor23.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:322) at sun.rmi.transport.Transport$1.run(Transport.java:177) at sun.rmi.transport.Transport$1.run(Transport.java:174) at java.security.AccessController.doPrivileged(Native Method) at sun.rmi.transport.Transport.serviceCall(Transport.java:173) at sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:556) at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:811) at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:670) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.AssertionError: Memory was freed at org.apache.cassandra.io.util.Memory.checkPosition(Memory.java:259) at org.apache.cassandra.io.util.Memory.getInt(Memory.java:211) at org.apache.cassandra.io.sstable.IndexSummary.getIndex(IndexSummary.java:79) at org.apache.cassandra.io.sstable.IndexSummary.getKey(IndexSummary.java:84) at
[jira] [Commented] (CASSANDRA-8812) JVM Crashes on Windows x86
[ https://issues.apache.org/jira/browse/CASSANDRA-8812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14324775#comment-14324775 ] Amichai Rothman commented on CASSANDRA-8812: Sure, I just ran it a few more times (with cassandra-all 2.1.2, if any of the line numbers changed) and got several of these: {noformat} 2015-02-17 11:35:13,470 | PERIODIC-COMMIT-LOG-SYNCER | o.a.c.d.c.CommitLog | ERROR | CommitLog.java:367 | Failed to persist commits to disk. Commit disk failure policy is stop; terminating thread org.apache.cassandra.io.FSWriteError: java.io.IOException: The handle is invalid at org.apache.cassandra.db.commitlog.CommitLogSegment.sync(CommitLogSegment.java:329) ~[cassandra-all-2.1.2.jar:2.1.2] at org.apache.cassandra.db.commitlog.CommitLog.sync(CommitLog.java:195)~[cassandra-all-2.1.2.jar:2.1.2] at org.apache.cassandra.db.commitlog.AbstractCommitLogService$1.run(AbstractCommitLogService.java:81) ~[cassandra-all-2.1.2.jar:2.1.2] at java.lang.Thread.run(Thread.java:745) [na:1.8.0_31] Caused by: java.io.IOException: The handle is invalid at java.nio.MappedByteBuffer.force0(Native Method) ~[na:1.8.0_31] at java.nio.MappedByteBuffer.force(MappedByteBuffer.java:203) ~[na:1.8.0_31] at org.apache.cassandra.db.commitlog.CommitLogSegment.sync(CommitLogSegment.java:315) ~[cassandra-all-2.1.2.jar:2.1.2] ... 3 common frames omitted {noformat} However in the past I also got exceptions with an exactly identical stack trace other than a different IOException message Attempt to access invalid address instead of The handle is invalid. JVM Crashes on Windows x86 -- Key: CASSANDRA-8812 URL: https://issues.apache.org/jira/browse/CASSANDRA-8812 Project: Cassandra Issue Type: Bug Environment: Windows 7 running x86(32-bit) Oracle JDK 1.8.0_u31 Reporter: Amichai Rothman Assignee: Joshua McKenzie Attachments: crashtest.tgz Under Windows (32 or 64 bit) with the 32-bit Oracle JDK, the JVM may crash due to EXCEPTION_ACCESS_VIOLATION. This happens inconsistently. The attached test project can recreate the crash - sometimes it works successfully, sometimes there's a Java exception in the log, and sometimes the hotspot JVM crash shows up (regardless of whether the JUnit test results in success - you can ignore that). Run it a bunch of times to see the various outcomes. It also contains a sample hotspot error log. Note that both when the Java exception is thrown and when the JVM crashes, the stack trace is almost the same - they both eventually occur when the PERIODIC-COMMIT-LOG-SYNCER thread calls CommitLogSegment.sync and accesses the buffer (MappedByteBuffer): if it happens to be in buffer.force(), then the Java exception is thrown, and if it's in one of the buffer.put() calls before it, then the JVM crashes. This possibly exposes a JVM bug as well in this case. So it basically looks like a race condition which results in the buffer sometimes being used after it is no longer valid. I recreated this on a PC with Windows 7 64-bit running the 32-bit Oracle JDK, as well as on a modern.ie virtualbox image of Windows 7 32-bit running the JDK, and it happens both with JDK 7 and JDK 8. Also defining an explicit dependency on cassandra 2.1.2 (as opposed to the cassandra-unit dependency on 2.1.0) doesn't make a difference. At some point in my testing I've also seen a Java-level exception on Linux, but I can't recreate it at the moment with this test project, so I can't guarantee it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8792) Improve Memory assertions
[ https://issues.apache.org/jira/browse/CASSANDRA-8792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14324820#comment-14324820 ] Benedict commented on CASSANDRA-8792: - pushed update Improve Memory assertions - Key: CASSANDRA-8792 URL: https://issues.apache.org/jira/browse/CASSANDRA-8792 Project: Cassandra Issue Type: Bug Components: Core Reporter: Benedict Assignee: Benedict Priority: Trivial Fix For: 2.1.4 Null pointers are valid returns if a size of zero is returned. We assume a null pointer implies resource mismanagement in a few places. We also don't properly check the bounds of all of our accesses; this patch attempts to tidy up both of these things. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-8119) More Expressive Consistency Levels
[ https://issues.apache.org/jira/browse/CASSANDRA-8119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14324725#comment-14324725 ] Jeremy Hanna edited comment on CASSANDRA-8119 at 2/17/15 7:21 PM: -- I think there's a case for LOCAL_QUORUM + ONE in another datacenter or LOCAL_QUORUM + ONE in every other datacenter. Essentially, it would be nice to guarantee that the data is persisted outside the local datacenter in a couple of forms. was (Author: jeromatron): I think there's a case for LOCAL_QUORUM + ONE in another datacenter or LOCAL_QUORUM + ONE in every other datacenter. More Expressive Consistency Levels -- Key: CASSANDRA-8119 URL: https://issues.apache.org/jira/browse/CASSANDRA-8119 Project: Cassandra Issue Type: New Feature Components: API Reporter: Tyler Hobbs For some multi-datacenter environments, the current set of consistency levels are too restrictive. For example, the following consistency requirements cannot be expressed: * LOCAL_QUORUM in two specific DCs * LOCAL_QUORUM in the local DC plus LOCAL_QUORUM in at least one other DC * LOCAL_QUORUM in the local DC plus N remote replicas in any DC I propose that we add a new consistency level: CUSTOM. In the v4 (or v5) protocol, this would be accompanied by an additional map argument. A map of {DC: CL} or a map of {DC: int} is sufficient to cover the first example. If we accept a special keys to represent any datacenter, the second case can be handled. A similar technique could be used for any other nodes. I'm not in love with the special keys, so if anybody has ideas for something more elegant, feel free to propose them. The main idea is that we want to be flexible enough to cover any reasonable consistency or durability requirements. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
svn commit: r1660477 - in /cassandra/site: publish/download/index.html publish/index.html src/content/download/index.html src/layout/skeleton/_download.html src/settings.py
Author: jake Date: Tue Feb 17 19:48:31 2015 New Revision: 1660477 URL: http://svn.apache.org/r1660477 Log: 2.1.3 release plus stable vs latest flag Modified: cassandra/site/publish/download/index.html cassandra/site/publish/index.html cassandra/site/src/content/download/index.html cassandra/site/src/layout/skeleton/_download.html cassandra/site/src/settings.py Modified: cassandra/site/publish/download/index.html URL: http://svn.apache.org/viewvc/cassandra/site/publish/download/index.html?rev=1660477r1=1660476r2=1660477view=diff == --- cassandra/site/publish/download/index.html (original) +++ cassandra/site/publish/download/index.html Tue Feb 17 19:48:31 2015 @@ -45,32 +45,59 @@ div class=span-24 h2 class=hdrCassandra Server/h2 - - Cassandra releases include the core server, the a href=http://wiki.apache.org/cassandra/NodeTool;nodetool/a administration command-line interface, and a development shell (a href=http://cassandra.apache.org/doc/cql/CQL.html;ttcqlsh/tt/a and the old ttcassandra-cli/tt). - p - The latest stable release of Apache Cassandra is 2.1.2 - (released on 2014-11-10). iIf you're just - starting out, download this one./i + Cassandra releases include the core server, the a href=http://wiki.apache.org/cassandra/NodeTool;nodetool/a administration command-line interface, and a development shell (a href=http://cassandra.apache.org/doc/cql/CQL.html;ttcqlsh/tt/a and the old ttcassandra-cli/tt). /p - Apache provides binary tarballs and Debian packages: + p + - ul -li -a class=filename - href=http://www.apache.org/dyn/closer.cgi?path=/cassandra/2.1.2/apache-cassandra-2.1.2-bin.tar.gz; - onclick=javascript: pageTracker._trackPageview('/clicks/binary_download'); - apache-cassandra-2.1.2-bin.tar.gz -/a -[a href=http://www.apache.org/dist/cassandra/2.1.2/apache-cassandra-2.1.2-bin.tar.gz.asc;PGP/a] -[a href=http://www.apache.org/dist/cassandra/2.1.2/apache-cassandra-2.1.2-bin.tar.gz.md5;MD5/a] -[a href=http://www.apache.org/dist/cassandra/2.1.2/apache-cassandra-2.1.2-bin.tar.gz.sha1;SHA1/a] -/li + bThere are currently two active releases available:/b + br/ + p + The latest release of Apache Cassandra is 2.1.3 + (released on 2014-02-17). iIf you're just + starting out and not yet in production, download this one./i +/p + + ul + li + a class=filename + href=http://www.apache.org/dyn/closer.cgi?path=/cassandra/2.1.3/apache-cassandra-2.1.3-bin.tar.gz; + onclick=javascript: pageTracker._trackPageview('/clicks/binary_download'); +apache-cassandra-2.1.3-bin.tar.gz + /a + [a href=http://www.apache.org/dist/cassandra/2.1.3/apache-cassandra-2.1.3-bin.tar.gz.asc;PGP/a] + [a href=http://www.apache.org/dist/cassandra/2.1.3/apache-cassandra-2.1.3-bin.tar.gz.md5;MD5/a] + [a href=http://www.apache.org/dist/cassandra/2.1.3/apache-cassandra-2.1.3-bin.tar.gz.sha1;SHA1/a] + /li + li + a href=http://wiki.apache.org/cassandra/DebianPackaging;Debian installation instructions/a + /li + /ul + + p + The bmost stable/b release of Apache Cassandra is 2.0.12 + (released on 2015-01-20). iIf you are in production or planning to be soon, download this one./i + /p + + ul + li + a class=filename href=http://www.apache.org/dyn/closer.cgi?path=/cassandra/2.0.12/apache-cassandra-2.0.12-bin.tar.gz;apache-cassandra-2.0.12-bin.tar.gz/a + [a href=http://www.apache.org/dist/cassandra/2.0.12/apache-cassandra-2.0.12-bin.tar.gz.asc;PGP/a] + [a href=http://www.apache.org/dist/cassandra/2.0.12/apache-cassandra-2.0.12-bin.tar.gz.md5;MD5/a] + [a href=http://www.apache.org/dist/cassandra/2.0.12/apache-cassandra-2.0.12-bin.tar.gz.sha1;SHA1/a] + /li li a href=http://wiki.apache.org/cassandra/DebianPackaging;Debian installation instructions/a /li - /ul + /ul + + + + /p + + h2 class=hdrThird Party Distributions (not endorsed by Apache)/h2 @@ -99,22 +126,6 @@ h2 class=hdrPrevious and Archived Cassandra Server Releases/h2 - p - Previous stable branches of Cassandra continue to see periodic maintenance - for some time after a new major release is made. The lastest release on the - 2.0 branch is 2.0.12 (released on - 2015-01-20). - /p - - ul -li -a class=filename href=http://www.apache.org/dyn/closer.cgi?path=/cassandra/2.0.12/apache-cassandra-2.0.12-bin.tar.gz;apache-cassandra-2.0.12-bin.tar.gz/a -[a href=http://www.apache.org/dist/cassandra/2.0.12/apache-cassandra-2.0.12-bin.tar.gz.asc;PGP/a] -[a href=http://www.apache.org/dist/cassandra/2.0.12/apache-cassandra-2.0.12-bin.tar.gz.md5;MD5/a] -[a href=http://www.apache.org/dist/cassandra/2.0.12/apache-cassandra-2.0.12-bin.tar.gz.sha1;SHA1/a] -/li - /ul - p @@ -144,13 +155,13 @@
[jira] [Commented] (CASSANDRA-8792) Improve Memory assertions
[ https://issues.apache.org/jira/browse/CASSANDRA-8792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14324790#comment-14324790 ] Jonathan Ellis commented on CASSANDRA-8792: --- My preference would be to assert that callers are using Memory the way we expect, rather than make sure we can handle cases that shouldn't happen. Let's go with that if it works for you. Improve Memory assertions - Key: CASSANDRA-8792 URL: https://issues.apache.org/jira/browse/CASSANDRA-8792 Project: Cassandra Issue Type: Bug Components: Core Reporter: Benedict Assignee: Benedict Priority: Trivial Fix For: 2.1.4 Null pointers are valid returns if a size of zero is returned. We assume a null pointer implies resource mismanagement in a few places. We also don't properly check the bounds of all of our accesses; this patch attempts to tidy up both of these things. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
svn commit: r1660488 - in /cassandra/site: publish/download/index.html src/settings.py
Author: jake Date: Tue Feb 17 20:16:57 2015 New Revision: 1660488 URL: http://svn.apache.org/r1660488 Log: fix date Modified: cassandra/site/publish/download/index.html cassandra/site/src/settings.py Modified: cassandra/site/publish/download/index.html URL: http://svn.apache.org/viewvc/cassandra/site/publish/download/index.html?rev=1660488r1=1660487r2=1660488view=diff == --- cassandra/site/publish/download/index.html (original) +++ cassandra/site/publish/download/index.html Tue Feb 17 20:16:57 2015 @@ -56,7 +56,7 @@ br/ p The latest release of Apache Cassandra is 2.1.3 - (released on 2014-02-17). iIf you're just + (released on 2015-02-17). iIf you're just starting out and not yet in production, download this one./i /p Modified: cassandra/site/src/settings.py URL: http://svn.apache.org/viewvc/cassandra/site/src/settings.py?rev=1660488r1=1660487r2=1660488view=diff == --- cassandra/site/src/settings.py (original) +++ cassandra/site/src/settings.py Tue Feb 17 20:16:57 2015 @@ -93,7 +93,7 @@ SITE_POST_PROCESSORS = { class CassandraDef(object): stable_version = '2.1.3' -stable_release_date = '2014-02-17' +stable_release_date = '2015-02-17' is_stable_prod_ready = False oldstable_version = '2.0.12' oldstable_release_date = '2015-01-20'
svn commit: r1660485 - in /cassandra/site: publish/index.html src/content/index.html
Author: jake Date: Tue Feb 17 20:14:56 2015 New Revision: 1660485 URL: http://svn.apache.org/r1660485 Log: books Modified: cassandra/site/publish/index.html cassandra/site/src/content/index.html Modified: cassandra/site/publish/index.html URL: http://svn.apache.org/viewvc/cassandra/site/publish/index.html?rev=1660485r1=1660484r2=1660485view=diff == --- cassandra/site/publish/index.html (original) +++ cassandra/site/publish/index.html Tue Feb 17 20:14:56 2015 @@ -211,8 +211,7 @@ h3 class=hdrDead Trees/h3 ul class=nobullets - lia href=http://www.amazon.com/Practical-Cassandra-Developers-Approach/dp/032193394X;Practical Cassandra: A Developer's Approach/a, by Russell Bradberry and Eric Lubow. Covers Cassandra 1.2. - lia href=http://www.packtpub.com/learning-cassandra-for-administrators/book;Learning Cassandra for Administrators/a, by Vijay Parthasarathy. Covers Cassandra 1.2 and 2.0; a href=http://www.amazon.com/Learning-Cassandra-Administrators-Vijay-Parthasarathy/dp/1782168176;also on Amazon/a. + lia href=http://www.amazon.com/Cassandra-High-Availability-Robbie-Strickland/dp/1783989122;Cassandra High Availability/a, by Robbie Strickland. Covers Cassandra 2.0/li /ul /div /div Modified: cassandra/site/src/content/index.html URL: http://svn.apache.org/viewvc/cassandra/site/src/content/index.html?rev=1660485r1=1660484r2=1660485view=diff == --- cassandra/site/src/content/index.html (original) +++ cassandra/site/src/content/index.html Tue Feb 17 20:14:56 2015 @@ -151,8 +151,7 @@ h3 class=hdrDead Trees/h3 ul class=nobullets - lia href=http://www.amazon.com/Practical-Cassandra-Developers-Approach/dp/032193394X;Practical Cassandra: A Developer's Approach/a, by Russell Bradberry and Eric Lubow. Covers Cassandra 1.2. - lia href=http://www.packtpub.com/learning-cassandra-for-administrators/book;Learning Cassandra for Administrators/a, by Vijay Parthasarathy. Covers Cassandra 1.2 and 2.0; a href=http://www.amazon.com/Learning-Cassandra-Administrators-Vijay-Parthasarathy/dp/1782168176;also on Amazon/a. + lia href=http://www.amazon.com/Cassandra-High-Availability-Robbie-Strickland/dp/1783989122;Cassandra High Availability/a, by Robbie Strickland. Covers Cassandra 2.0/li /ul /div /div
[jira] [Commented] (CASSANDRA-8792) Improve Memory assertions
[ https://issues.apache.org/jira/browse/CASSANDRA-8792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14324798#comment-14324798 ] Benedict commented on CASSANDRA-8792: - Well, I said probably - I honestly don't know if it is reasonable for us to ever (now or in future) allocate a zero length Memory object. But sure, let's go with that. Improve Memory assertions - Key: CASSANDRA-8792 URL: https://issues.apache.org/jira/browse/CASSANDRA-8792 Project: Cassandra Issue Type: Bug Components: Core Reporter: Benedict Assignee: Benedict Priority: Trivial Fix For: 2.1.4 Null pointers are valid returns if a size of zero is returned. We assume a null pointer implies resource mismanagement in a few places. We also don't properly check the bounds of all of our accesses; this patch attempts to tidy up both of these things. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8398) Expose time spent waiting in thread pool queue
[ https://issues.apache.org/jira/browse/CASSANDRA-8398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14324850#comment-14324850 ] sankalp kohli commented on CASSANDRA-8398: -- This will be very good metric to have and monitor. Expose time spent waiting in thread pool queue --- Key: CASSANDRA-8398 URL: https://issues.apache.org/jira/browse/CASSANDRA-8398 Project: Cassandra Issue Type: Improvement Reporter: T Jake Luciani Priority: Minor Fix For: 2.1.4 We are missing an important source of latency in our system, the time waiting to be processed by thread pools. We should add a metric for this so someone can easily see how much time is spent just waiting to be processed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-8734) Expose commit log archive status
[ https://issues.apache.org/jira/browse/CASSANDRA-8734?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joshua McKenzie updated CASSANDRA-8734: --- Fix Version/s: 2.1.4 Expose commit log archive status Key: CASSANDRA-8734 URL: https://issues.apache.org/jira/browse/CASSANDRA-8734 Project: Cassandra Issue Type: New Feature Components: Config Reporter: Philip S Doctor Assignee: Chris Lohfink Fix For: 2.1.4 Attachments: 8734-cassandra-2.1.txt The operational procedure to modify commit log archiving is to edit commitlog_archiving.properties and then perform a restart. However this has troublesome edge cases: 1) It is possible for people to modify commitlog_archiving.properties but then not perform a restart 2) It is possible for people to modify commitlog_archiving.properties only on some nodes 3) It is possible for people to have modified file + restart but then later add more nodes without correct modifications. Because of these reasons, it is operationally useful to be able to audit the commit log archive state of a node. Simply parsing commitlog_archiving.properties is insufficient due to #1. I would suggest exposing either via some system table or JMX would be useful. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8767) Added column does not sort as the last column when using new python driver
[ https://issues.apache.org/jira/browse/CASSANDRA-8767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14324681#comment-14324681 ] Tyler Hobbs commented on CASSANDRA-8767: It looks like the root of the problem is that ColumnFamily serialization doesn't include a {{reversed}} flag. When the query is executed locally, the {{reversed}} flag is set on the CF holding the results. When it's executed remotely and then deserialized, it's not. The {{reversed}} flag is then used to alter the comparator for adding cells to the CF. This causes the comparator incorrect for the remote-execution case, leading to the assertion error. Added column does not sort as the last column when using new python driver Key: CASSANDRA-8767 URL: https://issues.apache.org/jira/browse/CASSANDRA-8767 Project: Cassandra Issue Type: Bug Components: Core, Drivers (now out of tree) Environment: Cassandra 2.0.10, python-driver 2.1.3 Reporter: Russ Garrett Assignee: Tyler Hobbs Fix For: 2.0.13 Attachments: 8767-debug-logging.txt, describe-table.txt, exception-with-logging.txt, exception.txt We've just upgraded one of our python apps from using the old cql library to the new python-driver. When running one particular query, it produces the attached assertion error in Cassandra. The query is: bq. SELECT buffer, id, type, json FROM events WHERE buffer = %(bid)s AND idkey = %(idkey)s ORDER BY id ASC Where buffer and idkey are integer primary keys, and id is the clustering key (ordered asc). This query, with identical parameters, does not cause this error using the old cql python library, or with the cqlsh client.. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-8767) Added column does not sort as the last column when using new python driver
[ https://issues.apache.org/jira/browse/CASSANDRA-8767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14324681#comment-14324681 ] Tyler Hobbs edited comment on CASSANDRA-8767 at 2/17/15 7:03 PM: - It looks like the root of the problem is that ColumnFamily serialization doesn't include a {{reversed}} flag. When the query is executed locally, the {{reversed}} flag is set on the CF holding the results. When it's executed remotely and then deserialized, it's not. The {{reversed}} flag is then used to alter the comparator for adding cells to the CF. This causes the comparator to be incorrect for the remote-execution case, leading to the assertion error. was (Author: thobbs): It looks like the root of the problem is that ColumnFamily serialization doesn't include a {{reversed}} flag. When the query is executed locally, the {{reversed}} flag is set on the CF holding the results. When it's executed remotely and then deserialized, it's not. The {{reversed}} flag is then used to alter the comparator for adding cells to the CF. This causes the comparator incorrect for the remote-execution case, leading to the assertion error. Added column does not sort as the last column when using new python driver Key: CASSANDRA-8767 URL: https://issues.apache.org/jira/browse/CASSANDRA-8767 Project: Cassandra Issue Type: Bug Components: Core, Drivers (now out of tree) Environment: Cassandra 2.0.10, python-driver 2.1.3 Reporter: Russ Garrett Assignee: Tyler Hobbs Fix For: 2.0.13 Attachments: 8767-debug-logging.txt, describe-table.txt, exception-with-logging.txt, exception.txt We've just upgraded one of our python apps from using the old cql library to the new python-driver. When running one particular query, it produces the attached assertion error in Cassandra. The query is: bq. SELECT buffer, id, type, json FROM events WHERE buffer = %(bid)s AND idkey = %(idkey)s ORDER BY id ASC Where buffer and idkey are integer primary keys, and id is the clustering key (ordered asc). This query, with identical parameters, does not cause this error using the old cql python library, or with the cqlsh client.. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8789) OutboundTcpConnectionPool should route messages to sockets by size not type
[ https://issues.apache.org/jira/browse/CASSANDRA-8789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14324683#comment-14324683 ] Benedict commented on CASSANDRA-8789: - It's a nice neat patch. It might be worth commenting on the payloadSize memoization that we piggyback on visibility guarantees of the queue we use to pass the message to another thread, since we do always pass it, and that once handed over we should never call payloadSize() again on the thread that has handed off ownership. When I commit I'll also clean up some legacy cruft, like some generic parameters, and normalising the operation over both connections (in one place we just list them both, in the other two we construct an array and iterate, I'd prefer to do just one). But these are unrelated to this patch. OutboundTcpConnectionPool should route messages to sockets by size not type --- Key: CASSANDRA-8789 URL: https://issues.apache.org/jira/browse/CASSANDRA-8789 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Ariel Weisberg Assignee: Ariel Weisberg Fix For: 3.0 Attachments: 8789.diff I was looking at this trying to understand what messages flow over which connection. For reads the request goes out over the command connection and the response comes back over the ack connection. For writes the request goes out over the command connection and the response comes back over the command connection. Reads get a dedicated socket for responses. Mutation commands and responses both travel over the same socket along with read requests. Sockets are used uni-directional so there are actually four sockets in play and four threads at each node (2 inbounded, 2 outbound). CASSANDRA-488 doesn't leave a record of what the impact of this change was. If someone remembers what situations were made better it would be good to know. I am not clear on when/how this is helpful. The consumer side shouldn't be blocking so the only head of line blocking issue is the time it takes to transfer data over the wire. If message size is the cause of blocking issues then the current design mixes small messages and large messages on the same connection retaining the head of line blocking. Read requests share the same connection as write requests (which are large), and write acknowledgments (which are small) share the same connections as write requests. The only winner is read acknowledgements. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-8815) Race in sstable ref counting during streaming failures
[ https://issues.apache.org/jira/browse/CASSANDRA-8815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benedict updated CASSANDRA-8815: Attachment: 8815.txt Yep, nice spot. Attached a patch that calls files.clear() at the end, as well as ensuring it reaches that spot by catching any possible exceptions during cleanup. Race in sstable ref counting during streaming failures Key: CASSANDRA-8815 URL: https://issues.apache.org/jira/browse/CASSANDRA-8815 Project: Cassandra Issue Type: Bug Components: Core Reporter: sankalp kohli Assignee: Benedict Fix For: 2.0.13 Attachments: 8815.txt We have a seen a machine in Prod whose all read threads are blocked(spinning) on trying to acquire the reference lock on stables. There are also some stream sessions which are doing the same. On looking at the heap dump, we could see that a live sstable which is part of the View has a ref count = 0. This sstable is also not compacting or is part of any failed compaction. On looking through the code, we could see that if ref goes to zero and the stable is part of the View, all reader threads will spin forever. On further looking through the code of streaming, we could see that if StreamTransferTask.complete is called after closeSession has been called due to error in OutgoingMessageHandler, it will double decrement the ref count of an sstable. This race can happen and we see through exception in logs that closeSession was triggered by OutgoingMessageHandler. The fix for this is very simple i think. In StreamTransferTask.abort, we can remove a file from files” before decrementing the ref count. This will avoid this race. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8815) Race in sstable ref counting during streaming failures
[ https://issues.apache.org/jira/browse/CASSANDRA-8815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14324637#comment-14324637 ] sankalp kohli commented on CASSANDRA-8815: -- +1 Looks good. Race in sstable ref counting during streaming failures Key: CASSANDRA-8815 URL: https://issues.apache.org/jira/browse/CASSANDRA-8815 Project: Cassandra Issue Type: Bug Components: Core Reporter: sankalp kohli Assignee: Benedict Fix For: 2.0.13 Attachments: 8815.txt We have a seen a machine in Prod whose all read threads are blocked(spinning) on trying to acquire the reference lock on stables. There are also some stream sessions which are doing the same. On looking at the heap dump, we could see that a live sstable which is part of the View has a ref count = 0. This sstable is also not compacting or is part of any failed compaction. On looking through the code, we could see that if ref goes to zero and the stable is part of the View, all reader threads will spin forever. On further looking through the code of streaming, we could see that if StreamTransferTask.complete is called after closeSession has been called due to error in OutgoingMessageHandler, it will double decrement the ref count of an sstable. This race can happen and we see through exception in logs that closeSession was triggered by OutgoingMessageHandler. The fix for this is very simple i think. In StreamTransferTask.abort, we can remove a file from files” before decrementing the ref count. This will avoid this race. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8119) More Expressive Consistency Levels
[ https://issues.apache.org/jira/browse/CASSANDRA-8119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14324689#comment-14324689 ] Benedict commented on CASSANDRA-8119: - I wonder if, as another (or related) patch we might want to introduce DC groups so that we could specify, say, QUORUM in any primary DC and QUORUM in any backup DC, or QUORUM in any European DC and QUORUM in any other DC More Expressive Consistency Levels -- Key: CASSANDRA-8119 URL: https://issues.apache.org/jira/browse/CASSANDRA-8119 Project: Cassandra Issue Type: New Feature Components: API Reporter: Tyler Hobbs For some multi-datacenter environments, the current set of consistency levels are too restrictive. For example, the following consistency requirements cannot be expressed: * LOCAL_QUORUM in two specific DCs * LOCAL_QUORUM in the local DC plus LOCAL_QUORUM in at least one other DC * LOCAL_QUORUM in the local DC plus N remote replicas in any DC I propose that we add a new consistency level: CUSTOM. In the v4 (or v5) protocol, this would be accompanied by an additional map argument. A map of {DC: CL} or a map of {DC: int} is sufficient to cover the first example. If we accept a special keys to represent any datacenter, the second case can be handled. A similar technique could be used for any other nodes. I'm not in love with the special keys, so if anybody has ideas for something more elegant, feel free to propose them. The main idea is that we want to be flexible enough to cover any reasonable consistency or durability requirements. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
svn commit: r8040 - in /release/cassandra: 2.1.3/ debian/dists/21x/ debian/dists/21x/main/binary-amd64/ debian/dists/21x/main/binary-i386/ debian/dists/21x/main/source/ debian/pool/main/c/cassandra/
Author: jake Date: Tue Feb 17 19:21:53 2015 New Revision: 8040 Log: 2.1.3 release Added: release/cassandra/2.1.3/ release/cassandra/2.1.3/apache-cassandra-2.1.3-bin.tar.gz (with props) release/cassandra/2.1.3/apache-cassandra-2.1.3-bin.tar.gz.asc release/cassandra/2.1.3/apache-cassandra-2.1.3-bin.tar.gz.asc.md5 release/cassandra/2.1.3/apache-cassandra-2.1.3-bin.tar.gz.asc.sha1 release/cassandra/2.1.3/apache-cassandra-2.1.3-bin.tar.gz.md5 release/cassandra/2.1.3/apache-cassandra-2.1.3-bin.tar.gz.sha1 release/cassandra/2.1.3/apache-cassandra-2.1.3-src.tar.gz (with props) release/cassandra/2.1.3/apache-cassandra-2.1.3-src.tar.gz.asc release/cassandra/2.1.3/apache-cassandra-2.1.3-src.tar.gz.asc.md5 release/cassandra/2.1.3/apache-cassandra-2.1.3-src.tar.gz.asc.sha1 release/cassandra/2.1.3/apache-cassandra-2.1.3-src.tar.gz.md5 release/cassandra/2.1.3/apache-cassandra-2.1.3-src.tar.gz.sha1 release/cassandra/debian/pool/main/c/cassandra/cassandra-tools_2.1.3_all.deb (with props) release/cassandra/debian/pool/main/c/cassandra/cassandra_2.1.3.diff.gz (with props) release/cassandra/debian/pool/main/c/cassandra/cassandra_2.1.3.dsc release/cassandra/debian/pool/main/c/cassandra/cassandra_2.1.3.orig.tar.gz (with props) release/cassandra/debian/pool/main/c/cassandra/cassandra_2.1.3.orig.tar.gz.asc release/cassandra/debian/pool/main/c/cassandra/cassandra_2.1.3_all.deb (with props) Modified: release/cassandra/debian/dists/21x/InRelease release/cassandra/debian/dists/21x/Release release/cassandra/debian/dists/21x/Release.gpg release/cassandra/debian/dists/21x/main/binary-amd64/Packages release/cassandra/debian/dists/21x/main/binary-amd64/Packages.gz release/cassandra/debian/dists/21x/main/binary-i386/Packages release/cassandra/debian/dists/21x/main/binary-i386/Packages.gz release/cassandra/debian/dists/21x/main/source/Sources.gz Added: release/cassandra/2.1.3/apache-cassandra-2.1.3-bin.tar.gz == Binary file - no diff available. Propchange: release/cassandra/2.1.3/apache-cassandra-2.1.3-bin.tar.gz -- svn:mime-type = application/octet-stream Added: release/cassandra/2.1.3/apache-cassandra-2.1.3-bin.tar.gz.asc == --- release/cassandra/2.1.3/apache-cassandra-2.1.3-bin.tar.gz.asc (added) +++ release/cassandra/2.1.3/apache-cassandra-2.1.3-bin.tar.gz.asc Tue Feb 17 19:21:53 2015 @@ -0,0 +1,17 @@ +-BEGIN PGP SIGNATURE- +Version: GnuPG v1 + +iQIcBAABAgAGBQJU22RHAAoJEHSdbuwDU7EsidUP/jviLj1igq3FxThvJjYVzSIj +3pj1NmwhUQa7quGSp5XjxLD7vlCBdAyAL7xvxnBGhGjGpArCMsRfZB5BMD5X8zBd +gywyUB+hgeQ+Is8UWfsnFdq7mYuE0zOH2BpuSdakn99B8RrMKdJSvvwelmGlFZjm +dIirGdeZ00OGxP5k3kSpzdkxEoLrjIlo1r/+2cK1E4MzLU8pWSjetNGMV2yisWpT +LtudXdKidj1DqE5dl3MAQ3gwdXbY4+bt99FWWMAfAdKrPIyhaH2xEt8gb7cHQoF9 +JbI4DXuQkTEC2AIx7O9bJNnYBFjfJSne/pDivMbzYuvTW865CjhOKVAvocZ05ALv +EnPN0y3I2J0vqJoQw/kqHICgXe8WWt6SKn0yUrx0aLL+Zp1VZrU5EobZjLLnmx/R ++bYDmgpURtp0wG/j5FhakkJJUCazEVhvF/uIpRi6YH9uQuXrwBT6DTJ9AawzmT1T +mdCws5gS/uJmtPvW/3ywLAbe6wOzh+zX2yhXtgYFAw02ioW1OZdW7oHdKMLwCaEp +Ve4scWF3/pXNxAgp5pWewa7DYC07EWNfQZxngqTK8nKOXOaoAJZ1TEpgaHom8qB8 +1q4uWfwwN4wfW6oP5JFgTf6dy0IzXn69ai6g5R0zDGUpZ67RTLEq8tLpspG8OeH+ +pd7kZmfxejOdCDUD4EjV +=A+mf +-END PGP SIGNATURE- Added: release/cassandra/2.1.3/apache-cassandra-2.1.3-bin.tar.gz.asc.md5 == --- release/cassandra/2.1.3/apache-cassandra-2.1.3-bin.tar.gz.asc.md5 (added) +++ release/cassandra/2.1.3/apache-cassandra-2.1.3-bin.tar.gz.asc.md5 Tue Feb 17 19:21:53 2015 @@ -0,0 +1 @@ +300ef93ec08c7dda25d5cde5edab60e0 \ No newline at end of file Added: release/cassandra/2.1.3/apache-cassandra-2.1.3-bin.tar.gz.asc.sha1 == --- release/cassandra/2.1.3/apache-cassandra-2.1.3-bin.tar.gz.asc.sha1 (added) +++ release/cassandra/2.1.3/apache-cassandra-2.1.3-bin.tar.gz.asc.sha1 Tue Feb 17 19:21:53 2015 @@ -0,0 +1 @@ +33fb8cf4949eb1d2547bf0e3965c01fc6d108c1a \ No newline at end of file Added: release/cassandra/2.1.3/apache-cassandra-2.1.3-bin.tar.gz.md5 == --- release/cassandra/2.1.3/apache-cassandra-2.1.3-bin.tar.gz.md5 (added) +++ release/cassandra/2.1.3/apache-cassandra-2.1.3-bin.tar.gz.md5 Tue Feb 17 19:21:53 2015 @@ -0,0 +1 @@ +c01e85165d896447da5a1b3a038ff1ca \ No newline at end of file Added: release/cassandra/2.1.3/apache-cassandra-2.1.3-bin.tar.gz.sha1 == --- release/cassandra/2.1.3/apache-cassandra-2.1.3-bin.tar.gz.sha1 (added) +++
[jira] [Commented] (CASSANDRA-8789) OutboundTcpConnectionPool should route messages to sockets by size not type
[ https://issues.apache.org/jira/browse/CASSANDRA-8789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14324597#comment-14324597 ] Ariel Weisberg commented on CASSANDRA-8789: --- I should also add that I originally did this off of C-8692 so that the performance measurements would be meaningful since coalescing is a pre-req for this to a degree. OutboundTcpConnectionPool should route messages to sockets by size not type --- Key: CASSANDRA-8789 URL: https://issues.apache.org/jira/browse/CASSANDRA-8789 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Ariel Weisberg Assignee: Ariel Weisberg Fix For: 3.0 Attachments: 8789.diff I was looking at this trying to understand what messages flow over which connection. For reads the request goes out over the command connection and the response comes back over the ack connection. For writes the request goes out over the command connection and the response comes back over the command connection. Reads get a dedicated socket for responses. Mutation commands and responses both travel over the same socket along with read requests. Sockets are used uni-directional so there are actually four sockets in play and four threads at each node (2 inbounded, 2 outbound). CASSANDRA-488 doesn't leave a record of what the impact of this change was. If someone remembers what situations were made better it would be good to know. I am not clear on when/how this is helpful. The consumer side shouldn't be blocking so the only head of line blocking issue is the time it takes to transfer data over the wire. If message size is the cause of blocking issues then the current design mixes small messages and large messages on the same connection retaining the head of line blocking. Read requests share the same connection as write requests (which are large), and write acknowledgments (which are small) share the same connections as write requests. The only winner is read acknowledgements. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-8757) IndexSummaryBuilder should construct itself offheap, and share memory between the result of each build() invocation
[ https://issues.apache.org/jira/browse/CASSANDRA-8757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-8757: -- Reviewer: Ariel Weisberg IndexSummaryBuilder should construct itself offheap, and share memory between the result of each build() invocation --- Key: CASSANDRA-8757 URL: https://issues.apache.org/jira/browse/CASSANDRA-8757 Project: Cassandra Issue Type: Bug Components: Core Reporter: Benedict Assignee: Benedict Fix For: 2.1.4 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8807) SSTableSimpleUnsortedWriter may block indefinitely on close() if disk writer has crashed
[ https://issues.apache.org/jira/browse/CASSANDRA-8807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14324539#comment-14324539 ] Joshua McKenzie commented on CASSANDRA-8807: +1 SSTableSimpleUnsortedWriter may block indefinitely on close() if disk writer has crashed Key: CASSANDRA-8807 URL: https://issues.apache.org/jira/browse/CASSANDRA-8807 Project: Cassandra Issue Type: Bug Components: Core Reporter: Benedict Assignee: Benedict Priority: Minor Fix For: 2.1.4 Attachments: 8807.txt As was recently fixed with the sync() method, the final put() of the SENTINEL in close() may never return if the disk writer encounters an exception. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8789) OutboundTcpConnectionPool should route messages to sockets by size not type
[ https://issues.apache.org/jira/browse/CASSANDRA-8789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14324577#comment-14324577 ] Ariel Weisberg commented on CASSANDRA-8789: --- Well... I avoid rebasing trunk frequently because a good chunk of the time I do that I get something that is not working. Meaning I can't run a benchmark to evaluate performance. It also means my baseline is slightly more suspect as various things change and I have to take earlier performance numbers with a grain of salt. I rebased off of trunk https://github.com/aweisberg/cassandra/compare/C-8789-2?expand=1 OutboundTcpConnectionPool should route messages to sockets by size not type --- Key: CASSANDRA-8789 URL: https://issues.apache.org/jira/browse/CASSANDRA-8789 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Ariel Weisberg Assignee: Ariel Weisberg Fix For: 3.0 Attachments: 8789.diff I was looking at this trying to understand what messages flow over which connection. For reads the request goes out over the command connection and the response comes back over the ack connection. For writes the request goes out over the command connection and the response comes back over the command connection. Reads get a dedicated socket for responses. Mutation commands and responses both travel over the same socket along with read requests. Sockets are used uni-directional so there are actually four sockets in play and four threads at each node (2 inbounded, 2 outbound). CASSANDRA-488 doesn't leave a record of what the impact of this change was. If someone remembers what situations were made better it would be good to know. I am not clear on when/how this is helpful. The consumer side shouldn't be blocking so the only head of line blocking issue is the time it takes to transfer data over the wire. If message size is the cause of blocking issues then the current design mixes small messages and large messages on the same connection retaining the head of line blocking. Read requests share the same connection as write requests (which are large), and write acknowledgments (which are small) share the same connections as write requests. The only winner is read acknowledgements. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-7555) Support copy and link for commitlog archiving without forking the jvm
[ https://issues.apache.org/jira/browse/CASSANDRA-7555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joshua McKenzie updated CASSANDRA-7555: --- Fix Version/s: (was: 2.1.3) 2.1.4 Support copy and link for commitlog archiving without forking the jvm - Key: CASSANDRA-7555 URL: https://issues.apache.org/jira/browse/CASSANDRA-7555 Project: Cassandra Issue Type: Improvement Reporter: Nick Bailey Assignee: Joshua McKenzie Priority: Minor Fix For: 2.1.4 Right now for commitlog archiving the user specifies a command to run and c* forks the jvm to run that command. The most common operations will be either copy or link (hard or soft). Since we can do all of these operations without forking the jvm, which is very expensive, we should have special cases for those. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8119) More Expressive Consistency Levels
[ https://issues.apache.org/jira/browse/CASSANDRA-8119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14324725#comment-14324725 ] Jeremy Hanna commented on CASSANDRA-8119: - I think there's a case for LOCAL_QUORUM + ONE in another datacenter or LOCAL_QUORUM + ONE in every other datacenter. More Expressive Consistency Levels -- Key: CASSANDRA-8119 URL: https://issues.apache.org/jira/browse/CASSANDRA-8119 Project: Cassandra Issue Type: New Feature Components: API Reporter: Tyler Hobbs For some multi-datacenter environments, the current set of consistency levels are too restrictive. For example, the following consistency requirements cannot be expressed: * LOCAL_QUORUM in two specific DCs * LOCAL_QUORUM in the local DC plus LOCAL_QUORUM in at least one other DC * LOCAL_QUORUM in the local DC plus N remote replicas in any DC I propose that we add a new consistency level: CUSTOM. In the v4 (or v5) protocol, this would be accompanied by an additional map argument. A map of {DC: CL} or a map of {DC: int} is sufficient to cover the first example. If we accept a special keys to represent any datacenter, the second case can be handled. A similar technique could be used for any other nodes. I'm not in love with the special keys, so if anybody has ideas for something more elegant, feel free to propose them. The main idea is that we want to be flexible enough to cover any reasonable consistency or durability requirements. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8308) Windows: Commitlog access violations on unit tests
[ https://issues.apache.org/jira/browse/CASSANDRA-8308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14324704#comment-14324704 ] Joshua McKenzie commented on CASSANDRA-8308: [~philipthompson]: Could you do another run on cassci against [this branch|https://github.com/josh-mckenzie/cassandra/compare/8308_fix3]? Has the try/spin loop on segment deletion removed. Windows: Commitlog access violations on unit tests -- Key: CASSANDRA-8308 URL: https://issues.apache.org/jira/browse/CASSANDRA-8308 Project: Cassandra Issue Type: Bug Reporter: Joshua McKenzie Assignee: Joshua McKenzie Priority: Minor Labels: Windows Fix For: 3.0 Attachments: 8308-post-fix-3.txt, 8308-post-fix.txt, 8308_v1.txt, 8308_v2.txt, 8308_v3.txt We have four unit tests failing on trunk on Windows, all with FileSystemException's related to the SchemaLoader: {noformat} [junit] Test org.apache.cassandra.db.compaction.DateTieredCompactionStrategyTest FAILED [junit] Test org.apache.cassandra.cql3.ThriftCompatibilityTest FAILED [junit] Test org.apache.cassandra.io.sstable.SSTableRewriterTest FAILED [junit] Test org.apache.cassandra.repair.LocalSyncTaskTest FAILED {noformat} Example error: {noformat} [junit] Caused by: java.nio.file.FileSystemException: build\test\cassandra\commitlog;0\CommitLog-5-1415908745965.log: The process cannot access the file because it is being used by another process. [junit] [junit] at sun.nio.fs.WindowsException.translateToIOException(WindowsException.java:86) [junit] at sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:97) [junit] at sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:102) [junit] at sun.nio.fs.WindowsFileSystemProvider.implDelete(WindowsFileSystemProvider.java:269) [junit] at sun.nio.fs.AbstractFileSystemProvider.delete(AbstractFileSystemProvider.java:103) [junit] at java.nio.file.Files.delete(Files.java:1079) [junit] at org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:125) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8789) OutboundTcpConnectionPool should route messages to sockets by size not type
[ https://issues.apache.org/jira/browse/CASSANDRA-8789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14324537#comment-14324537 ] Benedict commented on CASSANDRA-8789: - is this based on latest trunk? got a failed apply. Much prefer github links so this isn't a problem :) OutboundTcpConnectionPool should route messages to sockets by size not type --- Key: CASSANDRA-8789 URL: https://issues.apache.org/jira/browse/CASSANDRA-8789 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Ariel Weisberg Assignee: Ariel Weisberg Fix For: 3.0 Attachments: 8789.diff I was looking at this trying to understand what messages flow over which connection. For reads the request goes out over the command connection and the response comes back over the ack connection. For writes the request goes out over the command connection and the response comes back over the command connection. Reads get a dedicated socket for responses. Mutation commands and responses both travel over the same socket along with read requests. Sockets are used uni-directional so there are actually four sockets in play and four threads at each node (2 inbounded, 2 outbound). CASSANDRA-488 doesn't leave a record of what the impact of this change was. If someone remembers what situations were made better it would be good to know. I am not clear on when/how this is helpful. The consumer side shouldn't be blocking so the only head of line blocking issue is the time it takes to transfer data over the wire. If message size is the cause of blocking issues then the current design mixes small messages and large messages on the same connection retaining the head of line blocking. Read requests share the same connection as write requests (which are large), and write acknowledgments (which are small) share the same connections as write requests. The only winner is read acknowledgements. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (CASSANDRA-6390) Add LOCAL_QUORUM+N consistency
[ https://issues.apache.org/jira/browse/CASSANDRA-6390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Philip Thompson resolved CASSANDRA-6390. Resolution: Duplicate Add LOCAL_QUORUM+N consistency -- Key: CASSANDRA-6390 URL: https://issues.apache.org/jira/browse/CASSANDRA-6390 Project: Cassandra Issue Type: Improvement Reporter: Tupshin Harper Priority: Minor There are a few occasionally requested patterns requested for consistency level that are not currently supported by Cassandra, and given the sentiment that the most general solution (CASSANDRA-2338) is unlikely to ever make sense, I'd like to propose the implementation of at least one additional consistency level. The specific request that I hear about the most often is to achieve local consistency while still ensuring at least one off-site copy. Ideally this could be specified as LOCAL_QUORUM + N, where N could range from 1 to non-local-DC RF. The challenge in implementing this simply today is that the cql protocol and the drivers assume that CL is an enum, and you can't make the general case of this feature an enum. However, it could be quite reasonable to special case LQ+ONE, LQ+TWO, and LQ+THREE if it would be prohibitive to change the protocol and drivers. For completeness, I'll mention that if it were easy, I would love to also see per-DC CL (so you could do stupid things like (DC1=LQ,DC2=1), but this ticket is not about that, and there is no obvious way to fit that into the current approach. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-8734) Expose commit log archive status
[ https://issues.apache.org/jira/browse/CASSANDRA-8734?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joshua McKenzie updated CASSANDRA-8734: --- Priority: Minor (was: Major) Expose commit log archive status Key: CASSANDRA-8734 URL: https://issues.apache.org/jira/browse/CASSANDRA-8734 Project: Cassandra Issue Type: New Feature Components: Config Reporter: Philip S Doctor Assignee: Chris Lohfink Priority: Minor Fix For: 2.1.4 Attachments: 8734-cassandra-2.1.txt The operational procedure to modify commit log archiving is to edit commitlog_archiving.properties and then perform a restart. However this has troublesome edge cases: 1) It is possible for people to modify commitlog_archiving.properties but then not perform a restart 2) It is possible for people to modify commitlog_archiving.properties only on some nodes 3) It is possible for people to have modified file + restart but then later add more nodes without correct modifications. Because of these reasons, it is operationally useful to be able to audit the commit log archive state of a node. Simply parsing commitlog_archiving.properties is insufficient due to #1. I would suggest exposing either via some system table or JMX would be useful. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8734) Expose commit log archive status
[ https://issues.apache.org/jira/browse/CASSANDRA-8734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14324555#comment-14324555 ] Joshua McKenzie commented on CASSANDRA-8734: It's a bean modification w/read-only and a touch of scope change - I'm comfortable adding this to 2.0. Code LGTM - could you rebase to 2.0 [~cnlwsu]? I'll get it committed as soon as we have a 2.0 version and merge it up. Expose commit log archive status Key: CASSANDRA-8734 URL: https://issues.apache.org/jira/browse/CASSANDRA-8734 Project: Cassandra Issue Type: New Feature Components: Config Reporter: Philip S Doctor Assignee: Chris Lohfink Priority: Minor Fix For: 2.1.4 Attachments: 8734-cassandra-2.1.txt The operational procedure to modify commit log archiving is to edit commitlog_archiving.properties and then perform a restart. However this has troublesome edge cases: 1) It is possible for people to modify commitlog_archiving.properties but then not perform a restart 2) It is possible for people to modify commitlog_archiving.properties only on some nodes 3) It is possible for people to have modified file + restart but then later add more nodes without correct modifications. Because of these reasons, it is operationally useful to be able to audit the commit log archive state of a node. Simply parsing commitlog_archiving.properties is insufficient due to #1. I would suggest exposing either via some system table or JMX would be useful. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-8375) Cleanup of generics in bounds serialization
[ https://issues.apache.org/jira/browse/CASSANDRA-8375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joshua McKenzie updated CASSANDRA-8375: --- Fix Version/s: 3.0 Cleanup of generics in bounds serialization --- Key: CASSANDRA-8375 URL: https://issues.apache.org/jira/browse/CASSANDRA-8375 Project: Cassandra Issue Type: Improvement Reporter: Branimir Lambov Assignee: Branimir Lambov Priority: Trivial Fix For: 3.0 Attachments: 8375.patch There is currently a single serializer for {{AbstractBounds}} applied to both {{Token}} and {{RowPosition}} ranges and bounds. This serializer does not know which kind of bounds it needs to work with, which causes some necessarily unsafe conversions and needs extra code in all bounds types ({{toRowBounds}}/{{toTokenBounds}}) to make the conversions safe, the application of which can be easily forgotten. As all users of this serialization know in advance the kind of range they want to serialize, this can be replaced by simpler type-specific serialization. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-8816) Dropping Keyspace too slow on Windows
Philip Thompson created CASSANDRA-8816: -- Summary: Dropping Keyspace too slow on Windows Key: CASSANDRA-8816 URL: https://issues.apache.org/jira/browse/CASSANDRA-8816 Project: Cassandra Issue Type: Improvement Reporter: Philip Thompson Assignee: Joshua McKenzie Priority: Trivial Fix For: 3.0 The dtests {{secondary_indexes_test.py:TestSecondaryIndexes.test_6924_dropping_ks}} and {{secondary_indexes_test.py:TestSecondaryIndexes.test_6924_dropping_cf}} are failing on 2.1-head and trunk on windows. They are running into the client side 10s timeout when dropping a keyspace. The tests create and drop the same keyspace 10 times, and only fourth or fifth iterations of this loop run into the timeout. Raising the timeout to 20s solves the issue. It is not occurring on linux. The keyspace is still being dropped correctly, so this is labeled 'Improvement' -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-8793) Avoid memory allocation when searching index summary
[ https://issues.apache.org/jira/browse/CASSANDRA-8793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-8793: -- Reviewer: Ariel Weisberg Avoid memory allocation when searching index summary Key: CASSANDRA-8793 URL: https://issues.apache.org/jira/browse/CASSANDRA-8793 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Assignee: Benedict Priority: Minor Fix For: 3.0 Currently we build a byte[] for each comparison, when we could just fill the details into a DirectByteBuffer -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8119) More Expressive Consistency Levels
[ https://issues.apache.org/jira/browse/CASSANDRA-8119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14324622#comment-14324622 ] Benedict commented on CASSANDRA-8119: - It might be nice to permit either CL _or_ explicit integer requirement, so that users can specify QUORUM without knowing the RF, or can specify a RF if they care about a specific degree of durability. Generally +1 this improvement, and don't have any qualms about special keys. We just need to make sure they are unambiguously encoded distinctly from the datacentre names. It might be nice to permit multiple occurences of the any datacentre special key, so that you could specify QUORUM x2 DC, without specifying which DC either occurs in. It might also be nice to have a special local datacentre key which doesn't require the user (or client) to know its name. More Expressive Consistency Levels -- Key: CASSANDRA-8119 URL: https://issues.apache.org/jira/browse/CASSANDRA-8119 Project: Cassandra Issue Type: New Feature Components: API Reporter: Tyler Hobbs For some multi-datacenter environments, the current set of consistency levels are too restrictive. For example, the following consistency requirements cannot be expressed: * LOCAL_QUORUM in two specific DCs * LOCAL_QUORUM in the local DC plus LOCAL_QUORUM in at least one other DC * LOCAL_QUORUM in the local DC plus N remote replicas in any DC I propose that we add a new consistency level: CUSTOM. In the v4 (or v5) protocol, this would be accompanied by an additional map argument. A map of {DC: CL} or a map of {DC: int} is sufficient to cover the first example. If we accept a special keys to represent any datacenter, the second case can be handled. A similar technique could be used for any other nodes. I'm not in love with the special keys, so if anybody has ideas for something more elegant, feel free to propose them. The main idea is that we want to be flexible enough to cover any reasonable consistency or durability requirements. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-8815) Race in sstable ref counting during streaming failures
[ https://issues.apache.org/jira/browse/CASSANDRA-8815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sankalp kohli updated CASSANDRA-8815: - Reviewer: sankalp kohli Race in sstable ref counting during streaming failures Key: CASSANDRA-8815 URL: https://issues.apache.org/jira/browse/CASSANDRA-8815 Project: Cassandra Issue Type: Bug Components: Core Reporter: sankalp kohli Assignee: Benedict Fix For: 2.0.13 Attachments: 8815.txt We have a seen a machine in Prod whose all read threads are blocked(spinning) on trying to acquire the reference lock on stables. There are also some stream sessions which are doing the same. On looking at the heap dump, we could see that a live sstable which is part of the View has a ref count = 0. This sstable is also not compacting or is part of any failed compaction. On looking through the code, we could see that if ref goes to zero and the stable is part of the View, all reader threads will spin forever. On further looking through the code of streaming, we could see that if StreamTransferTask.complete is called after closeSession has been called due to error in OutgoingMessageHandler, it will double decrement the ref count of an sstable. This race can happen and we see through exception in logs that closeSession was triggered by OutgoingMessageHandler. The fix for this is very simple i think. In StreamTransferTask.abort, we can remove a file from files” before decrementing the ref count. This will avoid this race. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-8669) simple_repair test failing on 2.1
[ https://issues.apache.org/jira/browse/CASSANDRA-8669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Philip Thompson updated CASSANDRA-8669: --- Fix Version/s: (was: 2.1.3) 2.1.4 simple_repair test failing on 2.1 - Key: CASSANDRA-8669 URL: https://issues.apache.org/jira/browse/CASSANDRA-8669 Project: Cassandra Issue Type: Bug Reporter: Philip Thompson Assignee: Yuki Morishita Fix For: 2.1.4 The dtest simple_repair_test began failing on 12/22 on 2.1 and trunk. The test fails intermittently both locally and on cassci. The test is here: https://github.com/riptano/cassandra-dtest/blob/master/repair_test.py#L32 The output is here: http://cassci.datastax.com/job/cassandra-2.1_dtest/661/testReport/repair_test/TestRepair/simple_repair_test/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8119) More Expressive Consistency Levels
[ https://issues.apache.org/jira/browse/CASSANDRA-8119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14324727#comment-14324727 ] Benedict commented on CASSANDRA-8119: - That's true, perhaps there should also be a special key for ALL other data centres, and a QUORUM of data centres More Expressive Consistency Levels -- Key: CASSANDRA-8119 URL: https://issues.apache.org/jira/browse/CASSANDRA-8119 Project: Cassandra Issue Type: New Feature Components: API Reporter: Tyler Hobbs For some multi-datacenter environments, the current set of consistency levels are too restrictive. For example, the following consistency requirements cannot be expressed: * LOCAL_QUORUM in two specific DCs * LOCAL_QUORUM in the local DC plus LOCAL_QUORUM in at least one other DC * LOCAL_QUORUM in the local DC plus N remote replicas in any DC I propose that we add a new consistency level: CUSTOM. In the v4 (or v5) protocol, this would be accompanied by an additional map argument. A map of {DC: CL} or a map of {DC: int} is sufficient to cover the first example. If we accept a special keys to represent any datacenter, the second case can be handled. A similar technique could be used for any other nodes. I'm not in love with the special keys, so if anybody has ideas for something more elegant, feel free to propose them. The main idea is that we want to be flexible enough to cover any reasonable consistency or durability requirements. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-6538) Provide a read-time CQL function to display the data size of columns and rows
[ https://issues.apache.org/jira/browse/CASSANDRA-6538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14324018#comment-14324018 ] Sylvain Lebresne commented on CASSANDRA-6538: - I have 2 problems with this ticket so far: # I'm on the fence on whether allowing such function on non-frozen collection really make sense. In that case, the size returned will *not* be the size the collection has in the database, and I'm a bit afraid people will misunderstand/misuse it. # We actually don't currently have a good way to declare a function that applies to _any_ type. The attached patch works around this by basically breaking the type system, making the blob type an accept everything type but that's not ok (I'm *strongly* against doing that). Overall, what I'd suggest would be to add a method that only work on blobs. If you nedd the size of another type, we actually have the {{xAsBlob}} method for that so that's easily done (and it makes it more clear what is returned imo). It's worth noting that this won't work for frozen types (or collection in general) since we don't have {{xAsBlob}} functions for those, but imo it's a somewhat separate problem, one that should be tackled in another ticket. Provide a read-time CQL function to display the data size of columns and rows - Key: CASSANDRA-6538 URL: https://issues.apache.org/jira/browse/CASSANDRA-6538 Project: Cassandra Issue Type: Improvement Reporter: Johnny Miller Priority: Minor Labels: cql Attachments: 6538.patch, sizeFzt.PNG It would be extremely useful to be able to work out the size of rows and columns via CQL. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8812) JVM Crashes on Windows x86
[ https://issues.apache.org/jira/browse/CASSANDRA-8812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14324051#comment-14324051 ] Benedict commented on CASSANDRA-8812: - Do you have the Java exception thrown by buffer.force()? JVM Crashes on Windows x86 -- Key: CASSANDRA-8812 URL: https://issues.apache.org/jira/browse/CASSANDRA-8812 Project: Cassandra Issue Type: Bug Environment: Windows 7 running x86(32-bit) Oracle JDK 1.8.0_u31 Reporter: Amichai Rothman Assignee: Joshua McKenzie Attachments: crashtest.tgz Under Windows (32 or 64 bit) with the 32-bit Oracle JDK, the JVM may crash due to EXCEPTION_ACCESS_VIOLATION. This happens inconsistently. The attached test project can recreate the crash - sometimes it works successfully, sometimes there's a Java exception in the log, and sometimes the hotspot JVM crash shows up (regardless of whether the JUnit test results in success - you can ignore that). Run it a bunch of times to see the various outcomes. It also contains a sample hotspot error log. Note that both when the Java exception is thrown and when the JVM crashes, the stack trace is almost the same - they both eventually occur when the PERIODIC-COMMIT-LOG-SYNCER thread calls CommitLogSegment.sync and accesses the buffer (MappedByteBuffer): if it happens to be in buffer.force(), then the Java exception is thrown, and if it's in one of the buffer.put() calls before it, then the JVM crashes. This possibly exposes a JVM bug as well in this case. So it basically looks like a race condition which results in the buffer sometimes being used after it is no longer valid. I recreated this on a PC with Windows 7 64-bit running the 32-bit Oracle JDK, as well as on a modern.ie virtualbox image of Windows 7 32-bit running the JDK, and it happens both with JDK 7 and JDK 8. Also defining an explicit dependency on cassandra 2.1.2 (as opposed to the cassandra-unit dependency on 2.1.0) doesn't make a difference. At some point in my testing I've also seen a Java-level exception on Linux, but I can't recreate it at the moment with this test project, so I can't guarantee it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8792) Improve Memory assertions
[ https://issues.apache.org/jira/browse/CASSANDRA-8792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14324824#comment-14324824 ] Jonathan Ellis commented on CASSANDRA-8792: --- +1 Improve Memory assertions - Key: CASSANDRA-8792 URL: https://issues.apache.org/jira/browse/CASSANDRA-8792 Project: Cassandra Issue Type: Bug Components: Core Reporter: Benedict Assignee: Benedict Priority: Trivial Fix For: 2.1.4 Null pointers are valid returns if a size of zero is returned. We assume a null pointer implies resource mismanagement in a few places. We also don't properly check the bounds of all of our accesses; this patch attempts to tidy up both of these things. -- This message was sent by Atlassian JIRA (v6.3.4#6332)