[jira] [Commented] (CASSANDRA-13348) Duplicate tokens after bootstrap
[ https://issues.apache.org/jira/browse/CASSANDRA-13348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15991984#comment-15991984 ] Yasuharu Goto commented on CASSANDRA-13348: --- [~dikanggu] Thank you! Our cluster seems to be able to avoid this issue. I'm gonna upgrade to 3.0.13! > Duplicate tokens after bootstrap > > > Key: CASSANDRA-13348 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13348 > Project: Cassandra > Issue Type: Bug >Reporter: Tom van der Woerdt >Assignee: Dikang Gu >Priority: Blocker > Fix For: 3.0.x > > > This one is a bit scary, and probably results in data loss. After a bootstrap > of a few new nodes into an existing cluster, two new nodes have chosen some > overlapping tokens. > In fact, of the 256 tokens chosen, 51 tokens were already in use on the other > node. > Node 1 log : > {noformat} > INFO [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,461 > StorageService.java:1160 - JOINING: waiting for ring information > INFO [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,461 > StorageService.java:1160 - JOINING: waiting for schema information to complete > INFO [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,461 > StorageService.java:1160 - JOINING: schema complete, ready to bootstrap > INFO [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,462 > StorageService.java:1160 - JOINING: waiting for pending range calculation > INFO [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,462 > StorageService.java:1160 - JOINING: calculation complete, ready to bootstrap > INFO [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,462 > StorageService.java:1160 - JOINING: getting bootstrap token > WARN [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,564 > TokenAllocation.java:61 - Selected tokens [, 2959334889475814712, > 3727103702384420083, 7183119311535804926, 6013900799616279548, > -1222135324851761575, 1645259890258332163, -1213352346686661387, > 7604192574911909354] > WARN [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,729 > TokenAllocation.java:65 - Replicated node load in datacentre before > allocation max 1.00 min 1.00 stddev 0. > WARN [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,729 > TokenAllocation.java:66 - Replicated node load in datacentre after allocation > max 1.00 min 1.00 stddev 0. > WARN [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,729 > TokenAllocation.java:70 - Unexpected growth in standard deviation after > allocation. > INFO [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:44,150 > StorageService.java:1160 - JOINING: sleeping 3 ms for pending range setup > INFO [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:43:14,151 > StorageService.java:1160 - JOINING: Starting to bootstrap... > {noformat} > Node 2 log: > {noformat} > INFO [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:51,937 > StorageService.java:971 - Joining ring by operator request > INFO [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 > StorageService.java:1160 - JOINING: waiting for ring information > INFO [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 > StorageService.java:1160 - JOINING: waiting for schema information to complete > INFO [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 > StorageService.java:1160 - JOINING: schema complete, ready to bootstrap > INFO [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 > StorageService.java:1160 - JOINING: waiting for pending range calculation > INFO [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,514 > StorageService.java:1160 - JOINING: calculation complete, ready to bootstrap > INFO [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,514 > StorageService.java:1160 - JOINING: getting bootstrap token > WARN [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,630 > TokenAllocation.java:61 - Selected tokens [.., 2890709530010722764, > -2416006722819773829, -5820248611267569511, -5990139574852472056, > 1645259890258332163, 9135021011763659240, -5451286144622276797, > 7604192574911909354] > WARN [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,794 > TokenAllocation.java:65 - Replicated node load in datacentre before > allocation max 1.02 min 0.98 stddev 0. > WARN [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,795 > TokenAllocation.java:66 - Replicated node load in datacentre after allocation > max 1.00 min 1.00 stddev 0. > INFO [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:53,149 > StorageService.java:1160 - JOINING: sleeping 3 ms for pending range setup > INFO [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:56:23,149 > StorageService.java:1160 - JOINING: Starting to bootstrap... > {noformat}
[jira] [Comment Edited] (CASSANDRA-13348) Duplicate tokens after bootstrap
[ https://issues.apache.org/jira/browse/CASSANDRA-13348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15990685#comment-15990685 ] Yasuharu Goto edited comment on CASSANDRA-13348 at 5/1/17 9:00 AM: --- Hi [~dikanggu], we're now preparing to upgrade our production Cassandra cluster from 2.1 to 3.0.13. Our 3.0 clusters does not activate allocate_tokens_for_keyspace for now. In your theory, would this issue affects to C* clusters with allocate_tokens_for_keyspace = null? was (Author: yasuharu): Hi [~dikanggu], we're now preparing to upgrade our production Cassandra cluster from 2.1 to 3.0.14. Our 3.0 clusters does not activate allocate_tokens_for_keyspace for now. In your theory, would this issue affects to C* clusters with allocate_tokens_for_keyspace = null? Thank you. > Duplicate tokens after bootstrap > > > Key: CASSANDRA-13348 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13348 > Project: Cassandra > Issue Type: Bug >Reporter: Tom van der Woerdt >Assignee: Dikang Gu >Priority: Blocker > Fix For: 3.0.x > > > This one is a bit scary, and probably results in data loss. After a bootstrap > of a few new nodes into an existing cluster, two new nodes have chosen some > overlapping tokens. > In fact, of the 256 tokens chosen, 51 tokens were already in use on the other > node. > Node 1 log : > {noformat} > INFO [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,461 > StorageService.java:1160 - JOINING: waiting for ring information > INFO [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,461 > StorageService.java:1160 - JOINING: waiting for schema information to complete > INFO [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,461 > StorageService.java:1160 - JOINING: schema complete, ready to bootstrap > INFO [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,462 > StorageService.java:1160 - JOINING: waiting for pending range calculation > INFO [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,462 > StorageService.java:1160 - JOINING: calculation complete, ready to bootstrap > INFO [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,462 > StorageService.java:1160 - JOINING: getting bootstrap token > WARN [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,564 > TokenAllocation.java:61 - Selected tokens [, 2959334889475814712, > 3727103702384420083, 7183119311535804926, 6013900799616279548, > -1222135324851761575, 1645259890258332163, -1213352346686661387, > 7604192574911909354] > WARN [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,729 > TokenAllocation.java:65 - Replicated node load in datacentre before > allocation max 1.00 min 1.00 stddev 0. > WARN [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,729 > TokenAllocation.java:66 - Replicated node load in datacentre after allocation > max 1.00 min 1.00 stddev 0. > WARN [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,729 > TokenAllocation.java:70 - Unexpected growth in standard deviation after > allocation. > INFO [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:44,150 > StorageService.java:1160 - JOINING: sleeping 3 ms for pending range setup > INFO [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:43:14,151 > StorageService.java:1160 - JOINING: Starting to bootstrap... > {noformat} > Node 2 log: > {noformat} > INFO [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:51,937 > StorageService.java:971 - Joining ring by operator request > INFO [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 > StorageService.java:1160 - JOINING: waiting for ring information > INFO [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 > StorageService.java:1160 - JOINING: waiting for schema information to complete > INFO [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 > StorageService.java:1160 - JOINING: schema complete, ready to bootstrap > INFO [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 > StorageService.java:1160 - JOINING: waiting for pending range calculation > INFO [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,514 > StorageService.java:1160 - JOINING: calculation complete, ready to bootstrap > INFO [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,514 > StorageService.java:1160 - JOINING: getting bootstrap token > WARN [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,630 > TokenAllocation.java:61 - Selected tokens [.., 2890709530010722764, > -2416006722819773829, -5820248611267569511, -5990139574852472056, > 1645259890258332163, 9135021011763659240, -5451286144622276797, > 7604192574911909354] > WARN [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,794 > TokenAllocation.java:65 - Replicated node load in datacentre
[jira] [Commented] (CASSANDRA-13348) Duplicate tokens after bootstrap
[ https://issues.apache.org/jira/browse/CASSANDRA-13348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15990685#comment-15990685 ] Yasuharu Goto commented on CASSANDRA-13348: --- Hi [~dikanggu], we're now preparing to upgrade our production Cassandra cluster from 2.1 to 3.0.14. Our 3.0 clusters does not activate allocate_tokens_for_keyspace for now. In your theory, would this issue affects to C* clusters with allocate_tokens_for_keyspace = null? Thank you. > Duplicate tokens after bootstrap > > > Key: CASSANDRA-13348 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13348 > Project: Cassandra > Issue Type: Bug >Reporter: Tom van der Woerdt >Assignee: Dikang Gu >Priority: Blocker > Fix For: 3.0.x > > > This one is a bit scary, and probably results in data loss. After a bootstrap > of a few new nodes into an existing cluster, two new nodes have chosen some > overlapping tokens. > In fact, of the 256 tokens chosen, 51 tokens were already in use on the other > node. > Node 1 log : > {noformat} > INFO [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,461 > StorageService.java:1160 - JOINING: waiting for ring information > INFO [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,461 > StorageService.java:1160 - JOINING: waiting for schema information to complete > INFO [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,461 > StorageService.java:1160 - JOINING: schema complete, ready to bootstrap > INFO [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,462 > StorageService.java:1160 - JOINING: waiting for pending range calculation > INFO [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,462 > StorageService.java:1160 - JOINING: calculation complete, ready to bootstrap > INFO [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,462 > StorageService.java:1160 - JOINING: getting bootstrap token > WARN [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,564 > TokenAllocation.java:61 - Selected tokens [, 2959334889475814712, > 3727103702384420083, 7183119311535804926, 6013900799616279548, > -1222135324851761575, 1645259890258332163, -1213352346686661387, > 7604192574911909354] > WARN [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,729 > TokenAllocation.java:65 - Replicated node load in datacentre before > allocation max 1.00 min 1.00 stddev 0. > WARN [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,729 > TokenAllocation.java:66 - Replicated node load in datacentre after allocation > max 1.00 min 1.00 stddev 0. > WARN [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,729 > TokenAllocation.java:70 - Unexpected growth in standard deviation after > allocation. > INFO [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:44,150 > StorageService.java:1160 - JOINING: sleeping 3 ms for pending range setup > INFO [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:43:14,151 > StorageService.java:1160 - JOINING: Starting to bootstrap... > {noformat} > Node 2 log: > {noformat} > INFO [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:51,937 > StorageService.java:971 - Joining ring by operator request > INFO [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 > StorageService.java:1160 - JOINING: waiting for ring information > INFO [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 > StorageService.java:1160 - JOINING: waiting for schema information to complete > INFO [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 > StorageService.java:1160 - JOINING: schema complete, ready to bootstrap > INFO [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 > StorageService.java:1160 - JOINING: waiting for pending range calculation > INFO [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,514 > StorageService.java:1160 - JOINING: calculation complete, ready to bootstrap > INFO [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,514 > StorageService.java:1160 - JOINING: getting bootstrap token > WARN [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,630 > TokenAllocation.java:61 - Selected tokens [.., 2890709530010722764, > -2416006722819773829, -5820248611267569511, -5990139574852472056, > 1645259890258332163, 9135021011763659240, -5451286144622276797, > 7604192574911909354] > WARN [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,794 > TokenAllocation.java:65 - Replicated node load in datacentre before > allocation max 1.02 min 0.98 stddev 0. > WARN [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,795 > TokenAllocation.java:66 - Replicated node load in datacentre after allocation > max 1.00 min 1.00 stddev 0. > INFO [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:53,149 > StorageService.java:1160 - JOINING:
[jira] [Commented] (CASSANDRA-13125) Duplicate rows after upgrading from 2.1.16 to 3.0.10/3.9
[ https://issues.apache.org/jira/browse/CASSANDRA-13125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15877379#comment-15877379 ] Yasuharu Goto commented on CASSANDRA-13125: --- [~mnantern] In our case, {{nodetool scrub}} (C* 3.0.9 or later) fixed our broken sstables. > Duplicate rows after upgrading from 2.1.16 to 3.0.10/3.9 > > > Key: CASSANDRA-13125 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13125 > Project: Cassandra > Issue Type: Bug >Reporter: Zhongxiang Zheng >Assignee: Sylvain Lebresne >Priority: Critical > Fix For: 3.0.11, 3.11.0 > > Attachments: diff-a.patch, diff-b.patch > > > I found that rows are splitting and duplicated after upgrading the cluster > from 2.1.x to 3.0.x. > I found the way to reproduce the problem as below. > {code} > $ ccm create test -v 2.1.16 -n 3 -s > > Current cluster is now: test > $ ccm node1 cqlsh -e "CREATE KEYSPACE test WITH replication = > {'class':'SimpleStrategy', 'replication_factor':3}" > $ ccm node1 cqlsh -e "CREATE TABLE test.test (id text PRIMARY KEY, value1 > set, value2 set);" > # Upgrade node1 > $ for i in 1; do ccm node${i} stop; ccm node${i} setdir -v3.0.10; ccm > node${i} start;ccm node${i} nodetool upgradesstables; done > # Insert a row through node1(3.0.10) > $ ccm node1 cqlsh -e "INSERT INTO test.test (id, value1, value2) values > ('aaa', {'aaa', 'bbb'}, {'ccc', 'ddd'});" > # Insert a row through node2(2.1.16) > $ ccm node2 cqlsh -e "INSERT INTO test.test (id, value1, value2) values > ('bbb', {'aaa', 'bbb'}, {'ccc', 'ddd'});" > # The row inserted from node1 is splitting > $ ccm node1 cqlsh -e "SELECT * FROM test.test ;" > id | value1 | value2 > -++ > aaa | null | null > aaa | {'aaa', 'bbb'} | {'ccc', 'ddd'} > bbb | {'aaa', 'bbb'} | {'ccc', 'ddd'} > $ for i in 1 2; do ccm node${i} nodetool flush; done > # Results of sstable2json of node2. The row inserted from node1(3.0.10) is > different from the row inserted from node2(2.1.16). > $ ccm node2 json -k test -c test > running > ['/home/zzheng/.ccm/test/node2/data0/test/test-5406ee80dbdb11e6a175f57c4c7c85f3/test-test-ka-1-Data.db'] > -- test-test-ka-1-Data.db - > [ > {"key": "aaa", > "cells": [["","",1484564624769577], >["value1","value2:!",1484564624769576,"t",1484564624], >["value1:616161","",1484564624769577], >["value1:626262","",1484564624769577], >["value2:636363","",1484564624769577], >["value2:646464","",1484564624769577]]}, > {"key": "bbb", > "cells": [["","",1484564634508029], >["value1:_","value1:!",1484564634508028,"t",1484564634], >["value1:616161","",1484564634508029], >["value1:626262","",1484564634508029], >["value2:_","value2:!",1484564634508028,"t",1484564634], >["value2:636363","",1484564634508029], >["value2:646464","",1484564634508029]]} > ] > # Upgrade node2,3 > $ for i in `seq 2 3`; do ccm node${i} stop; ccm node${i} setdir -v3.0.10; ccm > node${i} start;ccm node${i} nodetool upgradesstables; done > # After upgrade node2,3, the row inserted from node1 is splitting in node2,3 > $ ccm node2 cqlsh -e "SELECT * FROM test.test ;" > > id | value1 | value2 > -++ > aaa | null | null > aaa | {'aaa', 'bbb'} | {'ccc', 'ddd'} > bbb | {'aaa', 'bbb'} | {'ccc', 'ddd'} > (3 rows) > # Results of sstabledump > # node1 > [ > { > "partition" : { > "key" : [ "aaa" ], > "position" : 0 > }, > "rows" : [ > { > "type" : "row", > "position" : 17, > "liveness_info" : { "tstamp" : "2017-01-16T11:03:44.769577Z" }, > "cells" : [ > { "name" : "value1", "deletion_info" : { "marked_deleted" : > "2017-01-16T11:03:44.769576Z", "local_delete_time" : "2017-01-16T11:03:44Z" } > }, > { "name" : "value1", "path" : [ "aaa" ], "value" : "" }, > { "name" : "value1", "path" : [ "bbb" ], "value" : "" }, > { "name" : "value2", "deletion_info" : { "marked_deleted" : > "2017-01-16T11:03:44.769576Z", "local_delete_time" : "2017-01-16T11:03:44Z" } > }, > { "name" : "value2", "path" : [ "ccc" ], "value" : "" }, > { "name" : "value2", "path" : [ "ddd" ], "value" : "" } > ] > } > ] > }, > { > "partition" : { > "key" : [ "bbb" ], > "position" : 48 > }, > "rows" : [ > { > "type" : "row", > "position" : 65, >
[jira] [Commented] (CASSANDRA-8844) Change Data Capture (CDC)
[ https://issues.apache.org/jira/browse/CASSANDRA-8844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15873020#comment-15873020 ] Yasuharu Goto commented on CASSANDRA-8844: -- [~jbellis] Sorry, It seems that I unexpectedly changed the assignment with a keyboard shortcut. Thank you for your fix. > Change Data Capture (CDC) > - > > Key: CASSANDRA-8844 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8844 > Project: Cassandra > Issue Type: New Feature > Components: Coordination, Local Write-Read Paths >Reporter: Tupshin Harper >Assignee: Joshua McKenzie >Priority: Critical > Fix For: 3.8 > > > "In databases, change data capture (CDC) is a set of software design patterns > used to determine (and track) the data that has changed so that action can be > taken using the changed data. Also, Change data capture (CDC) is an approach > to data integration that is based on the identification, capture and delivery > of the changes made to enterprise data sources." > -Wikipedia > As Cassandra is increasingly being used as the Source of Record (SoR) for > mission critical data in large enterprises, it is increasingly being called > upon to act as the central hub of traffic and data flow to other systems. In > order to try to address the general need, we (cc [~brianmhess]), propose > implementing a simple data logging mechanism to enable per-table CDC patterns. > h2. The goals: > # Use CQL as the primary ingestion mechanism, in order to leverage its > Consistency Level semantics, and in order to treat it as the single > reliable/durable SoR for the data. > # To provide a mechanism for implementing good and reliable > (deliver-at-least-once with possible mechanisms for deliver-exactly-once ) > continuous semi-realtime feeds of mutations going into a Cassandra cluster. > # To eliminate the developmental and operational burden of users so that they > don't have to do dual writes to other systems. > # For users that are currently doing batch export from a Cassandra system, > give them the opportunity to make that realtime with a minimum of coding. > h2. The mechanism: > We propose a durable logging mechanism that functions similar to a commitlog, > with the following nuances: > - Takes place on every node, not just the coordinator, so RF number of copies > are logged. > - Separate log per table. > - Per-table configuration. Only tables that are specified as CDC_LOG would do > any logging. > - Per DC. We are trying to keep the complexity to a minimum to make this an > easy enhancement, but most likely use cases would prefer to only implement > CDC logging in one (or a subset) of the DCs that are being replicated to > - In the critical path of ConsistencyLevel acknowledgment. Just as with the > commitlog, failure to write to the CDC log should fail that node's write. If > that means the requested consistency level was not met, then clients *should* > experience UnavailableExceptions. > - Be written in a Row-centric manner such that it is easy for consumers to > reconstitute rows atomically. > - Written in a simple format designed to be consumed *directly* by daemons > written in non JVM languages > h2. Nice-to-haves > I strongly suspect that the following features will be asked for, but I also > believe that they can be deferred for a subsequent release, and to guage > actual interest. > - Multiple logs per table. This would make it easy to have multiple > "subscribers" to a single table's changes. A workaround would be to create a > forking daemon listener, but that's not a great answer. > - Log filtering. Being able to apply filters, including UDF-based filters > would make Casandra a much more versatile feeder into other systems, and > again, reduce complexity that would otherwise need to be built into the > daemons. > h2. Format and Consumption > - Cassandra would only write to the CDC log, and never delete from it. > - Cleaning up consumed logfiles would be the client daemon's responibility > - Logfile size should probably be configurable. > - Logfiles should be named with a predictable naming schema, making it > triivial to process them in order. > - Daemons should be able to checkpoint their work, and resume from where they > left off. This means they would have to leave some file artifact in the CDC > log's directory. > - A sophisticated daemon should be able to be written that could > -- Catch up, in written-order, even when it is multiple logfiles behind in > processing > -- Be able to continuously "tail" the most recent logfile and get > low-latency(ms?) access to the data as it is written. > h2. Alternate approach > In order to make consuming a change log easy and efficient to do with low > latency, the following could supplement the approach outlined above > -
[jira] [Assigned] (CASSANDRA-8844) Change Data Capture (CDC)
[ https://issues.apache.org/jira/browse/CASSANDRA-8844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yasuharu Goto reassigned CASSANDRA-8844: Assignee: Yasuharu Goto (was: Joshua McKenzie) > Change Data Capture (CDC) > - > > Key: CASSANDRA-8844 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8844 > Project: Cassandra > Issue Type: New Feature > Components: Coordination, Local Write-Read Paths >Reporter: Tupshin Harper >Assignee: Yasuharu Goto >Priority: Critical > Fix For: 3.8 > > > "In databases, change data capture (CDC) is a set of software design patterns > used to determine (and track) the data that has changed so that action can be > taken using the changed data. Also, Change data capture (CDC) is an approach > to data integration that is based on the identification, capture and delivery > of the changes made to enterprise data sources." > -Wikipedia > As Cassandra is increasingly being used as the Source of Record (SoR) for > mission critical data in large enterprises, it is increasingly being called > upon to act as the central hub of traffic and data flow to other systems. In > order to try to address the general need, we (cc [~brianmhess]), propose > implementing a simple data logging mechanism to enable per-table CDC patterns. > h2. The goals: > # Use CQL as the primary ingestion mechanism, in order to leverage its > Consistency Level semantics, and in order to treat it as the single > reliable/durable SoR for the data. > # To provide a mechanism for implementing good and reliable > (deliver-at-least-once with possible mechanisms for deliver-exactly-once ) > continuous semi-realtime feeds of mutations going into a Cassandra cluster. > # To eliminate the developmental and operational burden of users so that they > don't have to do dual writes to other systems. > # For users that are currently doing batch export from a Cassandra system, > give them the opportunity to make that realtime with a minimum of coding. > h2. The mechanism: > We propose a durable logging mechanism that functions similar to a commitlog, > with the following nuances: > - Takes place on every node, not just the coordinator, so RF number of copies > are logged. > - Separate log per table. > - Per-table configuration. Only tables that are specified as CDC_LOG would do > any logging. > - Per DC. We are trying to keep the complexity to a minimum to make this an > easy enhancement, but most likely use cases would prefer to only implement > CDC logging in one (or a subset) of the DCs that are being replicated to > - In the critical path of ConsistencyLevel acknowledgment. Just as with the > commitlog, failure to write to the CDC log should fail that node's write. If > that means the requested consistency level was not met, then clients *should* > experience UnavailableExceptions. > - Be written in a Row-centric manner such that it is easy for consumers to > reconstitute rows atomically. > - Written in a simple format designed to be consumed *directly* by daemons > written in non JVM languages > h2. Nice-to-haves > I strongly suspect that the following features will be asked for, but I also > believe that they can be deferred for a subsequent release, and to guage > actual interest. > - Multiple logs per table. This would make it easy to have multiple > "subscribers" to a single table's changes. A workaround would be to create a > forking daemon listener, but that's not a great answer. > - Log filtering. Being able to apply filters, including UDF-based filters > would make Casandra a much more versatile feeder into other systems, and > again, reduce complexity that would otherwise need to be built into the > daemons. > h2. Format and Consumption > - Cassandra would only write to the CDC log, and never delete from it. > - Cleaning up consumed logfiles would be the client daemon's responibility > - Logfile size should probably be configurable. > - Logfiles should be named with a predictable naming schema, making it > triivial to process them in order. > - Daemons should be able to checkpoint their work, and resume from where they > left off. This means they would have to leave some file artifact in the CDC > log's directory. > - A sophisticated daemon should be able to be written that could > -- Catch up, in written-order, even when it is multiple logfiles behind in > processing > -- Be able to continuously "tail" the most recent logfile and get > low-latency(ms?) access to the data as it is written. > h2. Alternate approach > In order to make consuming a change log easy and efficient to do with low > latency, the following could supplement the approach outlined above > - Instead of writing to a logfile, by default, Cassandra could expose a > socket for a daemon to connect to,
[jira] [Commented] (CASSANDRA-13125) Duplicate rows after upgrading from 2.1.16 to 3.0.10/3.9
[ https://issues.apache.org/jira/browse/CASSANDRA-13125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15857829#comment-15857829 ] Yasuharu Goto commented on CASSANDRA-13125: --- Thank you guys! :) > Duplicate rows after upgrading from 2.1.16 to 3.0.10/3.9 > > > Key: CASSANDRA-13125 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13125 > Project: Cassandra > Issue Type: Bug >Reporter: Zhongxiang Zheng >Assignee: Sylvain Lebresne >Priority: Critical > Fix For: 3.0.11, 3.11.0 > > Attachments: diff-a.patch, diff-b.patch > > > I found that rows are splitting and duplicated after upgrading the cluster > from 2.1.x to 3.0.x. > I found the way to reproduce the problem as below. > {code} > $ ccm create test -v 2.1.16 -n 3 -s > > Current cluster is now: test > $ ccm node1 cqlsh -e "CREATE KEYSPACE test WITH replication = > {'class':'SimpleStrategy', 'replication_factor':3}" > $ ccm node1 cqlsh -e "CREATE TABLE test.test (id text PRIMARY KEY, value1 > set, value2 set);" > # Upgrade node1 > $ for i in 1; do ccm node${i} stop; ccm node${i} setdir -v3.0.10; ccm > node${i} start;ccm node${i} nodetool upgradesstables; done > # Insert a row through node1(3.0.10) > $ ccm node1 cqlsh -e "INSERT INTO test.test (id, value1, value2) values > ('aaa', {'aaa', 'bbb'}, {'ccc', 'ddd'});" > # Insert a row through node2(2.1.16) > $ ccm node2 cqlsh -e "INSERT INTO test.test (id, value1, value2) values > ('bbb', {'aaa', 'bbb'}, {'ccc', 'ddd'});" > # The row inserted from node1 is splitting > $ ccm node1 cqlsh -e "SELECT * FROM test.test ;" > id | value1 | value2 > -++ > aaa | null | null > aaa | {'aaa', 'bbb'} | {'ccc', 'ddd'} > bbb | {'aaa', 'bbb'} | {'ccc', 'ddd'} > $ for i in 1 2; do ccm node${i} nodetool flush; done > # Results of sstable2json of node2. The row inserted from node1(3.0.10) is > different from the row inserted from node2(2.1.16). > $ ccm node2 json -k test -c test > running > ['/home/zzheng/.ccm/test/node2/data0/test/test-5406ee80dbdb11e6a175f57c4c7c85f3/test-test-ka-1-Data.db'] > -- test-test-ka-1-Data.db - > [ > {"key": "aaa", > "cells": [["","",1484564624769577], >["value1","value2:!",1484564624769576,"t",1484564624], >["value1:616161","",1484564624769577], >["value1:626262","",1484564624769577], >["value2:636363","",1484564624769577], >["value2:646464","",1484564624769577]]}, > {"key": "bbb", > "cells": [["","",1484564634508029], >["value1:_","value1:!",1484564634508028,"t",1484564634], >["value1:616161","",1484564634508029], >["value1:626262","",1484564634508029], >["value2:_","value2:!",1484564634508028,"t",1484564634], >["value2:636363","",1484564634508029], >["value2:646464","",1484564634508029]]} > ] > # Upgrade node2,3 > $ for i in `seq 2 3`; do ccm node${i} stop; ccm node${i} setdir -v3.0.10; ccm > node${i} start;ccm node${i} nodetool upgradesstables; done > # After upgrade node2,3, the row inserted from node1 is splitting in node2,3 > $ ccm node2 cqlsh -e "SELECT * FROM test.test ;" > > id | value1 | value2 > -++ > aaa | null | null > aaa | {'aaa', 'bbb'} | {'ccc', 'ddd'} > bbb | {'aaa', 'bbb'} | {'ccc', 'ddd'} > (3 rows) > # Results of sstabledump > # node1 > [ > { > "partition" : { > "key" : [ "aaa" ], > "position" : 0 > }, > "rows" : [ > { > "type" : "row", > "position" : 17, > "liveness_info" : { "tstamp" : "2017-01-16T11:03:44.769577Z" }, > "cells" : [ > { "name" : "value1", "deletion_info" : { "marked_deleted" : > "2017-01-16T11:03:44.769576Z", "local_delete_time" : "2017-01-16T11:03:44Z" } > }, > { "name" : "value1", "path" : [ "aaa" ], "value" : "" }, > { "name" : "value1", "path" : [ "bbb" ], "value" : "" }, > { "name" : "value2", "deletion_info" : { "marked_deleted" : > "2017-01-16T11:03:44.769576Z", "local_delete_time" : "2017-01-16T11:03:44Z" } > }, > { "name" : "value2", "path" : [ "ccc" ], "value" : "" }, > { "name" : "value2", "path" : [ "ddd" ], "value" : "" } > ] > } > ] > }, > { > "partition" : { > "key" : [ "bbb" ], > "position" : 48 > }, > "rows" : [ > { > "type" : "row", > "position" : 65, > "liveness_info" : { "tstamp" : "2017-01-16T11:03:54.508029Z" }, > "cells"
[jira] [Commented] (CASSANDRA-13125) Duplicate rows after upgrading from 2.1.16 to 3.0.10/3.9
[ https://issues.apache.org/jira/browse/CASSANDRA-13125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15832169#comment-15832169 ] Yasuharu Goto commented on CASSANDRA-13125: --- Thank you [~slebresne]! I checked that your patches worked properly in my reproduce procedure. The result is below. Now I could see that C*3.0 generated {{[c-c!][d-d!]}} style range tombstones and the rows are not broken! h4. On 13125-3.0 {noformat} cqlsh> insert into test.test(a,b,c,d,e) values(14,1,{2,3},{4,5},6); cqlsh> select * from test.test; a | b | c | d | e +---+++--- 14 | 1 | {2, 3} | {4, 5} | 6 RangeTombstone(0), start:org.apache.cassandra.db.composites.CompoundSparseCellName@78e3b54a, end:org.apache.cassandra.db.composites.BoundedComposite@6b517b57, markedAt:1484931972134334, delTime:1484931972 RangeTombstone(1), start:org.apache.cassandra.db.composites.CompoundSparseCellName@78e3b54b, end:org.apache.cassandra.db.composites.BoundedComposite@6b517b58, markedAt:1484931972134334, delTime:1484931972 DeletionInfo:{deletedAt=-9223372036854775808, localDeletion=2147483647, ranges=[c-c:!, deletedAt=1484931972134334, localDeletion=1484931972][d-d:!, deletedAt=1484931972134334, localDeletion=1484931972]} from:/127.0.0.1, payload:Mutation(keyspace='test', key='000e', modifications=[ColumnFamily(test -{deletedAt=-9223372036854775808, localDeletion=2147483647, ranges=[c-c:!, deletedAt=1484931972134334, localDeletion=1484931972][d-d:!, deletedAt=1484931972134334, localDeletion=1484931972]}- [:false:0@1484931972134335,b:false:4@1484931972134335,c:0002:false:0@1484931972134335,c:0003:false:0@1484931972134335,d:0004:false:0@1484931972134335,d:0005:false:0@1484931972134335,e:false:4@1484931972134335,])]), verb:MUTATION, version:8 {noformat} h4. On 13125-3.11 {noformat} cqlsh> insert into test.test(a,b,c,d,e) values(14,1,{2,3},{4,5},6); cqlsh> select * from test.test; a | b | c | d | e +---+++--- 14 | 1 | {2, 3} | {4, 5} | 6 Mutation.deserialize() size==1 RangeTombstone(0), start:org.apache.cassandra.db.composites.CompoundSparseCellName@4316af5d, end:org.apache.cassandra.db.composites.BoundedComposite@256e93b0, markedAt:1484933162431359, delTime:1484933162 RangeTombstone(1), start:org.apache.cassandra.db.composites.CompoundSparseCellName@4316af5e, end:org.apache.cassandra.db.composites.BoundedComposite@256e93b1, markedAt:1484933162431359, delTime:1484933162 DeletionInfo:{deletedAt=-9223372036854775808, localDeletion=2147483647, ranges=[c-c:!, deletedAt=1484933162431359, localDeletion=1484933162][d-d:!, deletedAt=1484933162431359, localDeletion=1484933162]} from:/127.0.0.1, payload:Mutation(keyspace='test', key='000e', modifications=[ColumnFamily(test -{deletedAt=-9223372036854775808, localDeletion=2147483647, ranges=[c-c:!, deletedAt=1484933162431359, localDeletion=1484933162][d-d:!, deletedAt=1484933162431359, localDeletion=1484933162]}- [:false:0@1484933162431360,b:false:4@1484933162431360,c:0002:false:0@1484933162431360,c:0003:false:0@1484933162431360,d:0004:false:0@1484933162431360,d:0005:false:0@1484933162431360,e:false:4@1484933162431360,])]), verb:MUTATION, version:8 {noformat} > Duplicate rows after upgrading from 2.1.16 to 3.0.10/3.9 > > > Key: CASSANDRA-13125 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13125 > Project: Cassandra > Issue Type: Bug >Reporter: Zhongxiang Zheng >Assignee: Yasuharu Goto >Priority: Critical > Attachments: diff-a.patch, diff-b.patch > > > I found that rows are splitting and duplicated after upgrading the cluster > from 2.1.x to 3.0.x. > I found the way to reproduce the problem as below. > {code} > $ ccm create test -v 2.1.16 -n 3 -s > > Current cluster is now: test > $ ccm node1 cqlsh -e "CREATE KEYSPACE test WITH replication = > {'class':'SimpleStrategy', 'replication_factor':3}" > $ ccm node1 cqlsh -e "CREATE TABLE test.test (id text PRIMARY KEY, value1 > set, value2 set);" > # Upgrade node1 > $ for i in 1; do ccm node${i} stop; ccm node${i} setdir -v3.0.10; ccm > node${i} start;ccm node${i} nodetool upgradesstables; done > # Insert a row through node1(3.0.10) > $ ccm node1 cqlsh -e "INSERT INTO test.test (id, value1, value2) values > ('aaa', {'aaa', 'bbb'}, {'ccc', 'ddd'});" > # Insert a row through node2(2.1.16) > $ ccm node2 cqlsh -e "INSERT INTO test.test (id, value1, value2) values > ('bbb', {'aaa', 'bbb'}, {'ccc', 'ddd'});" > # The row inserted from node1 is splitting > $ ccm node1 cqlsh -e "SELECT * FROM test.test ;" > id | value1 | value2 >
[jira] [Comment Edited] (CASSANDRA-13125) Duplicate rows after upgrading from 2.1.16 to 3.0.10/3.9
[ https://issues.apache.org/jira/browse/CASSANDRA-13125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15829145#comment-15829145 ] Yasuharu Goto edited comment on CASSANDRA-13125 at 1/19/17 1:58 AM: h2. Investigations... After some debugging, I found interesting difference in serialized RangeTombstoneLists between 2.1.16 and 3.0.10. - I ran 3 Cassandra nodes with some debug prints. -- 127.0.0.1 (C* 3.0.10) -- 127.0.0.2 (C* 2.1.16) -- 127.0.0.3 (C* 2.1.16) - They have a keyspace and a table already created. -- CREATE KEYSPACE test WITH replication = {'class': 'SimpleStrategy', 'replication_factor': '1'} -- CREATE TABLE test.test ( a int PRIMARY KEY, b int, c set, d set, e int ) - And I query a same INSERT (which mutation is sent to 127.0.0.2) query from 127.0.0.1(C*3.0) and 127.0.0.3(C*2.1) and see the difference. Insert a row from 127.0.0.1 and scan. ( inserted (a=14) row is broken) {code:sql} cqlsh> insert into test.test(a,b,c,d,e) values(14,1,{2,3},{4,5},6); cqlsh> select * from test.test; a | b| c | d | e +--+++-- 14 |1 | null | null | null 14 | null | {2, 3} | {4, 5} |6 (2 rows) {code} And then, I insert from 127.0.0.3 and scan. (neither a=5 nor a=14 are broken) {code:sql} cqlsh> insert into test.test(a,b,c,d,e)values(5,1,{2,3},{4,5},6); cqlsh> select * from test.test; a | b | c | d | e +---+++--- 5 | 1 | {2, 3} | {4, 5} | 6 14 | 1 | {2, 3} | {4, 5} | 6 {code} And back to 127.0.0.1 and scan the table. a=14 is broken but a=5 is not. {code:sql} cqlsh> select * from test.test; a | b| c | d | e +--+++-- 5 |1 | {2, 3} | {4, 5} |6 14 |1 | null | null | null 14 | null | {2, 3} | {4, 5} |6 {code} Therefore,It looks like that "C*3 can't scan properly rows that is stored in C*2 but inserted from C*3."; Next, I observed some incoming MUTATIONs in 127.0.0.2 like below. I saw that C*3.0 sent RangeTombstones like {{[c-c],[c-d]}}, but C*2.1 sent {{[c:_-c],[d:_-d]}}. {noformat} > insert into test.test(a,b,c,d,e) values(14,1,{2,3},{4,5},6); from 127.0.0.1 DeletionInfo:{deletedAt=-9223372036854775808, localDeletion=2147483647, ranges=[c-c:!, deletedAt=1484710273390930, localDeletion=1484710273][c-d:!, deletedAt=1484710273390930, localDeletion=1484710273]} from:/127.0.0.1, payload:Mutation(keyspace='test', key='000e', modifications=[ColumnFamily(test -{deletedAt=-9223372036854775808, localDeletion=2147483647, ranges=[c-c:!, deletedAt=1484710273390930, localDeletion=1484710273][c-d:!, deletedAt=1484710273390930, localDeletion=1484710273]}- [:false:0@1484710273390931,b:false:4@1484710273390931,c:0002:false:0@1484710273390931,c:0003:false:0@1484710273390931,d:0004:false:0@1484710273390931,d:0005:false:0@1484710273390931,e:false:4@1484710273390931,])]), verb:MUTATION, version:8 > insert into test.test(a,b,c,d,e) values(14,1,{2,3},{4,5},6); from 127.0.0.3 DeletionInfo:{deletedAt=-9223372036854775808, localDeletion=2147483647, ranges=[c:_-c:!, deletedAt=1484710277987556, localDeletion=1484710277][d:_-d:!, deletedAt=1484710277987556, localDeletion=1484710277]} from:/127.0.0.3, payload:Mutation(keyspace='test', key='000e', modifications=[ColumnFamily(test -{deletedAt=-9223372036854775808, localDeletion=2147483647, ranges=[c:_-c:!, deletedAt=1484710277987556, localDeletion=1484710277][d:_-d:!, deletedAt=1484710277987556, localDeletion=1484710277]}- [:false:0@1484710277987557,b:false:4@1484710277987557,c:0002:false:0@1484710277987557,c:0003:false:0@1484710277987557,d:0004:false:0@1484710277987557,d:0005:false:0@1484710277987557,e:false:4@1484710277987557,])]), verb:MUTATION, version:8 {noformat} h2. Workaround Plan-A But, LegacyRangeTombstone remove {{collectionName}} from RangeTombStone which start.bound != end.bound like {{[c-d]}} https://github.com/apache/cassandra/blob/cassandra-3.0.10/src/java/org/apache/cassandra/db/LegacyLayout.java#L1592-L1599 It seems like that this deletions of collectionName corrupt the unmarshal of legacy tombstone. After I commentized these else-if block, I could scan the table correctly. {code:java} if ((start.collectionName == null) != (stop.collectionName == null)) { if (start.collectionName == null) stop = new LegacyBound(stop.bound, stop.isStatic, null); else start = new LegacyBound(start.bound, start.isStatic, null); } /*else if (!Objects.equals(start.collectionName, stop.collectionName)) { // We're in the similar but slightly more complex case where on top of the big tombstone // A, we have 2 (or more) collection tombstones B and C within A. So we also end up with // a
[jira] [Updated] (CASSANDRA-13125) Duplicate rows after upgrading from 2.1.16 to 3.0.10/3.9
[ https://issues.apache.org/jira/browse/CASSANDRA-13125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yasuharu Goto updated CASSANDRA-13125: -- Reproduced In: 3.9, 3.0.10 (was: 3.0.10, 3.9) Status: Patch Available (was: Open) I've submitted a brief patch for reference. > Duplicate rows after upgrading from 2.1.16 to 3.0.10/3.9 > > > Key: CASSANDRA-13125 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13125 > Project: Cassandra > Issue Type: Bug >Reporter: Zhongxiang Zheng > Attachments: diff-a.patch, diff-b.patch > > > I found that rows are splitting and duplicated after upgrading the cluster > from 2.1.x to 3.0.x. > I found the way to reproduce the problem as below. > {code} > $ ccm create test -v 2.1.16 -n 3 -s > > Current cluster is now: test > $ ccm node1 cqlsh -e "CREATE KEYSPACE test WITH replication = > {'class':'SimpleStrategy', 'replication_factor':3}" > $ ccm node1 cqlsh -e "CREATE TABLE test.test (id text PRIMARY KEY, value1 > set, value2 set);" > # Upgrade node1 > $ for i in 1; do ccm node${i} stop; ccm node${i} setdir -v3.0.10; ccm > node${i} start;ccm node${i} nodetool upgradesstables; done > # Insert a row through node1(3.0.10) > $ ccm node1 cqlsh -e "INSERT INTO test.test (id, value1, value2) values > ('aaa', {'aaa', 'bbb'}, {'ccc', 'ddd'});" > # Insert a row through node2(2.1.16) > $ ccm node2 cqlsh -e "INSERT INTO test.test (id, value1, value2) values > ('bbb', {'aaa', 'bbb'}, {'ccc', 'ddd'});" > # The row inserted from node1 is splitting > $ ccm node1 cqlsh -e "SELECT * FROM test.test ;" > id | value1 | value2 > -++ > aaa | null | null > aaa | {'aaa', 'bbb'} | {'ccc', 'ddd'} > bbb | {'aaa', 'bbb'} | {'ccc', 'ddd'} > $ for i in 1 2; do ccm node${i} nodetool flush; done > # Results of sstable2json of node2. The row inserted from node1(3.0.10) is > different from the row inserted from node2(2.1.16). > $ ccm node2 json -k test -c test > running > ['/home/zzheng/.ccm/test/node2/data0/test/test-5406ee80dbdb11e6a175f57c4c7c85f3/test-test-ka-1-Data.db'] > -- test-test-ka-1-Data.db - > [ > {"key": "aaa", > "cells": [["","",1484564624769577], >["value1","value2:!",1484564624769576,"t",1484564624], >["value1:616161","",1484564624769577], >["value1:626262","",1484564624769577], >["value2:636363","",1484564624769577], >["value2:646464","",1484564624769577]]}, > {"key": "bbb", > "cells": [["","",1484564634508029], >["value1:_","value1:!",1484564634508028,"t",1484564634], >["value1:616161","",1484564634508029], >["value1:626262","",1484564634508029], >["value2:_","value2:!",1484564634508028,"t",1484564634], >["value2:636363","",1484564634508029], >["value2:646464","",1484564634508029]]} > ] > # Upgrade node2,3 > $ for i in `seq 2 3`; do ccm node${i} stop; ccm node${i} setdir -v3.0.10; ccm > node${i} start;ccm node${i} nodetool upgradesstables; done > # After upgrade node2,3, the row inserted from node1 is splitting in node2,3 > $ ccm node2 cqlsh -e "SELECT * FROM test.test ;" > > id | value1 | value2 > -++ > aaa | null | null > aaa | {'aaa', 'bbb'} | {'ccc', 'ddd'} > bbb | {'aaa', 'bbb'} | {'ccc', 'ddd'} > (3 rows) > # Results of sstabledump > # node1 > [ > { > "partition" : { > "key" : [ "aaa" ], > "position" : 0 > }, > "rows" : [ > { > "type" : "row", > "position" : 17, > "liveness_info" : { "tstamp" : "2017-01-16T11:03:44.769577Z" }, > "cells" : [ > { "name" : "value1", "deletion_info" : { "marked_deleted" : > "2017-01-16T11:03:44.769576Z", "local_delete_time" : "2017-01-16T11:03:44Z" } > }, > { "name" : "value1", "path" : [ "aaa" ], "value" : "" }, > { "name" : "value1", "path" : [ "bbb" ], "value" : "" }, > { "name" : "value2", "deletion_info" : { "marked_deleted" : > "2017-01-16T11:03:44.769576Z", "local_delete_time" : "2017-01-16T11:03:44Z" } > }, > { "name" : "value2", "path" : [ "ccc" ], "value" : "" }, > { "name" : "value2", "path" : [ "ddd" ], "value" : "" } > ] > } > ] > }, > { > "partition" : { > "key" : [ "bbb" ], > "position" : 48 > }, > "rows" : [ > { > "type" : "row", > "position" : 65, > "liveness_info" : { "tstamp" : "2017-01-16T11:03:54.508029Z" }, > "cells" : [ > { "name" :
[jira] [Updated] (CASSANDRA-13125) Duplicate rows after upgrading from 2.1.16 to 3.0.10/3.9
[ https://issues.apache.org/jira/browse/CASSANDRA-13125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yasuharu Goto updated CASSANDRA-13125: -- Attachment: diff-b.patch A patch for Plan-B on Cassandra-3.0.10. > Duplicate rows after upgrading from 2.1.16 to 3.0.10/3.9 > > > Key: CASSANDRA-13125 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13125 > Project: Cassandra > Issue Type: Bug >Reporter: Zhongxiang Zheng > Attachments: diff-a.patch, diff-b.patch > > > I found that rows are splitting and duplicated after upgrading the cluster > from 2.1.x to 3.0.x. > I found the way to reproduce the problem as below. > {code} > $ ccm create test -v 2.1.16 -n 3 -s > > Current cluster is now: test > $ ccm node1 cqlsh -e "CREATE KEYSPACE test WITH replication = > {'class':'SimpleStrategy', 'replication_factor':3}" > $ ccm node1 cqlsh -e "CREATE TABLE test.test (id text PRIMARY KEY, value1 > set, value2 set);" > # Upgrade node1 > $ for i in 1; do ccm node${i} stop; ccm node${i} setdir -v3.0.10; ccm > node${i} start;ccm node${i} nodetool upgradesstables; done > # Insert a row through node1(3.0.10) > $ ccm node1 cqlsh -e "INSERT INTO test.test (id, value1, value2) values > ('aaa', {'aaa', 'bbb'}, {'ccc', 'ddd'});" > # Insert a row through node2(2.1.16) > $ ccm node2 cqlsh -e "INSERT INTO test.test (id, value1, value2) values > ('bbb', {'aaa', 'bbb'}, {'ccc', 'ddd'});" > # The row inserted from node1 is splitting > $ ccm node1 cqlsh -e "SELECT * FROM test.test ;" > id | value1 | value2 > -++ > aaa | null | null > aaa | {'aaa', 'bbb'} | {'ccc', 'ddd'} > bbb | {'aaa', 'bbb'} | {'ccc', 'ddd'} > $ for i in 1 2; do ccm node${i} nodetool flush; done > # Results of sstable2json of node2. The row inserted from node1(3.0.10) is > different from the row inserted from node2(2.1.16). > $ ccm node2 json -k test -c test > running > ['/home/zzheng/.ccm/test/node2/data0/test/test-5406ee80dbdb11e6a175f57c4c7c85f3/test-test-ka-1-Data.db'] > -- test-test-ka-1-Data.db - > [ > {"key": "aaa", > "cells": [["","",1484564624769577], >["value1","value2:!",1484564624769576,"t",1484564624], >["value1:616161","",1484564624769577], >["value1:626262","",1484564624769577], >["value2:636363","",1484564624769577], >["value2:646464","",1484564624769577]]}, > {"key": "bbb", > "cells": [["","",1484564634508029], >["value1:_","value1:!",1484564634508028,"t",1484564634], >["value1:616161","",1484564634508029], >["value1:626262","",1484564634508029], >["value2:_","value2:!",1484564634508028,"t",1484564634], >["value2:636363","",1484564634508029], >["value2:646464","",1484564634508029]]} > ] > # Upgrade node2,3 > $ for i in `seq 2 3`; do ccm node${i} stop; ccm node${i} setdir -v3.0.10; ccm > node${i} start;ccm node${i} nodetool upgradesstables; done > # After upgrade node2,3, the row inserted from node1 is splitting in node2,3 > $ ccm node2 cqlsh -e "SELECT * FROM test.test ;" > > id | value1 | value2 > -++ > aaa | null | null > aaa | {'aaa', 'bbb'} | {'ccc', 'ddd'} > bbb | {'aaa', 'bbb'} | {'ccc', 'ddd'} > (3 rows) > # Results of sstabledump > # node1 > [ > { > "partition" : { > "key" : [ "aaa" ], > "position" : 0 > }, > "rows" : [ > { > "type" : "row", > "position" : 17, > "liveness_info" : { "tstamp" : "2017-01-16T11:03:44.769577Z" }, > "cells" : [ > { "name" : "value1", "deletion_info" : { "marked_deleted" : > "2017-01-16T11:03:44.769576Z", "local_delete_time" : "2017-01-16T11:03:44Z" } > }, > { "name" : "value1", "path" : [ "aaa" ], "value" : "" }, > { "name" : "value1", "path" : [ "bbb" ], "value" : "" }, > { "name" : "value2", "deletion_info" : { "marked_deleted" : > "2017-01-16T11:03:44.769576Z", "local_delete_time" : "2017-01-16T11:03:44Z" } > }, > { "name" : "value2", "path" : [ "ccc" ], "value" : "" }, > { "name" : "value2", "path" : [ "ddd" ], "value" : "" } > ] > } > ] > }, > { > "partition" : { > "key" : [ "bbb" ], > "position" : 48 > }, > "rows" : [ > { > "type" : "row", > "position" : 65, > "liveness_info" : { "tstamp" : "2017-01-16T11:03:54.508029Z" }, > "cells" : [ > { "name" : "value1", "deletion_info" : { "marked_deleted" : >
[jira] [Updated] (CASSANDRA-13125) Duplicate rows after upgrading from 2.1.16 to 3.0.10/3.9
[ https://issues.apache.org/jira/browse/CASSANDRA-13125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yasuharu Goto updated CASSANDRA-13125: -- Attachment: diff-a.patch A patch for Plan-A on Cassandra 3.0.10 > Duplicate rows after upgrading from 2.1.16 to 3.0.10/3.9 > > > Key: CASSANDRA-13125 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13125 > Project: Cassandra > Issue Type: Bug >Reporter: Zhongxiang Zheng > Attachments: diff-a.patch, diff-b.patch > > > I found that rows are splitting and duplicated after upgrading the cluster > from 2.1.x to 3.0.x. > I found the way to reproduce the problem as below. > {code} > $ ccm create test -v 2.1.16 -n 3 -s > > Current cluster is now: test > $ ccm node1 cqlsh -e "CREATE KEYSPACE test WITH replication = > {'class':'SimpleStrategy', 'replication_factor':3}" > $ ccm node1 cqlsh -e "CREATE TABLE test.test (id text PRIMARY KEY, value1 > set, value2 set);" > # Upgrade node1 > $ for i in 1; do ccm node${i} stop; ccm node${i} setdir -v3.0.10; ccm > node${i} start;ccm node${i} nodetool upgradesstables; done > # Insert a row through node1(3.0.10) > $ ccm node1 cqlsh -e "INSERT INTO test.test (id, value1, value2) values > ('aaa', {'aaa', 'bbb'}, {'ccc', 'ddd'});" > # Insert a row through node2(2.1.16) > $ ccm node2 cqlsh -e "INSERT INTO test.test (id, value1, value2) values > ('bbb', {'aaa', 'bbb'}, {'ccc', 'ddd'});" > # The row inserted from node1 is splitting > $ ccm node1 cqlsh -e "SELECT * FROM test.test ;" > id | value1 | value2 > -++ > aaa | null | null > aaa | {'aaa', 'bbb'} | {'ccc', 'ddd'} > bbb | {'aaa', 'bbb'} | {'ccc', 'ddd'} > $ for i in 1 2; do ccm node${i} nodetool flush; done > # Results of sstable2json of node2. The row inserted from node1(3.0.10) is > different from the row inserted from node2(2.1.16). > $ ccm node2 json -k test -c test > running > ['/home/zzheng/.ccm/test/node2/data0/test/test-5406ee80dbdb11e6a175f57c4c7c85f3/test-test-ka-1-Data.db'] > -- test-test-ka-1-Data.db - > [ > {"key": "aaa", > "cells": [["","",1484564624769577], >["value1","value2:!",1484564624769576,"t",1484564624], >["value1:616161","",1484564624769577], >["value1:626262","",1484564624769577], >["value2:636363","",1484564624769577], >["value2:646464","",1484564624769577]]}, > {"key": "bbb", > "cells": [["","",1484564634508029], >["value1:_","value1:!",1484564634508028,"t",1484564634], >["value1:616161","",1484564634508029], >["value1:626262","",1484564634508029], >["value2:_","value2:!",1484564634508028,"t",1484564634], >["value2:636363","",1484564634508029], >["value2:646464","",1484564634508029]]} > ] > # Upgrade node2,3 > $ for i in `seq 2 3`; do ccm node${i} stop; ccm node${i} setdir -v3.0.10; ccm > node${i} start;ccm node${i} nodetool upgradesstables; done > # After upgrade node2,3, the row inserted from node1 is splitting in node2,3 > $ ccm node2 cqlsh -e "SELECT * FROM test.test ;" > > id | value1 | value2 > -++ > aaa | null | null > aaa | {'aaa', 'bbb'} | {'ccc', 'ddd'} > bbb | {'aaa', 'bbb'} | {'ccc', 'ddd'} > (3 rows) > # Results of sstabledump > # node1 > [ > { > "partition" : { > "key" : [ "aaa" ], > "position" : 0 > }, > "rows" : [ > { > "type" : "row", > "position" : 17, > "liveness_info" : { "tstamp" : "2017-01-16T11:03:44.769577Z" }, > "cells" : [ > { "name" : "value1", "deletion_info" : { "marked_deleted" : > "2017-01-16T11:03:44.769576Z", "local_delete_time" : "2017-01-16T11:03:44Z" } > }, > { "name" : "value1", "path" : [ "aaa" ], "value" : "" }, > { "name" : "value1", "path" : [ "bbb" ], "value" : "" }, > { "name" : "value2", "deletion_info" : { "marked_deleted" : > "2017-01-16T11:03:44.769576Z", "local_delete_time" : "2017-01-16T11:03:44Z" } > }, > { "name" : "value2", "path" : [ "ccc" ], "value" : "" }, > { "name" : "value2", "path" : [ "ddd" ], "value" : "" } > ] > } > ] > }, > { > "partition" : { > "key" : [ "bbb" ], > "position" : 48 > }, > "rows" : [ > { > "type" : "row", > "position" : 65, > "liveness_info" : { "tstamp" : "2017-01-16T11:03:54.508029Z" }, > "cells" : [ > { "name" : "value1", "deletion_info" : { "marked_deleted" : >
[jira] [Commented] (CASSANDRA-13125) Duplicate rows after upgrading from 2.1.16 to 3.0.10/3.9
[ https://issues.apache.org/jira/browse/CASSANDRA-13125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15829145#comment-15829145 ] Yasuharu Goto commented on CASSANDRA-13125: --- h2. Investigations... After some debugging, I found interesting difference in serialized RangeTombstoneLists between 2.1.16 and 3.0.10. - I ran 3 Cassandra nodes with some debug prints. -- 127.0.0.1 (C* 3.0.10) -- 127.0.0.2 (C* 2.1.16) -- 127.0.0.3 (C* 2.1.16) - They have a keyspace and a table already created. -- CREATE KEYSPACE test WITH replication = {'class': 'SimpleStrategy', 'replication_factor': '1'} -- CREATE TABLE test.test ( a int PRIMARY KEY, b int, c set, d set, e int ) - And I query a same INSERT (which mutation is sent to 127.0.0.2) query from 127.0.0.1(C*3.0) and 127.0.0.3(C*2.1) and see the difference. Insert a row from 127.0.0.1 and scan. ( inserted (a=14) row is broken) {code:sql} cqlsh> insert into test.test(a,b,c,d,e) values(14,1,{2,3},{4,5},6); cqlsh> select * from test.test; a | b| c | d | e +--+++-- 14 |1 | null | null | null 14 | null | {2, 3} | {4, 5} |6 (2 rows) {code} And then, I insert from 127.0.0.3 and scan. (neither a=5 nor a=14 are broken) {code:sql} cqlsh> insert into test.test(a,b,c,d,e)values(5,1,{2,3},{4,5},6); cqlsh> select * from test.test; a | b | c | d | e +---+++--- 5 | 1 | {2, 3} | {4, 5} | 6 14 | 1 | {2, 3} | {4, 5} | 6 {code} And back to 127.0.0.1 and scan the table. a=14 is broken but a=5 is not. {code:sql} cqlsh> select * from test.test; a | b| c | d | e +--+++-- 5 |1 | {2, 3} | {4, 5} |6 14 |1 | null | null | null 14 | null | {2, 3} | {4, 5} |6 {code} Therefore,It looks like that "C*3 can't scan properly rows that is stored in C*2 but inserted from C*3."; Next, I observed some incoming MUTATIONs in 127.0.0.2 like below. I saw that C*3.0 sent RangeTombstones like {{[c-c],[c-d]}}, but C*2.1 sent {{[c:_-c],[d:_-d]}}. {noformat} > insert into test.test(a,b,c,d,e) values(14,1,{2,3},{4,5},6); from 127.0.0.1 DeletionInfo:{deletedAt=-9223372036854775808, localDeletion=2147483647, ranges=[c-c:!, deletedAt=1484710273390930, localDeletion=1484710273][c-d:!, deletedAt=1484710273390930, localDeletion=1484710273]} from:/127.0.0.1, payload:Mutation(keyspace='test', key='000e', modifications=[ColumnFamily(test -{deletedAt=-9223372036854775808, localDeletion=2147483647, ranges=[c-c:!, deletedAt=1484710273390930, localDeletion=1484710273][c-d:!, deletedAt=1484710273390930, localDeletion=1484710273]}- [:false:0@1484710273390931,b:false:4@1484710273390931,c:0002:false:0@1484710273390931,c:0003:false:0@1484710273390931,d:0004:false:0@1484710273390931,d:0005:false:0@1484710273390931,e:false:4@1484710273390931,])]), verb:MUTATION, version:8 > insert into test.test(a,b,c,d,e) values(14,1,{2,3},{4,5},6); from 127.0.0.3 DeletionInfo:{deletedAt=-9223372036854775808, localDeletion=2147483647, ranges=[c:_-c:!, deletedAt=1484710277987556, localDeletion=1484710277][d:_-d:!, deletedAt=1484710277987556, localDeletion=1484710277]} from:/127.0.0.3, payload:Mutation(keyspace='test', key='000e', modifications=[ColumnFamily(test -{deletedAt=-9223372036854775808, localDeletion=2147483647, ranges=[c:_-c:!, deletedAt=1484710277987556, localDeletion=1484710277][d:_-d:!, deletedAt=1484710277987556, localDeletion=1484710277]}- [:false:0@1484710277987557,b:false:4@1484710277987557,c:0002:false:0@1484710277987557,c:0003:false:0@1484710277987557,d:0004:false:0@1484710277987557,d:0005:false:0@1484710277987557,e:false:4@1484710277987557,])]), verb:MUTATION, version:8 {noformat} h2. Workaround Plan-A But, LegacyRangeTombstone remove {{collectionName}} from RangeTombStone which start.bound != end.bound like {{[c-d]}} https://github.com/apache/cassandra/blob/cassandra-3.0.10/src/java/org/apache/cassandra/db/LegacyLayout.java#L1592-L1599 It seems like that this deletions of collectionName corrupt the unmarshal of legacy tombstone. After I commentized these else-if block, I could scan the table correctly. {code:java} if ((start.collectionName == null) != (stop.collectionName == null)) { if (start.collectionName == null) stop = new LegacyBound(stop.bound, stop.isStatic, null); else start = new LegacyBound(start.bound, start.isStatic, null); } /*else if (!Objects.equals(start.collectionName, stop.collectionName)) { // We're in the similar but slightly more complex case where on top of the big tombstone // A, we have 2 (or more) collection tombstones B and C within A. So we also end up with // a tombstone that goes between the end of B and the
[jira] [Commented] (CASSANDRA-12861) example/triggers build fail.
[ https://issues.apache.org/jira/browse/CASSANDRA-12861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15658815#comment-15658815 ] Yasuharu Goto commented on CASSANDRA-12861: --- Thank you [~slebresne] for your correction. Your version helped me understanding Cassandra code :) > example/triggers build fail. > > > Key: CASSANDRA-12861 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12861 > Project: Cassandra > Issue Type: Bug >Reporter: Yasuharu Goto >Assignee: Sylvain Lebresne >Priority: Trivial > > When I tried to build example/trigger on trunk branch, I found that "ant jar" > fails with an error like below. > (Sorry for my language settings for ant. I couldn't find how to change it. > The error indicated here is a "cannot find symboll" error of > RowUpdateBuilder). > {code} > Buildfile: /Users/yasuharu/git/cassandra/examples/triggers/build.xml > init: > [mkdir] Created dir: > /Users/yasuharu/git/cassandra/examples/triggers/build/classes > build: > [javac] Compiling 1 source file to > /Users/yasuharu/git/cassandra/examples/triggers/build/classes > [javac] 警告: > 注釈プロセッサ'org.openjdk.jmh.generators.BenchmarkProcessor'から-source > '1.8'より小さいソース・バージョン'RELEASE_6'がサポートされています > [javac] > /Users/yasuharu/git/cassandra/examples/triggers/src/org/apache/cassandra/triggers/AuditTrigger.java:27: > エラー: シンボルを見つけられません > [javac] import org.apache.cassandra.db.RowUpdateBuilder; > [javac] ^ > [javac] シンボル: クラス RowUpdateBuilder > [javac] 場所: パッケージ org.apache.cassandra.db > [javac] エラー1個 > [javac] 警告1個 > BUILD FAILED > /Users/yasuharu/git/cassandra/examples/triggers/build.xml:45: Compile failed; > see the compiler error output for details. > Total time: 1 second > {code} > I think the movement of RowUpdateBuilder to test has broken this build. > https://github.com/apache/cassandra/commit/26838063de6246e3a1e18062114ca92fb81c00cf > In order to fix this, I moved back RowUpdateBuilder.java to src in my patch. > https://github.com/apache/cassandra/commit/d133eefe9c5fbebd8d389a9397c3948b8c36bd06 > Could you please review my patch? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-12861) example/triggers build fail.
[ https://issues.apache.org/jira/browse/CASSANDRA-12861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yasuharu Goto updated CASSANDRA-12861: -- Status: Patch Available (was: Open) > example/triggers build fail. > > > Key: CASSANDRA-12861 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12861 > Project: Cassandra > Issue Type: Bug >Reporter: Yasuharu Goto >Assignee: Yasuharu Goto >Priority: Trivial > > When I tried to build example/trigger on trunk branch, I found that "ant jar" > fails with an error like below. > (Sorry for my language settings for ant. I couldn't find how to change it. > The error indicated here is a "cannot find symboll" error of > RowUpdateBuilder). > {code} > Buildfile: /Users/yasuharu/git/cassandra/examples/triggers/build.xml > init: > [mkdir] Created dir: > /Users/yasuharu/git/cassandra/examples/triggers/build/classes > build: > [javac] Compiling 1 source file to > /Users/yasuharu/git/cassandra/examples/triggers/build/classes > [javac] 警告: > 注釈プロセッサ'org.openjdk.jmh.generators.BenchmarkProcessor'から-source > '1.8'より小さいソース・バージョン'RELEASE_6'がサポートされています > [javac] > /Users/yasuharu/git/cassandra/examples/triggers/src/org/apache/cassandra/triggers/AuditTrigger.java:27: > エラー: シンボルを見つけられません > [javac] import org.apache.cassandra.db.RowUpdateBuilder; > [javac] ^ > [javac] シンボル: クラス RowUpdateBuilder > [javac] 場所: パッケージ org.apache.cassandra.db > [javac] エラー1個 > [javac] 警告1個 > BUILD FAILED > /Users/yasuharu/git/cassandra/examples/triggers/build.xml:45: Compile failed; > see the compiler error output for details. > Total time: 1 second > {code} > I think the movement of RowUpdateBuilder to test has broken this build. > https://github.com/apache/cassandra/commit/26838063de6246e3a1e18062114ca92fb81c00cf > In order to fix this, I moved back RowUpdateBuilder.java to src in my patch. > https://github.com/apache/cassandra/commit/d133eefe9c5fbebd8d389a9397c3948b8c36bd06 > Could you please review my patch? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-12861) example/triggers build fail.
Yasuharu Goto created CASSANDRA-12861: - Summary: example/triggers build fail. Key: CASSANDRA-12861 URL: https://issues.apache.org/jira/browse/CASSANDRA-12861 Project: Cassandra Issue Type: Bug Reporter: Yasuharu Goto Assignee: Yasuharu Goto Priority: Trivial When I tried to build example/trigger on trunk branch, I found that "ant jar" fails with an error like below. (Sorry for my language settings for ant. I couldn't find how to change it. The error indicated here is a "cannot find symboll" error of RowUpdateBuilder). {code} Buildfile: /Users/yasuharu/git/cassandra/examples/triggers/build.xml init: [mkdir] Created dir: /Users/yasuharu/git/cassandra/examples/triggers/build/classes build: [javac] Compiling 1 source file to /Users/yasuharu/git/cassandra/examples/triggers/build/classes [javac] 警告: 注釈プロセッサ'org.openjdk.jmh.generators.BenchmarkProcessor'から-source '1.8'より小さいソース・バージョン'RELEASE_6'がサポートされています [javac] /Users/yasuharu/git/cassandra/examples/triggers/src/org/apache/cassandra/triggers/AuditTrigger.java:27: エラー: シンボルを見つけられません [javac] import org.apache.cassandra.db.RowUpdateBuilder; [javac] ^ [javac] シンボル: クラス RowUpdateBuilder [javac] 場所: パッケージ org.apache.cassandra.db [javac] エラー1個 [javac] 警告1個 BUILD FAILED /Users/yasuharu/git/cassandra/examples/triggers/build.xml:45: Compile failed; see the compiler error output for details. Total time: 1 second {code} I think the movement of RowUpdateBuilder to test has broken this build. https://github.com/apache/cassandra/commit/26838063de6246e3a1e18062114ca92fb81c00cf In order to fix this, I moved back RowUpdateBuilder.java to src in my patch. https://github.com/apache/cassandra/commit/d133eefe9c5fbebd8d389a9397c3948b8c36bd06 Could you please review my patch? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-12731) Remove IndexInfo cache from FileIndexInfoRetriever.
[ https://issues.apache.org/jira/browse/CASSANDRA-12731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15542399#comment-15542399 ] Yasuharu Goto commented on CASSANDRA-12731: --- Thank you very much for your review and cleaning up, [~snazy] ! > Remove IndexInfo cache from FileIndexInfoRetriever. > --- > > Key: CASSANDRA-12731 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12731 > Project: Cassandra > Issue Type: Improvement >Reporter: Yasuharu Goto >Assignee: Yasuharu Goto > Attachments: screenshot-1.png, screenshot-2.png > > > Hi guys. > In the patch of CASSANDRA-11206 , I found that FileIndexInfoRetriever > allocates a (potentially very large) IndexInfo array (up to the number of > IndexInfo in the RowIndexEntry has) as a cache in every single read path. > After some experiments using LargePartitionTest on my MacBook, I got results > that show that removing FileIndexInfoRetriever improves the performance for > large partitions like below (latencies reduced by 41% and by 45%). > {noformat} > // LargePartitionsTest.test_13_4G with cache by array > INFO [main] 2016-09-29 23:11:25,763 ?:? - SELECTs 1 for part=4194304k > total=16384M took 94197 ms > INFO [main] 2016-09-29 23:12:50,914 ?:? - SELECTs 2 for part=4194304k > total=16384M took 85151 ms > // LargePartitionsTest.test_13_4G without cache > INFO [main] 2016-09-30 00:13:26,050 ?:? - SELECTs 1 for part=4194304k > total=16384M took 55112 ms > INFO [main] 2016-09-30 00:14:12,132 ?:? - SELECTs 2 for part=4194304k > total=16384M took 46082 ms > {noformat} > Code is > [here|https://github.com/matope/cassandra/commit/86fb910a0e38f7520e1be40fb42f74a692f2ebce] > (based on trunk) > Heap memory usage during running LargePartitionsTest (except for 8G test) > with array cache(original) > !screenshot-1.png! > Heap memory usage during running LargePartitionsTest (except for 8G test) > without cache > !screenshot-2.png! > Of course, I have attempted to use some collection containers instead of a > plain array. But I could not recognize great improvement enough to justify > using these cache mechanism by them. (Unless I did some mistake or overlook > about this test) > || LargePartitionsTest.test_12_2G || SELECTs 1 (ms) || SELECTs 2 (ms) || Scan > (ms) || > |Original (array) | 62736 | 48562 | 41540 | > |ConcurrentHashMap | 47597 | 30854 | 18271 | > |ConcurrentHashMap 2nd trial |44036|26895|17443| > |LinkedHashCache (capacity=16, limit=10, fifo) 1st|42668|32165|17323| > |LinkedHashCache (capacity=16, limit=10, fifo) 2nd|48863|28066|18053| > |LinkedHashCache (capacity=16, limit=16, fifo) | 46979 | 29810 | 18620 | > |LinkedHashCache (capacity=16, limit=10, lru) | 46456 | 29749 | 20311 | > |No Cache | 47579 | 32480 | 18337 | > |No Cache 2nd trial | 46534 | 27670 | 18700 | > Code that I used for this comparison is > [here|https://github.com/matope/cassandra/commit/e12fcac77f0f46bdf4104ef21c6454bfb2bb92d0]. > LinkedHashCache is a simple fifo/lru cache that is extended by LinkedHashMap. > Scan is a execution time to iterate through the large partition. > So, in this issue, I'd like to propose to remove IndexInfo cache from > FileIndexInfoRetriever to improve the performance on large partitions. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-12731) Remove IndexInfo cache from FileIndexInfoRetriever.
[ https://issues.apache.org/jira/browse/CASSANDRA-12731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yasuharu Goto updated CASSANDRA-12731: -- Assignee: Yasuharu Goto Status: Patch Available (was: Open) > Remove IndexInfo cache from FileIndexInfoRetriever. > --- > > Key: CASSANDRA-12731 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12731 > Project: Cassandra > Issue Type: Improvement >Reporter: Yasuharu Goto >Assignee: Yasuharu Goto > Attachments: screenshot-1.png, screenshot-2.png > > > Hi guys. > In the patch of CASSANDRA-11206 , I found that FileIndexInfoRetriever > allocates a (potentially very large) IndexInfo array (up to the number of > IndexInfo in the RowIndexEntry has) as a cache in every single read path. > After some experiments using LargePartitionTest on my MacBook, I got results > that show that removing FileIndexInfoRetriever improves the performance for > large partitions like below (latencies reduced by 41% and by 45%). > {noformat} > // LargePartitionsTest.test_13_4G with cache by array > INFO [main] 2016-09-29 23:11:25,763 ?:? - SELECTs 1 for part=4194304k > total=16384M took 94197 ms > INFO [main] 2016-09-29 23:12:50,914 ?:? - SELECTs 2 for part=4194304k > total=16384M took 85151 ms > // LargePartitionsTest.test_13_4G without cache > INFO [main] 2016-09-30 00:13:26,050 ?:? - SELECTs 1 for part=4194304k > total=16384M took 55112 ms > INFO [main] 2016-09-30 00:14:12,132 ?:? - SELECTs 2 for part=4194304k > total=16384M took 46082 ms > {noformat} > Code is > [here|https://github.com/matope/cassandra/commit/86fb910a0e38f7520e1be40fb42f74a692f2ebce] > (based on trunk) > Heap memory usage during running LargePartitionsTest (except for 8G test) > with array cache(original) > !screenshot-1.png! > Heap memory usage during running LargePartitionsTest (except for 8G test) > without cache > !screenshot-2.png! > Of course, I have attempted to use some collection containers instead of a > plain array. But I could not recognize great improvement enough to justify > using these cache mechanism by them. (Unless I did some mistake or overlook > about this test) > || LargePartitionsTest.test_12_2G || SELECTs 1 (ms) || SELECTs 2 (ms) || Scan > (ms) || > |Original (array) | 62736 | 48562 | 41540 | > |ConcurrentHashMap | 47597 | 30854 | 18271 | > |ConcurrentHashMap 2nd trial |44036|26895|17443| > |LinkedHashCache (capacity=16, limit=10, fifo) 1st|42668|32165|17323| > |LinkedHashCache (capacity=16, limit=10, fifo) 2nd|48863|28066|18053| > |LinkedHashCache (capacity=16, limit=16, fifo) | 46979 | 29810 | 18620 | > |LinkedHashCache (capacity=16, limit=10, lru) | 46456 | 29749 | 20311 | > |No Cache | 47579 | 32480 | 18337 | > |No Cache 2nd trial | 46534 | 27670 | 18700 | > Code that I used for this comparison is > [here|https://github.com/matope/cassandra/commit/e12fcac77f0f46bdf4104ef21c6454bfb2bb92d0]. > LinkedHashCache is a simple fifo/lru cache that is extended by LinkedHashMap. > Scan is a execution time to iterate through the large partition. > So, in this issue, I'd like to propose to remove IndexInfo cache from > FileIndexInfoRetriever to improve the performance on large partitions. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-12731) Remove IndexInfo cache from FileIndexInfoRetriever.
[ https://issues.apache.org/jira/browse/CASSANDRA-12731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yasuharu Goto updated CASSANDRA-12731: -- Description: Hi guys. In the patch of CASSANDRA-11206 , I found that FileIndexInfoRetriever allocates a (potentially very large) IndexInfo array (up to the number of IndexInfo in the RowIndexEntry has) as a cache in every single read path. After some experiments using LargePartitionTest on my MacBook, I got results that show that removing FileIndexInfoRetriever improves the performance for large partitions like below (latencies reduced by 41% and by 45%). {noformat} // LargePartitionsTest.test_13_4G with cache by array INFO [main] 2016-09-29 23:11:25,763 ?:? - SELECTs 1 for part=4194304k total=16384M took 94197 ms INFO [main] 2016-09-29 23:12:50,914 ?:? - SELECTs 2 for part=4194304k total=16384M took 85151 ms // LargePartitionsTest.test_13_4G without cache INFO [main] 2016-09-30 00:13:26,050 ?:? - SELECTs 1 for part=4194304k total=16384M took 55112 ms INFO [main] 2016-09-30 00:14:12,132 ?:? - SELECTs 2 for part=4194304k total=16384M took 46082 ms {noformat} Code is [here|https://github.com/matope/cassandra/commit/86fb910a0e38f7520e1be40fb42f74a692f2ebce] (based on trunk) Heap memory usage during running LargePartitionsTest (except for 8G test) with array cache(original) !screenshot-1.png! Heap memory usage during running LargePartitionsTest (except for 8G test) without cache !screenshot-2.png! Of course, I have attempted to use some collection containers instead of a plain array. But I could not recognize great improvement enough to justify using these cache mechanism by them. (Unless I did some mistake or overlook about this test) || LargePartitionsTest.test_12_2G || SELECTs 1 (ms) || SELECTs 2 (ms) || Scan (ms) || |Original (array) | 62736 | 48562 | 41540 | |ConcurrentHashMap | 47597 | 30854 | 18271 | |ConcurrentHashMap 2nd trial |44036|26895|17443| |LinkedHashCache (capacity=16, limit=10, fifo) 1st|42668|32165|17323| |LinkedHashCache (capacity=16, limit=10, fifo) 2nd|48863|28066|18053| |LinkedHashCache (capacity=16, limit=16, fifo) | 46979 | 29810 | 18620 | |LinkedHashCache (capacity=16, limit=10, lru) | 46456 | 29749 | 20311 | |No Cache | 47579 | 32480 | 18337 | |No Cache 2nd trial | 46534 | 27670 | 18700 | Code that I used for this comparison is [here|https://github.com/matope/cassandra/commit/e12fcac77f0f46bdf4104ef21c6454bfb2bb92d0]. LinkedHashCache is a simple fifo/lru cache that is extended by LinkedHashMap. Scan is a execution time to iterate through the large partition. So, in this issue, I'd like to propose to remove IndexInfo cache from FileIndexInfoRetriever to improve the performance on large partitions. was: Hi guys. In the patch of CASSANDRA-11206 , I found that FileIndexInfoRetriever allocates a (potentially very large) IndexInfo array (up to the number of IndexInfo in the RowIndexEntry has) as a cache in every single read path. After some experiments using LargePartitionTest on my MacBook, I got results that show that removing FileIndexInfoRetriever improves the performance for large partitions like below (latencies reduced by 41% and by 45%). {noformat} // LargePartitionsTest.test_13_4G with cache by array INFO [main] 2016-09-29 23:11:25,763 ?:? - SELECTs 1 for part=4194304k total=16384M took 94197 ms INFO [main] 2016-09-29 23:12:50,914 ?:? - SELECTs 2 for part=4194304k total=16384M took 85151 ms // LargePartitionsTest.test_13_4G without cache INFO [main] 2016-09-30 00:13:26,050 ?:? - SELECTs 1 for part=4194304k total=16384M took 55112 ms INFO [main] 2016-09-30 00:14:12,132 ?:? - SELECTs 2 for part=4194304k total=16384M took 46082 ms {noformat} Code is [here|https://github.com/matope/cassandra/commit/86fb910a0e38f7520e1be40fb42f74a692f2ebce] (based on trunk) Heap memory usage during running LargePartitionsTest (except for 8G test) with array cache(original) !screenshot-1.png! Heap memory usage during running LargePartitionsTest (except for 8G test) without cache !screenshot-2.png! Of course, I have attempted to use some collection containers instead of a plain array. But I could not recognize great improvement enough to justify using these cache mechanism by them. (Unless I did some mistake or overlook about this test) || LargePartitionsTest.test_12_2G || SELECTs 1 (ms) || SELECTs 2 (ms) || Scan (ms) || |Original (array) | 62736 | 48562 | 41540 | |ConcurrentHashMap | 47597 | 30854 | 18271 | |ConcurrentHashMap 2nd trial |44036|26895|17443| |LinkedHashCache (capacity=16, limit=10, fifo) 1st|42668|32165|17323| |LinkedHashCache (capacity=16, limit=10, fifo) 2nd|48863|28066|18053| |LinkedHashCache (capacity=16, limit=16, fifo) | 46979 | 29810 | 18620 | |LinkedHashCache (capacity=16, limit=10, lru) | 46456 | 29749 | 20311 | |No Cache | 47579 | 32480 | 18337 | |No Cache 2nd trial | 46534 | 27670 | 18700 | Code that
[jira] [Commented] (CASSANDRA-12731) Remove IndexInfo cache from FileIndexInfoRetriever.
[ https://issues.apache.org/jira/browse/CASSANDRA-12731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15533273#comment-15533273 ] Yasuharu Goto commented on CASSANDRA-12731: --- Patch is here. https://github.com/matope/cassandra/commit/86fb910a0e38f7520e1be40fb42f74a692f2ebce > Remove IndexInfo cache from FileIndexInfoRetriever. > --- > > Key: CASSANDRA-12731 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12731 > Project: Cassandra > Issue Type: Improvement >Reporter: Yasuharu Goto > Attachments: screenshot-1.png, screenshot-2.png > > > Hi guys. > In the patch of CASSANDRA-11206 , I found that FileIndexInfoRetriever > allocates a (potentially very large) IndexInfo array (up to the number of > IndexInfo in the RowIndexEntry has) as a cache in every single read path. > After some experiments using LargePartitionTest on my MacBook, I got results > that show that removing FileIndexInfoRetriever improves the performance for > large partitions like below (latencies reduced by 41% and by 45%). > {noformat} > // LargePartitionsTest.test_13_4G with cache by array > INFO [main] 2016-09-29 23:11:25,763 ?:? - SELECTs 1 for part=4194304k > total=16384M took 94197 ms > INFO [main] 2016-09-29 23:12:50,914 ?:? - SELECTs 2 for part=4194304k > total=16384M took 85151 ms > // LargePartitionsTest.test_13_4G without cache > INFO [main] 2016-09-30 00:13:26,050 ?:? - SELECTs 1 for part=4194304k > total=16384M took 55112 ms > INFO [main] 2016-09-30 00:14:12,132 ?:? - SELECTs 2 for part=4194304k > total=16384M took 46082 ms > {noformat} > Code is > [here|https://github.com/matope/cassandra/commit/86fb910a0e38f7520e1be40fb42f74a692f2ebce] > (based on trunk) > Heap memory usage during running LargePartitionsTest (except for 8G test) > with array cache(original) > !screenshot-1.png! > Heap memory usage during running LargePartitionsTest (except for 8G test) > without cache > !screenshot-2.png! > Of course, I have attempted to use some collection containers instead of a > plain array. But I could not recognize great improvement enough to justify > using these cache mechanism by them. (Unless I did some mistake or overlook > about this test) > || LargePartitionsTest.test_12_2G || SELECTs 1 (ms) || SELECTs 2 (ms) || Scan > (ms) || > |Original (array) | 62736 | 48562 | 41540 | > |ConcurrentHashMap | 47597 | 30854 | 18271 | > |ConcurrentHashMap 2nd trial |44036|26895|17443| > |LinkedHashCache (capacity=16, limit=10, fifo) 1st|42668|32165|17323| > |LinkedHashCache (capacity=16, limit=10, fifo) 2nd|48863|28066|18053| > |LinkedHashCache (capacity=16, limit=16, fifo) | 46979 | 29810 | 18620 | > |LinkedHashCache (capacity=16, limit=10, lru) | 46456 | 29749 | 20311 | > |No Cache | 47579 | 32480 | 18337 | > |No Cache 2nd trial | 46534 | 27670 | 18700 | > Code that I used for this comparison is > [here|https://github.com/matope/cassandra/commit/e12fcac77f0f46bdf4104ef21c6454bfb2bb92d0]. > LinkedHashCache is a simple fifo/lru cache that is extended by LinkedHashMap. > Scan is a execution time to iterate through the large partition. > So, In this issue, I'd like to propose to remove IndexInfo cache from > FileIndexInfoRetriever to improve the performance on large partition. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-12731) Remove IndexInfo cache from FileIndexInfoRetriever.
[ https://issues.apache.org/jira/browse/CASSANDRA-12731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yasuharu Goto updated CASSANDRA-12731: -- Description: Hi guys. In the patch of CASSANDRA-11206 , I found that FileIndexInfoRetriever allocates a (potentially very large) IndexInfo array (up to the number of IndexInfo in the RowIndexEntry has) as a cache in every single read path. After some experiments using LargePartitionTest on my MacBook, I got results that show that removing FileIndexInfoRetriever improves the performance for large partitions like below (latencies reduced by 41% and by 45%). {noformat} // LargePartitionsTest.test_13_4G with cache by array INFO [main] 2016-09-29 23:11:25,763 ?:? - SELECTs 1 for part=4194304k total=16384M took 94197 ms INFO [main] 2016-09-29 23:12:50,914 ?:? - SELECTs 2 for part=4194304k total=16384M took 85151 ms // LargePartitionsTest.test_13_4G without cache INFO [main] 2016-09-30 00:13:26,050 ?:? - SELECTs 1 for part=4194304k total=16384M took 55112 ms INFO [main] 2016-09-30 00:14:12,132 ?:? - SELECTs 2 for part=4194304k total=16384M took 46082 ms {noformat} Code is [here|https://github.com/matope/cassandra/commit/86fb910a0e38f7520e1be40fb42f74a692f2ebce] (based on trunk) Heap memory usage during running LargePartitionsTest (except for 8G test) with array cache(original) !screenshot-1.png! Heap memory usage during running LargePartitionsTest (except for 8G test) without cache !screenshot-2.png! Of course, I have attempted to use some collection containers instead of a plain array. But I could not recognize great improvement enough to justify using these cache mechanism by them. (Unless I did some mistake or overlook about this test) || LargePartitionsTest.test_12_2G || SELECTs 1 (ms) || SELECTs 2 (ms) || Scan (ms) || |Original (array) | 62736 | 48562 | 41540 | |ConcurrentHashMap | 47597 | 30854 | 18271 | |ConcurrentHashMap 2nd trial |44036|26895|17443| |LinkedHashCache (capacity=16, limit=10, fifo) 1st|42668|32165|17323| |LinkedHashCache (capacity=16, limit=10, fifo) 2nd|48863|28066|18053| |LinkedHashCache (capacity=16, limit=16, fifo) | 46979 | 29810 | 18620 | |LinkedHashCache (capacity=16, limit=10, lru) | 46456 | 29749 | 20311 | |No Cache | 47579 | 32480 | 18337 | |No Cache 2nd trial | 46534 | 27670 | 18700 | Code that I used for this comparison is [here|https://github.com/matope/cassandra/commit/e12fcac77f0f46bdf4104ef21c6454bfb2bb92d0]. LinkedHashCache is a simple fifo/lru cache that is extended by LinkedHashMap. Scan is a execution time to iterate through the large partition. So, In this issue, I'd like to propose to remove IndexInfo cache from FileIndexInfoRetriever to improve the performance on large partition. was: Hi guys. In the patch of CASSANDRA-11206 , I found that FileIndexInfoRetriever allocates a very large IndexInfo array (up to the number of IndexInfo in the RowIndexEntry has) as a cache in every single read path. After some experiments using LargePartitionTest on my MacBook, I got results that show that removing FileIndexInfoRetriever improves the performance for large partitions like below (latencies reduced by 41% and by 45%). {noformat} // LargePartitionsTest.test_13_4G with cache by array INFO [main] 2016-09-29 23:11:25,763 ?:? - SELECTs 1 for part=4194304k total=16384M took 94197 ms INFO [main] 2016-09-29 23:12:50,914 ?:? - SELECTs 2 for part=4194304k total=16384M took 85151 ms // LargePartitionsTest.test_13_4G without cache INFO [main] 2016-09-30 00:13:26,050 ?:? - SELECTs 1 for part=4194304k total=16384M took 55112 ms INFO [main] 2016-09-30 00:14:12,132 ?:? - SELECTs 2 for part=4194304k total=16384M took 46082 ms {noformat} Code is [here|https://github.com/matope/cassandra/commit/86fb910a0e38f7520e1be40fb42f74a692f2ebce] (based on trunk) Heap memory usage during running LargePartitionsTest (except for 8G test) with array cache(original) !screenshot-1.png! Heap memory usage during running LargePartitionsTest (except for 8G test) without cache !screenshot-2.png! Of course, I have attempted to use some collection containers instead of a plain array. But I could not recognize great improvement enough to justify using these cache mechanism by them. (Unless I did some mistake or overlook about this test) || LargePartitionsTest.test_12_2G || SELECTs 1 (ms) || SELECTs 2 (ms) || Scan (ms) || |Original (array) | 62736 | 48562 | 41540 | |ConcurrentHashMap | 47597 | 30854 | 18271 | |ConcurrentHashMap 2nd trial |44036|26895|17443| |LinkedHashCache (capacity=16, limit=10, fifo) 1st|42668|32165|17323| |LinkedHashCache (capacity=16, limit=10, fifo) 2nd|48863|28066|18053| |LinkedHashCache (capacity=16, limit=16, fifo) | 46979 | 29810 | 18620 | |LinkedHashCache (capacity=16, limit=10, lru) | 46456 | 29749 | 20311 | |No Cache | 47579 | 32480 | 18337 | |No Cache 2nd trial | 46534 | 27670 | 18700 | Code that I used for
[jira] [Updated] (CASSANDRA-12731) Remove IndexInfo cache from FileIndexInfoRetriever.
[ https://issues.apache.org/jira/browse/CASSANDRA-12731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yasuharu Goto updated CASSANDRA-12731: -- Description: Hi guys. In the patch of CASSANDRA-11206 , I found that FileIndexInfoRetriever allocates a very large IndexInfo array (up to the number of IndexInfo in the RowIndexEntry has) as a cache in every single read path. After some experiments using LargePartitionTest on my MacBook, I got results that show that removing FileIndexInfoRetriever improves the performance for large partitions like below (latencies reduced by 41% and by 45%). {noformat} // LargePartitionsTest.test_13_4G with cache by array INFO [main] 2016-09-29 23:11:25,763 ?:? - SELECTs 1 for part=4194304k total=16384M took 94197 ms INFO [main] 2016-09-29 23:12:50,914 ?:? - SELECTs 2 for part=4194304k total=16384M took 85151 ms // LargePartitionsTest.test_13_4G without cache INFO [main] 2016-09-30 00:13:26,050 ?:? - SELECTs 1 for part=4194304k total=16384M took 55112 ms INFO [main] 2016-09-30 00:14:12,132 ?:? - SELECTs 2 for part=4194304k total=16384M took 46082 ms {noformat} Code is [here|https://github.com/matope/cassandra/commit/86fb910a0e38f7520e1be40fb42f74a692f2ebce] (based on trunk) Heap memory usage during running LargePartitionsTest (except for 8G test) with array cache(original) !screenshot-1.png! Heap memory usage during running LargePartitionsTest (except for 8G test) without cache !screenshot-2.png! Of course, I have attempted to use some collection containers instead of a plain array. But I could not recognize great improvement enough to justify using these cache mechanism by them. (Unless I did some mistake or overlook about this test) || LargePartitionsTest.test_12_2G || SELECTs 1 (ms) || SELECTs 2 (ms) || Scan (ms) || |Original (array) | 62736 | 48562 | 41540 | |ConcurrentHashMap | 47597 | 30854 | 18271 | |ConcurrentHashMap 2nd trial |44036|26895|17443| |LinkedHashCache (capacity=16, limit=10, fifo) 1st|42668|32165|17323| |LinkedHashCache (capacity=16, limit=10, fifo) 2nd|48863|28066|18053| |LinkedHashCache (capacity=16, limit=16, fifo) | 46979 | 29810 | 18620 | |LinkedHashCache (capacity=16, limit=10, lru) | 46456 | 29749 | 20311 | |No Cache | 47579 | 32480 | 18337 | |No Cache 2nd trial | 46534 | 27670 | 18700 | Code that I used for this comparison is [here|https://github.com/matope/cassandra/commit/e12fcac77f0f46bdf4104ef21c6454bfb2bb92d0]. LinkedHashCache is a simple fifo/lru cache that is extended by LinkedHashMap. Scan is a execution time to iterate through the large partition. So, In this issue, I'd like to propose to remove IndexInfo cache from FileIndexInfoRetriever to improve the performance on large partition. was: Hi guys. In the patch of CASSANDRA-11206 , I found that FileIndexInfoRetriever allocates a very large IndexInfo array (up to the number of IndexInfo in the RowIndexEntry has) as a cache in every single read path. After some experiments using LargePartitionTest on my MacBook, I got results that show that removing FileIndexInfoRetriever improves the performance for large partitions like below (latencies reduced by 41% and by 45%). {noformat} // LargePartitionsTest.test_13_4G with cache by array INFO [main] 2016-09-29 23:11:25,763 ?:? - SELECTs 1 for part=4194304k total=16384M took 94197 ms INFO [main] 2016-09-29 23:12:50,914 ?:? - SELECTs 2 for part=4194304k total=16384M took 85151 ms // LargePartitionsTest.test_13_4G without cache INFO [main] 2016-09-30 00:13:26,050 ?:? - SELECTs 1 for part=4194304k total=16384M took 55112 ms INFO [main] 2016-09-30 00:14:12,132 ?:? - SELECTs 2 for part=4194304k total=16384M took 46082 ms {noformat} Code is [here|https://github.com/matope/cassandra/commit/86fb910a0e38f7520e1be40fb42f74a692f2ebce] (based on trunk) Heap memory usage during running LargePartitionsTest (except for 8G test) with array cache(original) !screenshot-1.png! Heap memory usage during running LargePartitionsTest (except for 8G test) without cache !screenshot-2.png! Of course, I have attempted to use some collection containers instead of a plain array. But I could not recognize great improvement enough to justify using these cache mechanism by them. (Unless I did some mistake or overlook about this test) || LargePartitionsTest.test_12_2G || SELECTs 1 (ms) || SELECTs 2 (ms) || Scan (ms) || |Original (array) | 62736 | 48562 | 41540 | |ConcurrentHashMap 1st| 47597 | 30854 | 18271 | |ConcurrentHashMap 2nd|44036|26895|17443| |LinkedHashCache (capacity=16, limit=10, fifo) 1st|42668|32165|17323| |LinkedHashCache (capacity=16, limit=10, fifo) 2nd|48863|28066|18053| |LinkedHashCache (capacity=16, limit=16, fifo) | 46979 | 29810 | 18620 | |LinkedHashCache (capacity=16, limit=10, lru) | 46456 | 29749 | 20311 | |No Cache 1st | 47579 | 32480 | 18337 | |No Cache 2nd | 46534 | 27670 | 18700 | Code that I used for this comparison is
[jira] [Updated] (CASSANDRA-12731) Remove IndexInfo cache from FileIndexInfoRetriever.
[ https://issues.apache.org/jira/browse/CASSANDRA-12731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yasuharu Goto updated CASSANDRA-12731: -- Description: Hi guys. In the patch of CASSANDRA-11206 , I found that FileIndexInfoRetriever allocates a very large IndexInfo array (up to the number of IndexInfo in the RowIndexEntry has) as a cache in every single read path. After some experiments using LargePartitionTest on my MacBook, I got results that show that removing FileIndexInfoRetriever improves the performance for large partitions like below (latencies reduced by 41% and by 45%). {noformat} // LargePartitionsTest.test_13_4G with cache by array INFO [main] 2016-09-29 23:11:25,763 ?:? - SELECTs 1 for part=4194304k total=16384M took 94197 ms INFO [main] 2016-09-29 23:12:50,914 ?:? - SELECTs 2 for part=4194304k total=16384M took 85151 ms // LargePartitionsTest.test_13_4G without cache INFO [main] 2016-09-30 00:13:26,050 ?:? - SELECTs 1 for part=4194304k total=16384M took 55112 ms INFO [main] 2016-09-30 00:14:12,132 ?:? - SELECTs 2 for part=4194304k total=16384M took 46082 ms {noformat} Code is [here|https://github.com/matope/cassandra/commit/86fb910a0e38f7520e1be40fb42f74a692f2ebce] (based on trunk) Heap memory usage during running LargePartitionsTest (except for 8G test) with array cache(original) !screenshot-1.png! Heap memory usage during running LargePartitionsTest (except for 8G test) without cache !screenshot-2.png! Of course, I have attempted to use some collection containers instead of a plain array. But I could not recognize great improvement enough to justify using these cache mechanism by them. (Unless I did some mistake or overlook about this test) || LargePartitionsTest.test_12_2G || SELECTs 1 (ms) || SELECTs 2 (ms) || Scan (ms) || |Original (array) | 62736 | 48562 | 41540 | |ConcurrentHashMap 1st| 47597 | 30854 | 18271 | |ConcurrentHashMap 2nd|44036|26895|17443| |LinkedHashCache (capacity=16, limit=10, fifo) 1st|42668|32165|17323| |LinkedHashCache (capacity=16, limit=10, fifo) 2nd|48863|28066|18053| |LinkedHashCache (capacity=16, limit=16, fifo) | 46979 | 29810 | 18620 | |LinkedHashCache (capacity=16, limit=10, lru) | 46456 | 29749 | 20311 | |No Cache 1st | 47579 | 32480 | 18337 | |No Cache 2nd | 46534 | 27670 | 18700 | Code that I used for this comparison is [here|https://github.com/matope/cassandra/commit/e12fcac77f0f46bdf4104ef21c6454bfb2bb92d0]. LinkedHashCache is a simple fifo/lru cache that is extended by LinkedHashMap. Scan is a execution time to iterate through the large partition. So, In this issue, I'd like to propose to remove IndexInfo cache from FileIndexInfoRetriever to improve the performance on large partition. was: Hi guys. In the patch of CASSANDRA-11206 , I found that FileIndexInfoRetriever allocates a very large IndexInfo array (up to the number of IndexInfo in the RowIndexEntry has) as a cache in every single read path. After some experiments using LargePartitionTest on my MacBook, I got results that show that removing FileIndexInfoRetriever improves the performance for large partitions like below (latencies reduced by 41% and by 45%). {noformat} // LargePartitionsTest.test_13_4G with cache by array INFO [main] 2016-09-29 23:11:25,763 ?:? - SELECTs 1 for part=4194304k total=16384M took 94197 ms INFO [main] 2016-09-29 23:12:50,914 ?:? - SELECTs 2 for part=4194304k total=16384M took 85151 ms // LargePartitionsTest.test_13_4G without cache INFO [main] 2016-09-30 00:13:26,050 ?:? - SELECTs 1 for part=4194304k total=16384M took 55112 ms INFO [main] 2016-09-30 00:14:12,132 ?:? - SELECTs 2 for part=4194304k total=16384M took 46082 ms {noformat} Code is [here|https://github.com/matope/cassandra/commit/86fb910a0e38f7520e1be40fb42f74a692f2ebce] (based on trunk) Of course, I have attempted to use some collection containers instead of a plain array. But I could not recognize great improvement enough to justify using these cache mechanism by them. (Unless I did some mistake or overlook about this test) || LargePartitionsTest.test_12_2G || SELECTs 1 (ms) || SELECTs 2 (ms) || Scan (ms) || |Original (array) | 62736 | 48562 | 41540 | |ConcurrentHashMap 1st| 47597 | 30854 | 18271 | |ConcurrentHashMap 2nd|44036|26895|17443| |LinkedHashCache (capacity=16, limit=10, fifo) 1st|42668|32165|17323| |LinkedHashCache (capacity=16, limit=10, fifo) 2nd|48863|28066|18053| |LinkedHashCache (capacity=16, limit=16, fifo) | 46979 | 29810 | 18620 | |LinkedHashCache (capacity=16, limit=10, lru) | 46456 | 29749 | 20311 | |No Cache 1st | 47579 | 32480 | 18337 | |No Cache 2nd | 46534 | 27670 | 18700 | Code that I used for this comparison is [here|https://github.com/matope/cassandra/commit/e12fcac77f0f46bdf4104ef21c6454bfb2bb92d0]. LinkedHashCache is a simple fifo/lru cache that is extended by LinkedHashMap. Scan is a execution time to iterate through the large partition. So,
[jira] [Updated] (CASSANDRA-12731) Remove IndexInfo cache from FileIndexInfoRetriever.
[ https://issues.apache.org/jira/browse/CASSANDRA-12731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yasuharu Goto updated CASSANDRA-12731: -- Attachment: screenshot-2.png > Remove IndexInfo cache from FileIndexInfoRetriever. > --- > > Key: CASSANDRA-12731 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12731 > Project: Cassandra > Issue Type: Improvement >Reporter: Yasuharu Goto > Attachments: screenshot-1.png, screenshot-2.png > > > Hi guys. > In the patch of CASSANDRA-11206 , I found that FileIndexInfoRetriever > allocates a very large IndexInfo array (up to the number of IndexInfo in the > RowIndexEntry has) as a cache in every single read path. > After some experiments using LargePartitionTest on my MacBook, I got results > that show that removing FileIndexInfoRetriever improves the performance for > large partitions like below (latencies reduced by 41% and by 45%). > {noformat} > // LargePartitionsTest.test_13_4G with cache by array > INFO [main] 2016-09-29 23:11:25,763 ?:? - SELECTs 1 for part=4194304k > total=16384M took 94197 ms > INFO [main] 2016-09-29 23:12:50,914 ?:? - SELECTs 2 for part=4194304k > total=16384M took 85151 ms > // LargePartitionsTest.test_13_4G without cache > INFO [main] 2016-09-30 00:13:26,050 ?:? - SELECTs 1 for part=4194304k > total=16384M took 55112 ms > INFO [main] 2016-09-30 00:14:12,132 ?:? - SELECTs 2 for part=4194304k > total=16384M took 46082 ms > {noformat} > Code is > [here|https://github.com/matope/cassandra/commit/86fb910a0e38f7520e1be40fb42f74a692f2ebce] > (based on trunk) > Of course, I have attempted to use some collection containers instead of a > plain array. But I could not recognize great improvement enough to justify > using these cache mechanism by them. (Unless I did some mistake or overlook > about this test) > || LargePartitionsTest.test_12_2G || SELECTs 1 (ms) || SELECTs 2 (ms) || Scan > (ms) || > |Original (array) | 62736 | 48562 | 41540 | > |ConcurrentHashMap 1st| 47597 | 30854 | 18271 | > |ConcurrentHashMap 2nd|44036|26895|17443| > |LinkedHashCache (capacity=16, limit=10, fifo) 1st|42668|32165|17323| > |LinkedHashCache (capacity=16, limit=10, fifo) 2nd|48863|28066|18053| > |LinkedHashCache (capacity=16, limit=16, fifo) | 46979 | 29810 | 18620 | > |LinkedHashCache (capacity=16, limit=10, lru) | 46456 | 29749 | 20311 | > |No Cache 1st | 47579 | 32480 | 18337 | > |No Cache 2nd | 46534 | 27670 | 18700 | > Code that I used for this comparison is > [here|https://github.com/matope/cassandra/commit/e12fcac77f0f46bdf4104ef21c6454bfb2bb92d0]. > LinkedHashCache is a simple fifo/lru cache that is extended by LinkedHashMap. > Scan is a execution time to iterate through the large partition. > So, In this issue, I'd like to propose to remove IndexInfo cache from > FileIndexInfoRetriever to improve the performance on large partition. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-12731) Remove IndexInfo cache from FileIndexInfoRetriever.
[ https://issues.apache.org/jira/browse/CASSANDRA-12731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yasuharu Goto updated CASSANDRA-12731: -- Attachment: screenshot-1.png > Remove IndexInfo cache from FileIndexInfoRetriever. > --- > > Key: CASSANDRA-12731 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12731 > Project: Cassandra > Issue Type: Improvement >Reporter: Yasuharu Goto > Attachments: screenshot-1.png, screenshot-2.png > > > Hi guys. > In the patch of CASSANDRA-11206 , I found that FileIndexInfoRetriever > allocates a very large IndexInfo array (up to the number of IndexInfo in the > RowIndexEntry has) as a cache in every single read path. > After some experiments using LargePartitionTest on my MacBook, I got results > that show that removing FileIndexInfoRetriever improves the performance for > large partitions like below (latencies reduced by 41% and by 45%). > {noformat} > // LargePartitionsTest.test_13_4G with cache by array > INFO [main] 2016-09-29 23:11:25,763 ?:? - SELECTs 1 for part=4194304k > total=16384M took 94197 ms > INFO [main] 2016-09-29 23:12:50,914 ?:? - SELECTs 2 for part=4194304k > total=16384M took 85151 ms > // LargePartitionsTest.test_13_4G without cache > INFO [main] 2016-09-30 00:13:26,050 ?:? - SELECTs 1 for part=4194304k > total=16384M took 55112 ms > INFO [main] 2016-09-30 00:14:12,132 ?:? - SELECTs 2 for part=4194304k > total=16384M took 46082 ms > {noformat} > Code is > [here|https://github.com/matope/cassandra/commit/86fb910a0e38f7520e1be40fb42f74a692f2ebce] > (based on trunk) > Of course, I have attempted to use some collection containers instead of a > plain array. But I could not recognize great improvement enough to justify > using these cache mechanism by them. (Unless I did some mistake or overlook > about this test) > || LargePartitionsTest.test_12_2G || SELECTs 1 (ms) || SELECTs 2 (ms) || Scan > (ms) || > |Original (array) | 62736 | 48562 | 41540 | > |ConcurrentHashMap 1st| 47597 | 30854 | 18271 | > |ConcurrentHashMap 2nd|44036|26895|17443| > |LinkedHashCache (capacity=16, limit=10, fifo) 1st|42668|32165|17323| > |LinkedHashCache (capacity=16, limit=10, fifo) 2nd|48863|28066|18053| > |LinkedHashCache (capacity=16, limit=16, fifo) | 46979 | 29810 | 18620 | > |LinkedHashCache (capacity=16, limit=10, lru) | 46456 | 29749 | 20311 | > |No Cache 1st | 47579 | 32480 | 18337 | > |No Cache 2nd | 46534 | 27670 | 18700 | > Code that I used for this comparison is > [here|https://github.com/matope/cassandra/commit/e12fcac77f0f46bdf4104ef21c6454bfb2bb92d0]. > LinkedHashCache is a simple fifo/lru cache that is extended by LinkedHashMap. > Scan is a execution time to iterate through the large partition. > So, In this issue, I'd like to propose to remove IndexInfo cache from > FileIndexInfoRetriever to improve the performance on large partition. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-12731) Remove IndexInfo cache from FileIndexInfoRetriever.
[ https://issues.apache.org/jira/browse/CASSANDRA-12731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yasuharu Goto updated CASSANDRA-12731: -- Description: Hi guys. In the patch of CASSANDRA-11206 , I found that FileIndexInfoRetriever allocates a very large IndexInfo array (up to the number of IndexInfo in the RowIndexEntry has) as a cache in every single read path. After some experiments using LargePartitionTest on my MacBook, I got results that show that removing FileIndexInfoRetriever improves the performance for large partitions like below (latencies reduced by 41% and by 45%). {noformat} // LargePartitionsTest.test_13_4G with cache by array INFO [main] 2016-09-29 23:11:25,763 ?:? - SELECTs 1 for part=4194304k total=16384M took 94197 ms INFO [main] 2016-09-29 23:12:50,914 ?:? - SELECTs 2 for part=4194304k total=16384M took 85151 ms // LargePartitionsTest.test_13_4G without cache INFO [main] 2016-09-30 00:13:26,050 ?:? - SELECTs 1 for part=4194304k total=16384M took 55112 ms INFO [main] 2016-09-30 00:14:12,132 ?:? - SELECTs 2 for part=4194304k total=16384M took 46082 ms {noformat} Code is [here|https://github.com/matope/cassandra/commit/86fb910a0e38f7520e1be40fb42f74a692f2ebce] (based on trunk) Of course, I have attempted to use some collection containers instead of a plain array. But I could not recognize great improvement enough to justify using these cache mechanism by them. (Unless I did some mistake or overlook about this test) || LargePartitionsTest.test_12_2G || SELECTs 1 (ms) || SELECTs 2 (ms) || Scan (ms) || |Original (array) | 62736 | 48562 | 41540 | |ConcurrentHashMap 1st| 47597 | 30854 | 18271 | |ConcurrentHashMap 2nd|44036|26895|17443| |LinkedHashCache (capacity=16, limit=10, fifo) 1st|42668|32165|17323| |LinkedHashCache (capacity=16, limit=10, fifo) 2nd|48863|28066|18053| |LinkedHashCache (capacity=16, limit=16, fifo) | 46979 | 29810 | 18620 | |LinkedHashCache (capacity=16, limit=10, lru) | 46456 | 29749 | 20311 | |No Cache 1st | 47579 | 32480 | 18337 | |No Cache 2nd | 46534 | 27670 | 18700 | Code that I used for this comparison is [here|https://github.com/matope/cassandra/commit/e12fcac77f0f46bdf4104ef21c6454bfb2bb92d0]. LinkedHashCache is a simple fifo/lru cache that is extended by LinkedHashMap. Scan is a execution time to iterate through the large partition. So, In this issue, I'd like to propose to remove IndexInfo cache from FileIndexInfoRetriever to improve the performance on large partition. was: Hi guys. In the patch of CASSANDRA-11206 , I found that FileIndexInfoRetriever allocates a very large IndexInfo array (up to the number of IndexInfo in the RowIndexEntry has) as a cache in every single read path. After some experiments using LargePartitionTest , I got results that show that removing FileIndexInfoRetriever improves the performance for large partitions like below (latencies reduced by 41% and by 45%). {noformat} // LargePartitionsTest.test_13_4G with cache by array INFO [main] 2016-09-29 23:11:25,763 ?:? - SELECTs 1 for part=4194304k total=16384M took 94197 ms INFO [main] 2016-09-29 23:12:50,914 ?:? - SELECTs 2 for part=4194304k total=16384M took 85151 ms // LargePartitionsTest.test_13_4G without cache INFO [main] 2016-09-30 00:13:26,050 ?:? - SELECTs 1 for part=4194304k total=16384M took 55112 ms INFO [main] 2016-09-30 00:14:12,132 ?:? - SELECTs 2 for part=4194304k total=16384M took 46082 ms {noformat} Code is [here|https://github.com/matope/cassandra/commit/86fb910a0e38f7520e1be40fb42f74a692f2ebce] (based on trunk) Of course, I have attempted to use some collection containers instead of a plain array. But I could not recognize great improvement enough to justify using these cache mechanism by them. (Unless I did some mistake or overlook about this test) || LargePartitionsTest.test_12_2G || SELECTs 1 (ms) || SELECTs 2 (ms) || Scan (ms) || |Original (array) | 62736 | 48562 | 41540 | |ConcurrentHashMap 1st| 47597 | 30854 | 18271 | |ConcurrentHashMap 2nd|44036|26895|17443| |LinkedHashCache (capacity=16, limit=10, fifo) 1st|42668|32165|17323| |LinkedHashCache (capacity=16, limit=10, fifo) 2nd|48863|28066|18053| |LinkedHashCache (capacity=16, limit=16, fifo) | 46979 | 29810 | 18620 | |LinkedHashCache (capacity=16, limit=10, lru) | 46456 | 29749 | 20311 | |No Cache 1st | 47579 | 32480 | 18337 | |No Cache 2nd | 46534 | 27670 | 18700 | Code that I used for this comparison is [here|https://github.com/matope/cassandra/commit/e12fcac77f0f46bdf4104ef21c6454bfb2bb92d0]. LinkedHashCache is a simple fifo/lru cache that is extended by LinkedHashMap. Scan is a execution time to iterate through the large partition. So, In this issue, I'd like to propose to remove IndexInfo cache from FileIndexInfoRetriever to improve the performance on large partition. > Remove IndexInfo cache from FileIndexInfoRetriever. >
[jira] [Created] (CASSANDRA-12731) Remove IndexInfo cache from FileIndexInfoRetriever.
Yasuharu Goto created CASSANDRA-12731: - Summary: Remove IndexInfo cache from FileIndexInfoRetriever. Key: CASSANDRA-12731 URL: https://issues.apache.org/jira/browse/CASSANDRA-12731 Project: Cassandra Issue Type: Improvement Reporter: Yasuharu Goto Hi guys. In the patch of CASSANDRA-11206 , I found that FileIndexInfoRetriever allocates a very large IndexInfo array (up to the number of IndexInfo in the RowIndexEntry has) as a cache in every single read path. After some experiments using LargePartitionTest , I got results that show that removing FileIndexInfoRetriever improves the performance for large partitions like below (latencies reduced by 41% and by 45%). {noformat} // LargePartitionsTest.test_13_4G with cache by array INFO [main] 2016-09-29 23:11:25,763 ?:? - SELECTs 1 for part=4194304k total=16384M took 94197 ms INFO [main] 2016-09-29 23:12:50,914 ?:? - SELECTs 2 for part=4194304k total=16384M took 85151 ms // LargePartitionsTest.test_13_4G without cache INFO [main] 2016-09-30 00:13:26,050 ?:? - SELECTs 1 for part=4194304k total=16384M took 55112 ms INFO [main] 2016-09-30 00:14:12,132 ?:? - SELECTs 2 for part=4194304k total=16384M took 46082 ms {noformat} Code is [here|https://github.com/matope/cassandra/commit/86fb910a0e38f7520e1be40fb42f74a692f2ebce] (based on trunk) Of course, I have attempted to use some collection containers instead of a plain array. But I could not recognize great improvement enough to justify using these cache mechanism by them. (Unless I did some mistake or overlook about this test) || LargePartitionsTest.test_12_2G || SELECTs 1 (ms) || SELECTs 2 (ms) || Scan (ms) || |Original (array) | 62736 | 48562 | 41540 | |ConcurrentHashMap 1st| 47597 | 30854 | 18271 | |ConcurrentHashMap 2nd|44036|26895|17443| |LinkedHashCache (capacity=16, limit=10, fifo) 1st|42668|32165|17323| |LinkedHashCache (capacity=16, limit=10, fifo) 2nd|48863|28066|18053| |LinkedHashCache (capacity=16, limit=16, fifo) | 46979 | 29810 | 18620 | |LinkedHashCache (capacity=16, limit=10, lru) | 46456 | 29749 | 20311 | |No Cache 1st | 47579 | 32480 | 18337 | |No Cache 2nd | 46534 | 27670 | 18700 | Code that I used for this comparison is [here|https://github.com/matope/cassandra/commit/e12fcac77f0f46bdf4104ef21c6454bfb2bb92d0]. LinkedHashCache is a simple fifo/lru cache that is extended by LinkedHashMap. Scan is a execution time to iterate through the large partition. So, In this issue, I'd like to propose to remove IndexInfo cache from FileIndexInfoRetriever to improve the performance on large partition. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-12717) IllegalArgumentException in CompactionTask
[ https://issues.apache.org/jira/browse/CASSANDRA-12717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yasuharu Goto updated CASSANDRA-12717: -- Summary: IllegalArgumentException in CompactionTask (was: Fix IllegalArgumentException in CompactionTask) > IllegalArgumentException in CompactionTask > -- > > Key: CASSANDRA-12717 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12717 > Project: Cassandra > Issue Type: Bug >Reporter: Yasuharu Goto >Assignee: Yasuharu Goto > > When I was ran LargePartitionsTest.test_11_1G at trunk, I found that this > test fails due to a java.lang.IllegalArgumentException during compaction. > This exception apparently happens when the compaction merges a large (>2GB) > partition. > {noformat} > DEBUG [COMMIT-LOG-ALLOCATOR] 2016-09-28 00:32:48,074 ?:? - No segments in > reserve; creating a fresh one > DEBUG [COMMIT-LOG-ALLOCATOR] 2016-09-28 00:32:48,437 ?:? - No segments in > reserve; creating a fresh one > WARN [CompactionExecutor:14] 2016-09-28 00:32:48,463 ?:? - Writing large > partition cql_test_keyspace/table_4:10 (1.004GiB) > ERROR [CompactionExecutor:14] 2016-09-28 00:32:49,734 ?:? - Fatal exception > in thread Thread[CompactionExecutor:14,1,main] > java.lang.IllegalArgumentException: Out of range: 2234434614 > at com.google.common.primitives.Ints.checkedCast(Ints.java:91) > ~[guava-18.0.jar:na] > at > org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:206) > ~[main/:na] > at > org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) > ~[main/:na] > at > org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:85) > ~[main/:na] > at > org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:61) > ~[main/:na] > at > org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionCandidate.run(CompactionManager.java:267) > ~[main/:na] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > ~[na:1.8.0_77] > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > ~[na:1.8.0_77] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > ~[na:1.8.0_77] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > [na:1.8.0_77] > at java.lang.Thread.run(Thread.java:745) [na:1.8.0_77] > DEBUG [COMMIT-LOG-ALLOCATOR] 2016-09-28 00:32:49,909 ?:? - No segments in > reserve; creating a fresh one > DEBUG [COMMIT-LOG-ALLOCATOR] 2016-09-28 00:32:50,148 ?:? - No segments in > reserve; creating a fresh one > DEBUG [COMMIT-LOG-ALLOCATOR] 2016-09-28 00:32:50,385 ?:? - No segments in > reserve; creating a fresh one > DEBUG [COMMIT-LOG-ALLOCATOR] 2016-09-28 00:32:50,620 ?:? - No segments in > reserve; creating a fresh one > {noformat} > {noformat} > java.lang.RuntimeException: java.util.concurrent.ExecutionException: > java.lang.IllegalArgumentException: Out of range: 2540348821 > at org.apache.cassandra.utils.Throwables.maybeFail(Throwables.java:51) > at > org.apache.cassandra.utils.FBUtilities.waitOnFutures(FBUtilities.java:393) > at > org.apache.cassandra.db.compaction.CompactionManager.performMaximal(CompactionManager.java:695) > at > org.apache.cassandra.db.ColumnFamilyStore.forceMajorCompaction(ColumnFamilyStore.java:2066) > at > org.apache.cassandra.db.ColumnFamilyStore.forceMajorCompaction(ColumnFamilyStore.java:2061) > at org.apache.cassandra.cql3.CQLTester.compact(CQLTester.java:426) > at > org.apache.cassandra.io.sstable.LargePartitionsTest.lambda$withPartitionSize$2(LargePartitionsTest.java:92) > at > org.apache.cassandra.io.sstable.LargePartitionsTest.measured(LargePartitionsTest.java:50) > at > org.apache.cassandra.io.sstable.LargePartitionsTest.withPartitionSize(LargePartitionsTest.java:90) > at > org.apache.cassandra.io.sstable.LargePartitionsTest.test_11_1G(LargePartitionsTest.java:198) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41) > at >
[jira] [Updated] (CASSANDRA-12717) Fix IllegalArgumentException in CompactionTask
[ https://issues.apache.org/jira/browse/CASSANDRA-12717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yasuharu Goto updated CASSANDRA-12717: -- Status: Patch Available (was: Open) > Fix IllegalArgumentException in CompactionTask > -- > > Key: CASSANDRA-12717 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12717 > Project: Cassandra > Issue Type: Bug >Reporter: Yasuharu Goto >Assignee: Yasuharu Goto > > When I was ran LargePartitionsTest.test_11_1G at trunk, I found that this > test fails due to a java.lang.IllegalArgumentException during compaction. > This exception apparently happens when the compaction merges a large (>2GB) > partition. > {noformat} > DEBUG [COMMIT-LOG-ALLOCATOR] 2016-09-28 00:32:48,074 ?:? - No segments in > reserve; creating a fresh one > DEBUG [COMMIT-LOG-ALLOCATOR] 2016-09-28 00:32:48,437 ?:? - No segments in > reserve; creating a fresh one > WARN [CompactionExecutor:14] 2016-09-28 00:32:48,463 ?:? - Writing large > partition cql_test_keyspace/table_4:10 (1.004GiB) > ERROR [CompactionExecutor:14] 2016-09-28 00:32:49,734 ?:? - Fatal exception > in thread Thread[CompactionExecutor:14,1,main] > java.lang.IllegalArgumentException: Out of range: 2234434614 > at com.google.common.primitives.Ints.checkedCast(Ints.java:91) > ~[guava-18.0.jar:na] > at > org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:206) > ~[main/:na] > at > org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) > ~[main/:na] > at > org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:85) > ~[main/:na] > at > org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:61) > ~[main/:na] > at > org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionCandidate.run(CompactionManager.java:267) > ~[main/:na] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > ~[na:1.8.0_77] > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > ~[na:1.8.0_77] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > ~[na:1.8.0_77] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > [na:1.8.0_77] > at java.lang.Thread.run(Thread.java:745) [na:1.8.0_77] > DEBUG [COMMIT-LOG-ALLOCATOR] 2016-09-28 00:32:49,909 ?:? - No segments in > reserve; creating a fresh one > DEBUG [COMMIT-LOG-ALLOCATOR] 2016-09-28 00:32:50,148 ?:? - No segments in > reserve; creating a fresh one > DEBUG [COMMIT-LOG-ALLOCATOR] 2016-09-28 00:32:50,385 ?:? - No segments in > reserve; creating a fresh one > DEBUG [COMMIT-LOG-ALLOCATOR] 2016-09-28 00:32:50,620 ?:? - No segments in > reserve; creating a fresh one > {noformat} > {noformat} > java.lang.RuntimeException: java.util.concurrent.ExecutionException: > java.lang.IllegalArgumentException: Out of range: 2540348821 > at org.apache.cassandra.utils.Throwables.maybeFail(Throwables.java:51) > at > org.apache.cassandra.utils.FBUtilities.waitOnFutures(FBUtilities.java:393) > at > org.apache.cassandra.db.compaction.CompactionManager.performMaximal(CompactionManager.java:695) > at > org.apache.cassandra.db.ColumnFamilyStore.forceMajorCompaction(ColumnFamilyStore.java:2066) > at > org.apache.cassandra.db.ColumnFamilyStore.forceMajorCompaction(ColumnFamilyStore.java:2061) > at org.apache.cassandra.cql3.CQLTester.compact(CQLTester.java:426) > at > org.apache.cassandra.io.sstable.LargePartitionsTest.lambda$withPartitionSize$2(LargePartitionsTest.java:92) > at > org.apache.cassandra.io.sstable.LargePartitionsTest.measured(LargePartitionsTest.java:50) > at > org.apache.cassandra.io.sstable.LargePartitionsTest.withPartitionSize(LargePartitionsTest.java:90) > at > org.apache.cassandra.io.sstable.LargePartitionsTest.test_11_1G(LargePartitionsTest.java:198) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20) > at >
[jira] [Updated] (CASSANDRA-12717) Fix IllegalArgumentException in CompactionTask
[ https://issues.apache.org/jira/browse/CASSANDRA-12717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yasuharu Goto updated CASSANDRA-12717: -- Description: When I was ran LargePartitionsTest.test_11_1G at trunk, I found that this test fails due to a java.lang.IllegalArgumentException during compaction. This exception apparently happens when the compaction merges a large (>2GB) partition. {noformat} DEBUG [COMMIT-LOG-ALLOCATOR] 2016-09-28 00:32:48,074 ?:? - No segments in reserve; creating a fresh one DEBUG [COMMIT-LOG-ALLOCATOR] 2016-09-28 00:32:48,437 ?:? - No segments in reserve; creating a fresh one WARN [CompactionExecutor:14] 2016-09-28 00:32:48,463 ?:? - Writing large partition cql_test_keyspace/table_4:10 (1.004GiB) ERROR [CompactionExecutor:14] 2016-09-28 00:32:49,734 ?:? - Fatal exception in thread Thread[CompactionExecutor:14,1,main] java.lang.IllegalArgumentException: Out of range: 2234434614 at com.google.common.primitives.Ints.checkedCast(Ints.java:91) ~[guava-18.0.jar:na] at org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:206) ~[main/:na] at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) ~[main/:na] at org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:85) ~[main/:na] at org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:61) ~[main/:na] at org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionCandidate.run(CompactionManager.java:267) ~[main/:na] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[na:1.8.0_77] at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[na:1.8.0_77] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) ~[na:1.8.0_77] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_77] at java.lang.Thread.run(Thread.java:745) [na:1.8.0_77] DEBUG [COMMIT-LOG-ALLOCATOR] 2016-09-28 00:32:49,909 ?:? - No segments in reserve; creating a fresh one DEBUG [COMMIT-LOG-ALLOCATOR] 2016-09-28 00:32:50,148 ?:? - No segments in reserve; creating a fresh one DEBUG [COMMIT-LOG-ALLOCATOR] 2016-09-28 00:32:50,385 ?:? - No segments in reserve; creating a fresh one DEBUG [COMMIT-LOG-ALLOCATOR] 2016-09-28 00:32:50,620 ?:? - No segments in reserve; creating a fresh one {noformat} {noformat} java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.IllegalArgumentException: Out of range: 2540348821 at org.apache.cassandra.utils.Throwables.maybeFail(Throwables.java:51) at org.apache.cassandra.utils.FBUtilities.waitOnFutures(FBUtilities.java:393) at org.apache.cassandra.db.compaction.CompactionManager.performMaximal(CompactionManager.java:695) at org.apache.cassandra.db.ColumnFamilyStore.forceMajorCompaction(ColumnFamilyStore.java:2066) at org.apache.cassandra.db.ColumnFamilyStore.forceMajorCompaction(ColumnFamilyStore.java:2061) at org.apache.cassandra.cql3.CQLTester.compact(CQLTester.java:426) at org.apache.cassandra.io.sstable.LargePartitionsTest.lambda$withPartitionSize$2(LargePartitionsTest.java:92) at org.apache.cassandra.io.sstable.LargePartitionsTest.measured(LargePartitionsTest.java:50) at org.apache.cassandra.io.sstable.LargePartitionsTest.withPartitionSize(LargePartitionsTest.java:90) at org.apache.cassandra.io.sstable.LargePartitionsTest.test_11_1G(LargePartitionsTest.java:198) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:31) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70) at com.intellij.junit4.JUnit4TestRunnerUtil$IgnoreIgnoredTestJUnit4ClassRunner.runChild(JUnit4TestRunnerUtil.java:358) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:44) at
[jira] [Comment Edited] (CASSANDRA-12717) Fix IllegalArgumentException in CompactionTask
[ https://issues.apache.org/jira/browse/CASSANDRA-12717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15526579#comment-15526579 ] Yasuharu Goto edited comment on CASSANDRA-12717 at 9/27/16 4:15 PM: Patch is here. Could you please review this? Fix IllegalArgumentException in CompactionTask https://github.com/matope/cassandra/commit/d6c40dd3d4d95dba8b9c3f88de1015315e45990d was (Author: yasuharu): Patch is here. Could you please review this? Fix IllegalArgumentException in CompactionTask https://github.com/matope/cassandra/commit/a9ccd9731e83fdd4148325c9a727b64e4982e2ba > Fix IllegalArgumentException in CompactionTask > -- > > Key: CASSANDRA-12717 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12717 > Project: Cassandra > Issue Type: Bug >Reporter: Yasuharu Goto >Assignee: Yasuharu Goto > > When I was ran LargePartitionsTest.test_11_1G at trunk, I found that this > test fails due to a java.lang.IllegalArgumentException during compaction > and, eventually fails. > This exception apparently happens when a compaction generates large sstable. > {noformat} > DEBUG [COMMIT-LOG-ALLOCATOR] 2016-09-28 00:32:48,074 ?:? - No segments in > reserve; creating a fresh one > DEBUG [COMMIT-LOG-ALLOCATOR] 2016-09-28 00:32:48,437 ?:? - No segments in > reserve; creating a fresh one > WARN [CompactionExecutor:14] 2016-09-28 00:32:48,463 ?:? - Writing large > partition cql_test_keyspace/table_4:10 (1.004GiB) > ERROR [CompactionExecutor:14] 2016-09-28 00:32:49,734 ?:? - Fatal exception > in thread Thread[CompactionExecutor:14,1,main] > java.lang.IllegalArgumentException: Out of range: 2234434614 > at com.google.common.primitives.Ints.checkedCast(Ints.java:91) > ~[guava-18.0.jar:na] > at > org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:206) > ~[main/:na] > at > org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) > ~[main/:na] > at > org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:85) > ~[main/:na] > at > org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:61) > ~[main/:na] > at > org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionCandidate.run(CompactionManager.java:267) > ~[main/:na] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > ~[na:1.8.0_77] > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > ~[na:1.8.0_77] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > ~[na:1.8.0_77] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > [na:1.8.0_77] > at java.lang.Thread.run(Thread.java:745) [na:1.8.0_77] > DEBUG [COMMIT-LOG-ALLOCATOR] 2016-09-28 00:32:49,909 ?:? - No segments in > reserve; creating a fresh one > DEBUG [COMMIT-LOG-ALLOCATOR] 2016-09-28 00:32:50,148 ?:? - No segments in > reserve; creating a fresh one > DEBUG [COMMIT-LOG-ALLOCATOR] 2016-09-28 00:32:50,385 ?:? - No segments in > reserve; creating a fresh one > DEBUG [COMMIT-LOG-ALLOCATOR] 2016-09-28 00:32:50,620 ?:? - No segments in > reserve; creating a fresh one > {noformat} > {noformat} > java.lang.RuntimeException: java.util.concurrent.ExecutionException: > java.lang.IllegalArgumentException: Out of range: 2540348821 > at org.apache.cassandra.utils.Throwables.maybeFail(Throwables.java:51) > at > org.apache.cassandra.utils.FBUtilities.waitOnFutures(FBUtilities.java:393) > at > org.apache.cassandra.db.compaction.CompactionManager.performMaximal(CompactionManager.java:695) > at > org.apache.cassandra.db.ColumnFamilyStore.forceMajorCompaction(ColumnFamilyStore.java:2066) > at > org.apache.cassandra.db.ColumnFamilyStore.forceMajorCompaction(ColumnFamilyStore.java:2061) > at org.apache.cassandra.cql3.CQLTester.compact(CQLTester.java:426) > at > org.apache.cassandra.io.sstable.LargePartitionsTest.lambda$withPartitionSize$2(LargePartitionsTest.java:92) > at > org.apache.cassandra.io.sstable.LargePartitionsTest.measured(LargePartitionsTest.java:50) > at > org.apache.cassandra.io.sstable.LargePartitionsTest.withPartitionSize(LargePartitionsTest.java:90) > at > org.apache.cassandra.io.sstable.LargePartitionsTest.test_11_1G(LargePartitionsTest.java:198) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at
[jira] [Updated] (CASSANDRA-12717) Fix IllegalArgumentException in CompactionTask
[ https://issues.apache.org/jira/browse/CASSANDRA-12717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yasuharu Goto updated CASSANDRA-12717: -- Description: When I was ran LargePartitionsTest.test_11_1G at trunk, I found that this test fails due to a java.lang.IllegalArgumentException during compaction and, eventually fails. This exception apparently happens when a compaction generates large sstable. {noformat} DEBUG [COMMIT-LOG-ALLOCATOR] 2016-09-28 00:32:48,074 ?:? - No segments in reserve; creating a fresh one DEBUG [COMMIT-LOG-ALLOCATOR] 2016-09-28 00:32:48,437 ?:? - No segments in reserve; creating a fresh one WARN [CompactionExecutor:14] 2016-09-28 00:32:48,463 ?:? - Writing large partition cql_test_keyspace/table_4:10 (1.004GiB) ERROR [CompactionExecutor:14] 2016-09-28 00:32:49,734 ?:? - Fatal exception in thread Thread[CompactionExecutor:14,1,main] java.lang.IllegalArgumentException: Out of range: 2234434614 at com.google.common.primitives.Ints.checkedCast(Ints.java:91) ~[guava-18.0.jar:na] at org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:206) ~[main/:na] at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) ~[main/:na] at org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:85) ~[main/:na] at org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:61) ~[main/:na] at org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionCandidate.run(CompactionManager.java:267) ~[main/:na] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[na:1.8.0_77] at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[na:1.8.0_77] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) ~[na:1.8.0_77] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_77] at java.lang.Thread.run(Thread.java:745) [na:1.8.0_77] DEBUG [COMMIT-LOG-ALLOCATOR] 2016-09-28 00:32:49,909 ?:? - No segments in reserve; creating a fresh one DEBUG [COMMIT-LOG-ALLOCATOR] 2016-09-28 00:32:50,148 ?:? - No segments in reserve; creating a fresh one DEBUG [COMMIT-LOG-ALLOCATOR] 2016-09-28 00:32:50,385 ?:? - No segments in reserve; creating a fresh one DEBUG [COMMIT-LOG-ALLOCATOR] 2016-09-28 00:32:50,620 ?:? - No segments in reserve; creating a fresh one {noformat} {noformat} java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.IllegalArgumentException: Out of range: 2540348821 at org.apache.cassandra.utils.Throwables.maybeFail(Throwables.java:51) at org.apache.cassandra.utils.FBUtilities.waitOnFutures(FBUtilities.java:393) at org.apache.cassandra.db.compaction.CompactionManager.performMaximal(CompactionManager.java:695) at org.apache.cassandra.db.ColumnFamilyStore.forceMajorCompaction(ColumnFamilyStore.java:2066) at org.apache.cassandra.db.ColumnFamilyStore.forceMajorCompaction(ColumnFamilyStore.java:2061) at org.apache.cassandra.cql3.CQLTester.compact(CQLTester.java:426) at org.apache.cassandra.io.sstable.LargePartitionsTest.lambda$withPartitionSize$2(LargePartitionsTest.java:92) at org.apache.cassandra.io.sstable.LargePartitionsTest.measured(LargePartitionsTest.java:50) at org.apache.cassandra.io.sstable.LargePartitionsTest.withPartitionSize(LargePartitionsTest.java:90) at org.apache.cassandra.io.sstable.LargePartitionsTest.test_11_1G(LargePartitionsTest.java:198) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:31) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70) at com.intellij.junit4.JUnit4TestRunnerUtil$IgnoreIgnoredTestJUnit4ClassRunner.runChild(JUnit4TestRunnerUtil.java:358) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:44) at
[jira] [Commented] (CASSANDRA-12717) Fix IllegalArgumentException in CompactionTask
[ https://issues.apache.org/jira/browse/CASSANDRA-12717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15526579#comment-15526579 ] Yasuharu Goto commented on CASSANDRA-12717: --- Patch is here. Could you please review this? Fix IllegalArgumentException in CompactionTask https://github.com/matope/cassandra/commit/a9ccd9731e83fdd4148325c9a727b64e4982e2ba > Fix IllegalArgumentException in CompactionTask > -- > > Key: CASSANDRA-12717 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12717 > Project: Cassandra > Issue Type: Bug >Reporter: Yasuharu Goto >Assignee: Yasuharu Goto > > When I was ran LargePartitionsTest.test_11_1G at trunk, I found that this > test fails due to a java.lang.IllegalArgumentException during compaction > {noformat} > DEBUG [COMMIT-LOG-ALLOCATOR] 2016-09-28 00:32:48,074 ?:? - No segments in > reserve; creating a fresh one > DEBUG [COMMIT-LOG-ALLOCATOR] 2016-09-28 00:32:48,437 ?:? - No segments in > reserve; creating a fresh one > WARN [CompactionExecutor:14] 2016-09-28 00:32:48,463 ?:? - Writing large > partition cql_test_keyspace/table_4:10 (1.004GiB) > ERROR [CompactionExecutor:14] 2016-09-28 00:32:49,734 ?:? - Fatal exception > in thread Thread[CompactionExecutor:14,1,main] > java.lang.IllegalArgumentException: Out of range: 2234434614 > at com.google.common.primitives.Ints.checkedCast(Ints.java:91) > ~[guava-18.0.jar:na] > at > org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:206) > ~[main/:na] > at > org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) > ~[main/:na] > at > org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:85) > ~[main/:na] > at > org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:61) > ~[main/:na] > at > org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionCandidate.run(CompactionManager.java:267) > ~[main/:na] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > ~[na:1.8.0_77] > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > ~[na:1.8.0_77] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > ~[na:1.8.0_77] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > [na:1.8.0_77] > at java.lang.Thread.run(Thread.java:745) [na:1.8.0_77] > DEBUG [COMMIT-LOG-ALLOCATOR] 2016-09-28 00:32:49,909 ?:? - No segments in > reserve; creating a fresh one > DEBUG [COMMIT-LOG-ALLOCATOR] 2016-09-28 00:32:50,148 ?:? - No segments in > reserve; creating a fresh one > DEBUG [COMMIT-LOG-ALLOCATOR] 2016-09-28 00:32:50,385 ?:? - No segments in > reserve; creating a fresh one > DEBUG [COMMIT-LOG-ALLOCATOR] 2016-09-28 00:32:50,620 ?:? - No segments in > reserve; creating a fresh one > {noformat} > and, eventually fails. > {noformat} > java.lang.RuntimeException: java.util.concurrent.ExecutionException: > java.lang.IllegalArgumentException: Out of range: 2540348821 > at org.apache.cassandra.utils.Throwables.maybeFail(Throwables.java:51) > at > org.apache.cassandra.utils.FBUtilities.waitOnFutures(FBUtilities.java:393) > at > org.apache.cassandra.db.compaction.CompactionManager.performMaximal(CompactionManager.java:695) > at > org.apache.cassandra.db.ColumnFamilyStore.forceMajorCompaction(ColumnFamilyStore.java:2066) > at > org.apache.cassandra.db.ColumnFamilyStore.forceMajorCompaction(ColumnFamilyStore.java:2061) > at org.apache.cassandra.cql3.CQLTester.compact(CQLTester.java:426) > at > org.apache.cassandra.io.sstable.LargePartitionsTest.lambda$withPartitionSize$2(LargePartitionsTest.java:92) > at > org.apache.cassandra.io.sstable.LargePartitionsTest.measured(LargePartitionsTest.java:50) > at > org.apache.cassandra.io.sstable.LargePartitionsTest.withPartitionSize(LargePartitionsTest.java:90) > at > org.apache.cassandra.io.sstable.LargePartitionsTest.test_11_1G(LargePartitionsTest.java:198) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41) >
[jira] [Updated] (CASSANDRA-12717) Fix IllegalArgumentException in CompactionTask
[ https://issues.apache.org/jira/browse/CASSANDRA-12717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yasuharu Goto updated CASSANDRA-12717: -- Description: When I was ran LargePartitionsTest.test_11_1G at trunk, I found that this test fails due to a java.lang.IllegalArgumentException during compaction {noformat} DEBUG [COMMIT-LOG-ALLOCATOR] 2016-09-28 00:32:48,074 ?:? - No segments in reserve; creating a fresh one DEBUG [COMMIT-LOG-ALLOCATOR] 2016-09-28 00:32:48,437 ?:? - No segments in reserve; creating a fresh one WARN [CompactionExecutor:14] 2016-09-28 00:32:48,463 ?:? - Writing large partition cql_test_keyspace/table_4:10 (1.004GiB) ERROR [CompactionExecutor:14] 2016-09-28 00:32:49,734 ?:? - Fatal exception in thread Thread[CompactionExecutor:14,1,main] java.lang.IllegalArgumentException: Out of range: 2234434614 at com.google.common.primitives.Ints.checkedCast(Ints.java:91) ~[guava-18.0.jar:na] at org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:206) ~[main/:na] at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) ~[main/:na] at org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:85) ~[main/:na] at org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:61) ~[main/:na] at org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionCandidate.run(CompactionManager.java:267) ~[main/:na] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[na:1.8.0_77] at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[na:1.8.0_77] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) ~[na:1.8.0_77] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_77] at java.lang.Thread.run(Thread.java:745) [na:1.8.0_77] DEBUG [COMMIT-LOG-ALLOCATOR] 2016-09-28 00:32:49,909 ?:? - No segments in reserve; creating a fresh one DEBUG [COMMIT-LOG-ALLOCATOR] 2016-09-28 00:32:50,148 ?:? - No segments in reserve; creating a fresh one DEBUG [COMMIT-LOG-ALLOCATOR] 2016-09-28 00:32:50,385 ?:? - No segments in reserve; creating a fresh one DEBUG [COMMIT-LOG-ALLOCATOR] 2016-09-28 00:32:50,620 ?:? - No segments in reserve; creating a fresh one {noformat} and, eventually fails. {noformat} java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.IllegalArgumentException: Out of range: 2540348821 at org.apache.cassandra.utils.Throwables.maybeFail(Throwables.java:51) at org.apache.cassandra.utils.FBUtilities.waitOnFutures(FBUtilities.java:393) at org.apache.cassandra.db.compaction.CompactionManager.performMaximal(CompactionManager.java:695) at org.apache.cassandra.db.ColumnFamilyStore.forceMajorCompaction(ColumnFamilyStore.java:2066) at org.apache.cassandra.db.ColumnFamilyStore.forceMajorCompaction(ColumnFamilyStore.java:2061) at org.apache.cassandra.cql3.CQLTester.compact(CQLTester.java:426) at org.apache.cassandra.io.sstable.LargePartitionsTest.lambda$withPartitionSize$2(LargePartitionsTest.java:92) at org.apache.cassandra.io.sstable.LargePartitionsTest.measured(LargePartitionsTest.java:50) at org.apache.cassandra.io.sstable.LargePartitionsTest.withPartitionSize(LargePartitionsTest.java:90) at org.apache.cassandra.io.sstable.LargePartitionsTest.test_11_1G(LargePartitionsTest.java:198) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:31) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70) at com.intellij.junit4.JUnit4TestRunnerUtil$IgnoreIgnoredTestJUnit4ClassRunner.runChild(JUnit4TestRunnerUtil.java:358) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:44) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:180) at
[jira] [Created] (CASSANDRA-12717) Fix IllegalArgumentException in CompactionTask
Yasuharu Goto created CASSANDRA-12717: - Summary: Fix IllegalArgumentException in CompactionTask Key: CASSANDRA-12717 URL: https://issues.apache.org/jira/browse/CASSANDRA-12717 Project: Cassandra Issue Type: Bug Reporter: Yasuharu Goto Assignee: Yasuharu Goto When I was ran LargePartitionsTest.test_11_1G, I found that this test fails due to a java.lang.IllegalArgumentException during compaction {noformat} DEBUG [COMMIT-LOG-ALLOCATOR] 2016-09-28 00:32:48,074 ?:? - No segments in reserve; creating a fresh one DEBUG [COMMIT-LOG-ALLOCATOR] 2016-09-28 00:32:48,437 ?:? - No segments in reserve; creating a fresh one WARN [CompactionExecutor:14] 2016-09-28 00:32:48,463 ?:? - Writing large partition cql_test_keyspace/table_4:10 (1.004GiB) ERROR [CompactionExecutor:14] 2016-09-28 00:32:49,734 ?:? - Fatal exception in thread Thread[CompactionExecutor:14,1,main] java.lang.IllegalArgumentException: Out of range: 2234434614 at com.google.common.primitives.Ints.checkedCast(Ints.java:91) ~[guava-18.0.jar:na] at org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:206) ~[main/:na] at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) ~[main/:na] at org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:85) ~[main/:na] at org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:61) ~[main/:na] at org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionCandidate.run(CompactionManager.java:267) ~[main/:na] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[na:1.8.0_77] at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[na:1.8.0_77] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) ~[na:1.8.0_77] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_77] at java.lang.Thread.run(Thread.java:745) [na:1.8.0_77] DEBUG [COMMIT-LOG-ALLOCATOR] 2016-09-28 00:32:49,909 ?:? - No segments in reserve; creating a fresh one DEBUG [COMMIT-LOG-ALLOCATOR] 2016-09-28 00:32:50,148 ?:? - No segments in reserve; creating a fresh one DEBUG [COMMIT-LOG-ALLOCATOR] 2016-09-28 00:32:50,385 ?:? - No segments in reserve; creating a fresh one DEBUG [COMMIT-LOG-ALLOCATOR] 2016-09-28 00:32:50,620 ?:? - No segments in reserve; creating a fresh one {noformat} and, eventually fails. {noformat} java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.IllegalArgumentException: Out of range: 2540348821 at org.apache.cassandra.utils.Throwables.maybeFail(Throwables.java:51) at org.apache.cassandra.utils.FBUtilities.waitOnFutures(FBUtilities.java:393) at org.apache.cassandra.db.compaction.CompactionManager.performMaximal(CompactionManager.java:695) at org.apache.cassandra.db.ColumnFamilyStore.forceMajorCompaction(ColumnFamilyStore.java:2066) at org.apache.cassandra.db.ColumnFamilyStore.forceMajorCompaction(ColumnFamilyStore.java:2061) at org.apache.cassandra.cql3.CQLTester.compact(CQLTester.java:426) at org.apache.cassandra.io.sstable.LargePartitionsTest.lambda$withPartitionSize$2(LargePartitionsTest.java:92) at org.apache.cassandra.io.sstable.LargePartitionsTest.measured(LargePartitionsTest.java:50) at org.apache.cassandra.io.sstable.LargePartitionsTest.withPartitionSize(LargePartitionsTest.java:90) at org.apache.cassandra.io.sstable.LargePartitionsTest.test_11_1G(LargePartitionsTest.java:198) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:31) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70) at com.intellij.junit4.JUnit4TestRunnerUtil$IgnoreIgnoredTestJUnit4ClassRunner.runChild(JUnit4TestRunnerUtil.java:358) at
[jira] [Commented] (CASSANDRA-11425) Add prepared query parameter to trace for "Execute CQL3 prepared query" session
[ https://issues.apache.org/jira/browse/CASSANDRA-11425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1523#comment-1523 ] Yasuharu Goto commented on CASSANDRA-11425: --- [~snazy] Thank you very much for the modifying and the merge! I'll be careful of unit tests next time. I'm interested in CASSANDRA-11719, but there is a challenger already. I'm going to watch the ticket. Thanks. > Add prepared query parameter to trace for "Execute CQL3 prepared query" > session > --- > > Key: CASSANDRA-11425 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11425 > Project: Cassandra > Issue Type: Improvement > Components: CQL >Reporter: Yasuharu Goto >Assignee: Yasuharu Goto >Priority: Minor > Fix For: 3.8 > > > For now, the system_traces.sessions rows for "Execute CQL3 prepared query" do > not show us any information about the prepared query which is executed on the > session. So we can't see what query is the session executing. > I think this makes performance tuning difficult on Cassandra. > So, In this ticket, I'd like to add the prepared query parameter on Execute > session trace like this. > {noformat} > cqlsh:system_traces> select * from sessions ; > session_id | client| command | coordinator | > duration | parameters > > | request | started_at > --+---+-+-+--+--+-+- > a001ec00-f1c5-11e5-b14a-6fe1292cf9f1 | 127.0.0.1 | QUERY | 127.0.0.1 | >666 | \{'consistency_level': 'ONE', 'page_size': '5000', 'query': > 'SELECT * FROM test.test2 WHERE id=? LIMIT 1', 'serial_consistency_level': > 'SERIAL'\} | Execute CQL3 prepared query | 2016-03-24 13:38:00.00+ > a0019de0-f1c5-11e5-b14a-6fe1292cf9f1 | 127.0.0.1 | QUERY | 127.0.0.1 | >109 | >{'query': 'SELECT * FROM test.test2 WHERE id=? LIMIT > 1'} |Preparing CQL3 query | 2016-03-24 13:37:59.998000+ > a0014fc0-f1c5-11e5-b14a-6fe1292cf9f1 | 127.0.0.1 | QUERY | 127.0.0.1 | >126 | > {'query': 'INSERT INTO test.test2(id,value) VALUES > (?,?)'} |Preparing CQL3 query | 2016-03-24 13:37:59.996000+ > a0019de1-f1c5-11e5-b14a-6fe1292cf9f1 | 127.0.0.1 | QUERY | 127.0.0.1 | >764 | {'consistency_level': 'ONE', 'page_size': '5000', 'query': > 'SELECT * FROM test.test2 WHERE id=? LIMIT 1', 'serial_consistency_level': > 'SERIAL'} | Execute CQL3 prepared query | 2016-03-24 13:37:59.998000+ > a00176d0-f1c5-11e5-b14a-6fe1292cf9f1 | 127.0.0.1 | QUERY | 127.0.0.1 | >857 | {'consistency_level': 'QUORUM', 'page_size': '5000', 'query': > 'INSERT INTO test.test2(id,value) VALUES (?,?)', 'serial_consistency_level': > 'SERIAL'} | Execute CQL3 prepared query | 2016-03-24 13:37:59.997000+ > {noformat} > Now, "Execute CQL3 prepared query" session displays its query. > I believe that this additional information would help operators a lot. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11425) Add prepared query parameter to trace for "Execute CQL3 prepared query" session
[ https://issues.apache.org/jira/browse/CASSANDRA-11425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15216098#comment-15216098 ] Yasuharu Goto commented on CASSANDRA-11425: --- [~thobbs] Thanks for your response! I concern about the memory consumption too. I have no idea if we should trim queries to save memory. For a good memory management, I think we might have to add query string into EntryWeigher.weightOf() calculation (and increase MAX_CACHE_PREPARED_MEMORY ?). > Add prepared query parameter to trace for "Execute CQL3 prepared query" > session > --- > > Key: CASSANDRA-11425 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11425 > Project: Cassandra > Issue Type: Improvement > Components: CQL >Reporter: Yasuharu Goto >Assignee: Yasuharu Goto >Priority: Minor > > For now, the system_traces.sessions rows for "Execute CQL3 prepared query" do > not show us any information about the prepared query which is executed on the > session. So we can't see what query is the session executing. > I think this makes performance tuning difficult on Cassandra. > So, In this ticket, I'd like to add the prepared query parameter on Execute > session trace like this. > {noformat} > cqlsh:system_traces> select * from sessions ; > session_id | client| command | coordinator | > duration | parameters > > | request | started_at > --+---+-+-+--+--+-+- > a001ec00-f1c5-11e5-b14a-6fe1292cf9f1 | 127.0.0.1 | QUERY | 127.0.0.1 | >666 | {'consistency_level': 'ONE', 'page_size': '5000', 'query': > 'SELECT * FROM test.test2 WHERE id=? LIMIT 1', 'serial_consistency_level': > 'SERIAL'} | Execute CQL3 prepared query | 2016-03-24 13:38:00.00+ > a0019de0-f1c5-11e5-b14a-6fe1292cf9f1 | 127.0.0.1 | QUERY | 127.0.0.1 | >109 | >{'query': 'SELECT * FROM test.test2 WHERE id=? LIMIT > 1'} |Preparing CQL3 query | 2016-03-24 13:37:59.998000+ > a0014fc0-f1c5-11e5-b14a-6fe1292cf9f1 | 127.0.0.1 | QUERY | 127.0.0.1 | >126 | > {'query': 'INSERT INTO test.test2(id,value) VALUES > (?,?)'} |Preparing CQL3 query | 2016-03-24 13:37:59.996000+ > a0019de1-f1c5-11e5-b14a-6fe1292cf9f1 | 127.0.0.1 | QUERY | 127.0.0.1 | >764 | {'consistency_level': 'ONE', 'page_size': '5000', 'query': > 'SELECT * FROM test.test2 WHERE id=? LIMIT 1', 'serial_consistency_level': > 'SERIAL'} | Execute CQL3 prepared query | 2016-03-24 13:37:59.998000+ > a00176d0-f1c5-11e5-b14a-6fe1292cf9f1 | 127.0.0.1 | QUERY | 127.0.0.1 | >857 | {'consistency_level': 'QUORUM', 'page_size': '5000', 'query': > 'INSERT INTO test.test2(id,value) VALUES (?,?)', 'serial_consistency_level': > 'SERIAL'} | Execute CQL3 prepared query | 2016-03-24 13:37:59.997000+ > {noformat} > Now, "Execute CQL3 prepared query" session displays its query. > I believe that this additional information would help operators a lot. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-11425) Add prepared query parameter to session trace for "Execute prepared CQL3 Query"
[ https://issues.apache.org/jira/browse/CASSANDRA-11425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yasuharu Goto updated CASSANDRA-11425: -- Summary: Add prepared query parameter to session trace for "Execute prepared CQL3 Query" (was: Add prepared statement on Execute prepared query session trace.) > Add prepared query parameter to session trace for "Execute prepared CQL3 > Query" > --- > > Key: CASSANDRA-11425 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11425 > Project: Cassandra > Issue Type: Improvement > Components: CQL >Reporter: Yasuharu Goto >Assignee: Yasuharu Goto >Priority: Minor > > For now, the system_traces.sessions rows for "Execute CQL3 prepared query" do > not show us any information about the prepared query which is executed on the > session. So we can't see what query is the session executing. > I think this makes performance tuning difficult on Cassandra. > So, In this ticket, I'd like to add the prepared query parameter on Execute > session trace like this. > {noformat} > cqlsh:system_traces> select * from sessions ; > session_id | client| command | coordinator | > duration | parameters > > | request | started_at > --+---+-+-+--+--+-+- > a001ec00-f1c5-11e5-b14a-6fe1292cf9f1 | 127.0.0.1 | QUERY | 127.0.0.1 | >666 | {'consistency_level': 'ONE', 'page_size': '5000', 'query': > 'SELECT * FROM test.test2 WHERE id=? LIMIT 1', 'serial_consistency_level': > 'SERIAL'} | Execute CQL3 prepared query | 2016-03-24 13:38:00.00+ > a0019de0-f1c5-11e5-b14a-6fe1292cf9f1 | 127.0.0.1 | QUERY | 127.0.0.1 | >109 | >{'query': 'SELECT * FROM test.test2 WHERE id=? LIMIT > 1'} |Preparing CQL3 query | 2016-03-24 13:37:59.998000+ > a0014fc0-f1c5-11e5-b14a-6fe1292cf9f1 | 127.0.0.1 | QUERY | 127.0.0.1 | >126 | > {'query': 'INSERT INTO test.test2(id,value) VALUES > (?,?)'} |Preparing CQL3 query | 2016-03-24 13:37:59.996000+ > a0019de1-f1c5-11e5-b14a-6fe1292cf9f1 | 127.0.0.1 | QUERY | 127.0.0.1 | >764 | {'consistency_level': 'ONE', 'page_size': '5000', 'query': > 'SELECT * FROM test.test2 WHERE id=? LIMIT 1', 'serial_consistency_level': > 'SERIAL'} | Execute CQL3 prepared query | 2016-03-24 13:37:59.998000+ > a00176d0-f1c5-11e5-b14a-6fe1292cf9f1 | 127.0.0.1 | QUERY | 127.0.0.1 | >857 | {'consistency_level': 'QUORUM', 'page_size': '5000', 'query': > 'INSERT INTO test.test2(id,value) VALUES (?,?)', 'serial_consistency_level': > 'SERIAL'} | Execute CQL3 prepared query | 2016-03-24 13:37:59.997000+ > {noformat} > Now, "Execute CQL3 prepared query" session displays its query. > I believe that this additional information would help operators a lot. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-11425) Add prepared query parameter to trace for "Execute CQL3 prepared query" session
[ https://issues.apache.org/jira/browse/CASSANDRA-11425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yasuharu Goto updated CASSANDRA-11425: -- Summary: Add prepared query parameter to trace for "Execute CQL3 prepared query" session (was: Add prepared query parameter to session trace for "Execute prepared CQL3 Query") > Add prepared query parameter to trace for "Execute CQL3 prepared query" > session > --- > > Key: CASSANDRA-11425 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11425 > Project: Cassandra > Issue Type: Improvement > Components: CQL >Reporter: Yasuharu Goto >Assignee: Yasuharu Goto >Priority: Minor > > For now, the system_traces.sessions rows for "Execute CQL3 prepared query" do > not show us any information about the prepared query which is executed on the > session. So we can't see what query is the session executing. > I think this makes performance tuning difficult on Cassandra. > So, In this ticket, I'd like to add the prepared query parameter on Execute > session trace like this. > {noformat} > cqlsh:system_traces> select * from sessions ; > session_id | client| command | coordinator | > duration | parameters > > | request | started_at > --+---+-+-+--+--+-+- > a001ec00-f1c5-11e5-b14a-6fe1292cf9f1 | 127.0.0.1 | QUERY | 127.0.0.1 | >666 | {'consistency_level': 'ONE', 'page_size': '5000', 'query': > 'SELECT * FROM test.test2 WHERE id=? LIMIT 1', 'serial_consistency_level': > 'SERIAL'} | Execute CQL3 prepared query | 2016-03-24 13:38:00.00+ > a0019de0-f1c5-11e5-b14a-6fe1292cf9f1 | 127.0.0.1 | QUERY | 127.0.0.1 | >109 | >{'query': 'SELECT * FROM test.test2 WHERE id=? LIMIT > 1'} |Preparing CQL3 query | 2016-03-24 13:37:59.998000+ > a0014fc0-f1c5-11e5-b14a-6fe1292cf9f1 | 127.0.0.1 | QUERY | 127.0.0.1 | >126 | > {'query': 'INSERT INTO test.test2(id,value) VALUES > (?,?)'} |Preparing CQL3 query | 2016-03-24 13:37:59.996000+ > a0019de1-f1c5-11e5-b14a-6fe1292cf9f1 | 127.0.0.1 | QUERY | 127.0.0.1 | >764 | {'consistency_level': 'ONE', 'page_size': '5000', 'query': > 'SELECT * FROM test.test2 WHERE id=? LIMIT 1', 'serial_consistency_level': > 'SERIAL'} | Execute CQL3 prepared query | 2016-03-24 13:37:59.998000+ > a00176d0-f1c5-11e5-b14a-6fe1292cf9f1 | 127.0.0.1 | QUERY | 127.0.0.1 | >857 | {'consistency_level': 'QUORUM', 'page_size': '5000', 'query': > 'INSERT INTO test.test2(id,value) VALUES (?,?)', 'serial_consistency_level': > 'SERIAL'} | Execute CQL3 prepared query | 2016-03-24 13:37:59.997000+ > {noformat} > Now, "Execute CQL3 prepared query" session displays its query. > I believe that this additional information would help operators a lot. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Issue Comment Deleted] (CASSANDRA-11425) Add prepared statement on Execute prepared query session trace.
[ https://issues.apache.org/jira/browse/CASSANDRA-11425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yasuharu Goto updated CASSANDRA-11425: -- Comment: was deleted (was: My patch is here. https://github.com/apache/cassandra/compare/trunk...matope:11425-trunk) > Add prepared statement on Execute prepared query session trace. > --- > > Key: CASSANDRA-11425 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11425 > Project: Cassandra > Issue Type: Improvement > Components: CQL >Reporter: Yasuharu Goto >Assignee: Yasuharu Goto >Priority: Minor > > For now, the system_traces.sessions rows for "Execute CQL3 prepared query" do > not show us any information about the prepared query which is executed on the > session. So we can't see what query is the session executing. > I think this makes performance tuning difficult on Cassandra. > So, In this ticket, I'd like to add the prepared query parameter on Execute > session trace like this. > {noformat} > cqlsh:system_traces> select * from sessions ; > session_id | client| command | coordinator | > duration | parameters > > | request | started_at > --+---+-+-+--+--+-+- > a001ec00-f1c5-11e5-b14a-6fe1292cf9f1 | 127.0.0.1 | QUERY | 127.0.0.1 | >666 | {'consistency_level': 'ONE', 'page_size': '5000', 'query': > 'SELECT * FROM test.test2 WHERE id=? LIMIT 1', 'serial_consistency_level': > 'SERIAL'} | Execute CQL3 prepared query | 2016-03-24 13:38:00.00+ > a0019de0-f1c5-11e5-b14a-6fe1292cf9f1 | 127.0.0.1 | QUERY | 127.0.0.1 | >109 | >{'query': 'SELECT * FROM test.test2 WHERE id=? LIMIT > 1'} |Preparing CQL3 query | 2016-03-24 13:37:59.998000+ > a0014fc0-f1c5-11e5-b14a-6fe1292cf9f1 | 127.0.0.1 | QUERY | 127.0.0.1 | >126 | > {'query': 'INSERT INTO test.test2(id,value) VALUES > (?,?)'} |Preparing CQL3 query | 2016-03-24 13:37:59.996000+ > a0019de1-f1c5-11e5-b14a-6fe1292cf9f1 | 127.0.0.1 | QUERY | 127.0.0.1 | >764 | {'consistency_level': 'ONE', 'page_size': '5000', 'query': > 'SELECT * FROM test.test2 WHERE id=? LIMIT 1', 'serial_consistency_level': > 'SERIAL'} | Execute CQL3 prepared query | 2016-03-24 13:37:59.998000+ > a00176d0-f1c5-11e5-b14a-6fe1292cf9f1 | 127.0.0.1 | QUERY | 127.0.0.1 | >857 | {'consistency_level': 'QUORUM', 'page_size': '5000', 'query': > 'INSERT INTO test.test2(id,value) VALUES (?,?)', 'serial_consistency_level': > 'SERIAL'} | Execute CQL3 prepared query | 2016-03-24 13:37:59.997000+ > {noformat} > Now, "Execute CQL3 prepared query" session displays its query. > I believe that this additional information would help operators a lot. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-11425) Add prepared statement on Execute prepared query session trace.
[ https://issues.apache.org/jira/browse/CASSANDRA-11425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yasuharu Goto updated CASSANDRA-11425: -- Status: Patch Available (was: Open) My patch is here. https://github.com/apache/cassandra/compare/trunk...matope:11425-trunk > Add prepared statement on Execute prepared query session trace. > --- > > Key: CASSANDRA-11425 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11425 > Project: Cassandra > Issue Type: Improvement > Components: CQL >Reporter: Yasuharu Goto >Assignee: Yasuharu Goto >Priority: Minor > > For now, the system_traces.sessions rows for "Execute CQL3 prepared query" do > not show us any information about the prepared query which is executed on the > session. So we can't see what query is the session executing. > I think this makes performance tuning difficult on Cassandra. > So, In this ticket, I'd like to add the prepared query parameter on Execute > session trace like this. > {noformat} > cqlsh:system_traces> select * from sessions ; > session_id | client| command | coordinator | > duration | parameters > > | request | started_at > --+---+-+-+--+--+-+- > a001ec00-f1c5-11e5-b14a-6fe1292cf9f1 | 127.0.0.1 | QUERY | 127.0.0.1 | >666 | {'consistency_level': 'ONE', 'page_size': '5000', 'query': > 'SELECT * FROM test.test2 WHERE id=? LIMIT 1', 'serial_consistency_level': > 'SERIAL'} | Execute CQL3 prepared query | 2016-03-24 13:38:00.00+ > a0019de0-f1c5-11e5-b14a-6fe1292cf9f1 | 127.0.0.1 | QUERY | 127.0.0.1 | >109 | >{'query': 'SELECT * FROM test.test2 WHERE id=? LIMIT > 1'} |Preparing CQL3 query | 2016-03-24 13:37:59.998000+ > a0014fc0-f1c5-11e5-b14a-6fe1292cf9f1 | 127.0.0.1 | QUERY | 127.0.0.1 | >126 | > {'query': 'INSERT INTO test.test2(id,value) VALUES > (?,?)'} |Preparing CQL3 query | 2016-03-24 13:37:59.996000+ > a0019de1-f1c5-11e5-b14a-6fe1292cf9f1 | 127.0.0.1 | QUERY | 127.0.0.1 | >764 | {'consistency_level': 'ONE', 'page_size': '5000', 'query': > 'SELECT * FROM test.test2 WHERE id=? LIMIT 1', 'serial_consistency_level': > 'SERIAL'} | Execute CQL3 prepared query | 2016-03-24 13:37:59.998000+ > a00176d0-f1c5-11e5-b14a-6fe1292cf9f1 | 127.0.0.1 | QUERY | 127.0.0.1 | >857 | {'consistency_level': 'QUORUM', 'page_size': '5000', 'query': > 'INSERT INTO test.test2(id,value) VALUES (?,?)', 'serial_consistency_level': > 'SERIAL'} | Execute CQL3 prepared query | 2016-03-24 13:37:59.997000+ > {noformat} > Now, "Execute CQL3 prepared query" session displays its query. > I believe that this additional information would help operators a lot. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11425) Add prepared statement on Execute prepared query session trace.
[ https://issues.apache.org/jira/browse/CASSANDRA-11425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15210265#comment-15210265 ] Yasuharu Goto commented on CASSANDRA-11425: --- My patch is here. https://github.com/apache/cassandra/compare/trunk...matope:11425-trunk > Add prepared statement on Execute prepared query session trace. > --- > > Key: CASSANDRA-11425 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11425 > Project: Cassandra > Issue Type: Improvement > Components: CQL >Reporter: Yasuharu Goto >Assignee: Yasuharu Goto >Priority: Minor > > For now, the system_traces.sessions rows for "Execute CQL3 prepared query" do > not show us any information about the prepared query which is executed on the > session. So we can't see what query is the session executing. > I think this makes performance tuning difficult on Cassandra. > So, In this ticket, I'd like to add the prepared query parameter on Execute > session trace like this. > {noformat} > cqlsh:system_traces> select * from sessions ; > session_id | client| command | coordinator | > duration | parameters > > | request | started_at > --+---+-+-+--+--+-+- > a001ec00-f1c5-11e5-b14a-6fe1292cf9f1 | 127.0.0.1 | QUERY | 127.0.0.1 | >666 | {'consistency_level': 'ONE', 'page_size': '5000', 'query': > 'SELECT * FROM test.test2 WHERE id=? LIMIT 1', 'serial_consistency_level': > 'SERIAL'} | Execute CQL3 prepared query | 2016-03-24 13:38:00.00+ > a0019de0-f1c5-11e5-b14a-6fe1292cf9f1 | 127.0.0.1 | QUERY | 127.0.0.1 | >109 | >{'query': 'SELECT * FROM test.test2 WHERE id=? LIMIT > 1'} |Preparing CQL3 query | 2016-03-24 13:37:59.998000+ > a0014fc0-f1c5-11e5-b14a-6fe1292cf9f1 | 127.0.0.1 | QUERY | 127.0.0.1 | >126 | > {'query': 'INSERT INTO test.test2(id,value) VALUES > (?,?)'} |Preparing CQL3 query | 2016-03-24 13:37:59.996000+ > a0019de1-f1c5-11e5-b14a-6fe1292cf9f1 | 127.0.0.1 | QUERY | 127.0.0.1 | >764 | {'consistency_level': 'ONE', 'page_size': '5000', 'query': > 'SELECT * FROM test.test2 WHERE id=? LIMIT 1', 'serial_consistency_level': > 'SERIAL'} | Execute CQL3 prepared query | 2016-03-24 13:37:59.998000+ > a00176d0-f1c5-11e5-b14a-6fe1292cf9f1 | 127.0.0.1 | QUERY | 127.0.0.1 | >857 | {'consistency_level': 'QUORUM', 'page_size': '5000', 'query': > 'INSERT INTO test.test2(id,value) VALUES (?,?)', 'serial_consistency_level': > 'SERIAL'} | Execute CQL3 prepared query | 2016-03-24 13:37:59.997000+ > {noformat} > Now, "Execute CQL3 prepared query" session displays its query. > I believe that this additional information would help operators a lot. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-11425) Add prepared statement on Execute prepared query session trace.
Yasuharu Goto created CASSANDRA-11425: - Summary: Add prepared statement on Execute prepared query session trace. Key: CASSANDRA-11425 URL: https://issues.apache.org/jira/browse/CASSANDRA-11425 Project: Cassandra Issue Type: Improvement Components: CQL Reporter: Yasuharu Goto Assignee: Yasuharu Goto Priority: Minor For now, the system_traces.sessions rows for "Execute CQL3 prepared query" do not show us any information about the prepared query which is executed on the session. So we can't see what query is the session executing. I think this makes performance tuning difficult on Cassandra. So, In this ticket, I'd like to add the prepared query parameter on Execute session trace like this. {noformat} cqlsh:system_traces> select * from sessions ; session_id | client| command | coordinator | duration | parameters | request | started_at --+---+-+-+--+--+-+- a001ec00-f1c5-11e5-b14a-6fe1292cf9f1 | 127.0.0.1 | QUERY | 127.0.0.1 | 666 | {'consistency_level': 'ONE', 'page_size': '5000', 'query': 'SELECT * FROM test.test2 WHERE id=? LIMIT 1', 'serial_consistency_level': 'SERIAL'} | Execute CQL3 prepared query | 2016-03-24 13:38:00.00+ a0019de0-f1c5-11e5-b14a-6fe1292cf9f1 | 127.0.0.1 | QUERY | 127.0.0.1 | 109 | {'query': 'SELECT * FROM test.test2 WHERE id=? LIMIT 1'} | Preparing CQL3 query | 2016-03-24 13:37:59.998000+ a0014fc0-f1c5-11e5-b14a-6fe1292cf9f1 | 127.0.0.1 | QUERY | 127.0.0.1 | 126 | {'query': 'INSERT INTO test.test2(id,value) VALUES (?,?)'} | Preparing CQL3 query | 2016-03-24 13:37:59.996000+ a0019de1-f1c5-11e5-b14a-6fe1292cf9f1 | 127.0.0.1 | QUERY | 127.0.0.1 | 764 | {'consistency_level': 'ONE', 'page_size': '5000', 'query': 'SELECT * FROM test.test2 WHERE id=? LIMIT 1', 'serial_consistency_level': 'SERIAL'} | Execute CQL3 prepared query | 2016-03-24 13:37:59.998000+ a00176d0-f1c5-11e5-b14a-6fe1292cf9f1 | 127.0.0.1 | QUERY | 127.0.0.1 | 857 | {'consistency_level': 'QUORUM', 'page_size': '5000', 'query': 'INSERT INTO test.test2(id,value) VALUES (?,?)', 'serial_consistency_level': 'SERIAL'} | Execute CQL3 prepared query | 2016-03-24 13:37:59.997000+ {noformat} Now, "Execute CQL3 prepared query" session displays its query. I believe that this additional information would help operators a lot. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10875) cqlsh fails to decode utf-8 characters for text typed columns.
[ https://issues.apache.org/jira/browse/CASSANDRA-10875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15082979#comment-15082979 ] Yasuharu Goto commented on CASSANDRA-10875: --- Oh, I've overlooked the commit. Thank you for your review and merge! [~pauloricardomg] [~snazy] > cqlsh fails to decode utf-8 characters for text typed columns. > -- > > Key: CASSANDRA-10875 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10875 > Project: Cassandra > Issue Type: Bug > Components: Tools >Reporter: Yasuharu Goto >Assignee: Yasuharu Goto >Priority: Minor > Fix For: 2.1.13, 2.2.5, 3.0.3 > > Attachments: 10875-2.1-2.txt, 10875-2.1-3.txt, 10875-2.1.12.txt, > 10875-2.2.txt, 10875-3.1.txt > > > Hi, we've found a bug that cqlsh can't handle unicode text in select > conditions even if it were text type. > {noformat} > $ ./bin/cqlsh > Connected to Test Cluster at 127.0.0.1:9042. > [cqlsh 5.0.1 | Cassandra 3.2-SNAPSHOT | CQL spec 3.3.1 | Native protocol v4] > Use HELP for help. > cqlsh> create KEYSPACE test WITH replication = {'class': 'SimpleStrategy', > 'replication_factor': 1}; > cqlsh> create table test.test(txt text primary key); > cqlsh> insert into test.test (txt) values('日本語'); > cqlsh> select * from test.test where txt='日本語'; > 'ascii' codec can't decode byte 0xe6 in position 35: ordinal not in range(128) > cqlsh> > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-10875) cqlsh fails to decode utf-8 characters for text typed columns.
[ https://issues.apache.org/jira/browse/CASSANDRA-10875?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yasuharu Goto updated CASSANDRA-10875: -- Attachment: 10875-2.1-2.txt > cqlsh fails to decode utf-8 characters for text typed columns. > -- > > Key: CASSANDRA-10875 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10875 > Project: Cassandra > Issue Type: Bug > Components: Tools >Reporter: Yasuharu Goto >Assignee: Yasuharu Goto >Priority: Minor > Fix For: 2.1.x, 2.2.x, 3.0.x, 3.x > > Attachments: 10875-2.1-2.txt, 10875-2.1.12.txt, 10875-2.2.txt, > 10875-3.1.txt > > > Hi, we've found a bug that cqlsh can't handle unicode text in select > conditions even if it were text type. > {noformat} > $ ./bin/cqlsh > Connected to Test Cluster at 127.0.0.1:9042. > [cqlsh 5.0.1 | Cassandra 3.2-SNAPSHOT | CQL spec 3.3.1 | Native protocol v4] > Use HELP for help. > cqlsh> create KEYSPACE test WITH replication = {'class': 'SimpleStrategy', > 'replication_factor': 1}; > cqlsh> create table test.test(txt text primary key); > cqlsh> insert into test.test (txt) values('日本語'); > cqlsh> select * from test.test where txt='日本語'; > 'ascii' codec can't decode byte 0xe6 in position 35: ordinal not in range(128) > cqlsh> > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-10875) cqlsh fails to decode utf-8 characters for text typed columns.
[ https://issues.apache.org/jira/browse/CASSANDRA-10875?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yasuharu Goto updated CASSANDRA-10875: -- Attachment: 10875-2.2.txt > cqlsh fails to decode utf-8 characters for text typed columns. > -- > > Key: CASSANDRA-10875 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10875 > Project: Cassandra > Issue Type: Bug > Components: Tools >Reporter: Yasuharu Goto >Assignee: Yasuharu Goto >Priority: Minor > Fix For: 2.1.x, 2.2.x, 3.0.x, 3.x > > Attachments: 10875-2.1-2.txt, 10875-2.1.12.txt, 10875-2.2.txt, > 10875-3.1.txt > > > Hi, we've found a bug that cqlsh can't handle unicode text in select > conditions even if it were text type. > {noformat} > $ ./bin/cqlsh > Connected to Test Cluster at 127.0.0.1:9042. > [cqlsh 5.0.1 | Cassandra 3.2-SNAPSHOT | CQL spec 3.3.1 | Native protocol v4] > Use HELP for help. > cqlsh> create KEYSPACE test WITH replication = {'class': 'SimpleStrategy', > 'replication_factor': 1}; > cqlsh> create table test.test(txt text primary key); > cqlsh> insert into test.test (txt) values('日本語'); > cqlsh> select * from test.test where txt='日本語'; > 'ascii' codec can't decode byte 0xe6 in position 35: ordinal not in range(128) > cqlsh> > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10875) cqlsh fails to decode utf-8 characters for text typed columns.
[ https://issues.apache.org/jira/browse/CASSANDRA-10875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15070499#comment-15070499 ] Yasuharu Goto commented on CASSANDRA-10875: --- [~pauloricardomg] Thank you for your great review! (And sorry to be late) I've updated my patches to 10875-2.1-2.txt and 10875-2.2.txt. I could merge 10875-2.2 to 2.2,3.0, and trunk. > cqlsh fails to decode utf-8 characters for text typed columns. > -- > > Key: CASSANDRA-10875 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10875 > Project: Cassandra > Issue Type: Bug > Components: Tools >Reporter: Yasuharu Goto >Assignee: Yasuharu Goto >Priority: Minor > Fix For: 2.1.x, 2.2.x, 3.0.x, 3.x > > Attachments: 10875-2.1-2.txt, 10875-2.1.12.txt, 10875-2.2.txt, > 10875-3.1.txt > > > Hi, we've found a bug that cqlsh can't handle unicode text in select > conditions even if it were text type. > {noformat} > $ ./bin/cqlsh > Connected to Test Cluster at 127.0.0.1:9042. > [cqlsh 5.0.1 | Cassandra 3.2-SNAPSHOT | CQL spec 3.3.1 | Native protocol v4] > Use HELP for help. > cqlsh> create KEYSPACE test WITH replication = {'class': 'SimpleStrategy', > 'replication_factor': 1}; > cqlsh> create table test.test(txt text primary key); > cqlsh> insert into test.test (txt) values('日本語'); > cqlsh> select * from test.test where txt='日本語'; > 'ascii' codec can't decode byte 0xe6 in position 35: ordinal not in range(128) > cqlsh> > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10875) cqlsh fails to decode utf-8 characters for text typed columns.
[ https://issues.apache.org/jira/browse/CASSANDRA-10875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15063294#comment-15063294 ] Yasuharu Goto commented on CASSANDRA-10875: --- Thank you for your response! I didn't notice --encoding option. I checked --help and --encoding option. In cassandra-2.1.9, cqlsh doesn't have --encoding option. {noformat} $ cqlsh --help Usage: cqlsh [options] [host [port]] CQL Shell for Apache Cassandra Options: --version show program's version number and exit -h, --helpshow this help message and exit -C, --color Always use color output --no-colorNever use color output -u USERNAME, --username=USERNAME Authenticate as user. -p PASSWORD, --password=PASSWORD Authenticate using password. -k KEYSPACE, --keyspace=KEYSPACE Authenticate to the given keyspace. -f FILE, --file=FILE Execute commands from FILE, then exit -t TRANSPORT_FACTORY, --transport-factory=TRANSPORT_FACTORY Use the provided Thrift transport factory function. --debug Show additional debugging information --cqlversion=CQLVERSION Specify a particular CQL version (default: 3). Examples: "2", "3.0.0-beta1" -2, --cql2Shortcut notation for --cqlversion=2 -3, --cql3Shortcut notation for --cqlversion=3 Connects to localhost:9160 by default. These defaults can be changed by setting $CQLSH_HOST and/or $CQLSH_PORT. When a host (and optional port number) are given on the command line, they take precedence over any defaults. $ cqlsh --encode=utf8 Usage: cqlsh [options] [host [port]] cqlsh: error: no such option: --encode {noformat} In Cassandra-3.0.0, cqlsh has it. But the help says encoding is utf8 already. {noformat} ./cqlsh --help Usage: cqlsh.py [options] [host [port]] CQL Shell for Apache Cassandra Options: --version show program's version number and exit -h, --helpshow this help message and exit -C, --color Always use color output --no-colorNever use color output --ssl Use SSL -u USERNAME, --username=USERNAME Authenticate as user. -p PASSWORD, --password=PASSWORD Authenticate using password. -k KEYSPACE, --keyspace=KEYSPACE Authenticate to the given keyspace. -f FILE, --file=FILE Execute commands from FILE, then exit --debug Show additional debugging information --encoding=ENCODING Specify a non-default encoding for output. If you are experiencing problems with unicode characters, using utf8 may fix the problem. (Default from system preferences: UTF-8) --cqlshrc=CQLSHRC Specify an alternative cqlshrc file location. --cqlversion=CQLVERSION Specify a particular CQL version (default: 3.3.1). Examples: "3.0.3", "3.1.0" -e EXECUTE, --execute=EXECUTE Execute the statement and quit. --connect-timeout=CONNECT_TIMEOUT Specify the connection timeout in seconds (default: 5 seconds). Connects to 127.0.0.1:9042 by default. These defaults can be changed by setting $CQLSH_HOST and/or $CQLSH_PORT. When a host (and optional port number) are given on the command line, they take precedence over any defaults. {noformat} But, cqlsh --encoding=utf8 doesn't seem to work correctly. {noformat} ./cqlsh --encoding=utf8 Connected to Test Cluster at 127.0.0.1:9042. [cqlsh 5.0.1 | Cassandra 3.0.0 | CQL spec 3.3.1 | Native protocol v4] Use HELP for help. cqlsh> select * from test where id='日本語'; 'ascii' codec can't decode byte 0xe6 in position 29: ordinal not in range(128) cqlsh> {noformat} > cqlsh fails to decode utf-8 characters for text typed columns. > -- > > Key: CASSANDRA-10875 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10875 > Project: Cassandra > Issue Type: Bug > Components: Tools >Reporter: Yasuharu Goto >Assignee: Yasuharu Goto >Priority: Minor > Fix For: 2.1.13, 3.1 > > Attachments: 10875-2.1.12.txt, 10875-3.1.txt > > > Hi, we've found a bug that cqlsh can't handle unicode text in select > conditions even if it were text type. > {noformat} > $ ./bin/cqlsh > Connected to Test Cluster at 127.0.0.1:9042. > [cqlsh 5.0.1 | Cassandra 3.2-SNAPSHOT | CQL spec 3.3.1 | Native protocol v4] > Use HELP for help. > cqlsh> create KEYSPACE test WITH replication = {'class': 'SimpleStrategy', > 'replication_factor': 1}; > cqlsh> create table test.test(txt text
[jira] [Comment Edited] (CASSANDRA-10875) cqlsh fails to decode utf-8 characters for text typed columns.
[ https://issues.apache.org/jira/browse/CASSANDRA-10875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15063294#comment-15063294 ] Yasuharu Goto edited comment on CASSANDRA-10875 at 12/18/15 2:14 AM: - Thank you for your response! I didn't notice --encoding option. I checked --help and --encoding option. In cassandra-2.1.9, cqlsh doesn't have --encoding option. {noformat} $ cqlsh --help Usage: cqlsh [options] [host [port]] CQL Shell for Apache Cassandra Options: --version show program's version number and exit -h, --helpshow this help message and exit -C, --color Always use color output --no-colorNever use color output -u USERNAME, --username=USERNAME Authenticate as user. -p PASSWORD, --password=PASSWORD Authenticate using password. -k KEYSPACE, --keyspace=KEYSPACE Authenticate to the given keyspace. -f FILE, --file=FILE Execute commands from FILE, then exit -t TRANSPORT_FACTORY, --transport-factory=TRANSPORT_FACTORY Use the provided Thrift transport factory function. --debug Show additional debugging information --cqlversion=CQLVERSION Specify a particular CQL version (default: 3). Examples: "2", "3.0.0-beta1" -2, --cql2Shortcut notation for --cqlversion=2 -3, --cql3Shortcut notation for --cqlversion=3 Connects to localhost:9160 by default. These defaults can be changed by setting $CQLSH_HOST and/or $CQLSH_PORT. When a host (and optional port number) are given on the command line, they take precedence over any defaults. $ cqlsh --encode=utf8 Usage: cqlsh [options] [host [port]] cqlsh: error: no such option: --encode {noformat} In Cassandra-3.0.0, cqlsh has it. But the help says encoding is utf8 already. {noformat} ./cqlsh --help Usage: cqlsh.py [options] [host [port]] CQL Shell for Apache Cassandra Options: --version show program's version number and exit -h, --helpshow this help message and exit -C, --color Always use color output --no-colorNever use color output --ssl Use SSL -u USERNAME, --username=USERNAME Authenticate as user. -p PASSWORD, --password=PASSWORD Authenticate using password. -k KEYSPACE, --keyspace=KEYSPACE Authenticate to the given keyspace. -f FILE, --file=FILE Execute commands from FILE, then exit --debug Show additional debugging information --encoding=ENCODING Specify a non-default encoding for output. If you are experiencing problems with unicode characters, using utf8 may fix the problem. (Default from system preferences: UTF-8) --cqlshrc=CQLSHRC Specify an alternative cqlshrc file location. --cqlversion=CQLVERSION Specify a particular CQL version (default: 3.3.1). Examples: "3.0.3", "3.1.0" -e EXECUTE, --execute=EXECUTE Execute the statement and quit. --connect-timeout=CONNECT_TIMEOUT Specify the connection timeout in seconds (default: 5 seconds). Connects to 127.0.0.1:9042 by default. These defaults can be changed by setting $CQLSH_HOST and/or $CQLSH_PORT. When a host (and optional port number) are given on the command line, they take precedence over any defaults. {noformat} Furthermore, cqlsh --encoding=utf8 doesn't seem to work correctly. {noformat} ./cqlsh --encoding=utf8 Connected to Test Cluster at 127.0.0.1:9042. [cqlsh 5.0.1 | Cassandra 3.0.0 | CQL spec 3.3.1 | Native protocol v4] Use HELP for help. cqlsh> select * from test where id='日本語'; 'ascii' codec can't decode byte 0xe6 in position 29: ordinal not in range(128) cqlsh> {noformat} was (Author: yasuharu): Thank you for your response! I didn't notice --encoding option. I checked --help and --encoding option. In cassandra-2.1.9, cqlsh doesn't have --encoding option. {noformat} $ cqlsh --help Usage: cqlsh [options] [host [port]] CQL Shell for Apache Cassandra Options: --version show program's version number and exit -h, --helpshow this help message and exit -C, --color Always use color output --no-colorNever use color output -u USERNAME, --username=USERNAME Authenticate as user. -p PASSWORD, --password=PASSWORD Authenticate using password. -k KEYSPACE, --keyspace=KEYSPACE Authenticate to the given keyspace. -f FILE, --file=FILE Execute commands from FILE, then exit -t TRANSPORT_FACTORY, --transport-factory=TRANSPORT_FACTORY
[jira] [Updated] (CASSANDRA-10875) cqlsh fails to decode utf-8 characters for text typed columns.
[ https://issues.apache.org/jira/browse/CASSANDRA-10875?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yasuharu Goto updated CASSANDRA-10875: -- Summary: cqlsh fails to decode utf-8 characters for text typed columns. (was: cqlsh decodes text column values as ascii in SELECT statements.) > cqlsh fails to decode utf-8 characters for text typed columns. > -- > > Key: CASSANDRA-10875 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10875 > Project: Cassandra > Issue Type: Bug > Components: Tools >Reporter: Yasuharu Goto >Assignee: Yasuharu Goto >Priority: Minor > Fix For: 2.1.13, 3.1 > > Attachments: 10875-2.1.12.txt, 10875-3.1.txt > > > Hi, we've found a bug that cqlsh can't handle unicode text in select > conditions even if it were text type. > {noformat} > $ ./bin/cqlsh > Connected to Test Cluster at 127.0.0.1:9042. > [cqlsh 5.0.1 | Cassandra 3.2-SNAPSHOT | CQL spec 3.3.1 | Native protocol v4] > Use HELP for help. > cqlsh> create KEYSPACE test WITH replication = {'class': 'SimpleStrategy', > 'replication_factor': 1}; > cqlsh> create table test.test(txt text primary key); > cqlsh> insert into test.test (txt) values('日本語'); > cqlsh> select * from test.test where txt='日本語'; > 'ascii' codec can't decode byte 0xe6 in position 35: ordinal not in range(128) > cqlsh> > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-10875) cqlsh decodes text column values as ascii in SELECT clause.
[ https://issues.apache.org/jira/browse/CASSANDRA-10875?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yasuharu Goto updated CASSANDRA-10875: -- Summary: cqlsh decodes text column values as ascii in SELECT clause. (was: cqlsh decodes text as ascii in SELECT clause.) > cqlsh decodes text column values as ascii in SELECT clause. > --- > > Key: CASSANDRA-10875 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10875 > Project: Cassandra > Issue Type: Bug > Components: Tools >Reporter: Yasuharu Goto >Assignee: Yasuharu Goto >Priority: Minor > > Hi, we've found a bug that cqlsh can't handle unicode text in select > conditions even if it were text type. > {noformat} > $ ./bin/cqlsh > Connected to Test Cluster at 127.0.0.1:9042. > [cqlsh 5.0.1 | Cassandra 3.2-SNAPSHOT | CQL spec 3.3.1 | Native protocol v4] > Use HELP for help. > cqlsh> create KEYSPACE test WITH replication = {'class': 'SimpleStrategy', > 'replication_factor': 1}; > cqlsh> create table test.test(txt text primary key); > cqlsh> insert into test.test (txt) values('日本語'); > cqlsh> select * from test.test where txt='日本語'; > 'ascii' codec can't decode byte 0xe6 in position 35: ordinal not in range(128) > cqlsh> > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-10875) cqlsh decodes text as ascii in SELECT clause.
Yasuharu Goto created CASSANDRA-10875: - Summary: cqlsh decodes text as ascii in SELECT clause. Key: CASSANDRA-10875 URL: https://issues.apache.org/jira/browse/CASSANDRA-10875 Project: Cassandra Issue Type: Bug Components: Tools Reporter: Yasuharu Goto Assignee: Yasuharu Goto Priority: Minor Hi, we've found a bug that cqlsh can't handle unicode text in select conditions even if it were text type. {noformat} $ ./bin/cqlsh Connected to Test Cluster at 127.0.0.1:9042. [cqlsh 5.0.1 | Cassandra 3.2-SNAPSHOT | CQL spec 3.3.1 | Native protocol v4] Use HELP for help. cqlsh> create KEYSPACE test WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 1}; cqlsh> create table test.test(txt text primary key); cqlsh> insert into test.test (txt) values('日本語'); cqlsh> select * from test.test where txt='日本語'; 'ascii' codec can't decode byte 0xe6 in position 35: ordinal not in range(128) cqlsh> {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-10875) cqlsh decodes text column values as ascii in SELECT clause.
[ https://issues.apache.org/jira/browse/CASSANDRA-10875?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yasuharu Goto updated CASSANDRA-10875: -- Attachment: 10875-3.1.txt > cqlsh decodes text column values as ascii in SELECT clause. > --- > > Key: CASSANDRA-10875 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10875 > Project: Cassandra > Issue Type: Bug > Components: Tools >Reporter: Yasuharu Goto >Assignee: Yasuharu Goto >Priority: Minor > > Hi, we've found a bug that cqlsh can't handle unicode text in select > conditions even if it were text type. > {noformat} > $ ./bin/cqlsh > Connected to Test Cluster at 127.0.0.1:9042. > [cqlsh 5.0.1 | Cassandra 3.2-SNAPSHOT | CQL spec 3.3.1 | Native protocol v4] > Use HELP for help. > cqlsh> create KEYSPACE test WITH replication = {'class': 'SimpleStrategy', > 'replication_factor': 1}; > cqlsh> create table test.test(txt text primary key); > cqlsh> insert into test.test (txt) values('日本語'); > cqlsh> select * from test.test where txt='日本語'; > 'ascii' codec can't decode byte 0xe6 in position 35: ordinal not in range(128) > cqlsh> > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-10875) cqlsh decodes text column values as ascii in SELECT clause.
[ https://issues.apache.org/jira/browse/CASSANDRA-10875?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yasuharu Goto updated CASSANDRA-10875: -- Attachment: (was: 10875-3.1.txt) > cqlsh decodes text column values as ascii in SELECT clause. > --- > > Key: CASSANDRA-10875 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10875 > Project: Cassandra > Issue Type: Bug > Components: Tools >Reporter: Yasuharu Goto >Assignee: Yasuharu Goto >Priority: Minor > > Hi, we've found a bug that cqlsh can't handle unicode text in select > conditions even if it were text type. > {noformat} > $ ./bin/cqlsh > Connected to Test Cluster at 127.0.0.1:9042. > [cqlsh 5.0.1 | Cassandra 3.2-SNAPSHOT | CQL spec 3.3.1 | Native protocol v4] > Use HELP for help. > cqlsh> create KEYSPACE test WITH replication = {'class': 'SimpleStrategy', > 'replication_factor': 1}; > cqlsh> create table test.test(txt text primary key); > cqlsh> insert into test.test (txt) values('日本語'); > cqlsh> select * from test.test where txt='日本語'; > 'ascii' codec can't decode byte 0xe6 in position 35: ordinal not in range(128) > cqlsh> > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-10875) cqlsh decodes text column values as ascii in SELECT clause.
[ https://issues.apache.org/jira/browse/CASSANDRA-10875?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yasuharu Goto updated CASSANDRA-10875: -- Attachment: 10875-3.1.txt > cqlsh decodes text column values as ascii in SELECT clause. > --- > > Key: CASSANDRA-10875 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10875 > Project: Cassandra > Issue Type: Bug > Components: Tools >Reporter: Yasuharu Goto >Assignee: Yasuharu Goto >Priority: Minor > Attachments: 10875-2.1.12.txt, 10875-3.1.txt > > > Hi, we've found a bug that cqlsh can't handle unicode text in select > conditions even if it were text type. > {noformat} > $ ./bin/cqlsh > Connected to Test Cluster at 127.0.0.1:9042. > [cqlsh 5.0.1 | Cassandra 3.2-SNAPSHOT | CQL spec 3.3.1 | Native protocol v4] > Use HELP for help. > cqlsh> create KEYSPACE test WITH replication = {'class': 'SimpleStrategy', > 'replication_factor': 1}; > cqlsh> create table test.test(txt text primary key); > cqlsh> insert into test.test (txt) values('日本語'); > cqlsh> select * from test.test where txt='日本語'; > 'ascii' codec can't decode byte 0xe6 in position 35: ordinal not in range(128) > cqlsh> > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-10875) cqlsh decodes text column values as ascii in SELECT clause.
[ https://issues.apache.org/jira/browse/CASSANDRA-10875?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yasuharu Goto updated CASSANDRA-10875: -- Attachment: 10875-2.1.12.txt > cqlsh decodes text column values as ascii in SELECT clause. > --- > > Key: CASSANDRA-10875 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10875 > Project: Cassandra > Issue Type: Bug > Components: Tools >Reporter: Yasuharu Goto >Assignee: Yasuharu Goto >Priority: Minor > Attachments: 10875-2.1.12.txt > > > Hi, we've found a bug that cqlsh can't handle unicode text in select > conditions even if it were text type. > {noformat} > $ ./bin/cqlsh > Connected to Test Cluster at 127.0.0.1:9042. > [cqlsh 5.0.1 | Cassandra 3.2-SNAPSHOT | CQL spec 3.3.1 | Native protocol v4] > Use HELP for help. > cqlsh> create KEYSPACE test WITH replication = {'class': 'SimpleStrategy', > 'replication_factor': 1}; > cqlsh> create table test.test(txt text primary key); > cqlsh> insert into test.test (txt) values('日本語'); > cqlsh> select * from test.test where txt='日本語'; > 'ascii' codec can't decode byte 0xe6 in position 35: ordinal not in range(128) > cqlsh> > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-10875) cqlsh decodes text column values as ascii in SELECT statements.
[ https://issues.apache.org/jira/browse/CASSANDRA-10875?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yasuharu Goto updated CASSANDRA-10875: -- Summary: cqlsh decodes text column values as ascii in SELECT statements. (was: cqlsh decodes text column values as ascii in SELECT clause.) > cqlsh decodes text column values as ascii in SELECT statements. > --- > > Key: CASSANDRA-10875 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10875 > Project: Cassandra > Issue Type: Bug > Components: Tools >Reporter: Yasuharu Goto >Assignee: Yasuharu Goto >Priority: Minor > Fix For: 2.1.13, 3.1 > > Attachments: 10875-2.1.12.txt, 10875-3.1.txt > > > Hi, we've found a bug that cqlsh can't handle unicode text in select > conditions even if it were text type. > {noformat} > $ ./bin/cqlsh > Connected to Test Cluster at 127.0.0.1:9042. > [cqlsh 5.0.1 | Cassandra 3.2-SNAPSHOT | CQL spec 3.3.1 | Native protocol v4] > Use HELP for help. > cqlsh> create KEYSPACE test WITH replication = {'class': 'SimpleStrategy', > 'replication_factor': 1}; > cqlsh> create table test.test(txt text primary key); > cqlsh> insert into test.test (txt) values('日本語'); > cqlsh> select * from test.test where txt='日本語'; > 'ascii' codec can't decode byte 0xe6 in position 35: ordinal not in range(128) > cqlsh> > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9779) Append-only optimization
[ https://issues.apache.org/jira/browse/CASSANDRA-9779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14984278#comment-14984278 ] Yasuharu Goto commented on CASSANDRA-9779: -- How about WITH INSERTS ONLY option for each columns? In our use case, we have mutable and immutable columns in a table and we're indexing only immutable columns manually now. We'll be happy if this optimization could be applied to our app. > Append-only optimization > > > Key: CASSANDRA-9779 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9779 > Project: Cassandra > Issue Type: New Feature > Components: CQL >Reporter: Jonathan Ellis > Fix For: 3.x > > > Many common workloads are append-only: that is, they insert new rows but do > not update existing ones. However, Cassandra has no way to infer this and so > it must treat all tables as if they may experience updates in the future. > If we added syntax to tell Cassandra about this ({{WITH INSERTS ONLY}} for > instance) then we could do a number of optimizations: > - Compaction would only need to worry about defragmenting partitions, not > rows. We could default to DTCS or similar. > - CollationController could stop scanning sstables as soon as it finds a > matching row > - Most importantly, materialized views wouldn't need to worry about deleting > prior values, which would eliminate the majority of the MV overhead -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9898) cqlsh crashes if it load a utf-8 file.
[ https://issues.apache.org/jira/browse/CASSANDRA-9898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14703391#comment-14703391 ] Yasuharu Goto commented on CASSANDRA-9898: -- Ping [~carlyeks]. What should I do as a next step? cqlsh crashes if it load a utf-8 file. -- Key: CASSANDRA-9898 URL: https://issues.apache.org/jira/browse/CASSANDRA-9898 Project: Cassandra Issue Type: Bug Components: Tools Environment: linux, os x yosemite. Reporter: Yasuharu Goto Assignee: Yasuharu Goto Priority: Minor Labels: cqlsh Fix For: 2.1.x, 2.2.x Attachments: cassandra-2.1-9898.txt, cassandra-2.2-9898.txt cqlsh crashes when it load a cql script file encoded in utf-8. This is a reproduction procedure. {noformat} $cat ./test.cql // 日本語のコメント use system; select * from system.peers; $cqlsh --version cqlsh 5.0.1 $cqlsh -f ./test.cql Traceback (most recent call last): File ./cqlsh, line 2459, in module main(*read_options(sys.argv[1:], os.environ)) File ./cqlsh, line 2451, in main shell.cmdloop() File ./cqlsh, line 940, in cmdloop line = self.get_input_line(self.prompt) File ./cqlsh, line 909, in get_input_line self.lastcmd = self.stdin.readline() File /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/codecs.py, line 675, in readline return self.reader.readline(size) File /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/codecs.py, line 530, in readline data = self.read(readsize, firstline=True) File /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/codecs.py, line 477, in read newchars, decodedbytes = self.decode(data, self.errors) UnicodeDecodeError: 'ascii' codec can't decode byte 0xe6 in position 3: ordinal not in range(128) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-9898) cqlsh crashes if it load a utf-8 file.
[ https://issues.apache.org/jira/browse/CASSANDRA-9898?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yasuharu Goto updated CASSANDRA-9898: - Description: cqlsh crashes when it load a cql script file encoded in utf-8. This is a reproduction procedure. {noformat} $cat ./test.cql // 日本語のコメント use system; select * from system.peers; $cqlsh --version cqlsh 5.0.1 $cqlsh -f ./test.cql Traceback (most recent call last): File ./cqlsh, line 2459, in module main(*read_options(sys.argv[1:], os.environ)) File ./cqlsh, line 2451, in main shell.cmdloop() File ./cqlsh, line 940, in cmdloop line = self.get_input_line(self.prompt) File ./cqlsh, line 909, in get_input_line self.lastcmd = self.stdin.readline() File /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/codecs.py, line 675, in readline return self.reader.readline(size) File /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/codecs.py, line 530, in readline data = self.read(readsize, firstline=True) File /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/codecs.py, line 477, in read newchars, decodedbytes = self.decode(data, self.errors) UnicodeDecodeError: 'ascii' codec can't decode byte 0xe6 in position 3: ordinal not in range(128) {noformat} was: cqlsh crashes when it load a cql script file encoded in utf-8. This is a reproduction procedure. {noformat} $cat ./test.cql // 日本語のコメント use system; select * from system.peers; $cqlsh --version cqlsh 5.0.1 $cqlsh -f ./test.cql Traceback (most recent call last): File ./cqlsh, line 2459, in module main(*read_options(sys.argv[1:], os.environ)) File ./cqlsh, line 2451, in main shell.cmdloop() File ./cqlsh, line 940, in cmdloop line = self.get_input_line(self.prompt) File ./cqlsh, line 909, in get_input_line self.lastcmd = self.stdin.readline() File /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/codecs.py, line 675, in readline return self.reader.readline(size) File /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/codecs.py, line 530, in readline data = self.read(readsize, firstline=True) File /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/codecs.py, line 477, in read newchars, decodedbytes = self.decode(data, self.errors) {noformat} cqlsh crashes if it load a utf-8 file. -- Key: CASSANDRA-9898 URL: https://issues.apache.org/jira/browse/CASSANDRA-9898 Project: Cassandra Issue Type: Bug Components: Tools Environment: linux, os x yosemite. Reporter: Yasuharu Goto Assignee: Yasuharu Goto Priority: Minor Labels: cqlsh Fix For: 2.1.x, 2.2.x Attachments: cassandra-2.1-9898.txt, cassandra-2.2-9898.txt cqlsh crashes when it load a cql script file encoded in utf-8. This is a reproduction procedure. {noformat} $cat ./test.cql // 日本語のコメント use system; select * from system.peers; $cqlsh --version cqlsh 5.0.1 $cqlsh -f ./test.cql Traceback (most recent call last): File ./cqlsh, line 2459, in module main(*read_options(sys.argv[1:], os.environ)) File ./cqlsh, line 2451, in main shell.cmdloop() File ./cqlsh, line 940, in cmdloop line = self.get_input_line(self.prompt) File ./cqlsh, line 909, in get_input_line self.lastcmd = self.stdin.readline() File /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/codecs.py, line 675, in readline return self.reader.readline(size) File /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/codecs.py, line 530, in readline data = self.read(readsize, firstline=True) File /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/codecs.py, line 477, in read newchars, decodedbytes = self.decode(data, self.errors) UnicodeDecodeError: 'ascii' codec can't decode byte 0xe6 in position 3: ordinal not in range(128) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9898) cqlsh crashes if it load a utf-8 file.
[ https://issues.apache.org/jira/browse/CASSANDRA-9898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14654600#comment-14654600 ] Yasuharu Goto commented on CASSANDRA-9898: -- Oops, I've not paste the last line of my error log. I've updated my description. cqlsh crashes if it load a utf-8 file. -- Key: CASSANDRA-9898 URL: https://issues.apache.org/jira/browse/CASSANDRA-9898 Project: Cassandra Issue Type: Bug Components: Tools Environment: linux, os x yosemite. Reporter: Yasuharu Goto Assignee: Yasuharu Goto Priority: Minor Labels: cqlsh Fix For: 2.1.x, 2.2.x Attachments: cassandra-2.1-9898.txt, cassandra-2.2-9898.txt cqlsh crashes when it load a cql script file encoded in utf-8. This is a reproduction procedure. {noformat} $cat ./test.cql // 日本語のコメント use system; select * from system.peers; $cqlsh --version cqlsh 5.0.1 $cqlsh -f ./test.cql Traceback (most recent call last): File ./cqlsh, line 2459, in module main(*read_options(sys.argv[1:], os.environ)) File ./cqlsh, line 2451, in main shell.cmdloop() File ./cqlsh, line 940, in cmdloop line = self.get_input_line(self.prompt) File ./cqlsh, line 909, in get_input_line self.lastcmd = self.stdin.readline() File /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/codecs.py, line 675, in readline return self.reader.readline(size) File /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/codecs.py, line 530, in readline data = self.read(readsize, firstline=True) File /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/codecs.py, line 477, in read newchars, decodedbytes = self.decode(data, self.errors) UnicodeDecodeError: 'ascii' codec can't decode byte 0xe6 in position 3: ordinal not in range(128) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-9898) cqlsh crashes if it load a utf-8 file.
[ https://issues.apache.org/jira/browse/CASSANDRA-9898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14654651#comment-14654651 ] Yasuharu Goto edited comment on CASSANDRA-9898 at 8/5/15 1:33 AM: -- Hmm, I've seen the ticket, but then I thought it's different issue with mine because their error log looks so different. But now I agree with you. their repro code looks get fixed by my patch with my brief test. was (Author: yasuharu): Hmm, I've seen the ticket, but then I thought it's different issue with mine because their error log looks so different. But now I agree with you. their repro code looks get fixed by my patch in my brief test. cqlsh crashes if it load a utf-8 file. -- Key: CASSANDRA-9898 URL: https://issues.apache.org/jira/browse/CASSANDRA-9898 Project: Cassandra Issue Type: Bug Components: Tools Environment: linux, os x yosemite. Reporter: Yasuharu Goto Assignee: Yasuharu Goto Priority: Minor Labels: cqlsh Fix For: 2.1.x, 2.2.x Attachments: cassandra-2.1-9898.txt, cassandra-2.2-9898.txt cqlsh crashes when it load a cql script file encoded in utf-8. This is a reproduction procedure. {noformat} $cat ./test.cql // 日本語のコメント use system; select * from system.peers; $cqlsh --version cqlsh 5.0.1 $cqlsh -f ./test.cql Traceback (most recent call last): File ./cqlsh, line 2459, in module main(*read_options(sys.argv[1:], os.environ)) File ./cqlsh, line 2451, in main shell.cmdloop() File ./cqlsh, line 940, in cmdloop line = self.get_input_line(self.prompt) File ./cqlsh, line 909, in get_input_line self.lastcmd = self.stdin.readline() File /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/codecs.py, line 675, in readline return self.reader.readline(size) File /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/codecs.py, line 530, in readline data = self.read(readsize, firstline=True) File /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/codecs.py, line 477, in read newchars, decodedbytes = self.decode(data, self.errors) UnicodeDecodeError: 'ascii' codec can't decode byte 0xe6 in position 3: ordinal not in range(128) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9898) cqlsh crashes if it load a utf-8 file.
[ https://issues.apache.org/jira/browse/CASSANDRA-9898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14654651#comment-14654651 ] Yasuharu Goto commented on CASSANDRA-9898: -- Hmm, I've seen the ticket, but then I thought it's different issue with mine because their error log looks so different. But now I agree with you. their repro code looks get fixed by my patch in my brief test. cqlsh crashes if it load a utf-8 file. -- Key: CASSANDRA-9898 URL: https://issues.apache.org/jira/browse/CASSANDRA-9898 Project: Cassandra Issue Type: Bug Components: Tools Environment: linux, os x yosemite. Reporter: Yasuharu Goto Assignee: Yasuharu Goto Priority: Minor Labels: cqlsh Fix For: 2.1.x, 2.2.x Attachments: cassandra-2.1-9898.txt, cassandra-2.2-9898.txt cqlsh crashes when it load a cql script file encoded in utf-8. This is a reproduction procedure. {noformat} $cat ./test.cql // 日本語のコメント use system; select * from system.peers; $cqlsh --version cqlsh 5.0.1 $cqlsh -f ./test.cql Traceback (most recent call last): File ./cqlsh, line 2459, in module main(*read_options(sys.argv[1:], os.environ)) File ./cqlsh, line 2451, in main shell.cmdloop() File ./cqlsh, line 940, in cmdloop line = self.get_input_line(self.prompt) File ./cqlsh, line 909, in get_input_line self.lastcmd = self.stdin.readline() File /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/codecs.py, line 675, in readline return self.reader.readline(size) File /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/codecs.py, line 530, in readline data = self.read(readsize, firstline=True) File /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/codecs.py, line 477, in read newchars, decodedbytes = self.decode(data, self.errors) UnicodeDecodeError: 'ascii' codec can't decode byte 0xe6 in position 3: ordinal not in range(128) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-9898) cqlsh crashes if it load a utf-8 file.
[ https://issues.apache.org/jira/browse/CASSANDRA-9898?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yasuharu Goto updated CASSANDRA-9898: - Attachment: cassandra-2.2-9898.txt cqlsh crashes if it load a utf-8 file. -- Key: CASSANDRA-9898 URL: https://issues.apache.org/jira/browse/CASSANDRA-9898 Project: Cassandra Issue Type: Bug Components: Tools Environment: linux, os x yosemite. Reporter: Yasuharu Goto Assignee: Yasuharu Goto Priority: Minor Attachments: cassandra-2.1-9898.txt, cassandra-2.2-9898.txt cqlsh crashes when it load a cql script file encoded in utf-8. This is a reproduction procedure. {quote} $cat ./test.cql // 日本語のコメント use system; select * from system.peers; $cqlsh --version cqlsh 5.0.1 $cqlsh -f ./test.cql Traceback (most recent call last): File ./cqlsh, line 2459, in module main(*read_options(sys.argv[1:], os.environ)) File ./cqlsh, line 2451, in main shell.cmdloop() File ./cqlsh, line 940, in cmdloop line = self.get_input_line(self.prompt) File ./cqlsh, line 909, in get_input_line self.lastcmd = self.stdin.readline() File /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/codecs.py, line 675, in readline return self.reader.readline(size) File /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/codecs.py, line 530, in readline data = self.read(readsize, firstline=True) File /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/codecs.py, line 477, in read newchars, decodedbytes = self.decode(data, self.errors) {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-9898) cqlsh crashes if it load a utf-8 file.
[ https://issues.apache.org/jira/browse/CASSANDRA-9898?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yasuharu Goto updated CASSANDRA-9898: - Assignee: Yuki Morishita cqlsh crashes if it load a utf-8 file. -- Key: CASSANDRA-9898 URL: https://issues.apache.org/jira/browse/CASSANDRA-9898 Project: Cassandra Issue Type: Bug Components: Tools Environment: linux, os x yosemite. Reporter: Yasuharu Goto Assignee: Yuki Morishita Priority: Minor Attachments: cassandra-2.1-9898.txt cqlsh crashes when it load a cql script file encoded in utf-8. This is a reproduction procedure. {quote} $cat ./test.cql // 日本語のコメント use system; select * from system.peers; $cqlsh --version cqlsh 5.0.1 $cqlsh -f ./test.cql Traceback (most recent call last): File ./cqlsh, line 2459, in module main(*read_options(sys.argv[1:], os.environ)) File ./cqlsh, line 2451, in main shell.cmdloop() File ./cqlsh, line 940, in cmdloop line = self.get_input_line(self.prompt) File ./cqlsh, line 909, in get_input_line self.lastcmd = self.stdin.readline() File /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/codecs.py, line 675, in readline return self.reader.readline(size) File /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/codecs.py, line 530, in readline data = self.read(readsize, firstline=True) File /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/codecs.py, line 477, in read newchars, decodedbytes = self.decode(data, self.errors) {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-9898) cqlsh crashes if it load a utf-8 file.
[ https://issues.apache.org/jira/browse/CASSANDRA-9898?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yasuharu Goto updated CASSANDRA-9898: - Attachment: cassandra-2.1-9898.txt Environment: linux, os x yosemite. Reproduced In: 2.1.8, 2.2.0 rc2 (was: 2.2.0 rc2, 2.1.8) cqlsh crashes if it load a utf-8 file. -- Key: CASSANDRA-9898 URL: https://issues.apache.org/jira/browse/CASSANDRA-9898 Project: Cassandra Issue Type: Bug Components: Tools Environment: linux, os x yosemite. Reporter: Yasuharu Goto Priority: Minor Attachments: cassandra-2.1-9898.txt cqlsh crashes when it load a cql script file encoded in utf-8. This is a reproduction procedure. {quote} $cat ./test.cql // 日本語のコメント use system; select * from system.peers; $cqlsh --version cqlsh 5.0.1 $cqlsh -f ./test.cql Traceback (most recent call last): File ./cqlsh, line 2459, in module main(*read_options(sys.argv[1:], os.environ)) File ./cqlsh, line 2451, in main shell.cmdloop() File ./cqlsh, line 940, in cmdloop line = self.get_input_line(self.prompt) File ./cqlsh, line 909, in get_input_line self.lastcmd = self.stdin.readline() File /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/codecs.py, line 675, in readline return self.reader.readline(size) File /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/codecs.py, line 530, in readline data = self.read(readsize, firstline=True) File /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/codecs.py, line 477, in read newchars, decodedbytes = self.decode(data, self.errors) {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-9898) cqlsh crashes if it load a utf-8 file.
Yasuharu Goto created CASSANDRA-9898: Summary: cqlsh crashes if it load a utf-8 file. Key: CASSANDRA-9898 URL: https://issues.apache.org/jira/browse/CASSANDRA-9898 Project: Cassandra Issue Type: Bug Components: Tools Reporter: Yasuharu Goto Priority: Minor cqlsh crashes when it load a cql script file encoded in utf-8. This is a reproduction procedure. {quote} $cat ./test.cql // 日本語のコメント use system; select * from system.peers; $cqlsh --version cqlsh 5.0.1 $cqlsh -f ./test.cql Traceback (most recent call last): File ./cqlsh, line 2459, in module main(*read_options(sys.argv[1:], os.environ)) File ./cqlsh, line 2451, in main shell.cmdloop() File ./cqlsh, line 940, in cmdloop line = self.get_input_line(self.prompt) File ./cqlsh, line 909, in get_input_line self.lastcmd = self.stdin.readline() File /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/codecs.py, line 675, in readline return self.reader.readline(size) File /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/codecs.py, line 530, in readline data = self.read(readsize, firstline=True) File /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/codecs.py, line 477, in read newchars, decodedbytes = self.decode(data, self.errors) {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7469) RejectedExecutionException causes orphan SSTables
[ https://issues.apache.org/jira/browse/CASSANDRA-7469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14048407#comment-14048407 ] Yasuharu Goto commented on CASSANDRA-7469: -- I agree with you. I'm goint to upgrade my cluster. RejectedExecutionException causes orphan SSTables - Key: CASSANDRA-7469 URL: https://issues.apache.org/jira/browse/CASSANDRA-7469 Project: Cassandra Issue Type: Bug Components: Core Reporter: Yasuharu Goto Priority: Minor I noticed that some old SSTables are not deleted and remaining in data dir. They are never compacted. {code} ./ks2-cf2-he-9690-Data.db ./ks2-cf2-he-9691-Data.db ./ks2-cf2-he-9679-Data.db- current version id ./ks2-cf2-he-205-Data.db- very old version id ./ks2-cf2-he-201-Data.db ./ks2-cf2-he-202-Data.db ./ks2-cf2-he-203-Data.db {code} And I noticed that some RejectedExecutionException causes these orphan SSTables. {code} ... INFO 18:51:45,323 DRAINING: starting drain process INFO 18:51:45,324 Stop listening to thrift clients ... # This compaction is not finished. Terminated by following Exception and nerver retried. So these SSTables are not deleted eternally. INFO 18:51:46,512 Compacting [SSTableReader(path='/var/cassandra/data/ks2/cf2/ks2-cf2-he-205-Data.db'), SSTableReader(path='/var/cassandra/data/ks2/cf2/ks2-cf2-he-203-Data.db'), SSTableReader(path='/var/cassandra/data/ks2/cf2/ks2-cf2-he-202-Data.db'), SSTableReader(path='/var/cassandra/data/ks2/cf2/ks2-cf2-he-201-Data.db')] ... # This compaction is finished. They don't get to be orphans. INFO 18:51:46,641 Compacting [SSTableReader(path='/var/cassandra/data/ks1/cf1/ks1-cf1-he-90-Data.db'), SSTableReader(path='/var/cassandra/data/ks1/cf1/ks1-cf1-he-89-Data.db'), SSTableReader(path='/var/cassandra/data/ks1/cf1/ks1-cf1-he-88-Data.db'), SSTableReader(path='/var/cassandra/data/ks1/cf1/ks1-cf1-he-87-Data.db')] INFO 18:51:46,736 Compacted to [/var/cassandra/data/ks1/cf1/ks1-cf1-he-91-Data.db,]. 370,606 to 317,566 (~85% of original) bytes for 193 keys at 3.187943MB/s. Time: 95ms. INFO 18:51:46,836 DRAINED ERROR 18:51:49,807 Exception in thread Thread[CompactionExecutor:1927,1,RMI Runtime] java.util.concurrent.RejectedExecutionException: Task java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask@32b5a2c6 rejected from org.apache.cassandra.concurrent.DebuggableScheduledThreadPoolExecutor@32d18f2c[Terminated, pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 3043] at java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2013) at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:816) at java.util.concurrent.ScheduledThreadPoolExecutor.delayedExecute(ScheduledThreadPoolExecutor.java:325) at java.util.concurrent.ScheduledThreadPoolExecutor.schedule(ScheduledThreadPoolExecutor.java:530) at java.util.concurrent.ScheduledThreadPoolExecutor.submit(ScheduledThreadPoolExecutor.java:629) at org.apache.cassandra.io.sstable.SSTableDeletingTask.schedule(SSTableDeletingTask.java:67) at org.apache.cassandra.io.sstable.SSTableReader.releaseReference(SSTableReader.java:806) at org.apache.cassandra.db.DataTracker.removeOldSSTablesSize(DataTracker.java:358) at org.apache.cassandra.db.DataTracker.postReplace(DataTracker.java:330) at org.apache.cassandra.db.DataTracker.replace(DataTracker.java:324) at org.apache.cassandra.db.DataTracker.replaceCompactedSSTables(DataTracker.java:253) at org.apache.cassandra.db.ColumnFamilyStore.replaceCompactedSSTables(ColumnFamilyStore.java:992) at org.apache.cassandra.db.compaction.CompactionTask.execute(CompactionTask.java:200) at org.apache.cassandra.db.compaction.CompactionManager$1.runMayThrow(CompactionManager.java:154) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) at java.util.concurrent.FutureTask.run(FutureTask.java:166) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:722) INFO 18:52:54,010 Cassandra shutting down... {code} As a result of log servey, we found some orphan SSTables caused by RejectExcutionException. Maybe I can fix each orphan files by nodetool refresh. But I'd like to ask if this is a problem that already has been solved in
[jira] [Created] (CASSANDRA-7469) RejectedExecutionException causes orphan SSTables
Yasuharu Goto created CASSANDRA-7469: Summary: RejectedExecutionException causes orphan SSTables Key: CASSANDRA-7469 URL: https://issues.apache.org/jira/browse/CASSANDRA-7469 Project: Cassandra Issue Type: Bug Components: Core Reporter: Yasuharu Goto Priority: Minor I noticed that some old SSTables are not deleted and remaining in data dir. They are never compacted. {code} ./ks2-cf2-he-9690-Data.db ./ks2-cf2-he-9691-Data.db ./ks2-cf2-he-9679-Data.db- current version id ./ks2-cf2-he-205-Data.db- very old version id ./ks2-cf2-he-201-Data.db ./ks2-cf2-he-202-Data.db ./ks2-cf2-he-203-Data.db {code} And I noticed that some RejectedExecutionException causes these orphan SSTables. {code} ... INFO 18:51:45,323 DRAINING: starting drain process INFO 18:51:45,324 Stop listening to thrift clients ... # This compaction is not finished. Terminated by following Exception and nerver retried. So these SSTables are not deleted eternally. INFO 18:51:46,512 Compacting [SSTableReader(path='/var/cassandra/data/ks2/cf2/ks2-cf2-he-205-Data.db'), SSTableReader(path='/var/cassandra/data/ks2/cf2/ks2-cf2-he-203-Data.db'), SSTableReader(path='/var/cassandra/data/ks2/cf2/ks2-cf2-he-202-Data.db'), SSTableReader(path='/var/cassandra/data/ks2/cf2/ks2-cf2-he-201-Data.db')] ... # This compaction is finished. They don't get to be orphans. INFO 18:51:46,641 Compacting [SSTableReader(path='/var/cassandra/data/ks1/cf1/ks1-cf1-he-90-Data.db'), SSTableReader(path='/var/cassandra/data/ks1/cf1/ks1-cf1-he-89-Data.db'), SSTableReader(path='/var/cassandra/data/ks1/cf1/ks1-cf1-he-88-Data.db'), SSTableReader(path='/var/cassandra/data/ks1/cf1/ks1-cf1-he-87-Data.db')] INFO 18:51:46,736 Compacted to [/var/cassandra/data/ks1/cf1/ks1-cf1-he-91-Data.db,]. 370,606 to 317,566 (~85% of original) bytes for 193 keys at 3.187943MB/s. Time: 95ms. INFO 18:51:46,836 DRAINED ERROR 18:51:49,807 Exception in thread Thread[CompactionExecutor:1927,1,RMI Runtime] java.util.concurrent.RejectedExecutionException: Task java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask@32b5a2c6 rejected from org.apache.cassandra.concurrent.DebuggableScheduledThreadPoolExecutor@32d18f2c[Terminated, pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 3043] at java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2013) at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:816) at java.util.concurrent.ScheduledThreadPoolExecutor.delayedExecute(ScheduledThreadPoolExecutor.java:325) at java.util.concurrent.ScheduledThreadPoolExecutor.schedule(ScheduledThreadPoolExecutor.java:530) at java.util.concurrent.ScheduledThreadPoolExecutor.submit(ScheduledThreadPoolExecutor.java:629) at org.apache.cassandra.io.sstable.SSTableDeletingTask.schedule(SSTableDeletingTask.java:67) at org.apache.cassandra.io.sstable.SSTableReader.releaseReference(SSTableReader.java:806) at org.apache.cassandra.db.DataTracker.removeOldSSTablesSize(DataTracker.java:358) at org.apache.cassandra.db.DataTracker.postReplace(DataTracker.java:330) at org.apache.cassandra.db.DataTracker.replace(DataTracker.java:324) at org.apache.cassandra.db.DataTracker.replaceCompactedSSTables(DataTracker.java:253) at org.apache.cassandra.db.ColumnFamilyStore.replaceCompactedSSTables(ColumnFamilyStore.java:992) at org.apache.cassandra.db.compaction.CompactionTask.execute(CompactionTask.java:200) at org.apache.cassandra.db.compaction.CompactionManager$1.runMayThrow(CompactionManager.java:154) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) at java.util.concurrent.FutureTask.run(FutureTask.java:166) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:722) INFO 18:52:54,010 Cassandra shutting down... {code} As a result of log servey, we found some orphan SSTables caused by RejectExcutionException. Maybe I can fix each orphan files by nodetool refresh. But I'd like to ask if this is a problem that already has been solved in early release. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7210) Add --resolve-ip option on 'nodetool ring'
[ https://issues.apache.org/jira/browse/CASSANDRA-7210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13999554#comment-13999554 ] Yasuharu Goto commented on CASSANDRA-7210: -- [~mishail] Thank you for your review and commit ! Add --resolve-ip option on 'nodetool ring' -- Key: CASSANDRA-7210 URL: https://issues.apache.org/jira/browse/CASSANDRA-7210 Project: Cassandra Issue Type: Improvement Components: Tools Reporter: Yasuharu Goto Assignee: Yasuharu Goto Priority: Trivial Fix For: 2.0.9, 2.1 rc1 Attachments: 2.0-7210-2.txt, 2.0-7210.txt, trunk-7210-2.txt, trunk-7210.txt Give nodetool ring the option of either displaying IPs or hostnames for the nodes in a ring. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (CASSANDRA-7210) Add --resolve-ip option on 'nodetool ring'
[ https://issues.apache.org/jira/browse/CASSANDRA-7210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yasuharu Goto updated CASSANDRA-7210: - Attachment: trunk-7210-2.txt [~mishail] Oops, I fixed. Add --resolve-ip option on 'nodetool ring' -- Key: CASSANDRA-7210 URL: https://issues.apache.org/jira/browse/CASSANDRA-7210 Project: Cassandra Issue Type: Improvement Components: Tools Reporter: Yasuharu Goto Assignee: Yasuharu Goto Priority: Trivial Fix For: 2.0.9, 2.1 rc1 Attachments: 2.0-7210-2.txt, 2.0-7210.txt, trunk-7210-2.txt, trunk-7210.txt Give nodetool ring the option of either displaying IPs or hostnames for the nodes in a ring. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (CASSANDRA-7210) Add --resolve-ip option on 'nodetool ring'
[ https://issues.apache.org/jira/browse/CASSANDRA-7210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yasuharu Goto updated CASSANDRA-7210: - Attachment: 2.0-7210.txt Thanks. I think 2.0-7210.txt goes correctly for 2.0. Add --resolve-ip option on 'nodetool ring' -- Key: CASSANDRA-7210 URL: https://issues.apache.org/jira/browse/CASSANDRA-7210 Project: Cassandra Issue Type: Improvement Components: Tools Reporter: Yasuharu Goto Assignee: Yasuharu Goto Priority: Trivial Fix For: 2.0.9 Attachments: 2.0-7210.txt, trunk-7210.txt Give nodetool ring the option of either displaying IPs or hostnames for the nodes in a ring. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (CASSANDRA-7210) Add --resolve-ip option on 'nodetool ring'
[ https://issues.apache.org/jira/browse/CASSANDRA-7210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yasuharu Goto updated CASSANDRA-7210: - Attachment: 2.0-7210-2.txt Oh, sorry. I did debug via Eclipse and I couldn't notice that. Now I fixed and checked it via bin/nodetool. Add --resolve-ip option on 'nodetool ring' -- Key: CASSANDRA-7210 URL: https://issues.apache.org/jira/browse/CASSANDRA-7210 Project: Cassandra Issue Type: Improvement Components: Tools Reporter: Yasuharu Goto Assignee: Yasuharu Goto Priority: Trivial Fix For: 2.0.9, 2.1 rc1 Attachments: 2.0-7210-2.txt, 2.0-7210.txt, trunk-7210.txt Give nodetool ring the option of either displaying IPs or hostnames for the nodes in a ring. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (CASSANDRA-2238) Allow nodetool to print out hostnames given an option
[ https://issues.apache.org/jira/browse/CASSANDRA-2238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yasuharu Goto updated CASSANDRA-2238: - Attachment: trunk-2238.txt This issue has been already fixed on 'nodetool status'. I'd like to add this --resolve-ip option on 'nodetool ring' too. Allow nodetool to print out hostnames given an option - Key: CASSANDRA-2238 URL: https://issues.apache.org/jira/browse/CASSANDRA-2238 Project: Cassandra Issue Type: Improvement Components: Tools Reporter: Joaquin Casares Assignee: Daneel S. Yaitskov Priority: Trivial Fix For: 1.2.14, 2.0.5 Attachments: trunk-2238.txt Give nodetool the option of either displaying IPs or hostnames for the nodes in a ring. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-2238) Allow nodetool to print out hostnames given an option
[ https://issues.apache.org/jira/browse/CASSANDRA-2238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13995272#comment-13995272 ] Yasuharu Goto commented on CASSANDRA-2238: -- OK, I opened a new tickeet CASSANDRA-7210. Thank you. Allow nodetool to print out hostnames given an option - Key: CASSANDRA-2238 URL: https://issues.apache.org/jira/browse/CASSANDRA-2238 Project: Cassandra Issue Type: Improvement Components: Tools Reporter: Joaquin Casares Assignee: Daneel S. Yaitskov Priority: Trivial Fix For: 1.2.14, 2.0.5 Attachments: trunk-2238.txt Give nodetool the option of either displaying IPs or hostnames for the nodes in a ring. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (CASSANDRA-7210) Add --resolve-ip option on 'nodetool ring'
[ https://issues.apache.org/jira/browse/CASSANDRA-7210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yasuharu Goto updated CASSANDRA-7210: - Attachment: trunk-7210.txt CASSANDRA-2238 Adds --resolve-ip option which allow 'nodetool status' to print out hostname of nodes. I'd like to add this option on 'nodetool ring' too. Add --resolve-ip option on 'nodetool ring' -- Key: CASSANDRA-7210 URL: https://issues.apache.org/jira/browse/CASSANDRA-7210 Project: Cassandra Issue Type: Improvement Components: Tools Reporter: Yasuharu Goto Priority: Trivial Attachments: trunk-7210.txt Give nodetool ring the option of either displaying IPs or hostnames for the nodes in a ring. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (CASSANDRA-7210) Add --resolve-ip option on 'nodetool ring'
Yasuharu Goto created CASSANDRA-7210: Summary: Add --resolve-ip option on 'nodetool ring' Key: CASSANDRA-7210 URL: https://issues.apache.org/jira/browse/CASSANDRA-7210 Project: Cassandra Issue Type: Improvement Components: Tools Reporter: Yasuharu Goto Priority: Trivial Give nodetool ring the option of either displaying IPs or hostnames for the nodes in a ring. -- This message was sent by Atlassian JIRA (v6.2#6252)