[jira] [Commented] (CASSANDRA-13348) Duplicate tokens after bootstrap

2017-05-01 Thread Yasuharu Goto (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15991984#comment-15991984
 ] 

Yasuharu Goto commented on CASSANDRA-13348:
---

[~dikanggu] Thank you! Our cluster seems to be able to avoid this issue. I'm 
gonna upgrade to 3.0.13!

> Duplicate tokens after bootstrap
> 
>
> Key: CASSANDRA-13348
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13348
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Tom van der Woerdt
>Assignee: Dikang Gu
>Priority: Blocker
> Fix For: 3.0.x
>
>
> This one is a bit scary, and probably results in data loss. After a bootstrap 
> of a few new nodes into an existing cluster, two new nodes have chosen some 
> overlapping tokens.
> In fact, of the 256 tokens chosen, 51 tokens were already in use on the other 
> node.
> Node 1 log :
> {noformat}
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,461 
> StorageService.java:1160 - JOINING: waiting for ring information
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,461 
> StorageService.java:1160 - JOINING: waiting for schema information to complete
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,461 
> StorageService.java:1160 - JOINING: schema complete, ready to bootstrap
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,462 
> StorageService.java:1160 - JOINING: waiting for pending range calculation
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,462 
> StorageService.java:1160 - JOINING: calculation complete, ready to bootstrap
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,462 
> StorageService.java:1160 - JOINING: getting bootstrap token
> WARN  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,564 
> TokenAllocation.java:61 - Selected tokens [, 2959334889475814712, 
> 3727103702384420083, 7183119311535804926, 6013900799616279548, 
> -1222135324851761575, 1645259890258332163, -1213352346686661387, 
> 7604192574911909354]
> WARN  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,729 
> TokenAllocation.java:65 - Replicated node load in datacentre before 
> allocation max 1.00 min 1.00 stddev 0.
> WARN  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,729 
> TokenAllocation.java:66 - Replicated node load in datacentre after allocation 
> max 1.00 min 1.00 stddev 0.
> WARN  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,729 
> TokenAllocation.java:70 - Unexpected growth in standard deviation after 
> allocation.
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:44,150 
> StorageService.java:1160 - JOINING: sleeping 3 ms for pending range setup
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:43:14,151 
> StorageService.java:1160 - JOINING: Starting to bootstrap...
> {noformat}
> Node 2 log:
> {noformat}
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:51,937 
> StorageService.java:971 - Joining ring by operator request
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 
> StorageService.java:1160 - JOINING: waiting for ring information
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 
> StorageService.java:1160 - JOINING: waiting for schema information to complete
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 
> StorageService.java:1160 - JOINING: schema complete, ready to bootstrap
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 
> StorageService.java:1160 - JOINING: waiting for pending range calculation
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,514 
> StorageService.java:1160 - JOINING: calculation complete, ready to bootstrap
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,514 
> StorageService.java:1160 - JOINING: getting bootstrap token
> WARN  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,630 
> TokenAllocation.java:61 - Selected tokens [.., 2890709530010722764, 
> -2416006722819773829, -5820248611267569511, -5990139574852472056, 
> 1645259890258332163, 9135021011763659240, -5451286144622276797, 
> 7604192574911909354]
> WARN  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,794 
> TokenAllocation.java:65 - Replicated node load in datacentre before 
> allocation max 1.02 min 0.98 stddev 0.
> WARN  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,795 
> TokenAllocation.java:66 - Replicated node load in datacentre after allocation 
> max 1.00 min 1.00 stddev 0.
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:53,149 
> StorageService.java:1160 - JOINING: sleeping 3 ms for pending range setup
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:56:23,149 
> StorageService.java:1160 - JOINING: Starting to bootstrap...
> {noformat}

[jira] [Comment Edited] (CASSANDRA-13348) Duplicate tokens after bootstrap

2017-05-01 Thread Yasuharu Goto (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15990685#comment-15990685
 ] 

Yasuharu Goto edited comment on CASSANDRA-13348 at 5/1/17 9:00 AM:
---

Hi [~dikanggu], we're now preparing to upgrade our production Cassandra cluster 
from 2.1 to 3.0.13.
Our 3.0 clusters does not activate allocate_tokens_for_keyspace for now. In 
your theory, would this issue affects to C* clusters with 
allocate_tokens_for_keyspace = null?


was (Author: yasuharu):
Hi [~dikanggu], we're now preparing to upgrade our production Cassandra cluster 
from 2.1 to 3.0.14.
Our 3.0 clusters does not activate allocate_tokens_for_keyspace for now. In 
your theory, would this issue affects to C* clusters with 
allocate_tokens_for_keyspace = null?

Thank you.

> Duplicate tokens after bootstrap
> 
>
> Key: CASSANDRA-13348
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13348
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Tom van der Woerdt
>Assignee: Dikang Gu
>Priority: Blocker
> Fix For: 3.0.x
>
>
> This one is a bit scary, and probably results in data loss. After a bootstrap 
> of a few new nodes into an existing cluster, two new nodes have chosen some 
> overlapping tokens.
> In fact, of the 256 tokens chosen, 51 tokens were already in use on the other 
> node.
> Node 1 log :
> {noformat}
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,461 
> StorageService.java:1160 - JOINING: waiting for ring information
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,461 
> StorageService.java:1160 - JOINING: waiting for schema information to complete
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,461 
> StorageService.java:1160 - JOINING: schema complete, ready to bootstrap
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,462 
> StorageService.java:1160 - JOINING: waiting for pending range calculation
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,462 
> StorageService.java:1160 - JOINING: calculation complete, ready to bootstrap
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,462 
> StorageService.java:1160 - JOINING: getting bootstrap token
> WARN  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,564 
> TokenAllocation.java:61 - Selected tokens [, 2959334889475814712, 
> 3727103702384420083, 7183119311535804926, 6013900799616279548, 
> -1222135324851761575, 1645259890258332163, -1213352346686661387, 
> 7604192574911909354]
> WARN  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,729 
> TokenAllocation.java:65 - Replicated node load in datacentre before 
> allocation max 1.00 min 1.00 stddev 0.
> WARN  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,729 
> TokenAllocation.java:66 - Replicated node load in datacentre after allocation 
> max 1.00 min 1.00 stddev 0.
> WARN  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,729 
> TokenAllocation.java:70 - Unexpected growth in standard deviation after 
> allocation.
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:44,150 
> StorageService.java:1160 - JOINING: sleeping 3 ms for pending range setup
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:43:14,151 
> StorageService.java:1160 - JOINING: Starting to bootstrap...
> {noformat}
> Node 2 log:
> {noformat}
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:51,937 
> StorageService.java:971 - Joining ring by operator request
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 
> StorageService.java:1160 - JOINING: waiting for ring information
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 
> StorageService.java:1160 - JOINING: waiting for schema information to complete
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 
> StorageService.java:1160 - JOINING: schema complete, ready to bootstrap
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 
> StorageService.java:1160 - JOINING: waiting for pending range calculation
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,514 
> StorageService.java:1160 - JOINING: calculation complete, ready to bootstrap
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,514 
> StorageService.java:1160 - JOINING: getting bootstrap token
> WARN  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,630 
> TokenAllocation.java:61 - Selected tokens [.., 2890709530010722764, 
> -2416006722819773829, -5820248611267569511, -5990139574852472056, 
> 1645259890258332163, 9135021011763659240, -5451286144622276797, 
> 7604192574911909354]
> WARN  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,794 
> TokenAllocation.java:65 - Replicated node load in datacentre 

[jira] [Commented] (CASSANDRA-13348) Duplicate tokens after bootstrap

2017-05-01 Thread Yasuharu Goto (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15990685#comment-15990685
 ] 

Yasuharu Goto commented on CASSANDRA-13348:
---

Hi [~dikanggu], we're now preparing to upgrade our production Cassandra cluster 
from 2.1 to 3.0.14.
Our 3.0 clusters does not activate allocate_tokens_for_keyspace for now. In 
your theory, would this issue affects to C* clusters with 
allocate_tokens_for_keyspace = null?

Thank you.

> Duplicate tokens after bootstrap
> 
>
> Key: CASSANDRA-13348
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13348
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Tom van der Woerdt
>Assignee: Dikang Gu
>Priority: Blocker
> Fix For: 3.0.x
>
>
> This one is a bit scary, and probably results in data loss. After a bootstrap 
> of a few new nodes into an existing cluster, two new nodes have chosen some 
> overlapping tokens.
> In fact, of the 256 tokens chosen, 51 tokens were already in use on the other 
> node.
> Node 1 log :
> {noformat}
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,461 
> StorageService.java:1160 - JOINING: waiting for ring information
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,461 
> StorageService.java:1160 - JOINING: waiting for schema information to complete
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,461 
> StorageService.java:1160 - JOINING: schema complete, ready to bootstrap
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,462 
> StorageService.java:1160 - JOINING: waiting for pending range calculation
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,462 
> StorageService.java:1160 - JOINING: calculation complete, ready to bootstrap
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,462 
> StorageService.java:1160 - JOINING: getting bootstrap token
> WARN  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,564 
> TokenAllocation.java:61 - Selected tokens [, 2959334889475814712, 
> 3727103702384420083, 7183119311535804926, 6013900799616279548, 
> -1222135324851761575, 1645259890258332163, -1213352346686661387, 
> 7604192574911909354]
> WARN  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,729 
> TokenAllocation.java:65 - Replicated node load in datacentre before 
> allocation max 1.00 min 1.00 stddev 0.
> WARN  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,729 
> TokenAllocation.java:66 - Replicated node load in datacentre after allocation 
> max 1.00 min 1.00 stddev 0.
> WARN  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:43,729 
> TokenAllocation.java:70 - Unexpected growth in standard deviation after 
> allocation.
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:42:44,150 
> StorageService.java:1160 - JOINING: sleeping 3 ms for pending range setup
> INFO  [RMI TCP Connection(107)-127.0.0.1] 2017-03-09 07:43:14,151 
> StorageService.java:1160 - JOINING: Starting to bootstrap...
> {noformat}
> Node 2 log:
> {noformat}
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:51,937 
> StorageService.java:971 - Joining ring by operator request
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 
> StorageService.java:1160 - JOINING: waiting for ring information
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 
> StorageService.java:1160 - JOINING: waiting for schema information to complete
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 
> StorageService.java:1160 - JOINING: schema complete, ready to bootstrap
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,513 
> StorageService.java:1160 - JOINING: waiting for pending range calculation
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,514 
> StorageService.java:1160 - JOINING: calculation complete, ready to bootstrap
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,514 
> StorageService.java:1160 - JOINING: getting bootstrap token
> WARN  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,630 
> TokenAllocation.java:61 - Selected tokens [.., 2890709530010722764, 
> -2416006722819773829, -5820248611267569511, -5990139574852472056, 
> 1645259890258332163, 9135021011763659240, -5451286144622276797, 
> 7604192574911909354]
> WARN  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,794 
> TokenAllocation.java:65 - Replicated node load in datacentre before 
> allocation max 1.02 min 0.98 stddev 0.
> WARN  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:52,795 
> TokenAllocation.java:66 - Replicated node load in datacentre after allocation 
> max 1.00 min 1.00 stddev 0.
> INFO  [RMI TCP Connection(380)-127.0.0.1] 2017-03-17 15:55:53,149 
> StorageService.java:1160 - JOINING: 

[jira] [Commented] (CASSANDRA-13125) Duplicate rows after upgrading from 2.1.16 to 3.0.10/3.9

2017-02-21 Thread Yasuharu Goto (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15877379#comment-15877379
 ] 

Yasuharu Goto commented on CASSANDRA-13125:
---

[~mnantern] In our case, {{nodetool scrub}} (C* 3.0.9 or later) fixed our 
broken sstables.

> Duplicate rows after upgrading from 2.1.16 to 3.0.10/3.9
> 
>
> Key: CASSANDRA-13125
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13125
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Zhongxiang Zheng
>Assignee: Sylvain Lebresne
>Priority: Critical
> Fix For: 3.0.11, 3.11.0
>
> Attachments: diff-a.patch, diff-b.patch
>
>
> I found that rows are splitting and duplicated after upgrading the cluster 
> from 2.1.x to 3.0.x.
> I found the way to reproduce the problem as below.
> {code}
> $ ccm create test -v 2.1.16 -n 3 -s   
> 
> Current cluster is now: test
> $ ccm node1 cqlsh  -e "CREATE KEYSPACE test WITH replication = 
> {'class':'SimpleStrategy', 'replication_factor':3}"
> $ ccm node1 cqlsh -e "CREATE TABLE test.test (id text PRIMARY KEY, value1 
> set, value2 set);"
> # Upgrade node1
> $ for i in 1; do ccm node${i} stop; ccm node${i} setdir -v3.0.10; ccm 
> node${i} start;ccm node${i} nodetool upgradesstables; done
> # Insert a row through node1(3.0.10)
> $ ccm node1 cqlsh -e "INSERT INTO test.test (id, value1, value2) values 
> ('aaa', {'aaa', 'bbb'}, {'ccc', 'ddd'});"   
> # Insert a row through node2(2.1.16)
> $ ccm node2 cqlsh -e "INSERT INTO test.test (id, value1, value2) values 
> ('bbb', {'aaa', 'bbb'}, {'ccc', 'ddd'});" 
> # The row inserted from node1 is splitting
> $ ccm node1 cqlsh -e "SELECT * FROM test.test ;"
>  id  | value1 | value2
> -++
>  aaa |   null |   null
>  aaa | {'aaa', 'bbb'} | {'ccc', 'ddd'}
>  bbb | {'aaa', 'bbb'} | {'ccc', 'ddd'}
> $ for i in 1 2; do ccm node${i} nodetool flush; done
> # Results of sstable2json of node2. The row inserted from node1(3.0.10) is 
> different from the row inserted from node2(2.1.16).
> $ ccm node2 json -k test -c test
> running
> ['/home/zzheng/.ccm/test/node2/data0/test/test-5406ee80dbdb11e6a175f57c4c7c85f3/test-test-ka-1-Data.db']
> -- test-test-ka-1-Data.db -
> [
> {"key": "aaa",
>  "cells": [["","",1484564624769577],
>["value1","value2:!",1484564624769576,"t",1484564624],
>["value1:616161","",1484564624769577],
>["value1:626262","",1484564624769577],
>["value2:636363","",1484564624769577],
>["value2:646464","",1484564624769577]]},
> {"key": "bbb",
>  "cells": [["","",1484564634508029],
>["value1:_","value1:!",1484564634508028,"t",1484564634],
>["value1:616161","",1484564634508029],
>["value1:626262","",1484564634508029],
>["value2:_","value2:!",1484564634508028,"t",1484564634],
>["value2:636363","",1484564634508029],
>["value2:646464","",1484564634508029]]}
> ]
> # Upgrade node2,3
> $ for i in `seq 2 3`; do ccm node${i} stop; ccm node${i} setdir -v3.0.10; ccm 
> node${i} start;ccm node${i} nodetool upgradesstables; done
> # After upgrade node2,3, the row inserted from node1 is splitting in node2,3
> $ ccm node2 cqlsh -e "SELECT * FROM test.test ;"  
>   
>  id  | value1 | value2
> -++
>  aaa |   null |   null
>  aaa | {'aaa', 'bbb'} | {'ccc', 'ddd'}
>  bbb | {'aaa', 'bbb'} | {'ccc', 'ddd'}
> (3 rows)
> # Results of sstabledump
> # node1
> [
>   {
> "partition" : {
>   "key" : [ "aaa" ],
>   "position" : 0
> },
> "rows" : [
>   {
> "type" : "row",
> "position" : 17,
> "liveness_info" : { "tstamp" : "2017-01-16T11:03:44.769577Z" },
> "cells" : [
>   { "name" : "value1", "deletion_info" : { "marked_deleted" : 
> "2017-01-16T11:03:44.769576Z", "local_delete_time" : "2017-01-16T11:03:44Z" } 
> },
>   { "name" : "value1", "path" : [ "aaa" ], "value" : "" },
>   { "name" : "value1", "path" : [ "bbb" ], "value" : "" },
>   { "name" : "value2", "deletion_info" : { "marked_deleted" : 
> "2017-01-16T11:03:44.769576Z", "local_delete_time" : "2017-01-16T11:03:44Z" } 
> },
>   { "name" : "value2", "path" : [ "ccc" ], "value" : "" },
>   { "name" : "value2", "path" : [ "ddd" ], "value" : "" }
> ]
>   }
> ]
>   },
>   {
> "partition" : {
>   "key" : [ "bbb" ],
>   "position" : 48
> },
> "rows" : [
>   {
> "type" : "row",
> "position" : 65,
> 

[jira] [Commented] (CASSANDRA-8844) Change Data Capture (CDC)

2017-02-17 Thread Yasuharu Goto (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15873020#comment-15873020
 ] 

Yasuharu Goto commented on CASSANDRA-8844:
--

[~jbellis] Sorry, It seems that I unexpectedly changed the assignment with a 
keyboard shortcut. Thank you for your fix.

> Change Data Capture (CDC)
> -
>
> Key: CASSANDRA-8844
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8844
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Coordination, Local Write-Read Paths
>Reporter: Tupshin Harper
>Assignee: Joshua McKenzie
>Priority: Critical
> Fix For: 3.8
>
>
> "In databases, change data capture (CDC) is a set of software design patterns 
> used to determine (and track) the data that has changed so that action can be 
> taken using the changed data. Also, Change data capture (CDC) is an approach 
> to data integration that is based on the identification, capture and delivery 
> of the changes made to enterprise data sources."
> -Wikipedia
> As Cassandra is increasingly being used as the Source of Record (SoR) for 
> mission critical data in large enterprises, it is increasingly being called 
> upon to act as the central hub of traffic and data flow to other systems. In 
> order to try to address the general need, we (cc [~brianmhess]), propose 
> implementing a simple data logging mechanism to enable per-table CDC patterns.
> h2. The goals:
> # Use CQL as the primary ingestion mechanism, in order to leverage its 
> Consistency Level semantics, and in order to treat it as the single 
> reliable/durable SoR for the data.
> # To provide a mechanism for implementing good and reliable 
> (deliver-at-least-once with possible mechanisms for deliver-exactly-once ) 
> continuous semi-realtime feeds of mutations going into a Cassandra cluster.
> # To eliminate the developmental and operational burden of users so that they 
> don't have to do dual writes to other systems.
> # For users that are currently doing batch export from a Cassandra system, 
> give them the opportunity to make that realtime with a minimum of coding.
> h2. The mechanism:
> We propose a durable logging mechanism that functions similar to a commitlog, 
> with the following nuances:
> - Takes place on every node, not just the coordinator, so RF number of copies 
> are logged.
> - Separate log per table.
> - Per-table configuration. Only tables that are specified as CDC_LOG would do 
> any logging.
> - Per DC. We are trying to keep the complexity to a minimum to make this an 
> easy enhancement, but most likely use cases would prefer to only implement 
> CDC logging in one (or a subset) of the DCs that are being replicated to
> - In the critical path of ConsistencyLevel acknowledgment. Just as with the 
> commitlog, failure to write to the CDC log should fail that node's write. If 
> that means the requested consistency level was not met, then clients *should* 
> experience UnavailableExceptions.
> - Be written in a Row-centric manner such that it is easy for consumers to 
> reconstitute rows atomically.
> - Written in a simple format designed to be consumed *directly* by daemons 
> written in non JVM languages
> h2. Nice-to-haves
> I strongly suspect that the following features will be asked for, but I also 
> believe that they can be deferred for a subsequent release, and to guage 
> actual interest.
> - Multiple logs per table. This would make it easy to have multiple 
> "subscribers" to a single table's changes. A workaround would be to create a 
> forking daemon listener, but that's not a great answer.
> - Log filtering. Being able to apply filters, including UDF-based filters 
> would make Casandra a much more versatile feeder into other systems, and 
> again, reduce complexity that would otherwise need to be built into the 
> daemons.
> h2. Format and Consumption
> - Cassandra would only write to the CDC log, and never delete from it. 
> - Cleaning up consumed logfiles would be the client daemon's responibility
> - Logfile size should probably be configurable.
> - Logfiles should be named with a predictable naming schema, making it 
> triivial to process them in order.
> - Daemons should be able to checkpoint their work, and resume from where they 
> left off. This means they would have to leave some file artifact in the CDC 
> log's directory.
> - A sophisticated daemon should be able to be written that could 
> -- Catch up, in written-order, even when it is multiple logfiles behind in 
> processing
> -- Be able to continuously "tail" the most recent logfile and get 
> low-latency(ms?) access to the data as it is written.
> h2. Alternate approach
> In order to make consuming a change log easy and efficient to do with low 
> latency, the following could supplement the approach outlined above
> - 

[jira] [Assigned] (CASSANDRA-8844) Change Data Capture (CDC)

2017-02-17 Thread Yasuharu Goto (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yasuharu Goto reassigned CASSANDRA-8844:


Assignee: Yasuharu Goto  (was: Joshua McKenzie)

> Change Data Capture (CDC)
> -
>
> Key: CASSANDRA-8844
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8844
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Coordination, Local Write-Read Paths
>Reporter: Tupshin Harper
>Assignee: Yasuharu Goto
>Priority: Critical
> Fix For: 3.8
>
>
> "In databases, change data capture (CDC) is a set of software design patterns 
> used to determine (and track) the data that has changed so that action can be 
> taken using the changed data. Also, Change data capture (CDC) is an approach 
> to data integration that is based on the identification, capture and delivery 
> of the changes made to enterprise data sources."
> -Wikipedia
> As Cassandra is increasingly being used as the Source of Record (SoR) for 
> mission critical data in large enterprises, it is increasingly being called 
> upon to act as the central hub of traffic and data flow to other systems. In 
> order to try to address the general need, we (cc [~brianmhess]), propose 
> implementing a simple data logging mechanism to enable per-table CDC patterns.
> h2. The goals:
> # Use CQL as the primary ingestion mechanism, in order to leverage its 
> Consistency Level semantics, and in order to treat it as the single 
> reliable/durable SoR for the data.
> # To provide a mechanism for implementing good and reliable 
> (deliver-at-least-once with possible mechanisms for deliver-exactly-once ) 
> continuous semi-realtime feeds of mutations going into a Cassandra cluster.
> # To eliminate the developmental and operational burden of users so that they 
> don't have to do dual writes to other systems.
> # For users that are currently doing batch export from a Cassandra system, 
> give them the opportunity to make that realtime with a minimum of coding.
> h2. The mechanism:
> We propose a durable logging mechanism that functions similar to a commitlog, 
> with the following nuances:
> - Takes place on every node, not just the coordinator, so RF number of copies 
> are logged.
> - Separate log per table.
> - Per-table configuration. Only tables that are specified as CDC_LOG would do 
> any logging.
> - Per DC. We are trying to keep the complexity to a minimum to make this an 
> easy enhancement, but most likely use cases would prefer to only implement 
> CDC logging in one (or a subset) of the DCs that are being replicated to
> - In the critical path of ConsistencyLevel acknowledgment. Just as with the 
> commitlog, failure to write to the CDC log should fail that node's write. If 
> that means the requested consistency level was not met, then clients *should* 
> experience UnavailableExceptions.
> - Be written in a Row-centric manner such that it is easy for consumers to 
> reconstitute rows atomically.
> - Written in a simple format designed to be consumed *directly* by daemons 
> written in non JVM languages
> h2. Nice-to-haves
> I strongly suspect that the following features will be asked for, but I also 
> believe that they can be deferred for a subsequent release, and to guage 
> actual interest.
> - Multiple logs per table. This would make it easy to have multiple 
> "subscribers" to a single table's changes. A workaround would be to create a 
> forking daemon listener, but that's not a great answer.
> - Log filtering. Being able to apply filters, including UDF-based filters 
> would make Casandra a much more versatile feeder into other systems, and 
> again, reduce complexity that would otherwise need to be built into the 
> daemons.
> h2. Format and Consumption
> - Cassandra would only write to the CDC log, and never delete from it. 
> - Cleaning up consumed logfiles would be the client daemon's responibility
> - Logfile size should probably be configurable.
> - Logfiles should be named with a predictable naming schema, making it 
> triivial to process them in order.
> - Daemons should be able to checkpoint their work, and resume from where they 
> left off. This means they would have to leave some file artifact in the CDC 
> log's directory.
> - A sophisticated daemon should be able to be written that could 
> -- Catch up, in written-order, even when it is multiple logfiles behind in 
> processing
> -- Be able to continuously "tail" the most recent logfile and get 
> low-latency(ms?) access to the data as it is written.
> h2. Alternate approach
> In order to make consuming a change log easy and efficient to do with low 
> latency, the following could supplement the approach outlined above
> - Instead of writing to a logfile, by default, Cassandra could expose a 
> socket for a daemon to connect to, 

[jira] [Commented] (CASSANDRA-13125) Duplicate rows after upgrading from 2.1.16 to 3.0.10/3.9

2017-02-08 Thread Yasuharu Goto (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15857829#comment-15857829
 ] 

Yasuharu Goto commented on CASSANDRA-13125:
---

Thank you guys! :)

> Duplicate rows after upgrading from 2.1.16 to 3.0.10/3.9
> 
>
> Key: CASSANDRA-13125
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13125
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Zhongxiang Zheng
>Assignee: Sylvain Lebresne
>Priority: Critical
> Fix For: 3.0.11, 3.11.0
>
> Attachments: diff-a.patch, diff-b.patch
>
>
> I found that rows are splitting and duplicated after upgrading the cluster 
> from 2.1.x to 3.0.x.
> I found the way to reproduce the problem as below.
> {code}
> $ ccm create test -v 2.1.16 -n 3 -s   
> 
> Current cluster is now: test
> $ ccm node1 cqlsh  -e "CREATE KEYSPACE test WITH replication = 
> {'class':'SimpleStrategy', 'replication_factor':3}"
> $ ccm node1 cqlsh -e "CREATE TABLE test.test (id text PRIMARY KEY, value1 
> set, value2 set);"
> # Upgrade node1
> $ for i in 1; do ccm node${i} stop; ccm node${i} setdir -v3.0.10; ccm 
> node${i} start;ccm node${i} nodetool upgradesstables; done
> # Insert a row through node1(3.0.10)
> $ ccm node1 cqlsh -e "INSERT INTO test.test (id, value1, value2) values 
> ('aaa', {'aaa', 'bbb'}, {'ccc', 'ddd'});"   
> # Insert a row through node2(2.1.16)
> $ ccm node2 cqlsh -e "INSERT INTO test.test (id, value1, value2) values 
> ('bbb', {'aaa', 'bbb'}, {'ccc', 'ddd'});" 
> # The row inserted from node1 is splitting
> $ ccm node1 cqlsh -e "SELECT * FROM test.test ;"
>  id  | value1 | value2
> -++
>  aaa |   null |   null
>  aaa | {'aaa', 'bbb'} | {'ccc', 'ddd'}
>  bbb | {'aaa', 'bbb'} | {'ccc', 'ddd'}
> $ for i in 1 2; do ccm node${i} nodetool flush; done
> # Results of sstable2json of node2. The row inserted from node1(3.0.10) is 
> different from the row inserted from node2(2.1.16).
> $ ccm node2 json -k test -c test
> running
> ['/home/zzheng/.ccm/test/node2/data0/test/test-5406ee80dbdb11e6a175f57c4c7c85f3/test-test-ka-1-Data.db']
> -- test-test-ka-1-Data.db -
> [
> {"key": "aaa",
>  "cells": [["","",1484564624769577],
>["value1","value2:!",1484564624769576,"t",1484564624],
>["value1:616161","",1484564624769577],
>["value1:626262","",1484564624769577],
>["value2:636363","",1484564624769577],
>["value2:646464","",1484564624769577]]},
> {"key": "bbb",
>  "cells": [["","",1484564634508029],
>["value1:_","value1:!",1484564634508028,"t",1484564634],
>["value1:616161","",1484564634508029],
>["value1:626262","",1484564634508029],
>["value2:_","value2:!",1484564634508028,"t",1484564634],
>["value2:636363","",1484564634508029],
>["value2:646464","",1484564634508029]]}
> ]
> # Upgrade node2,3
> $ for i in `seq 2 3`; do ccm node${i} stop; ccm node${i} setdir -v3.0.10; ccm 
> node${i} start;ccm node${i} nodetool upgradesstables; done
> # After upgrade node2,3, the row inserted from node1 is splitting in node2,3
> $ ccm node2 cqlsh -e "SELECT * FROM test.test ;"  
>   
>  id  | value1 | value2
> -++
>  aaa |   null |   null
>  aaa | {'aaa', 'bbb'} | {'ccc', 'ddd'}
>  bbb | {'aaa', 'bbb'} | {'ccc', 'ddd'}
> (3 rows)
> # Results of sstabledump
> # node1
> [
>   {
> "partition" : {
>   "key" : [ "aaa" ],
>   "position" : 0
> },
> "rows" : [
>   {
> "type" : "row",
> "position" : 17,
> "liveness_info" : { "tstamp" : "2017-01-16T11:03:44.769577Z" },
> "cells" : [
>   { "name" : "value1", "deletion_info" : { "marked_deleted" : 
> "2017-01-16T11:03:44.769576Z", "local_delete_time" : "2017-01-16T11:03:44Z" } 
> },
>   { "name" : "value1", "path" : [ "aaa" ], "value" : "" },
>   { "name" : "value1", "path" : [ "bbb" ], "value" : "" },
>   { "name" : "value2", "deletion_info" : { "marked_deleted" : 
> "2017-01-16T11:03:44.769576Z", "local_delete_time" : "2017-01-16T11:03:44Z" } 
> },
>   { "name" : "value2", "path" : [ "ccc" ], "value" : "" },
>   { "name" : "value2", "path" : [ "ddd" ], "value" : "" }
> ]
>   }
> ]
>   },
>   {
> "partition" : {
>   "key" : [ "bbb" ],
>   "position" : 48
> },
> "rows" : [
>   {
> "type" : "row",
> "position" : 65,
> "liveness_info" : { "tstamp" : "2017-01-16T11:03:54.508029Z" },
> "cells" 

[jira] [Commented] (CASSANDRA-13125) Duplicate rows after upgrading from 2.1.16 to 3.0.10/3.9

2017-01-20 Thread Yasuharu Goto (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15832169#comment-15832169
 ] 

Yasuharu Goto commented on CASSANDRA-13125:
---

Thank you [~slebresne]! I checked that your patches worked properly in my 
reproduce procedure. The result is below. Now I could see that C*3.0 generated 
{{[c-c!][d-d!]}} style range tombstones and the rows are not broken!

h4. On 13125-3.0
{noformat}
cqlsh> insert into test.test(a,b,c,d,e) values(14,1,{2,3},{4,5},6);
cqlsh> select * from test.test;

 a  | b | c  | d  | e
+---+++---
 14 | 1 | {2, 3} | {4, 5} | 6

RangeTombstone(0), 
start:org.apache.cassandra.db.composites.CompoundSparseCellName@78e3b54a, 
end:org.apache.cassandra.db.composites.BoundedComposite@6b517b57, 
markedAt:1484931972134334, delTime:1484931972
RangeTombstone(1), 
start:org.apache.cassandra.db.composites.CompoundSparseCellName@78e3b54b, 
end:org.apache.cassandra.db.composites.BoundedComposite@6b517b58, 
markedAt:1484931972134334, delTime:1484931972
DeletionInfo:{deletedAt=-9223372036854775808, localDeletion=2147483647, 
ranges=[c-c:!, deletedAt=1484931972134334, localDeletion=1484931972][d-d:!, 
deletedAt=1484931972134334, localDeletion=1484931972]}
from:/127.0.0.1, payload:Mutation(keyspace='test', key='000e', 
modifications=[ColumnFamily(test -{deletedAt=-9223372036854775808, 
localDeletion=2147483647, ranges=[c-c:!, deletedAt=1484931972134334, 
localDeletion=1484931972][d-d:!, deletedAt=1484931972134334, 
localDeletion=1484931972]}- 
[:false:0@1484931972134335,b:false:4@1484931972134335,c:0002:false:0@1484931972134335,c:0003:false:0@1484931972134335,d:0004:false:0@1484931972134335,d:0005:false:0@1484931972134335,e:false:4@1484931972134335,])]),
 verb:MUTATION, version:8
{noformat}

h4. On 13125-3.11
{noformat}
cqlsh> insert into test.test(a,b,c,d,e) values(14,1,{2,3},{4,5},6);
cqlsh> select * from test.test;   
 a  | b | c  | d  | e
+---+++---
 14 | 1 | {2, 3} | {4, 5} | 6

Mutation.deserialize() size==1
RangeTombstone(0), 
start:org.apache.cassandra.db.composites.CompoundSparseCellName@4316af5d, 
end:org.apache.cassandra.db.composites.BoundedComposite@256e93b0, 
markedAt:1484933162431359, delTime:1484933162
RangeTombstone(1), 
start:org.apache.cassandra.db.composites.CompoundSparseCellName@4316af5e, 
end:org.apache.cassandra.db.composites.BoundedComposite@256e93b1, 
markedAt:1484933162431359, delTime:1484933162
DeletionInfo:{deletedAt=-9223372036854775808, localDeletion=2147483647, 
ranges=[c-c:!, deletedAt=1484933162431359, localDeletion=1484933162][d-d:!, 
deletedAt=1484933162431359, localDeletion=1484933162]}
from:/127.0.0.1, payload:Mutation(keyspace='test', key='000e', 
modifications=[ColumnFamily(test -{deletedAt=-9223372036854775808, 
localDeletion=2147483647, ranges=[c-c:!, deletedAt=1484933162431359, 
localDeletion=1484933162][d-d:!, deletedAt=1484933162431359, 
localDeletion=1484933162]}- 
[:false:0@1484933162431360,b:false:4@1484933162431360,c:0002:false:0@1484933162431360,c:0003:false:0@1484933162431360,d:0004:false:0@1484933162431360,d:0005:false:0@1484933162431360,e:false:4@1484933162431360,])]),
 verb:MUTATION, version:8
{noformat}



> Duplicate rows after upgrading from 2.1.16 to 3.0.10/3.9
> 
>
> Key: CASSANDRA-13125
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13125
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Zhongxiang Zheng
>Assignee: Yasuharu Goto
>Priority: Critical
> Attachments: diff-a.patch, diff-b.patch
>
>
> I found that rows are splitting and duplicated after upgrading the cluster 
> from 2.1.x to 3.0.x.
> I found the way to reproduce the problem as below.
> {code}
> $ ccm create test -v 2.1.16 -n 3 -s   
> 
> Current cluster is now: test
> $ ccm node1 cqlsh  -e "CREATE KEYSPACE test WITH replication = 
> {'class':'SimpleStrategy', 'replication_factor':3}"
> $ ccm node1 cqlsh -e "CREATE TABLE test.test (id text PRIMARY KEY, value1 
> set, value2 set);"
> # Upgrade node1
> $ for i in 1; do ccm node${i} stop; ccm node${i} setdir -v3.0.10; ccm 
> node${i} start;ccm node${i} nodetool upgradesstables; done
> # Insert a row through node1(3.0.10)
> $ ccm node1 cqlsh -e "INSERT INTO test.test (id, value1, value2) values 
> ('aaa', {'aaa', 'bbb'}, {'ccc', 'ddd'});"   
> # Insert a row through node2(2.1.16)
> $ ccm node2 cqlsh -e "INSERT INTO test.test (id, value1, value2) values 
> ('bbb', {'aaa', 'bbb'}, {'ccc', 'ddd'});" 
> # The row inserted from node1 is splitting
> $ ccm node1 cqlsh -e "SELECT * FROM test.test ;"
>  id  | value1 | value2
> 

[jira] [Comment Edited] (CASSANDRA-13125) Duplicate rows after upgrading from 2.1.16 to 3.0.10/3.9

2017-01-18 Thread Yasuharu Goto (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15829145#comment-15829145
 ] 

Yasuharu Goto edited comment on CASSANDRA-13125 at 1/19/17 1:58 AM:


h2. Investigations...

After some debugging, I found interesting difference in serialized 
RangeTombstoneLists between 2.1.16 and 3.0.10.

- I ran 3 Cassandra nodes with some debug prints.
-- 127.0.0.1 (C* 3.0.10)
-- 127.0.0.2 (C* 2.1.16)
-- 127.0.0.3 (C* 2.1.16)
- They have a keyspace and a table already created.
-- CREATE KEYSPACE test WITH replication = {'class': 'SimpleStrategy', 
'replication_factor': '1'}
-- CREATE TABLE test.test ( a int PRIMARY KEY, b int, c set, d set, e 
int )
- And I query a same INSERT (which mutation is sent to 127.0.0.2) query from 
127.0.0.1(C*3.0) and 127.0.0.3(C*2.1) and see the difference.

Insert a row from 127.0.0.1 and scan. ( inserted (a=14) row is broken)
{code:sql}
cqlsh> insert into test.test(a,b,c,d,e) values(14,1,{2,3},{4,5},6);
cqlsh> select * from test.test;

 a  | b| c  | d  | e
+--+++--
 14 |1 |   null |   null | null
 14 | null | {2, 3} | {4, 5} |6

(2 rows)
{code}

And then, I insert from 127.0.0.3 and scan.  (neither a=5 nor a=14 are broken)
{code:sql}
cqlsh> insert into test.test(a,b,c,d,e)values(5,1,{2,3},{4,5},6);
cqlsh> select * from test.test;

 a  | b | c  | d  | e
+---+++---
  5 | 1 | {2, 3} | {4, 5} | 6
 14 | 1 | {2, 3} | {4, 5} | 6
{code}

And back to 127.0.0.1 and scan the table. a=14 is broken but a=5 is not.
{code:sql}
cqlsh> select * from test.test;

 a  | b| c  | d  | e
+--+++--
  5 |1 | {2, 3} | {4, 5} |6
 14 |1 |   null |   null | null
 14 | null | {2, 3} | {4, 5} |6
{code}

Therefore,It looks like that "C*3 can't scan properly rows that is stored in 
C*2 but inserted from C*3.";

Next, I observed some incoming MUTATIONs in 127.0.0.2 like below. I saw that 
C*3.0 sent RangeTombstones like {{[c-c],[c-d]}}, but C*2.1 sent 
{{[c:_-c],[d:_-d]}}.

{noformat}
> insert into test.test(a,b,c,d,e) values(14,1,{2,3},{4,5},6); from 127.0.0.1

DeletionInfo:{deletedAt=-9223372036854775808, localDeletion=2147483647, 
ranges=[c-c:!, deletedAt=1484710273390930, localDeletion=1484710273][c-d:!, 
deletedAt=1484710273390930, localDeletion=1484710273]}
from:/127.0.0.1, payload:Mutation(keyspace='test', key='000e', 
modifications=[ColumnFamily(test -{deletedAt=-9223372036854775808, 
localDeletion=2147483647, ranges=[c-c:!, deletedAt=1484710273390930, 
localDeletion=1484710273][c-d:!, deletedAt=1484710273390930, 
localDeletion=1484710273]}- 
[:false:0@1484710273390931,b:false:4@1484710273390931,c:0002:false:0@1484710273390931,c:0003:false:0@1484710273390931,d:0004:false:0@1484710273390931,d:0005:false:0@1484710273390931,e:false:4@1484710273390931,])]),
 verb:MUTATION, version:8

> insert into test.test(a,b,c,d,e) values(14,1,{2,3},{4,5},6); from 127.0.0.3
DeletionInfo:{deletedAt=-9223372036854775808, localDeletion=2147483647, 
ranges=[c:_-c:!, deletedAt=1484710277987556, localDeletion=1484710277][d:_-d:!, 
deletedAt=1484710277987556, localDeletion=1484710277]}
from:/127.0.0.3, payload:Mutation(keyspace='test', key='000e', 
modifications=[ColumnFamily(test -{deletedAt=-9223372036854775808, 
localDeletion=2147483647, ranges=[c:_-c:!, deletedAt=1484710277987556, 
localDeletion=1484710277][d:_-d:!, deletedAt=1484710277987556, 
localDeletion=1484710277]}- 
[:false:0@1484710277987557,b:false:4@1484710277987557,c:0002:false:0@1484710277987557,c:0003:false:0@1484710277987557,d:0004:false:0@1484710277987557,d:0005:false:0@1484710277987557,e:false:4@1484710277987557,])]),
 verb:MUTATION, version:8
{noformat}

h2. Workaround Plan-A

But, LegacyRangeTombstone remove {{collectionName}} from RangeTombStone which 
start.bound != end.bound like {{[c-d]}}
https://github.com/apache/cassandra/blob/cassandra-3.0.10/src/java/org/apache/cassandra/db/LegacyLayout.java#L1592-L1599
It seems like that this deletions of collectionName corrupt the unmarshal of 
legacy tombstone. After I commentized these else-if block, I could scan the 
table correctly.


{code:java}
if ((start.collectionName == null) != (stop.collectionName == null))
{
if (start.collectionName == null)
stop = new LegacyBound(stop.bound, stop.isStatic, null);
else
start = new LegacyBound(start.bound, start.isStatic, null);
}
/*else if (!Objects.equals(start.collectionName, 
stop.collectionName))
{
// We're in the similar but slightly more complex case where on 
top of the big tombstone
// A, we have 2 (or more) collection tombstones B and C within 
A. So we also end up with
// a 

[jira] [Updated] (CASSANDRA-13125) Duplicate rows after upgrading from 2.1.16 to 3.0.10/3.9

2017-01-18 Thread Yasuharu Goto (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yasuharu Goto updated CASSANDRA-13125:
--
Reproduced In: 3.9, 3.0.10  (was: 3.0.10, 3.9)
   Status: Patch Available  (was: Open)

I've submitted a brief patch for reference.

> Duplicate rows after upgrading from 2.1.16 to 3.0.10/3.9
> 
>
> Key: CASSANDRA-13125
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13125
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Zhongxiang Zheng
> Attachments: diff-a.patch, diff-b.patch
>
>
> I found that rows are splitting and duplicated after upgrading the cluster 
> from 2.1.x to 3.0.x.
> I found the way to reproduce the problem as below.
> {code}
> $ ccm create test -v 2.1.16 -n 3 -s   
> 
> Current cluster is now: test
> $ ccm node1 cqlsh  -e "CREATE KEYSPACE test WITH replication = 
> {'class':'SimpleStrategy', 'replication_factor':3}"
> $ ccm node1 cqlsh -e "CREATE TABLE test.test (id text PRIMARY KEY, value1 
> set, value2 set);"
> # Upgrade node1
> $ for i in 1; do ccm node${i} stop; ccm node${i} setdir -v3.0.10; ccm 
> node${i} start;ccm node${i} nodetool upgradesstables; done
> # Insert a row through node1(3.0.10)
> $ ccm node1 cqlsh -e "INSERT INTO test.test (id, value1, value2) values 
> ('aaa', {'aaa', 'bbb'}, {'ccc', 'ddd'});"   
> # Insert a row through node2(2.1.16)
> $ ccm node2 cqlsh -e "INSERT INTO test.test (id, value1, value2) values 
> ('bbb', {'aaa', 'bbb'}, {'ccc', 'ddd'});" 
> # The row inserted from node1 is splitting
> $ ccm node1 cqlsh -e "SELECT * FROM test.test ;"
>  id  | value1 | value2
> -++
>  aaa |   null |   null
>  aaa | {'aaa', 'bbb'} | {'ccc', 'ddd'}
>  bbb | {'aaa', 'bbb'} | {'ccc', 'ddd'}
> $ for i in 1 2; do ccm node${i} nodetool flush; done
> # Results of sstable2json of node2. The row inserted from node1(3.0.10) is 
> different from the row inserted from node2(2.1.16).
> $ ccm node2 json -k test -c test
> running
> ['/home/zzheng/.ccm/test/node2/data0/test/test-5406ee80dbdb11e6a175f57c4c7c85f3/test-test-ka-1-Data.db']
> -- test-test-ka-1-Data.db -
> [
> {"key": "aaa",
>  "cells": [["","",1484564624769577],
>["value1","value2:!",1484564624769576,"t",1484564624],
>["value1:616161","",1484564624769577],
>["value1:626262","",1484564624769577],
>["value2:636363","",1484564624769577],
>["value2:646464","",1484564624769577]]},
> {"key": "bbb",
>  "cells": [["","",1484564634508029],
>["value1:_","value1:!",1484564634508028,"t",1484564634],
>["value1:616161","",1484564634508029],
>["value1:626262","",1484564634508029],
>["value2:_","value2:!",1484564634508028,"t",1484564634],
>["value2:636363","",1484564634508029],
>["value2:646464","",1484564634508029]]}
> ]
> # Upgrade node2,3
> $ for i in `seq 2 3`; do ccm node${i} stop; ccm node${i} setdir -v3.0.10; ccm 
> node${i} start;ccm node${i} nodetool upgradesstables; done
> # After upgrade node2,3, the row inserted from node1 is splitting in node2,3
> $ ccm node2 cqlsh -e "SELECT * FROM test.test ;"  
>   
>  id  | value1 | value2
> -++
>  aaa |   null |   null
>  aaa | {'aaa', 'bbb'} | {'ccc', 'ddd'}
>  bbb | {'aaa', 'bbb'} | {'ccc', 'ddd'}
> (3 rows)
> # Results of sstabledump
> # node1
> [
>   {
> "partition" : {
>   "key" : [ "aaa" ],
>   "position" : 0
> },
> "rows" : [
>   {
> "type" : "row",
> "position" : 17,
> "liveness_info" : { "tstamp" : "2017-01-16T11:03:44.769577Z" },
> "cells" : [
>   { "name" : "value1", "deletion_info" : { "marked_deleted" : 
> "2017-01-16T11:03:44.769576Z", "local_delete_time" : "2017-01-16T11:03:44Z" } 
> },
>   { "name" : "value1", "path" : [ "aaa" ], "value" : "" },
>   { "name" : "value1", "path" : [ "bbb" ], "value" : "" },
>   { "name" : "value2", "deletion_info" : { "marked_deleted" : 
> "2017-01-16T11:03:44.769576Z", "local_delete_time" : "2017-01-16T11:03:44Z" } 
> },
>   { "name" : "value2", "path" : [ "ccc" ], "value" : "" },
>   { "name" : "value2", "path" : [ "ddd" ], "value" : "" }
> ]
>   }
> ]
>   },
>   {
> "partition" : {
>   "key" : [ "bbb" ],
>   "position" : 48
> },
> "rows" : [
>   {
> "type" : "row",
> "position" : 65,
> "liveness_info" : { "tstamp" : "2017-01-16T11:03:54.508029Z" },
> "cells" : [
>   { "name" : 

[jira] [Updated] (CASSANDRA-13125) Duplicate rows after upgrading from 2.1.16 to 3.0.10/3.9

2017-01-18 Thread Yasuharu Goto (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yasuharu Goto updated CASSANDRA-13125:
--
Attachment: diff-b.patch

A patch for Plan-B on Cassandra-3.0.10.

> Duplicate rows after upgrading from 2.1.16 to 3.0.10/3.9
> 
>
> Key: CASSANDRA-13125
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13125
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Zhongxiang Zheng
> Attachments: diff-a.patch, diff-b.patch
>
>
> I found that rows are splitting and duplicated after upgrading the cluster 
> from 2.1.x to 3.0.x.
> I found the way to reproduce the problem as below.
> {code}
> $ ccm create test -v 2.1.16 -n 3 -s   
> 
> Current cluster is now: test
> $ ccm node1 cqlsh  -e "CREATE KEYSPACE test WITH replication = 
> {'class':'SimpleStrategy', 'replication_factor':3}"
> $ ccm node1 cqlsh -e "CREATE TABLE test.test (id text PRIMARY KEY, value1 
> set, value2 set);"
> # Upgrade node1
> $ for i in 1; do ccm node${i} stop; ccm node${i} setdir -v3.0.10; ccm 
> node${i} start;ccm node${i} nodetool upgradesstables; done
> # Insert a row through node1(3.0.10)
> $ ccm node1 cqlsh -e "INSERT INTO test.test (id, value1, value2) values 
> ('aaa', {'aaa', 'bbb'}, {'ccc', 'ddd'});"   
> # Insert a row through node2(2.1.16)
> $ ccm node2 cqlsh -e "INSERT INTO test.test (id, value1, value2) values 
> ('bbb', {'aaa', 'bbb'}, {'ccc', 'ddd'});" 
> # The row inserted from node1 is splitting
> $ ccm node1 cqlsh -e "SELECT * FROM test.test ;"
>  id  | value1 | value2
> -++
>  aaa |   null |   null
>  aaa | {'aaa', 'bbb'} | {'ccc', 'ddd'}
>  bbb | {'aaa', 'bbb'} | {'ccc', 'ddd'}
> $ for i in 1 2; do ccm node${i} nodetool flush; done
> # Results of sstable2json of node2. The row inserted from node1(3.0.10) is 
> different from the row inserted from node2(2.1.16).
> $ ccm node2 json -k test -c test
> running
> ['/home/zzheng/.ccm/test/node2/data0/test/test-5406ee80dbdb11e6a175f57c4c7c85f3/test-test-ka-1-Data.db']
> -- test-test-ka-1-Data.db -
> [
> {"key": "aaa",
>  "cells": [["","",1484564624769577],
>["value1","value2:!",1484564624769576,"t",1484564624],
>["value1:616161","",1484564624769577],
>["value1:626262","",1484564624769577],
>["value2:636363","",1484564624769577],
>["value2:646464","",1484564624769577]]},
> {"key": "bbb",
>  "cells": [["","",1484564634508029],
>["value1:_","value1:!",1484564634508028,"t",1484564634],
>["value1:616161","",1484564634508029],
>["value1:626262","",1484564634508029],
>["value2:_","value2:!",1484564634508028,"t",1484564634],
>["value2:636363","",1484564634508029],
>["value2:646464","",1484564634508029]]}
> ]
> # Upgrade node2,3
> $ for i in `seq 2 3`; do ccm node${i} stop; ccm node${i} setdir -v3.0.10; ccm 
> node${i} start;ccm node${i} nodetool upgradesstables; done
> # After upgrade node2,3, the row inserted from node1 is splitting in node2,3
> $ ccm node2 cqlsh -e "SELECT * FROM test.test ;"  
>   
>  id  | value1 | value2
> -++
>  aaa |   null |   null
>  aaa | {'aaa', 'bbb'} | {'ccc', 'ddd'}
>  bbb | {'aaa', 'bbb'} | {'ccc', 'ddd'}
> (3 rows)
> # Results of sstabledump
> # node1
> [
>   {
> "partition" : {
>   "key" : [ "aaa" ],
>   "position" : 0
> },
> "rows" : [
>   {
> "type" : "row",
> "position" : 17,
> "liveness_info" : { "tstamp" : "2017-01-16T11:03:44.769577Z" },
> "cells" : [
>   { "name" : "value1", "deletion_info" : { "marked_deleted" : 
> "2017-01-16T11:03:44.769576Z", "local_delete_time" : "2017-01-16T11:03:44Z" } 
> },
>   { "name" : "value1", "path" : [ "aaa" ], "value" : "" },
>   { "name" : "value1", "path" : [ "bbb" ], "value" : "" },
>   { "name" : "value2", "deletion_info" : { "marked_deleted" : 
> "2017-01-16T11:03:44.769576Z", "local_delete_time" : "2017-01-16T11:03:44Z" } 
> },
>   { "name" : "value2", "path" : [ "ccc" ], "value" : "" },
>   { "name" : "value2", "path" : [ "ddd" ], "value" : "" }
> ]
>   }
> ]
>   },
>   {
> "partition" : {
>   "key" : [ "bbb" ],
>   "position" : 48
> },
> "rows" : [
>   {
> "type" : "row",
> "position" : 65,
> "liveness_info" : { "tstamp" : "2017-01-16T11:03:54.508029Z" },
> "cells" : [
>   { "name" : "value1", "deletion_info" : { "marked_deleted" : 
> 

[jira] [Updated] (CASSANDRA-13125) Duplicate rows after upgrading from 2.1.16 to 3.0.10/3.9

2017-01-18 Thread Yasuharu Goto (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yasuharu Goto updated CASSANDRA-13125:
--
Attachment: diff-a.patch

A patch for Plan-A on Cassandra 3.0.10

> Duplicate rows after upgrading from 2.1.16 to 3.0.10/3.9
> 
>
> Key: CASSANDRA-13125
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13125
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Zhongxiang Zheng
> Attachments: diff-a.patch, diff-b.patch
>
>
> I found that rows are splitting and duplicated after upgrading the cluster 
> from 2.1.x to 3.0.x.
> I found the way to reproduce the problem as below.
> {code}
> $ ccm create test -v 2.1.16 -n 3 -s   
> 
> Current cluster is now: test
> $ ccm node1 cqlsh  -e "CREATE KEYSPACE test WITH replication = 
> {'class':'SimpleStrategy', 'replication_factor':3}"
> $ ccm node1 cqlsh -e "CREATE TABLE test.test (id text PRIMARY KEY, value1 
> set, value2 set);"
> # Upgrade node1
> $ for i in 1; do ccm node${i} stop; ccm node${i} setdir -v3.0.10; ccm 
> node${i} start;ccm node${i} nodetool upgradesstables; done
> # Insert a row through node1(3.0.10)
> $ ccm node1 cqlsh -e "INSERT INTO test.test (id, value1, value2) values 
> ('aaa', {'aaa', 'bbb'}, {'ccc', 'ddd'});"   
> # Insert a row through node2(2.1.16)
> $ ccm node2 cqlsh -e "INSERT INTO test.test (id, value1, value2) values 
> ('bbb', {'aaa', 'bbb'}, {'ccc', 'ddd'});" 
> # The row inserted from node1 is splitting
> $ ccm node1 cqlsh -e "SELECT * FROM test.test ;"
>  id  | value1 | value2
> -++
>  aaa |   null |   null
>  aaa | {'aaa', 'bbb'} | {'ccc', 'ddd'}
>  bbb | {'aaa', 'bbb'} | {'ccc', 'ddd'}
> $ for i in 1 2; do ccm node${i} nodetool flush; done
> # Results of sstable2json of node2. The row inserted from node1(3.0.10) is 
> different from the row inserted from node2(2.1.16).
> $ ccm node2 json -k test -c test
> running
> ['/home/zzheng/.ccm/test/node2/data0/test/test-5406ee80dbdb11e6a175f57c4c7c85f3/test-test-ka-1-Data.db']
> -- test-test-ka-1-Data.db -
> [
> {"key": "aaa",
>  "cells": [["","",1484564624769577],
>["value1","value2:!",1484564624769576,"t",1484564624],
>["value1:616161","",1484564624769577],
>["value1:626262","",1484564624769577],
>["value2:636363","",1484564624769577],
>["value2:646464","",1484564624769577]]},
> {"key": "bbb",
>  "cells": [["","",1484564634508029],
>["value1:_","value1:!",1484564634508028,"t",1484564634],
>["value1:616161","",1484564634508029],
>["value1:626262","",1484564634508029],
>["value2:_","value2:!",1484564634508028,"t",1484564634],
>["value2:636363","",1484564634508029],
>["value2:646464","",1484564634508029]]}
> ]
> # Upgrade node2,3
> $ for i in `seq 2 3`; do ccm node${i} stop; ccm node${i} setdir -v3.0.10; ccm 
> node${i} start;ccm node${i} nodetool upgradesstables; done
> # After upgrade node2,3, the row inserted from node1 is splitting in node2,3
> $ ccm node2 cqlsh -e "SELECT * FROM test.test ;"  
>   
>  id  | value1 | value2
> -++
>  aaa |   null |   null
>  aaa | {'aaa', 'bbb'} | {'ccc', 'ddd'}
>  bbb | {'aaa', 'bbb'} | {'ccc', 'ddd'}
> (3 rows)
> # Results of sstabledump
> # node1
> [
>   {
> "partition" : {
>   "key" : [ "aaa" ],
>   "position" : 0
> },
> "rows" : [
>   {
> "type" : "row",
> "position" : 17,
> "liveness_info" : { "tstamp" : "2017-01-16T11:03:44.769577Z" },
> "cells" : [
>   { "name" : "value1", "deletion_info" : { "marked_deleted" : 
> "2017-01-16T11:03:44.769576Z", "local_delete_time" : "2017-01-16T11:03:44Z" } 
> },
>   { "name" : "value1", "path" : [ "aaa" ], "value" : "" },
>   { "name" : "value1", "path" : [ "bbb" ], "value" : "" },
>   { "name" : "value2", "deletion_info" : { "marked_deleted" : 
> "2017-01-16T11:03:44.769576Z", "local_delete_time" : "2017-01-16T11:03:44Z" } 
> },
>   { "name" : "value2", "path" : [ "ccc" ], "value" : "" },
>   { "name" : "value2", "path" : [ "ddd" ], "value" : "" }
> ]
>   }
> ]
>   },
>   {
> "partition" : {
>   "key" : [ "bbb" ],
>   "position" : 48
> },
> "rows" : [
>   {
> "type" : "row",
> "position" : 65,
> "liveness_info" : { "tstamp" : "2017-01-16T11:03:54.508029Z" },
> "cells" : [
>   { "name" : "value1", "deletion_info" : { "marked_deleted" : 
> 

[jira] [Commented] (CASSANDRA-13125) Duplicate rows after upgrading from 2.1.16 to 3.0.10/3.9

2017-01-18 Thread Yasuharu Goto (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15829145#comment-15829145
 ] 

Yasuharu Goto commented on CASSANDRA-13125:
---

h2. Investigations...

After some debugging, I found interesting difference in serialized 
RangeTombstoneLists between 2.1.16 and 3.0.10.

- I ran 3 Cassandra nodes with some debug prints.
-- 127.0.0.1 (C* 3.0.10)
-- 127.0.0.2 (C* 2.1.16)
-- 127.0.0.3 (C* 2.1.16)
- They have a keyspace and a table already created.
-- CREATE KEYSPACE test WITH replication = {'class': 'SimpleStrategy', 
'replication_factor': '1'}
-- CREATE TABLE test.test ( a int PRIMARY KEY, b int, c set, d set, e 
int )
- And I query a same INSERT (which mutation is sent to 127.0.0.2) query from 
127.0.0.1(C*3.0) and 127.0.0.3(C*2.1) and see the difference.

Insert a row from 127.0.0.1 and scan. ( inserted (a=14) row is broken)
{code:sql}
cqlsh> insert into test.test(a,b,c,d,e) values(14,1,{2,3},{4,5},6);
cqlsh> select * from test.test;

 a  | b| c  | d  | e
+--+++--
 14 |1 |   null |   null | null
 14 | null | {2, 3} | {4, 5} |6

(2 rows)
{code}

And then, I insert from 127.0.0.3 and scan.  (neither a=5 nor a=14 are broken)
{code:sql}
cqlsh> insert into test.test(a,b,c,d,e)values(5,1,{2,3},{4,5},6);
cqlsh> select * from test.test;

 a  | b | c  | d  | e
+---+++---
  5 | 1 | {2, 3} | {4, 5} | 6
 14 | 1 | {2, 3} | {4, 5} | 6
{code}

And back to 127.0.0.1 and scan the table. a=14 is broken but a=5 is not.
{code:sql}
cqlsh> select * from test.test;

 a  | b| c  | d  | e
+--+++--
  5 |1 | {2, 3} | {4, 5} |6
 14 |1 |   null |   null | null
 14 | null | {2, 3} | {4, 5} |6
{code}

Therefore,It looks like that "C*3 can't scan properly rows that is stored in 
C*2 but inserted from C*3.";

Next, I observed some incoming MUTATIONs in 127.0.0.2 like below. I saw that 
C*3.0 sent RangeTombstones like {{[c-c],[c-d]}}, but C*2.1 sent 
{{[c:_-c],[d:_-d]}}.

{noformat}
> insert into test.test(a,b,c,d,e) values(14,1,{2,3},{4,5},6); from 127.0.0.1

DeletionInfo:{deletedAt=-9223372036854775808, localDeletion=2147483647, 
ranges=[c-c:!, deletedAt=1484710273390930, localDeletion=1484710273][c-d:!, 
deletedAt=1484710273390930, localDeletion=1484710273]}
from:/127.0.0.1, payload:Mutation(keyspace='test', key='000e', 
modifications=[ColumnFamily(test -{deletedAt=-9223372036854775808, 
localDeletion=2147483647, ranges=[c-c:!, deletedAt=1484710273390930, 
localDeletion=1484710273][c-d:!, deletedAt=1484710273390930, 
localDeletion=1484710273]}- 
[:false:0@1484710273390931,b:false:4@1484710273390931,c:0002:false:0@1484710273390931,c:0003:false:0@1484710273390931,d:0004:false:0@1484710273390931,d:0005:false:0@1484710273390931,e:false:4@1484710273390931,])]),
 verb:MUTATION, version:8

> insert into test.test(a,b,c,d,e) values(14,1,{2,3},{4,5},6); from 127.0.0.3
DeletionInfo:{deletedAt=-9223372036854775808, localDeletion=2147483647, 
ranges=[c:_-c:!, deletedAt=1484710277987556, localDeletion=1484710277][d:_-d:!, 
deletedAt=1484710277987556, localDeletion=1484710277]}
from:/127.0.0.3, payload:Mutation(keyspace='test', key='000e', 
modifications=[ColumnFamily(test -{deletedAt=-9223372036854775808, 
localDeletion=2147483647, ranges=[c:_-c:!, deletedAt=1484710277987556, 
localDeletion=1484710277][d:_-d:!, deletedAt=1484710277987556, 
localDeletion=1484710277]}- 
[:false:0@1484710277987557,b:false:4@1484710277987557,c:0002:false:0@1484710277987557,c:0003:false:0@1484710277987557,d:0004:false:0@1484710277987557,d:0005:false:0@1484710277987557,e:false:4@1484710277987557,])]),
 verb:MUTATION, version:8
{noformat}

h2. Workaround Plan-A

But, LegacyRangeTombstone remove {{collectionName}} from RangeTombStone which 
start.bound != end.bound like {{[c-d]}}
https://github.com/apache/cassandra/blob/cassandra-3.0.10/src/java/org/apache/cassandra/db/LegacyLayout.java#L1592-L1599
It seems like that this deletions of collectionName corrupt the unmarshal of 
legacy tombstone. After I commentized these else-if block, I could scan the 
table correctly.


{code:java}
if ((start.collectionName == null) != (stop.collectionName == null))
{
if (start.collectionName == null)
stop = new LegacyBound(stop.bound, stop.isStatic, null);
else
start = new LegacyBound(start.bound, start.isStatic, null);
}
/*else if (!Objects.equals(start.collectionName, 
stop.collectionName))
{
// We're in the similar but slightly more complex case where on 
top of the big tombstone
// A, we have 2 (or more) collection tombstones B and C within 
A. So we also end up with
// a tombstone that goes between the end of B and the 

[jira] [Commented] (CASSANDRA-12861) example/triggers build fail.

2016-11-11 Thread Yasuharu Goto (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15658815#comment-15658815
 ] 

Yasuharu Goto commented on CASSANDRA-12861:
---

Thank you [~slebresne] for your correction. Your version helped me 
understanding Cassandra code :)

> example/triggers build fail.
> 
>
> Key: CASSANDRA-12861
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12861
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Yasuharu Goto
>Assignee: Sylvain Lebresne
>Priority: Trivial
>
> When I tried to build example/trigger on trunk branch, I found that "ant jar" 
> fails with an error like below.
> (Sorry for my language settings for ant. I couldn't find how to change it. 
> The error indicated here is a "cannot find symboll" error of 
> RowUpdateBuilder).
> {code}
> Buildfile: /Users/yasuharu/git/cassandra/examples/triggers/build.xml
> init:
> [mkdir] Created dir: 
> /Users/yasuharu/git/cassandra/examples/triggers/build/classes
> build:
> [javac] Compiling 1 source file to 
> /Users/yasuharu/git/cassandra/examples/triggers/build/classes
> [javac] 警告: 
> 注釈プロセッサ'org.openjdk.jmh.generators.BenchmarkProcessor'から-source 
> '1.8'より小さいソース・バージョン'RELEASE_6'がサポートされています
> [javac] 
> /Users/yasuharu/git/cassandra/examples/triggers/src/org/apache/cassandra/triggers/AuditTrigger.java:27:
>  エラー: シンボルを見つけられません
> [javac] import org.apache.cassandra.db.RowUpdateBuilder;
> [javac]   ^
> [javac]   シンボル:   クラス RowUpdateBuilder
> [javac]   場所: パッケージ org.apache.cassandra.db
> [javac] エラー1個
> [javac] 警告1個
> BUILD FAILED
> /Users/yasuharu/git/cassandra/examples/triggers/build.xml:45: Compile failed; 
> see the compiler error output for details.
> Total time: 1 second
> {code}
> I think the movement of RowUpdateBuilder to test has broken this build.
> https://github.com/apache/cassandra/commit/26838063de6246e3a1e18062114ca92fb81c00cf
> In order to fix this, I moved back RowUpdateBuilder.java to src in my patch.
> https://github.com/apache/cassandra/commit/d133eefe9c5fbebd8d389a9397c3948b8c36bd06
> Could you please review my patch?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-12861) example/triggers build fail.

2016-10-30 Thread Yasuharu Goto (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yasuharu Goto updated CASSANDRA-12861:
--
Status: Patch Available  (was: Open)

> example/triggers build fail.
> 
>
> Key: CASSANDRA-12861
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12861
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Yasuharu Goto
>Assignee: Yasuharu Goto
>Priority: Trivial
>
> When I tried to build example/trigger on trunk branch, I found that "ant jar" 
> fails with an error like below.
> (Sorry for my language settings for ant. I couldn't find how to change it. 
> The error indicated here is a "cannot find symboll" error of 
> RowUpdateBuilder).
> {code}
> Buildfile: /Users/yasuharu/git/cassandra/examples/triggers/build.xml
> init:
> [mkdir] Created dir: 
> /Users/yasuharu/git/cassandra/examples/triggers/build/classes
> build:
> [javac] Compiling 1 source file to 
> /Users/yasuharu/git/cassandra/examples/triggers/build/classes
> [javac] 警告: 
> 注釈プロセッサ'org.openjdk.jmh.generators.BenchmarkProcessor'から-source 
> '1.8'より小さいソース・バージョン'RELEASE_6'がサポートされています
> [javac] 
> /Users/yasuharu/git/cassandra/examples/triggers/src/org/apache/cassandra/triggers/AuditTrigger.java:27:
>  エラー: シンボルを見つけられません
> [javac] import org.apache.cassandra.db.RowUpdateBuilder;
> [javac]   ^
> [javac]   シンボル:   クラス RowUpdateBuilder
> [javac]   場所: パッケージ org.apache.cassandra.db
> [javac] エラー1個
> [javac] 警告1個
> BUILD FAILED
> /Users/yasuharu/git/cassandra/examples/triggers/build.xml:45: Compile failed; 
> see the compiler error output for details.
> Total time: 1 second
> {code}
> I think the movement of RowUpdateBuilder to test has broken this build.
> https://github.com/apache/cassandra/commit/26838063de6246e3a1e18062114ca92fb81c00cf
> In order to fix this, I moved back RowUpdateBuilder.java to src in my patch.
> https://github.com/apache/cassandra/commit/d133eefe9c5fbebd8d389a9397c3948b8c36bd06
> Could you please review my patch?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-12861) example/triggers build fail.

2016-10-30 Thread Yasuharu Goto (JIRA)
Yasuharu Goto created CASSANDRA-12861:
-

 Summary: example/triggers build fail.
 Key: CASSANDRA-12861
 URL: https://issues.apache.org/jira/browse/CASSANDRA-12861
 Project: Cassandra
  Issue Type: Bug
Reporter: Yasuharu Goto
Assignee: Yasuharu Goto
Priority: Trivial


When I tried to build example/trigger on trunk branch, I found that "ant jar" 
fails with an error like below.
(Sorry for my language settings for ant. I couldn't find how to change it. The 
error indicated here is a "cannot find symboll" error of RowUpdateBuilder).

{code}
Buildfile: /Users/yasuharu/git/cassandra/examples/triggers/build.xml

init:
[mkdir] Created dir: 
/Users/yasuharu/git/cassandra/examples/triggers/build/classes

build:
[javac] Compiling 1 source file to 
/Users/yasuharu/git/cassandra/examples/triggers/build/classes
[javac] 警告: 注釈プロセッサ'org.openjdk.jmh.generators.BenchmarkProcessor'から-source 
'1.8'より小さいソース・バージョン'RELEASE_6'がサポートされています
[javac] 
/Users/yasuharu/git/cassandra/examples/triggers/src/org/apache/cassandra/triggers/AuditTrigger.java:27:
 エラー: シンボルを見つけられません
[javac] import org.apache.cassandra.db.RowUpdateBuilder;
[javac]   ^
[javac]   シンボル:   クラス RowUpdateBuilder
[javac]   場所: パッケージ org.apache.cassandra.db
[javac] エラー1個
[javac] 警告1個

BUILD FAILED
/Users/yasuharu/git/cassandra/examples/triggers/build.xml:45: Compile failed; 
see the compiler error output for details.

Total time: 1 second
{code}

I think the movement of RowUpdateBuilder to test has broken this build.
https://github.com/apache/cassandra/commit/26838063de6246e3a1e18062114ca92fb81c00cf

In order to fix this, I moved back RowUpdateBuilder.java to src in my patch.
https://github.com/apache/cassandra/commit/d133eefe9c5fbebd8d389a9397c3948b8c36bd06

Could you please review my patch?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12731) Remove IndexInfo cache from FileIndexInfoRetriever.

2016-10-03 Thread Yasuharu Goto (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15542399#comment-15542399
 ] 

Yasuharu Goto commented on CASSANDRA-12731:
---

Thank you very much for your review and cleaning up,  [~snazy] !

> Remove IndexInfo cache from FileIndexInfoRetriever.
> ---
>
> Key: CASSANDRA-12731
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12731
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Yasuharu Goto
>Assignee: Yasuharu Goto
> Attachments: screenshot-1.png, screenshot-2.png
>
>
> Hi guys.
> In the patch of CASSANDRA-11206 , I found that FileIndexInfoRetriever 
> allocates a (potentially very large) IndexInfo array (up to the number of 
> IndexInfo in the RowIndexEntry has) as a cache in every single read path.
> After some experiments using LargePartitionTest on my MacBook, I got results 
> that show that removing FileIndexInfoRetriever improves the performance for 
> large partitions like below (latencies reduced by 41% and by 45%).
> {noformat}
> // LargePartitionsTest.test_13_4G with cache by array
> INFO  [main] 2016-09-29 23:11:25,763 ?:? - SELECTs 1 for part=4194304k 
> total=16384M took 94197 ms
> INFO  [main] 2016-09-29 23:12:50,914 ?:? - SELECTs 2 for part=4194304k 
> total=16384M took 85151 ms
> // LargePartitionsTest.test_13_4G without cache
> INFO  [main] 2016-09-30 00:13:26,050 ?:? - SELECTs 1 for part=4194304k 
> total=16384M took 55112 ms
> INFO  [main] 2016-09-30 00:14:12,132 ?:? - SELECTs 2 for part=4194304k 
> total=16384M took 46082 ms
> {noformat}
> Code is 
> [here|https://github.com/matope/cassandra/commit/86fb910a0e38f7520e1be40fb42f74a692f2ebce]
>  (based on trunk)
> Heap memory usage during running LargePartitionsTest (except for 8G test) 
> with array cache(original)
> !screenshot-1.png!
> Heap memory usage during running LargePartitionsTest (except for 8G test) 
> without cache
> !screenshot-2.png!
> Of course, I have attempted to use some collection containers instead of a 
> plain array. But I could not recognize great improvement enough to justify 
> using these cache mechanism by them. (Unless I did some mistake or overlook 
> about this test)
> || LargePartitionsTest.test_12_2G || SELECTs 1 (ms) || SELECTs 2 (ms) || Scan 
> (ms) ||
> |Original (array) | 62736 | 48562 | 41540 |
> |ConcurrentHashMap | 47597 | 30854 | 18271 |
> |ConcurrentHashMap 2nd trial |44036|26895|17443|
> |LinkedHashCache (capacity=16, limit=10, fifo) 1st|42668|32165|17323|
> |LinkedHashCache (capacity=16, limit=10, fifo) 2nd|48863|28066|18053|
> |LinkedHashCache (capacity=16, limit=16, fifo) | 46979 | 29810 | 18620 |
> |LinkedHashCache (capacity=16, limit=10, lru) | 46456 | 29749 | 20311 |
> |No Cache | 47579 | 32480 | 18337 |
> |No Cache 2nd trial | 46534 | 27670 | 18700 |
> Code that I used for this comparison is 
> [here|https://github.com/matope/cassandra/commit/e12fcac77f0f46bdf4104ef21c6454bfb2bb92d0].
>  LinkedHashCache is a simple fifo/lru cache that is extended by LinkedHashMap.
> Scan is a execution time to iterate through the large partition.
> So, in this issue, I'd like to propose to remove IndexInfo cache from 
> FileIndexInfoRetriever to improve the performance on large partitions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-12731) Remove IndexInfo cache from FileIndexInfoRetriever.

2016-09-30 Thread Yasuharu Goto (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yasuharu Goto updated CASSANDRA-12731:
--
Assignee: Yasuharu Goto
  Status: Patch Available  (was: Open)

> Remove IndexInfo cache from FileIndexInfoRetriever.
> ---
>
> Key: CASSANDRA-12731
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12731
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Yasuharu Goto
>Assignee: Yasuharu Goto
> Attachments: screenshot-1.png, screenshot-2.png
>
>
> Hi guys.
> In the patch of CASSANDRA-11206 , I found that FileIndexInfoRetriever 
> allocates a (potentially very large) IndexInfo array (up to the number of 
> IndexInfo in the RowIndexEntry has) as a cache in every single read path.
> After some experiments using LargePartitionTest on my MacBook, I got results 
> that show that removing FileIndexInfoRetriever improves the performance for 
> large partitions like below (latencies reduced by 41% and by 45%).
> {noformat}
> // LargePartitionsTest.test_13_4G with cache by array
> INFO  [main] 2016-09-29 23:11:25,763 ?:? - SELECTs 1 for part=4194304k 
> total=16384M took 94197 ms
> INFO  [main] 2016-09-29 23:12:50,914 ?:? - SELECTs 2 for part=4194304k 
> total=16384M took 85151 ms
> // LargePartitionsTest.test_13_4G without cache
> INFO  [main] 2016-09-30 00:13:26,050 ?:? - SELECTs 1 for part=4194304k 
> total=16384M took 55112 ms
> INFO  [main] 2016-09-30 00:14:12,132 ?:? - SELECTs 2 for part=4194304k 
> total=16384M took 46082 ms
> {noformat}
> Code is 
> [here|https://github.com/matope/cassandra/commit/86fb910a0e38f7520e1be40fb42f74a692f2ebce]
>  (based on trunk)
> Heap memory usage during running LargePartitionsTest (except for 8G test) 
> with array cache(original)
> !screenshot-1.png!
> Heap memory usage during running LargePartitionsTest (except for 8G test) 
> without cache
> !screenshot-2.png!
> Of course, I have attempted to use some collection containers instead of a 
> plain array. But I could not recognize great improvement enough to justify 
> using these cache mechanism by them. (Unless I did some mistake or overlook 
> about this test)
> || LargePartitionsTest.test_12_2G || SELECTs 1 (ms) || SELECTs 2 (ms) || Scan 
> (ms) ||
> |Original (array) | 62736 | 48562 | 41540 |
> |ConcurrentHashMap | 47597 | 30854 | 18271 |
> |ConcurrentHashMap 2nd trial |44036|26895|17443|
> |LinkedHashCache (capacity=16, limit=10, fifo) 1st|42668|32165|17323|
> |LinkedHashCache (capacity=16, limit=10, fifo) 2nd|48863|28066|18053|
> |LinkedHashCache (capacity=16, limit=16, fifo) | 46979 | 29810 | 18620 |
> |LinkedHashCache (capacity=16, limit=10, lru) | 46456 | 29749 | 20311 |
> |No Cache | 47579 | 32480 | 18337 |
> |No Cache 2nd trial | 46534 | 27670 | 18700 |
> Code that I used for this comparison is 
> [here|https://github.com/matope/cassandra/commit/e12fcac77f0f46bdf4104ef21c6454bfb2bb92d0].
>  LinkedHashCache is a simple fifo/lru cache that is extended by LinkedHashMap.
> Scan is a execution time to iterate through the large partition.
> So, in this issue, I'd like to propose to remove IndexInfo cache from 
> FileIndexInfoRetriever to improve the performance on large partitions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-12731) Remove IndexInfo cache from FileIndexInfoRetriever.

2016-09-29 Thread Yasuharu Goto (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yasuharu Goto updated CASSANDRA-12731:
--
Description: 
Hi guys.
In the patch of CASSANDRA-11206 , I found that FileIndexInfoRetriever allocates 
a (potentially very large) IndexInfo array (up to the number of IndexInfo in 
the RowIndexEntry has) as a cache in every single read path.

After some experiments using LargePartitionTest on my MacBook, I got results 
that show that removing FileIndexInfoRetriever improves the performance for 
large partitions like below (latencies reduced by 41% and by 45%).

{noformat}
// LargePartitionsTest.test_13_4G with cache by array
INFO  [main] 2016-09-29 23:11:25,763 ?:? - SELECTs 1 for part=4194304k 
total=16384M took 94197 ms
INFO  [main] 2016-09-29 23:12:50,914 ?:? - SELECTs 2 for part=4194304k 
total=16384M took 85151 ms

// LargePartitionsTest.test_13_4G without cache
INFO  [main] 2016-09-30 00:13:26,050 ?:? - SELECTs 1 for part=4194304k 
total=16384M took 55112 ms
INFO  [main] 2016-09-30 00:14:12,132 ?:? - SELECTs 2 for part=4194304k 
total=16384M took 46082 ms
{noformat}

Code is 
[here|https://github.com/matope/cassandra/commit/86fb910a0e38f7520e1be40fb42f74a692f2ebce]
 (based on trunk)

Heap memory usage during running LargePartitionsTest (except for 8G test) with 
array cache(original)
!screenshot-1.png!
Heap memory usage during running LargePartitionsTest (except for 8G test) 
without cache
!screenshot-2.png!



Of course, I have attempted to use some collection containers instead of a 
plain array. But I could not recognize great improvement enough to justify 
using these cache mechanism by them. (Unless I did some mistake or overlook 
about this test)

|| LargePartitionsTest.test_12_2G || SELECTs 1 (ms) || SELECTs 2 (ms) || Scan 
(ms) ||
|Original (array) | 62736 | 48562 | 41540 |
|ConcurrentHashMap | 47597 | 30854 | 18271 |
|ConcurrentHashMap 2nd trial |44036|26895|17443|
|LinkedHashCache (capacity=16, limit=10, fifo) 1st|42668|32165|17323|
|LinkedHashCache (capacity=16, limit=10, fifo) 2nd|48863|28066|18053|
|LinkedHashCache (capacity=16, limit=16, fifo) | 46979 | 29810 | 18620 |
|LinkedHashCache (capacity=16, limit=10, lru) | 46456 | 29749 | 20311 |
|No Cache | 47579 | 32480 | 18337 |
|No Cache 2nd trial | 46534 | 27670 | 18700 |

Code that I used for this comparison is 
[here|https://github.com/matope/cassandra/commit/e12fcac77f0f46bdf4104ef21c6454bfb2bb92d0].
 LinkedHashCache is a simple fifo/lru cache that is extended by LinkedHashMap.
Scan is a execution time to iterate through the large partition.

So, in this issue, I'd like to propose to remove IndexInfo cache from 
FileIndexInfoRetriever to improve the performance on large partitions.

  was:
Hi guys.
In the patch of CASSANDRA-11206 , I found that FileIndexInfoRetriever allocates 
a (potentially very large) IndexInfo array (up to the number of IndexInfo in 
the RowIndexEntry has) as a cache in every single read path.

After some experiments using LargePartitionTest on my MacBook, I got results 
that show that removing FileIndexInfoRetriever improves the performance for 
large partitions like below (latencies reduced by 41% and by 45%).

{noformat}
// LargePartitionsTest.test_13_4G with cache by array
INFO  [main] 2016-09-29 23:11:25,763 ?:? - SELECTs 1 for part=4194304k 
total=16384M took 94197 ms
INFO  [main] 2016-09-29 23:12:50,914 ?:? - SELECTs 2 for part=4194304k 
total=16384M took 85151 ms

// LargePartitionsTest.test_13_4G without cache
INFO  [main] 2016-09-30 00:13:26,050 ?:? - SELECTs 1 for part=4194304k 
total=16384M took 55112 ms
INFO  [main] 2016-09-30 00:14:12,132 ?:? - SELECTs 2 for part=4194304k 
total=16384M took 46082 ms
{noformat}

Code is 
[here|https://github.com/matope/cassandra/commit/86fb910a0e38f7520e1be40fb42f74a692f2ebce]
 (based on trunk)

Heap memory usage during running LargePartitionsTest (except for 8G test) with 
array cache(original)
!screenshot-1.png!
Heap memory usage during running LargePartitionsTest (except for 8G test) 
without cache
!screenshot-2.png!



Of course, I have attempted to use some collection containers instead of a 
plain array. But I could not recognize great improvement enough to justify 
using these cache mechanism by them. (Unless I did some mistake or overlook 
about this test)

|| LargePartitionsTest.test_12_2G || SELECTs 1 (ms) || SELECTs 2 (ms) || Scan 
(ms) ||
|Original (array) | 62736 | 48562 | 41540 |
|ConcurrentHashMap | 47597 | 30854 | 18271 |
|ConcurrentHashMap 2nd trial |44036|26895|17443|
|LinkedHashCache (capacity=16, limit=10, fifo) 1st|42668|32165|17323|
|LinkedHashCache (capacity=16, limit=10, fifo) 2nd|48863|28066|18053|
|LinkedHashCache (capacity=16, limit=16, fifo) | 46979 | 29810 | 18620 |
|LinkedHashCache (capacity=16, limit=10, lru) | 46456 | 29749 | 20311 |
|No Cache | 47579 | 32480 | 18337 |
|No Cache 2nd trial | 46534 | 27670 | 18700 |

Code that 

[jira] [Commented] (CASSANDRA-12731) Remove IndexInfo cache from FileIndexInfoRetriever.

2016-09-29 Thread Yasuharu Goto (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15533273#comment-15533273
 ] 

Yasuharu Goto commented on CASSANDRA-12731:
---

Patch is here.
https://github.com/matope/cassandra/commit/86fb910a0e38f7520e1be40fb42f74a692f2ebce

> Remove IndexInfo cache from FileIndexInfoRetriever.
> ---
>
> Key: CASSANDRA-12731
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12731
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Yasuharu Goto
> Attachments: screenshot-1.png, screenshot-2.png
>
>
> Hi guys.
> In the patch of CASSANDRA-11206 , I found that FileIndexInfoRetriever 
> allocates a (potentially very large) IndexInfo array (up to the number of 
> IndexInfo in the RowIndexEntry has) as a cache in every single read path.
> After some experiments using LargePartitionTest on my MacBook, I got results 
> that show that removing FileIndexInfoRetriever improves the performance for 
> large partitions like below (latencies reduced by 41% and by 45%).
> {noformat}
> // LargePartitionsTest.test_13_4G with cache by array
> INFO  [main] 2016-09-29 23:11:25,763 ?:? - SELECTs 1 for part=4194304k 
> total=16384M took 94197 ms
> INFO  [main] 2016-09-29 23:12:50,914 ?:? - SELECTs 2 for part=4194304k 
> total=16384M took 85151 ms
> // LargePartitionsTest.test_13_4G without cache
> INFO  [main] 2016-09-30 00:13:26,050 ?:? - SELECTs 1 for part=4194304k 
> total=16384M took 55112 ms
> INFO  [main] 2016-09-30 00:14:12,132 ?:? - SELECTs 2 for part=4194304k 
> total=16384M took 46082 ms
> {noformat}
> Code is 
> [here|https://github.com/matope/cassandra/commit/86fb910a0e38f7520e1be40fb42f74a692f2ebce]
>  (based on trunk)
> Heap memory usage during running LargePartitionsTest (except for 8G test) 
> with array cache(original)
> !screenshot-1.png!
> Heap memory usage during running LargePartitionsTest (except for 8G test) 
> without cache
> !screenshot-2.png!
> Of course, I have attempted to use some collection containers instead of a 
> plain array. But I could not recognize great improvement enough to justify 
> using these cache mechanism by them. (Unless I did some mistake or overlook 
> about this test)
> || LargePartitionsTest.test_12_2G || SELECTs 1 (ms) || SELECTs 2 (ms) || Scan 
> (ms) ||
> |Original (array) | 62736 | 48562 | 41540 |
> |ConcurrentHashMap | 47597 | 30854 | 18271 |
> |ConcurrentHashMap 2nd trial |44036|26895|17443|
> |LinkedHashCache (capacity=16, limit=10, fifo) 1st|42668|32165|17323|
> |LinkedHashCache (capacity=16, limit=10, fifo) 2nd|48863|28066|18053|
> |LinkedHashCache (capacity=16, limit=16, fifo) | 46979 | 29810 | 18620 |
> |LinkedHashCache (capacity=16, limit=10, lru) | 46456 | 29749 | 20311 |
> |No Cache | 47579 | 32480 | 18337 |
> |No Cache 2nd trial | 46534 | 27670 | 18700 |
> Code that I used for this comparison is 
> [here|https://github.com/matope/cassandra/commit/e12fcac77f0f46bdf4104ef21c6454bfb2bb92d0].
>  LinkedHashCache is a simple fifo/lru cache that is extended by LinkedHashMap.
> Scan is a execution time to iterate through the large partition.
> So, In this issue, I'd like to propose to remove IndexInfo cache from 
> FileIndexInfoRetriever to improve the performance on large partition.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-12731) Remove IndexInfo cache from FileIndexInfoRetriever.

2016-09-29 Thread Yasuharu Goto (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yasuharu Goto updated CASSANDRA-12731:
--
Description: 
Hi guys.
In the patch of CASSANDRA-11206 , I found that FileIndexInfoRetriever allocates 
a (potentially very large) IndexInfo array (up to the number of IndexInfo in 
the RowIndexEntry has) as a cache in every single read path.

After some experiments using LargePartitionTest on my MacBook, I got results 
that show that removing FileIndexInfoRetriever improves the performance for 
large partitions like below (latencies reduced by 41% and by 45%).

{noformat}
// LargePartitionsTest.test_13_4G with cache by array
INFO  [main] 2016-09-29 23:11:25,763 ?:? - SELECTs 1 for part=4194304k 
total=16384M took 94197 ms
INFO  [main] 2016-09-29 23:12:50,914 ?:? - SELECTs 2 for part=4194304k 
total=16384M took 85151 ms

// LargePartitionsTest.test_13_4G without cache
INFO  [main] 2016-09-30 00:13:26,050 ?:? - SELECTs 1 for part=4194304k 
total=16384M took 55112 ms
INFO  [main] 2016-09-30 00:14:12,132 ?:? - SELECTs 2 for part=4194304k 
total=16384M took 46082 ms
{noformat}

Code is 
[here|https://github.com/matope/cassandra/commit/86fb910a0e38f7520e1be40fb42f74a692f2ebce]
 (based on trunk)

Heap memory usage during running LargePartitionsTest (except for 8G test) with 
array cache(original)
!screenshot-1.png!
Heap memory usage during running LargePartitionsTest (except for 8G test) 
without cache
!screenshot-2.png!



Of course, I have attempted to use some collection containers instead of a 
plain array. But I could not recognize great improvement enough to justify 
using these cache mechanism by them. (Unless I did some mistake or overlook 
about this test)

|| LargePartitionsTest.test_12_2G || SELECTs 1 (ms) || SELECTs 2 (ms) || Scan 
(ms) ||
|Original (array) | 62736 | 48562 | 41540 |
|ConcurrentHashMap | 47597 | 30854 | 18271 |
|ConcurrentHashMap 2nd trial |44036|26895|17443|
|LinkedHashCache (capacity=16, limit=10, fifo) 1st|42668|32165|17323|
|LinkedHashCache (capacity=16, limit=10, fifo) 2nd|48863|28066|18053|
|LinkedHashCache (capacity=16, limit=16, fifo) | 46979 | 29810 | 18620 |
|LinkedHashCache (capacity=16, limit=10, lru) | 46456 | 29749 | 20311 |
|No Cache | 47579 | 32480 | 18337 |
|No Cache 2nd trial | 46534 | 27670 | 18700 |

Code that I used for this comparison is 
[here|https://github.com/matope/cassandra/commit/e12fcac77f0f46bdf4104ef21c6454bfb2bb92d0].
 LinkedHashCache is a simple fifo/lru cache that is extended by LinkedHashMap.
Scan is a execution time to iterate through the large partition.

So, In this issue, I'd like to propose to remove IndexInfo cache from 
FileIndexInfoRetriever to improve the performance on large partition.

  was:
Hi guys.
In the patch of CASSANDRA-11206 , I found that FileIndexInfoRetriever allocates 
a very large IndexInfo array (up to the number of IndexInfo in the 
RowIndexEntry has) as a cache in every single read path.

After some experiments using LargePartitionTest on my MacBook, I got results 
that show that removing FileIndexInfoRetriever improves the performance for 
large partitions like below (latencies reduced by 41% and by 45%).

{noformat}
// LargePartitionsTest.test_13_4G with cache by array
INFO  [main] 2016-09-29 23:11:25,763 ?:? - SELECTs 1 for part=4194304k 
total=16384M took 94197 ms
INFO  [main] 2016-09-29 23:12:50,914 ?:? - SELECTs 2 for part=4194304k 
total=16384M took 85151 ms

// LargePartitionsTest.test_13_4G without cache
INFO  [main] 2016-09-30 00:13:26,050 ?:? - SELECTs 1 for part=4194304k 
total=16384M took 55112 ms
INFO  [main] 2016-09-30 00:14:12,132 ?:? - SELECTs 2 for part=4194304k 
total=16384M took 46082 ms
{noformat}

Code is 
[here|https://github.com/matope/cassandra/commit/86fb910a0e38f7520e1be40fb42f74a692f2ebce]
 (based on trunk)

Heap memory usage during running LargePartitionsTest (except for 8G test) with 
array cache(original)
!screenshot-1.png!
Heap memory usage during running LargePartitionsTest (except for 8G test) 
without cache
!screenshot-2.png!



Of course, I have attempted to use some collection containers instead of a 
plain array. But I could not recognize great improvement enough to justify 
using these cache mechanism by them. (Unless I did some mistake or overlook 
about this test)

|| LargePartitionsTest.test_12_2G || SELECTs 1 (ms) || SELECTs 2 (ms) || Scan 
(ms) ||
|Original (array) | 62736 | 48562 | 41540 |
|ConcurrentHashMap | 47597 | 30854 | 18271 |
|ConcurrentHashMap 2nd trial |44036|26895|17443|
|LinkedHashCache (capacity=16, limit=10, fifo) 1st|42668|32165|17323|
|LinkedHashCache (capacity=16, limit=10, fifo) 2nd|48863|28066|18053|
|LinkedHashCache (capacity=16, limit=16, fifo) | 46979 | 29810 | 18620 |
|LinkedHashCache (capacity=16, limit=10, lru) | 46456 | 29749 | 20311 |
|No Cache | 47579 | 32480 | 18337 |
|No Cache 2nd trial | 46534 | 27670 | 18700 |

Code that I used for 

[jira] [Updated] (CASSANDRA-12731) Remove IndexInfo cache from FileIndexInfoRetriever.

2016-09-29 Thread Yasuharu Goto (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yasuharu Goto updated CASSANDRA-12731:
--
Description: 
Hi guys.
In the patch of CASSANDRA-11206 , I found that FileIndexInfoRetriever allocates 
a very large IndexInfo array (up to the number of IndexInfo in the 
RowIndexEntry has) as a cache in every single read path.

After some experiments using LargePartitionTest on my MacBook, I got results 
that show that removing FileIndexInfoRetriever improves the performance for 
large partitions like below (latencies reduced by 41% and by 45%).

{noformat}
// LargePartitionsTest.test_13_4G with cache by array
INFO  [main] 2016-09-29 23:11:25,763 ?:? - SELECTs 1 for part=4194304k 
total=16384M took 94197 ms
INFO  [main] 2016-09-29 23:12:50,914 ?:? - SELECTs 2 for part=4194304k 
total=16384M took 85151 ms

// LargePartitionsTest.test_13_4G without cache
INFO  [main] 2016-09-30 00:13:26,050 ?:? - SELECTs 1 for part=4194304k 
total=16384M took 55112 ms
INFO  [main] 2016-09-30 00:14:12,132 ?:? - SELECTs 2 for part=4194304k 
total=16384M took 46082 ms
{noformat}

Code is 
[here|https://github.com/matope/cassandra/commit/86fb910a0e38f7520e1be40fb42f74a692f2ebce]
 (based on trunk)

Heap memory usage during running LargePartitionsTest (except for 8G test) with 
array cache(original)
!screenshot-1.png!
Heap memory usage during running LargePartitionsTest (except for 8G test) 
without cache
!screenshot-2.png!



Of course, I have attempted to use some collection containers instead of a 
plain array. But I could not recognize great improvement enough to justify 
using these cache mechanism by them. (Unless I did some mistake or overlook 
about this test)

|| LargePartitionsTest.test_12_2G || SELECTs 1 (ms) || SELECTs 2 (ms) || Scan 
(ms) ||
|Original (array) | 62736 | 48562 | 41540 |
|ConcurrentHashMap | 47597 | 30854 | 18271 |
|ConcurrentHashMap 2nd trial |44036|26895|17443|
|LinkedHashCache (capacity=16, limit=10, fifo) 1st|42668|32165|17323|
|LinkedHashCache (capacity=16, limit=10, fifo) 2nd|48863|28066|18053|
|LinkedHashCache (capacity=16, limit=16, fifo) | 46979 | 29810 | 18620 |
|LinkedHashCache (capacity=16, limit=10, lru) | 46456 | 29749 | 20311 |
|No Cache | 47579 | 32480 | 18337 |
|No Cache 2nd trial | 46534 | 27670 | 18700 |

Code that I used for this comparison is 
[here|https://github.com/matope/cassandra/commit/e12fcac77f0f46bdf4104ef21c6454bfb2bb92d0].
 LinkedHashCache is a simple fifo/lru cache that is extended by LinkedHashMap.
Scan is a execution time to iterate through the large partition.

So, In this issue, I'd like to propose to remove IndexInfo cache from 
FileIndexInfoRetriever to improve the performance on large partition.

  was:
Hi guys.
In the patch of CASSANDRA-11206 , I found that FileIndexInfoRetriever allocates 
a very large IndexInfo array (up to the number of IndexInfo in the 
RowIndexEntry has) as a cache in every single read path.

After some experiments using LargePartitionTest on my MacBook, I got results 
that show that removing FileIndexInfoRetriever improves the performance for 
large partitions like below (latencies reduced by 41% and by 45%).

{noformat}
// LargePartitionsTest.test_13_4G with cache by array
INFO  [main] 2016-09-29 23:11:25,763 ?:? - SELECTs 1 for part=4194304k 
total=16384M took 94197 ms
INFO  [main] 2016-09-29 23:12:50,914 ?:? - SELECTs 2 for part=4194304k 
total=16384M took 85151 ms

// LargePartitionsTest.test_13_4G without cache
INFO  [main] 2016-09-30 00:13:26,050 ?:? - SELECTs 1 for part=4194304k 
total=16384M took 55112 ms
INFO  [main] 2016-09-30 00:14:12,132 ?:? - SELECTs 2 for part=4194304k 
total=16384M took 46082 ms
{noformat}

Code is 
[here|https://github.com/matope/cassandra/commit/86fb910a0e38f7520e1be40fb42f74a692f2ebce]
 (based on trunk)

Heap memory usage during running LargePartitionsTest (except for 8G test) with 
array cache(original)
!screenshot-1.png!
Heap memory usage during running LargePartitionsTest (except for 8G test) 
without cache
!screenshot-2.png!



Of course, I have attempted to use some collection containers instead of a 
plain array. But I could not recognize great improvement enough to justify 
using these cache mechanism by them. (Unless I did some mistake or overlook 
about this test)

|| LargePartitionsTest.test_12_2G || SELECTs 1 (ms) || SELECTs 2 (ms) || Scan 
(ms) ||
|Original (array) | 62736 | 48562 | 41540 |
|ConcurrentHashMap 1st| 47597 | 30854 | 18271 |
|ConcurrentHashMap 2nd|44036|26895|17443|
|LinkedHashCache (capacity=16, limit=10, fifo) 1st|42668|32165|17323|
|LinkedHashCache (capacity=16, limit=10, fifo) 2nd|48863|28066|18053|
|LinkedHashCache (capacity=16, limit=16, fifo) | 46979 | 29810 | 18620 |
|LinkedHashCache (capacity=16, limit=10, lru) | 46456 | 29749 | 20311 |
|No Cache 1st | 47579 | 32480 | 18337 |
|No Cache 2nd | 46534 | 27670 | 18700 |

Code that I used for this comparison is 

[jira] [Updated] (CASSANDRA-12731) Remove IndexInfo cache from FileIndexInfoRetriever.

2016-09-29 Thread Yasuharu Goto (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yasuharu Goto updated CASSANDRA-12731:
--
Description: 
Hi guys.
In the patch of CASSANDRA-11206 , I found that FileIndexInfoRetriever allocates 
a very large IndexInfo array (up to the number of IndexInfo in the 
RowIndexEntry has) as a cache in every single read path.

After some experiments using LargePartitionTest on my MacBook, I got results 
that show that removing FileIndexInfoRetriever improves the performance for 
large partitions like below (latencies reduced by 41% and by 45%).

{noformat}
// LargePartitionsTest.test_13_4G with cache by array
INFO  [main] 2016-09-29 23:11:25,763 ?:? - SELECTs 1 for part=4194304k 
total=16384M took 94197 ms
INFO  [main] 2016-09-29 23:12:50,914 ?:? - SELECTs 2 for part=4194304k 
total=16384M took 85151 ms

// LargePartitionsTest.test_13_4G without cache
INFO  [main] 2016-09-30 00:13:26,050 ?:? - SELECTs 1 for part=4194304k 
total=16384M took 55112 ms
INFO  [main] 2016-09-30 00:14:12,132 ?:? - SELECTs 2 for part=4194304k 
total=16384M took 46082 ms
{noformat}

Code is 
[here|https://github.com/matope/cassandra/commit/86fb910a0e38f7520e1be40fb42f74a692f2ebce]
 (based on trunk)

Heap memory usage during running LargePartitionsTest (except for 8G test) with 
array cache(original)
!screenshot-1.png!
Heap memory usage during running LargePartitionsTest (except for 8G test) 
without cache
!screenshot-2.png!



Of course, I have attempted to use some collection containers instead of a 
plain array. But I could not recognize great improvement enough to justify 
using these cache mechanism by them. (Unless I did some mistake or overlook 
about this test)

|| LargePartitionsTest.test_12_2G || SELECTs 1 (ms) || SELECTs 2 (ms) || Scan 
(ms) ||
|Original (array) | 62736 | 48562 | 41540 |
|ConcurrentHashMap 1st| 47597 | 30854 | 18271 |
|ConcurrentHashMap 2nd|44036|26895|17443|
|LinkedHashCache (capacity=16, limit=10, fifo) 1st|42668|32165|17323|
|LinkedHashCache (capacity=16, limit=10, fifo) 2nd|48863|28066|18053|
|LinkedHashCache (capacity=16, limit=16, fifo) | 46979 | 29810 | 18620 |
|LinkedHashCache (capacity=16, limit=10, lru) | 46456 | 29749 | 20311 |
|No Cache 1st | 47579 | 32480 | 18337 |
|No Cache 2nd | 46534 | 27670 | 18700 |

Code that I used for this comparison is 
[here|https://github.com/matope/cassandra/commit/e12fcac77f0f46bdf4104ef21c6454bfb2bb92d0].
 LinkedHashCache is a simple fifo/lru cache that is extended by LinkedHashMap.
Scan is a execution time to iterate through the large partition.

So, In this issue, I'd like to propose to remove IndexInfo cache from 
FileIndexInfoRetriever to improve the performance on large partition.

  was:
Hi guys.
In the patch of CASSANDRA-11206 , I found that FileIndexInfoRetriever allocates 
a very large IndexInfo array (up to the number of IndexInfo in the 
RowIndexEntry has) as a cache in every single read path.

After some experiments using LargePartitionTest on my MacBook, I got results 
that show that removing FileIndexInfoRetriever improves the performance for 
large partitions like below (latencies reduced by 41% and by 45%).

{noformat}
// LargePartitionsTest.test_13_4G with cache by array
INFO  [main] 2016-09-29 23:11:25,763 ?:? - SELECTs 1 for part=4194304k 
total=16384M took 94197 ms
INFO  [main] 2016-09-29 23:12:50,914 ?:? - SELECTs 2 for part=4194304k 
total=16384M took 85151 ms

// LargePartitionsTest.test_13_4G without cache
INFO  [main] 2016-09-30 00:13:26,050 ?:? - SELECTs 1 for part=4194304k 
total=16384M took 55112 ms
INFO  [main] 2016-09-30 00:14:12,132 ?:? - SELECTs 2 for part=4194304k 
total=16384M took 46082 ms
{noformat}

Code is 
[here|https://github.com/matope/cassandra/commit/86fb910a0e38f7520e1be40fb42f74a692f2ebce]
 (based on trunk)

Of course, I have attempted to use some collection containers instead of a 
plain array. But I could not recognize great improvement enough to justify 
using these cache mechanism by them. (Unless I did some mistake or overlook 
about this test)

|| LargePartitionsTest.test_12_2G || SELECTs 1 (ms) || SELECTs 2 (ms) || Scan 
(ms) ||
|Original (array) | 62736 | 48562 | 41540 |
|ConcurrentHashMap 1st| 47597 | 30854 | 18271 |
|ConcurrentHashMap 2nd|44036|26895|17443|
|LinkedHashCache (capacity=16, limit=10, fifo) 1st|42668|32165|17323|
|LinkedHashCache (capacity=16, limit=10, fifo) 2nd|48863|28066|18053|
|LinkedHashCache (capacity=16, limit=16, fifo) | 46979 | 29810 | 18620 |
|LinkedHashCache (capacity=16, limit=10, lru) | 46456 | 29749 | 20311 |
|No Cache 1st | 47579 | 32480 | 18337 |
|No Cache 2nd | 46534 | 27670 | 18700 |

Code that I used for this comparison is 
[here|https://github.com/matope/cassandra/commit/e12fcac77f0f46bdf4104ef21c6454bfb2bb92d0].
 LinkedHashCache is a simple fifo/lru cache that is extended by LinkedHashMap.
Scan is a execution time to iterate through the large partition.

So, 

[jira] [Updated] (CASSANDRA-12731) Remove IndexInfo cache from FileIndexInfoRetriever.

2016-09-29 Thread Yasuharu Goto (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yasuharu Goto updated CASSANDRA-12731:
--
Attachment: screenshot-2.png

> Remove IndexInfo cache from FileIndexInfoRetriever.
> ---
>
> Key: CASSANDRA-12731
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12731
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Yasuharu Goto
> Attachments: screenshot-1.png, screenshot-2.png
>
>
> Hi guys.
> In the patch of CASSANDRA-11206 , I found that FileIndexInfoRetriever 
> allocates a very large IndexInfo array (up to the number of IndexInfo in the 
> RowIndexEntry has) as a cache in every single read path.
> After some experiments using LargePartitionTest on my MacBook, I got results 
> that show that removing FileIndexInfoRetriever improves the performance for 
> large partitions like below (latencies reduced by 41% and by 45%).
> {noformat}
> // LargePartitionsTest.test_13_4G with cache by array
> INFO  [main] 2016-09-29 23:11:25,763 ?:? - SELECTs 1 for part=4194304k 
> total=16384M took 94197 ms
> INFO  [main] 2016-09-29 23:12:50,914 ?:? - SELECTs 2 for part=4194304k 
> total=16384M took 85151 ms
> // LargePartitionsTest.test_13_4G without cache
> INFO  [main] 2016-09-30 00:13:26,050 ?:? - SELECTs 1 for part=4194304k 
> total=16384M took 55112 ms
> INFO  [main] 2016-09-30 00:14:12,132 ?:? - SELECTs 2 for part=4194304k 
> total=16384M took 46082 ms
> {noformat}
> Code is 
> [here|https://github.com/matope/cassandra/commit/86fb910a0e38f7520e1be40fb42f74a692f2ebce]
>  (based on trunk)
> Of course, I have attempted to use some collection containers instead of a 
> plain array. But I could not recognize great improvement enough to justify 
> using these cache mechanism by them. (Unless I did some mistake or overlook 
> about this test)
> || LargePartitionsTest.test_12_2G || SELECTs 1 (ms) || SELECTs 2 (ms) || Scan 
> (ms) ||
> |Original (array) | 62736 | 48562 | 41540 |
> |ConcurrentHashMap 1st| 47597 | 30854 | 18271 |
> |ConcurrentHashMap 2nd|44036|26895|17443|
> |LinkedHashCache (capacity=16, limit=10, fifo) 1st|42668|32165|17323|
> |LinkedHashCache (capacity=16, limit=10, fifo) 2nd|48863|28066|18053|
> |LinkedHashCache (capacity=16, limit=16, fifo) | 46979 | 29810 | 18620 |
> |LinkedHashCache (capacity=16, limit=10, lru) | 46456 | 29749 | 20311 |
> |No Cache 1st | 47579 | 32480 | 18337 |
> |No Cache 2nd | 46534 | 27670 | 18700 |
> Code that I used for this comparison is 
> [here|https://github.com/matope/cassandra/commit/e12fcac77f0f46bdf4104ef21c6454bfb2bb92d0].
>  LinkedHashCache is a simple fifo/lru cache that is extended by LinkedHashMap.
> Scan is a execution time to iterate through the large partition.
> So, In this issue, I'd like to propose to remove IndexInfo cache from 
> FileIndexInfoRetriever to improve the performance on large partition.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-12731) Remove IndexInfo cache from FileIndexInfoRetriever.

2016-09-29 Thread Yasuharu Goto (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yasuharu Goto updated CASSANDRA-12731:
--
Attachment: screenshot-1.png

> Remove IndexInfo cache from FileIndexInfoRetriever.
> ---
>
> Key: CASSANDRA-12731
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12731
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Yasuharu Goto
> Attachments: screenshot-1.png, screenshot-2.png
>
>
> Hi guys.
> In the patch of CASSANDRA-11206 , I found that FileIndexInfoRetriever 
> allocates a very large IndexInfo array (up to the number of IndexInfo in the 
> RowIndexEntry has) as a cache in every single read path.
> After some experiments using LargePartitionTest on my MacBook, I got results 
> that show that removing FileIndexInfoRetriever improves the performance for 
> large partitions like below (latencies reduced by 41% and by 45%).
> {noformat}
> // LargePartitionsTest.test_13_4G with cache by array
> INFO  [main] 2016-09-29 23:11:25,763 ?:? - SELECTs 1 for part=4194304k 
> total=16384M took 94197 ms
> INFO  [main] 2016-09-29 23:12:50,914 ?:? - SELECTs 2 for part=4194304k 
> total=16384M took 85151 ms
> // LargePartitionsTest.test_13_4G without cache
> INFO  [main] 2016-09-30 00:13:26,050 ?:? - SELECTs 1 for part=4194304k 
> total=16384M took 55112 ms
> INFO  [main] 2016-09-30 00:14:12,132 ?:? - SELECTs 2 for part=4194304k 
> total=16384M took 46082 ms
> {noformat}
> Code is 
> [here|https://github.com/matope/cassandra/commit/86fb910a0e38f7520e1be40fb42f74a692f2ebce]
>  (based on trunk)
> Of course, I have attempted to use some collection containers instead of a 
> plain array. But I could not recognize great improvement enough to justify 
> using these cache mechanism by them. (Unless I did some mistake or overlook 
> about this test)
> || LargePartitionsTest.test_12_2G || SELECTs 1 (ms) || SELECTs 2 (ms) || Scan 
> (ms) ||
> |Original (array) | 62736 | 48562 | 41540 |
> |ConcurrentHashMap 1st| 47597 | 30854 | 18271 |
> |ConcurrentHashMap 2nd|44036|26895|17443|
> |LinkedHashCache (capacity=16, limit=10, fifo) 1st|42668|32165|17323|
> |LinkedHashCache (capacity=16, limit=10, fifo) 2nd|48863|28066|18053|
> |LinkedHashCache (capacity=16, limit=16, fifo) | 46979 | 29810 | 18620 |
> |LinkedHashCache (capacity=16, limit=10, lru) | 46456 | 29749 | 20311 |
> |No Cache 1st | 47579 | 32480 | 18337 |
> |No Cache 2nd | 46534 | 27670 | 18700 |
> Code that I used for this comparison is 
> [here|https://github.com/matope/cassandra/commit/e12fcac77f0f46bdf4104ef21c6454bfb2bb92d0].
>  LinkedHashCache is a simple fifo/lru cache that is extended by LinkedHashMap.
> Scan is a execution time to iterate through the large partition.
> So, In this issue, I'd like to propose to remove IndexInfo cache from 
> FileIndexInfoRetriever to improve the performance on large partition.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-12731) Remove IndexInfo cache from FileIndexInfoRetriever.

2016-09-29 Thread Yasuharu Goto (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yasuharu Goto updated CASSANDRA-12731:
--
Description: 
Hi guys.
In the patch of CASSANDRA-11206 , I found that FileIndexInfoRetriever allocates 
a very large IndexInfo array (up to the number of IndexInfo in the 
RowIndexEntry has) as a cache in every single read path.

After some experiments using LargePartitionTest on my MacBook, I got results 
that show that removing FileIndexInfoRetriever improves the performance for 
large partitions like below (latencies reduced by 41% and by 45%).

{noformat}
// LargePartitionsTest.test_13_4G with cache by array
INFO  [main] 2016-09-29 23:11:25,763 ?:? - SELECTs 1 for part=4194304k 
total=16384M took 94197 ms
INFO  [main] 2016-09-29 23:12:50,914 ?:? - SELECTs 2 for part=4194304k 
total=16384M took 85151 ms

// LargePartitionsTest.test_13_4G without cache
INFO  [main] 2016-09-30 00:13:26,050 ?:? - SELECTs 1 for part=4194304k 
total=16384M took 55112 ms
INFO  [main] 2016-09-30 00:14:12,132 ?:? - SELECTs 2 for part=4194304k 
total=16384M took 46082 ms
{noformat}

Code is 
[here|https://github.com/matope/cassandra/commit/86fb910a0e38f7520e1be40fb42f74a692f2ebce]
 (based on trunk)

Of course, I have attempted to use some collection containers instead of a 
plain array. But I could not recognize great improvement enough to justify 
using these cache mechanism by them. (Unless I did some mistake or overlook 
about this test)

|| LargePartitionsTest.test_12_2G || SELECTs 1 (ms) || SELECTs 2 (ms) || Scan 
(ms) ||
|Original (array) | 62736 | 48562 | 41540 |
|ConcurrentHashMap 1st| 47597 | 30854 | 18271 |
|ConcurrentHashMap 2nd|44036|26895|17443|
|LinkedHashCache (capacity=16, limit=10, fifo) 1st|42668|32165|17323|
|LinkedHashCache (capacity=16, limit=10, fifo) 2nd|48863|28066|18053|
|LinkedHashCache (capacity=16, limit=16, fifo) | 46979 | 29810 | 18620 |
|LinkedHashCache (capacity=16, limit=10, lru) | 46456 | 29749 | 20311 |
|No Cache 1st | 47579 | 32480 | 18337 |
|No Cache 2nd | 46534 | 27670 | 18700 |

Code that I used for this comparison is 
[here|https://github.com/matope/cassandra/commit/e12fcac77f0f46bdf4104ef21c6454bfb2bb92d0].
 LinkedHashCache is a simple fifo/lru cache that is extended by LinkedHashMap.
Scan is a execution time to iterate through the large partition.

So, In this issue, I'd like to propose to remove IndexInfo cache from 
FileIndexInfoRetriever to improve the performance on large partition.

  was:
Hi guys.
In the patch of CASSANDRA-11206 , I found that FileIndexInfoRetriever allocates 
a very large IndexInfo array (up to the number of IndexInfo in the 
RowIndexEntry has) as a cache in every single read path.

After some experiments using LargePartitionTest , I got results that show that 
removing FileIndexInfoRetriever improves the performance for large partitions 
like below (latencies reduced by 41% and by 45%).

{noformat}
// LargePartitionsTest.test_13_4G with cache by array
INFO  [main] 2016-09-29 23:11:25,763 ?:? - SELECTs 1 for part=4194304k 
total=16384M took 94197 ms
INFO  [main] 2016-09-29 23:12:50,914 ?:? - SELECTs 2 for part=4194304k 
total=16384M took 85151 ms

// LargePartitionsTest.test_13_4G without cache
INFO  [main] 2016-09-30 00:13:26,050 ?:? - SELECTs 1 for part=4194304k 
total=16384M took 55112 ms
INFO  [main] 2016-09-30 00:14:12,132 ?:? - SELECTs 2 for part=4194304k 
total=16384M took 46082 ms
{noformat}

Code is 
[here|https://github.com/matope/cassandra/commit/86fb910a0e38f7520e1be40fb42f74a692f2ebce]
 (based on trunk)

Of course, I have attempted to use some collection containers instead of a 
plain array. But I could not recognize great improvement enough to justify 
using these cache mechanism by them. (Unless I did some mistake or overlook 
about this test)

|| LargePartitionsTest.test_12_2G || SELECTs 1 (ms) || SELECTs 2 (ms) || Scan 
(ms) ||
|Original (array) | 62736 | 48562 | 41540 |
|ConcurrentHashMap 1st| 47597 | 30854 | 18271 |
|ConcurrentHashMap 2nd|44036|26895|17443|
|LinkedHashCache (capacity=16, limit=10, fifo) 1st|42668|32165|17323|
|LinkedHashCache (capacity=16, limit=10, fifo) 2nd|48863|28066|18053|
|LinkedHashCache (capacity=16, limit=16, fifo) | 46979 | 29810 | 18620 |
|LinkedHashCache (capacity=16, limit=10, lru) | 46456 | 29749 | 20311 |
|No Cache 1st | 47579 | 32480 | 18337 |
|No Cache 2nd | 46534 | 27670 | 18700 |

Code that I used for this comparison is 
[here|https://github.com/matope/cassandra/commit/e12fcac77f0f46bdf4104ef21c6454bfb2bb92d0].
 LinkedHashCache is a simple fifo/lru cache that is extended by LinkedHashMap.
Scan is a execution time to iterate through the large partition.

So, In this issue, I'd like to propose to remove IndexInfo cache from 
FileIndexInfoRetriever to improve the performance on large partition.


> Remove IndexInfo cache from FileIndexInfoRetriever.
> 

[jira] [Created] (CASSANDRA-12731) Remove IndexInfo cache from FileIndexInfoRetriever.

2016-09-29 Thread Yasuharu Goto (JIRA)
Yasuharu Goto created CASSANDRA-12731:
-

 Summary: Remove IndexInfo cache from FileIndexInfoRetriever.
 Key: CASSANDRA-12731
 URL: https://issues.apache.org/jira/browse/CASSANDRA-12731
 Project: Cassandra
  Issue Type: Improvement
Reporter: Yasuharu Goto


Hi guys.
In the patch of CASSANDRA-11206 , I found that FileIndexInfoRetriever allocates 
a very large IndexInfo array (up to the number of IndexInfo in the 
RowIndexEntry has) as a cache in every single read path.

After some experiments using LargePartitionTest , I got results that show that 
removing FileIndexInfoRetriever improves the performance for large partitions 
like below (latencies reduced by 41% and by 45%).

{noformat}
// LargePartitionsTest.test_13_4G with cache by array
INFO  [main] 2016-09-29 23:11:25,763 ?:? - SELECTs 1 for part=4194304k 
total=16384M took 94197 ms
INFO  [main] 2016-09-29 23:12:50,914 ?:? - SELECTs 2 for part=4194304k 
total=16384M took 85151 ms

// LargePartitionsTest.test_13_4G without cache
INFO  [main] 2016-09-30 00:13:26,050 ?:? - SELECTs 1 for part=4194304k 
total=16384M took 55112 ms
INFO  [main] 2016-09-30 00:14:12,132 ?:? - SELECTs 2 for part=4194304k 
total=16384M took 46082 ms
{noformat}

Code is 
[here|https://github.com/matope/cassandra/commit/86fb910a0e38f7520e1be40fb42f74a692f2ebce]
 (based on trunk)

Of course, I have attempted to use some collection containers instead of a 
plain array. But I could not recognize great improvement enough to justify 
using these cache mechanism by them. (Unless I did some mistake or overlook 
about this test)

|| LargePartitionsTest.test_12_2G || SELECTs 1 (ms) || SELECTs 2 (ms) || Scan 
(ms) ||
|Original (array) | 62736 | 48562 | 41540 |
|ConcurrentHashMap 1st| 47597 | 30854 | 18271 |
|ConcurrentHashMap 2nd|44036|26895|17443|
|LinkedHashCache (capacity=16, limit=10, fifo) 1st|42668|32165|17323|
|LinkedHashCache (capacity=16, limit=10, fifo) 2nd|48863|28066|18053|
|LinkedHashCache (capacity=16, limit=16, fifo) | 46979 | 29810 | 18620 |
|LinkedHashCache (capacity=16, limit=10, lru) | 46456 | 29749 | 20311 |
|No Cache 1st | 47579 | 32480 | 18337 |
|No Cache 2nd | 46534 | 27670 | 18700 |

Code that I used for this comparison is 
[here|https://github.com/matope/cassandra/commit/e12fcac77f0f46bdf4104ef21c6454bfb2bb92d0].
 LinkedHashCache is a simple fifo/lru cache that is extended by LinkedHashMap.
Scan is a execution time to iterate through the large partition.

So, In this issue, I'd like to propose to remove IndexInfo cache from 
FileIndexInfoRetriever to improve the performance on large partition.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-12717) IllegalArgumentException in CompactionTask

2016-09-29 Thread Yasuharu Goto (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yasuharu Goto updated CASSANDRA-12717:
--
Summary: IllegalArgumentException in CompactionTask  (was: Fix 
IllegalArgumentException in CompactionTask)

> IllegalArgumentException in CompactionTask
> --
>
> Key: CASSANDRA-12717
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12717
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Yasuharu Goto
>Assignee: Yasuharu Goto
>
> When I was ran LargePartitionsTest.test_11_1G at trunk, I found that this 
> test fails due to a java.lang.IllegalArgumentException during compaction.
> This exception apparently happens when the compaction merges a large (>2GB) 
> partition.
> {noformat}
> DEBUG [COMMIT-LOG-ALLOCATOR] 2016-09-28 00:32:48,074 ?:? - No segments in 
> reserve; creating a fresh one
> DEBUG [COMMIT-LOG-ALLOCATOR] 2016-09-28 00:32:48,437 ?:? - No segments in 
> reserve; creating a fresh one
> WARN  [CompactionExecutor:14] 2016-09-28 00:32:48,463 ?:? - Writing large 
> partition cql_test_keyspace/table_4:10 (1.004GiB)
> ERROR [CompactionExecutor:14] 2016-09-28 00:32:49,734 ?:? - Fatal exception 
> in thread Thread[CompactionExecutor:14,1,main]
> java.lang.IllegalArgumentException: Out of range: 2234434614
> at com.google.common.primitives.Ints.checkedCast(Ints.java:91) 
> ~[guava-18.0.jar:na]
> at 
> org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:206)
>  ~[main/:na]
> at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) 
> ~[main/:na]
> at 
> org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:85)
>  ~[main/:na]
> at 
> org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:61)
>  ~[main/:na]
> at 
> org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionCandidate.run(CompactionManager.java:267)
>  ~[main/:na]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> ~[na:1.8.0_77]
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
> ~[na:1.8.0_77]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  ~[na:1.8.0_77]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  [na:1.8.0_77]
> at java.lang.Thread.run(Thread.java:745) [na:1.8.0_77]
> DEBUG [COMMIT-LOG-ALLOCATOR] 2016-09-28 00:32:49,909 ?:? - No segments in 
> reserve; creating a fresh one
> DEBUG [COMMIT-LOG-ALLOCATOR] 2016-09-28 00:32:50,148 ?:? - No segments in 
> reserve; creating a fresh one
> DEBUG [COMMIT-LOG-ALLOCATOR] 2016-09-28 00:32:50,385 ?:? - No segments in 
> reserve; creating a fresh one
> DEBUG [COMMIT-LOG-ALLOCATOR] 2016-09-28 00:32:50,620 ?:? - No segments in 
> reserve; creating a fresh one
> {noformat}
> {noformat}
> java.lang.RuntimeException: java.util.concurrent.ExecutionException: 
> java.lang.IllegalArgumentException: Out of range: 2540348821
> at org.apache.cassandra.utils.Throwables.maybeFail(Throwables.java:51)
> at 
> org.apache.cassandra.utils.FBUtilities.waitOnFutures(FBUtilities.java:393)
> at 
> org.apache.cassandra.db.compaction.CompactionManager.performMaximal(CompactionManager.java:695)
> at 
> org.apache.cassandra.db.ColumnFamilyStore.forceMajorCompaction(ColumnFamilyStore.java:2066)
> at 
> org.apache.cassandra.db.ColumnFamilyStore.forceMajorCompaction(ColumnFamilyStore.java:2061)
> at org.apache.cassandra.cql3.CQLTester.compact(CQLTester.java:426)
> at 
> org.apache.cassandra.io.sstable.LargePartitionsTest.lambda$withPartitionSize$2(LargePartitionsTest.java:92)
> at 
> org.apache.cassandra.io.sstable.LargePartitionsTest.measured(LargePartitionsTest.java:50)
> at 
> org.apache.cassandra.io.sstable.LargePartitionsTest.withPartitionSize(LargePartitionsTest.java:90)
> at 
> org.apache.cassandra.io.sstable.LargePartitionsTest.test_11_1G(LargePartitionsTest.java:198)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44)
> at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
> at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41)
> at 
> 

[jira] [Updated] (CASSANDRA-12717) Fix IllegalArgumentException in CompactionTask

2016-09-27 Thread Yasuharu Goto (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yasuharu Goto updated CASSANDRA-12717:
--
Status: Patch Available  (was: Open)

> Fix IllegalArgumentException in CompactionTask
> --
>
> Key: CASSANDRA-12717
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12717
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Yasuharu Goto
>Assignee: Yasuharu Goto
>
> When I was ran LargePartitionsTest.test_11_1G at trunk, I found that this 
> test fails due to a java.lang.IllegalArgumentException during compaction.
> This exception apparently happens when the compaction merges a large (>2GB) 
> partition.
> {noformat}
> DEBUG [COMMIT-LOG-ALLOCATOR] 2016-09-28 00:32:48,074 ?:? - No segments in 
> reserve; creating a fresh one
> DEBUG [COMMIT-LOG-ALLOCATOR] 2016-09-28 00:32:48,437 ?:? - No segments in 
> reserve; creating a fresh one
> WARN  [CompactionExecutor:14] 2016-09-28 00:32:48,463 ?:? - Writing large 
> partition cql_test_keyspace/table_4:10 (1.004GiB)
> ERROR [CompactionExecutor:14] 2016-09-28 00:32:49,734 ?:? - Fatal exception 
> in thread Thread[CompactionExecutor:14,1,main]
> java.lang.IllegalArgumentException: Out of range: 2234434614
> at com.google.common.primitives.Ints.checkedCast(Ints.java:91) 
> ~[guava-18.0.jar:na]
> at 
> org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:206)
>  ~[main/:na]
> at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) 
> ~[main/:na]
> at 
> org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:85)
>  ~[main/:na]
> at 
> org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:61)
>  ~[main/:na]
> at 
> org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionCandidate.run(CompactionManager.java:267)
>  ~[main/:na]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> ~[na:1.8.0_77]
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
> ~[na:1.8.0_77]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  ~[na:1.8.0_77]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  [na:1.8.0_77]
> at java.lang.Thread.run(Thread.java:745) [na:1.8.0_77]
> DEBUG [COMMIT-LOG-ALLOCATOR] 2016-09-28 00:32:49,909 ?:? - No segments in 
> reserve; creating a fresh one
> DEBUG [COMMIT-LOG-ALLOCATOR] 2016-09-28 00:32:50,148 ?:? - No segments in 
> reserve; creating a fresh one
> DEBUG [COMMIT-LOG-ALLOCATOR] 2016-09-28 00:32:50,385 ?:? - No segments in 
> reserve; creating a fresh one
> DEBUG [COMMIT-LOG-ALLOCATOR] 2016-09-28 00:32:50,620 ?:? - No segments in 
> reserve; creating a fresh one
> {noformat}
> {noformat}
> java.lang.RuntimeException: java.util.concurrent.ExecutionException: 
> java.lang.IllegalArgumentException: Out of range: 2540348821
> at org.apache.cassandra.utils.Throwables.maybeFail(Throwables.java:51)
> at 
> org.apache.cassandra.utils.FBUtilities.waitOnFutures(FBUtilities.java:393)
> at 
> org.apache.cassandra.db.compaction.CompactionManager.performMaximal(CompactionManager.java:695)
> at 
> org.apache.cassandra.db.ColumnFamilyStore.forceMajorCompaction(ColumnFamilyStore.java:2066)
> at 
> org.apache.cassandra.db.ColumnFamilyStore.forceMajorCompaction(ColumnFamilyStore.java:2061)
> at org.apache.cassandra.cql3.CQLTester.compact(CQLTester.java:426)
> at 
> org.apache.cassandra.io.sstable.LargePartitionsTest.lambda$withPartitionSize$2(LargePartitionsTest.java:92)
> at 
> org.apache.cassandra.io.sstable.LargePartitionsTest.measured(LargePartitionsTest.java:50)
> at 
> org.apache.cassandra.io.sstable.LargePartitionsTest.withPartitionSize(LargePartitionsTest.java:90)
> at 
> org.apache.cassandra.io.sstable.LargePartitionsTest.test_11_1G(LargePartitionsTest.java:198)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44)
> at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
> at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41)
> at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
> at 
> 

[jira] [Updated] (CASSANDRA-12717) Fix IllegalArgumentException in CompactionTask

2016-09-27 Thread Yasuharu Goto (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yasuharu Goto updated CASSANDRA-12717:
--
Description: 
When I was ran LargePartitionsTest.test_11_1G at trunk, I found that this test 
fails due to a java.lang.IllegalArgumentException during compaction.
This exception apparently happens when the compaction merges a large (>2GB) 
partition.

{noformat}
DEBUG [COMMIT-LOG-ALLOCATOR] 2016-09-28 00:32:48,074 ?:? - No segments in 
reserve; creating a fresh one
DEBUG [COMMIT-LOG-ALLOCATOR] 2016-09-28 00:32:48,437 ?:? - No segments in 
reserve; creating a fresh one
WARN  [CompactionExecutor:14] 2016-09-28 00:32:48,463 ?:? - Writing large 
partition cql_test_keyspace/table_4:10 (1.004GiB)
ERROR [CompactionExecutor:14] 2016-09-28 00:32:49,734 ?:? - Fatal exception in 
thread Thread[CompactionExecutor:14,1,main]
java.lang.IllegalArgumentException: Out of range: 2234434614
at com.google.common.primitives.Ints.checkedCast(Ints.java:91) 
~[guava-18.0.jar:na]
at 
org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:206)
 ~[main/:na]
at 
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) 
~[main/:na]
at 
org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:85)
 ~[main/:na]
at 
org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:61)
 ~[main/:na]
at 
org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionCandidate.run(CompactionManager.java:267)
 ~[main/:na]
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
~[na:1.8.0_77]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
~[na:1.8.0_77]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
~[na:1.8.0_77]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
[na:1.8.0_77]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_77]
DEBUG [COMMIT-LOG-ALLOCATOR] 2016-09-28 00:32:49,909 ?:? - No segments in 
reserve; creating a fresh one
DEBUG [COMMIT-LOG-ALLOCATOR] 2016-09-28 00:32:50,148 ?:? - No segments in 
reserve; creating a fresh one
DEBUG [COMMIT-LOG-ALLOCATOR] 2016-09-28 00:32:50,385 ?:? - No segments in 
reserve; creating a fresh one
DEBUG [COMMIT-LOG-ALLOCATOR] 2016-09-28 00:32:50,620 ?:? - No segments in 
reserve; creating a fresh one
{noformat}

{noformat}

java.lang.RuntimeException: java.util.concurrent.ExecutionException: 
java.lang.IllegalArgumentException: Out of range: 2540348821

at org.apache.cassandra.utils.Throwables.maybeFail(Throwables.java:51)
at 
org.apache.cassandra.utils.FBUtilities.waitOnFutures(FBUtilities.java:393)
at 
org.apache.cassandra.db.compaction.CompactionManager.performMaximal(CompactionManager.java:695)
at 
org.apache.cassandra.db.ColumnFamilyStore.forceMajorCompaction(ColumnFamilyStore.java:2066)
at 
org.apache.cassandra.db.ColumnFamilyStore.forceMajorCompaction(ColumnFamilyStore.java:2061)
at org.apache.cassandra.cql3.CQLTester.compact(CQLTester.java:426)
at 
org.apache.cassandra.io.sstable.LargePartitionsTest.lambda$withPartitionSize$2(LargePartitionsTest.java:92)
at 
org.apache.cassandra.io.sstable.LargePartitionsTest.measured(LargePartitionsTest.java:50)
at 
org.apache.cassandra.io.sstable.LargePartitionsTest.withPartitionSize(LargePartitionsTest.java:90)
at 
org.apache.cassandra.io.sstable.LargePartitionsTest.test_11_1G(LargePartitionsTest.java:198)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28)
at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:31)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
at 
com.intellij.junit4.JUnit4TestRunnerUtil$IgnoreIgnoredTestJUnit4ClassRunner.runChild(JUnit4TestRunnerUtil.java:358)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:44)
at 

[jira] [Comment Edited] (CASSANDRA-12717) Fix IllegalArgumentException in CompactionTask

2016-09-27 Thread Yasuharu Goto (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15526579#comment-15526579
 ] 

Yasuharu Goto edited comment on CASSANDRA-12717 at 9/27/16 4:15 PM:


Patch is here. Could you please review this?

Fix IllegalArgumentException in CompactionTask
https://github.com/matope/cassandra/commit/d6c40dd3d4d95dba8b9c3f88de1015315e45990d


was (Author: yasuharu):
Patch is here. Could you please review this?

Fix IllegalArgumentException in CompactionTask
https://github.com/matope/cassandra/commit/a9ccd9731e83fdd4148325c9a727b64e4982e2ba

> Fix IllegalArgumentException in CompactionTask
> --
>
> Key: CASSANDRA-12717
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12717
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Yasuharu Goto
>Assignee: Yasuharu Goto
>
> When I was ran LargePartitionsTest.test_11_1G at trunk, I found that this 
> test fails due to a java.lang.IllegalArgumentException during compaction 
> and, eventually fails.
> This exception apparently happens when a compaction generates large sstable.
> {noformat}
> DEBUG [COMMIT-LOG-ALLOCATOR] 2016-09-28 00:32:48,074 ?:? - No segments in 
> reserve; creating a fresh one
> DEBUG [COMMIT-LOG-ALLOCATOR] 2016-09-28 00:32:48,437 ?:? - No segments in 
> reserve; creating a fresh one
> WARN  [CompactionExecutor:14] 2016-09-28 00:32:48,463 ?:? - Writing large 
> partition cql_test_keyspace/table_4:10 (1.004GiB)
> ERROR [CompactionExecutor:14] 2016-09-28 00:32:49,734 ?:? - Fatal exception 
> in thread Thread[CompactionExecutor:14,1,main]
> java.lang.IllegalArgumentException: Out of range: 2234434614
> at com.google.common.primitives.Ints.checkedCast(Ints.java:91) 
> ~[guava-18.0.jar:na]
> at 
> org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:206)
>  ~[main/:na]
> at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) 
> ~[main/:na]
> at 
> org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:85)
>  ~[main/:na]
> at 
> org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:61)
>  ~[main/:na]
> at 
> org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionCandidate.run(CompactionManager.java:267)
>  ~[main/:na]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> ~[na:1.8.0_77]
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
> ~[na:1.8.0_77]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  ~[na:1.8.0_77]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  [na:1.8.0_77]
> at java.lang.Thread.run(Thread.java:745) [na:1.8.0_77]
> DEBUG [COMMIT-LOG-ALLOCATOR] 2016-09-28 00:32:49,909 ?:? - No segments in 
> reserve; creating a fresh one
> DEBUG [COMMIT-LOG-ALLOCATOR] 2016-09-28 00:32:50,148 ?:? - No segments in 
> reserve; creating a fresh one
> DEBUG [COMMIT-LOG-ALLOCATOR] 2016-09-28 00:32:50,385 ?:? - No segments in 
> reserve; creating a fresh one
> DEBUG [COMMIT-LOG-ALLOCATOR] 2016-09-28 00:32:50,620 ?:? - No segments in 
> reserve; creating a fresh one
> {noformat}
> {noformat}
> java.lang.RuntimeException: java.util.concurrent.ExecutionException: 
> java.lang.IllegalArgumentException: Out of range: 2540348821
> at org.apache.cassandra.utils.Throwables.maybeFail(Throwables.java:51)
> at 
> org.apache.cassandra.utils.FBUtilities.waitOnFutures(FBUtilities.java:393)
> at 
> org.apache.cassandra.db.compaction.CompactionManager.performMaximal(CompactionManager.java:695)
> at 
> org.apache.cassandra.db.ColumnFamilyStore.forceMajorCompaction(ColumnFamilyStore.java:2066)
> at 
> org.apache.cassandra.db.ColumnFamilyStore.forceMajorCompaction(ColumnFamilyStore.java:2061)
> at org.apache.cassandra.cql3.CQLTester.compact(CQLTester.java:426)
> at 
> org.apache.cassandra.io.sstable.LargePartitionsTest.lambda$withPartitionSize$2(LargePartitionsTest.java:92)
> at 
> org.apache.cassandra.io.sstable.LargePartitionsTest.measured(LargePartitionsTest.java:50)
> at 
> org.apache.cassandra.io.sstable.LargePartitionsTest.withPartitionSize(LargePartitionsTest.java:90)
> at 
> org.apache.cassandra.io.sstable.LargePartitionsTest.test_11_1G(LargePartitionsTest.java:198)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at 

[jira] [Updated] (CASSANDRA-12717) Fix IllegalArgumentException in CompactionTask

2016-09-27 Thread Yasuharu Goto (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yasuharu Goto updated CASSANDRA-12717:
--
Description: 
When I was ran LargePartitionsTest.test_11_1G at trunk, I found that this test 
fails due to a java.lang.IllegalArgumentException during compaction 
and, eventually fails.
This exception apparently happens when a compaction generates large sstable.

{noformat}
DEBUG [COMMIT-LOG-ALLOCATOR] 2016-09-28 00:32:48,074 ?:? - No segments in 
reserve; creating a fresh one
DEBUG [COMMIT-LOG-ALLOCATOR] 2016-09-28 00:32:48,437 ?:? - No segments in 
reserve; creating a fresh one
WARN  [CompactionExecutor:14] 2016-09-28 00:32:48,463 ?:? - Writing large 
partition cql_test_keyspace/table_4:10 (1.004GiB)
ERROR [CompactionExecutor:14] 2016-09-28 00:32:49,734 ?:? - Fatal exception in 
thread Thread[CompactionExecutor:14,1,main]
java.lang.IllegalArgumentException: Out of range: 2234434614
at com.google.common.primitives.Ints.checkedCast(Ints.java:91) 
~[guava-18.0.jar:na]
at 
org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:206)
 ~[main/:na]
at 
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) 
~[main/:na]
at 
org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:85)
 ~[main/:na]
at 
org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:61)
 ~[main/:na]
at 
org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionCandidate.run(CompactionManager.java:267)
 ~[main/:na]
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
~[na:1.8.0_77]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
~[na:1.8.0_77]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
~[na:1.8.0_77]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
[na:1.8.0_77]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_77]
DEBUG [COMMIT-LOG-ALLOCATOR] 2016-09-28 00:32:49,909 ?:? - No segments in 
reserve; creating a fresh one
DEBUG [COMMIT-LOG-ALLOCATOR] 2016-09-28 00:32:50,148 ?:? - No segments in 
reserve; creating a fresh one
DEBUG [COMMIT-LOG-ALLOCATOR] 2016-09-28 00:32:50,385 ?:? - No segments in 
reserve; creating a fresh one
DEBUG [COMMIT-LOG-ALLOCATOR] 2016-09-28 00:32:50,620 ?:? - No segments in 
reserve; creating a fresh one
{noformat}

{noformat}

java.lang.RuntimeException: java.util.concurrent.ExecutionException: 
java.lang.IllegalArgumentException: Out of range: 2540348821

at org.apache.cassandra.utils.Throwables.maybeFail(Throwables.java:51)
at 
org.apache.cassandra.utils.FBUtilities.waitOnFutures(FBUtilities.java:393)
at 
org.apache.cassandra.db.compaction.CompactionManager.performMaximal(CompactionManager.java:695)
at 
org.apache.cassandra.db.ColumnFamilyStore.forceMajorCompaction(ColumnFamilyStore.java:2066)
at 
org.apache.cassandra.db.ColumnFamilyStore.forceMajorCompaction(ColumnFamilyStore.java:2061)
at org.apache.cassandra.cql3.CQLTester.compact(CQLTester.java:426)
at 
org.apache.cassandra.io.sstable.LargePartitionsTest.lambda$withPartitionSize$2(LargePartitionsTest.java:92)
at 
org.apache.cassandra.io.sstable.LargePartitionsTest.measured(LargePartitionsTest.java:50)
at 
org.apache.cassandra.io.sstable.LargePartitionsTest.withPartitionSize(LargePartitionsTest.java:90)
at 
org.apache.cassandra.io.sstable.LargePartitionsTest.test_11_1G(LargePartitionsTest.java:198)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28)
at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:31)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
at 
com.intellij.junit4.JUnit4TestRunnerUtil$IgnoreIgnoredTestJUnit4ClassRunner.runChild(JUnit4TestRunnerUtil.java:358)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:44)
at 

[jira] [Commented] (CASSANDRA-12717) Fix IllegalArgumentException in CompactionTask

2016-09-27 Thread Yasuharu Goto (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15526579#comment-15526579
 ] 

Yasuharu Goto commented on CASSANDRA-12717:
---

Patch is here. Could you please review this?

Fix IllegalArgumentException in CompactionTask
https://github.com/matope/cassandra/commit/a9ccd9731e83fdd4148325c9a727b64e4982e2ba

> Fix IllegalArgumentException in CompactionTask
> --
>
> Key: CASSANDRA-12717
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12717
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Yasuharu Goto
>Assignee: Yasuharu Goto
>
> When I was ran LargePartitionsTest.test_11_1G at trunk, I found that this 
> test fails due to a java.lang.IllegalArgumentException during compaction
> {noformat}
> DEBUG [COMMIT-LOG-ALLOCATOR] 2016-09-28 00:32:48,074 ?:? - No segments in 
> reserve; creating a fresh one
> DEBUG [COMMIT-LOG-ALLOCATOR] 2016-09-28 00:32:48,437 ?:? - No segments in 
> reserve; creating a fresh one
> WARN  [CompactionExecutor:14] 2016-09-28 00:32:48,463 ?:? - Writing large 
> partition cql_test_keyspace/table_4:10 (1.004GiB)
> ERROR [CompactionExecutor:14] 2016-09-28 00:32:49,734 ?:? - Fatal exception 
> in thread Thread[CompactionExecutor:14,1,main]
> java.lang.IllegalArgumentException: Out of range: 2234434614
> at com.google.common.primitives.Ints.checkedCast(Ints.java:91) 
> ~[guava-18.0.jar:na]
> at 
> org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:206)
>  ~[main/:na]
> at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) 
> ~[main/:na]
> at 
> org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:85)
>  ~[main/:na]
> at 
> org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:61)
>  ~[main/:na]
> at 
> org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionCandidate.run(CompactionManager.java:267)
>  ~[main/:na]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> ~[na:1.8.0_77]
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
> ~[na:1.8.0_77]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  ~[na:1.8.0_77]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  [na:1.8.0_77]
> at java.lang.Thread.run(Thread.java:745) [na:1.8.0_77]
> DEBUG [COMMIT-LOG-ALLOCATOR] 2016-09-28 00:32:49,909 ?:? - No segments in 
> reserve; creating a fresh one
> DEBUG [COMMIT-LOG-ALLOCATOR] 2016-09-28 00:32:50,148 ?:? - No segments in 
> reserve; creating a fresh one
> DEBUG [COMMIT-LOG-ALLOCATOR] 2016-09-28 00:32:50,385 ?:? - No segments in 
> reserve; creating a fresh one
> DEBUG [COMMIT-LOG-ALLOCATOR] 2016-09-28 00:32:50,620 ?:? - No segments in 
> reserve; creating a fresh one
> {noformat}
> and, eventually fails.
> {noformat}
> java.lang.RuntimeException: java.util.concurrent.ExecutionException: 
> java.lang.IllegalArgumentException: Out of range: 2540348821
> at org.apache.cassandra.utils.Throwables.maybeFail(Throwables.java:51)
> at 
> org.apache.cassandra.utils.FBUtilities.waitOnFutures(FBUtilities.java:393)
> at 
> org.apache.cassandra.db.compaction.CompactionManager.performMaximal(CompactionManager.java:695)
> at 
> org.apache.cassandra.db.ColumnFamilyStore.forceMajorCompaction(ColumnFamilyStore.java:2066)
> at 
> org.apache.cassandra.db.ColumnFamilyStore.forceMajorCompaction(ColumnFamilyStore.java:2061)
> at org.apache.cassandra.cql3.CQLTester.compact(CQLTester.java:426)
> at 
> org.apache.cassandra.io.sstable.LargePartitionsTest.lambda$withPartitionSize$2(LargePartitionsTest.java:92)
> at 
> org.apache.cassandra.io.sstable.LargePartitionsTest.measured(LargePartitionsTest.java:50)
> at 
> org.apache.cassandra.io.sstable.LargePartitionsTest.withPartitionSize(LargePartitionsTest.java:90)
> at 
> org.apache.cassandra.io.sstable.LargePartitionsTest.test_11_1G(LargePartitionsTest.java:198)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44)
> at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
> at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41)
> 

[jira] [Updated] (CASSANDRA-12717) Fix IllegalArgumentException in CompactionTask

2016-09-27 Thread Yasuharu Goto (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yasuharu Goto updated CASSANDRA-12717:
--
Description: 
When I was ran LargePartitionsTest.test_11_1G at trunk, I found that this test 
fails due to a java.lang.IllegalArgumentException during compaction

{noformat}
DEBUG [COMMIT-LOG-ALLOCATOR] 2016-09-28 00:32:48,074 ?:? - No segments in 
reserve; creating a fresh one
DEBUG [COMMIT-LOG-ALLOCATOR] 2016-09-28 00:32:48,437 ?:? - No segments in 
reserve; creating a fresh one
WARN  [CompactionExecutor:14] 2016-09-28 00:32:48,463 ?:? - Writing large 
partition cql_test_keyspace/table_4:10 (1.004GiB)
ERROR [CompactionExecutor:14] 2016-09-28 00:32:49,734 ?:? - Fatal exception in 
thread Thread[CompactionExecutor:14,1,main]
java.lang.IllegalArgumentException: Out of range: 2234434614
at com.google.common.primitives.Ints.checkedCast(Ints.java:91) 
~[guava-18.0.jar:na]
at 
org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:206)
 ~[main/:na]
at 
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) 
~[main/:na]
at 
org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:85)
 ~[main/:na]
at 
org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:61)
 ~[main/:na]
at 
org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionCandidate.run(CompactionManager.java:267)
 ~[main/:na]
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
~[na:1.8.0_77]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
~[na:1.8.0_77]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
~[na:1.8.0_77]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
[na:1.8.0_77]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_77]
DEBUG [COMMIT-LOG-ALLOCATOR] 2016-09-28 00:32:49,909 ?:? - No segments in 
reserve; creating a fresh one
DEBUG [COMMIT-LOG-ALLOCATOR] 2016-09-28 00:32:50,148 ?:? - No segments in 
reserve; creating a fresh one
DEBUG [COMMIT-LOG-ALLOCATOR] 2016-09-28 00:32:50,385 ?:? - No segments in 
reserve; creating a fresh one
DEBUG [COMMIT-LOG-ALLOCATOR] 2016-09-28 00:32:50,620 ?:? - No segments in 
reserve; creating a fresh one
{noformat}

and, eventually fails.

{noformat}

java.lang.RuntimeException: java.util.concurrent.ExecutionException: 
java.lang.IllegalArgumentException: Out of range: 2540348821

at org.apache.cassandra.utils.Throwables.maybeFail(Throwables.java:51)
at 
org.apache.cassandra.utils.FBUtilities.waitOnFutures(FBUtilities.java:393)
at 
org.apache.cassandra.db.compaction.CompactionManager.performMaximal(CompactionManager.java:695)
at 
org.apache.cassandra.db.ColumnFamilyStore.forceMajorCompaction(ColumnFamilyStore.java:2066)
at 
org.apache.cassandra.db.ColumnFamilyStore.forceMajorCompaction(ColumnFamilyStore.java:2061)
at org.apache.cassandra.cql3.CQLTester.compact(CQLTester.java:426)
at 
org.apache.cassandra.io.sstable.LargePartitionsTest.lambda$withPartitionSize$2(LargePartitionsTest.java:92)
at 
org.apache.cassandra.io.sstable.LargePartitionsTest.measured(LargePartitionsTest.java:50)
at 
org.apache.cassandra.io.sstable.LargePartitionsTest.withPartitionSize(LargePartitionsTest.java:90)
at 
org.apache.cassandra.io.sstable.LargePartitionsTest.test_11_1G(LargePartitionsTest.java:198)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28)
at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:31)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
at 
com.intellij.junit4.JUnit4TestRunnerUtil$IgnoreIgnoredTestJUnit4ClassRunner.runChild(JUnit4TestRunnerUtil.java:358)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:44)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:180)
at 

[jira] [Created] (CASSANDRA-12717) Fix IllegalArgumentException in CompactionTask

2016-09-27 Thread Yasuharu Goto (JIRA)
Yasuharu Goto created CASSANDRA-12717:
-

 Summary: Fix IllegalArgumentException in CompactionTask
 Key: CASSANDRA-12717
 URL: https://issues.apache.org/jira/browse/CASSANDRA-12717
 Project: Cassandra
  Issue Type: Bug
Reporter: Yasuharu Goto
Assignee: Yasuharu Goto


When I was ran LargePartitionsTest.test_11_1G, I found that this test fails due 
to a java.lang.IllegalArgumentException during compaction

{noformat}
DEBUG [COMMIT-LOG-ALLOCATOR] 2016-09-28 00:32:48,074 ?:? - No segments in 
reserve; creating a fresh one
DEBUG [COMMIT-LOG-ALLOCATOR] 2016-09-28 00:32:48,437 ?:? - No segments in 
reserve; creating a fresh one
WARN  [CompactionExecutor:14] 2016-09-28 00:32:48,463 ?:? - Writing large 
partition cql_test_keyspace/table_4:10 (1.004GiB)
ERROR [CompactionExecutor:14] 2016-09-28 00:32:49,734 ?:? - Fatal exception in 
thread Thread[CompactionExecutor:14,1,main]
java.lang.IllegalArgumentException: Out of range: 2234434614
at com.google.common.primitives.Ints.checkedCast(Ints.java:91) 
~[guava-18.0.jar:na]
at 
org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:206)
 ~[main/:na]
at 
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) 
~[main/:na]
at 
org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:85)
 ~[main/:na]
at 
org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:61)
 ~[main/:na]
at 
org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionCandidate.run(CompactionManager.java:267)
 ~[main/:na]
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
~[na:1.8.0_77]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
~[na:1.8.0_77]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
~[na:1.8.0_77]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
[na:1.8.0_77]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_77]
DEBUG [COMMIT-LOG-ALLOCATOR] 2016-09-28 00:32:49,909 ?:? - No segments in 
reserve; creating a fresh one
DEBUG [COMMIT-LOG-ALLOCATOR] 2016-09-28 00:32:50,148 ?:? - No segments in 
reserve; creating a fresh one
DEBUG [COMMIT-LOG-ALLOCATOR] 2016-09-28 00:32:50,385 ?:? - No segments in 
reserve; creating a fresh one
DEBUG [COMMIT-LOG-ALLOCATOR] 2016-09-28 00:32:50,620 ?:? - No segments in 
reserve; creating a fresh one
{noformat}

and, eventually fails.

{noformat}

java.lang.RuntimeException: java.util.concurrent.ExecutionException: 
java.lang.IllegalArgumentException: Out of range: 2540348821

at org.apache.cassandra.utils.Throwables.maybeFail(Throwables.java:51)
at 
org.apache.cassandra.utils.FBUtilities.waitOnFutures(FBUtilities.java:393)
at 
org.apache.cassandra.db.compaction.CompactionManager.performMaximal(CompactionManager.java:695)
at 
org.apache.cassandra.db.ColumnFamilyStore.forceMajorCompaction(ColumnFamilyStore.java:2066)
at 
org.apache.cassandra.db.ColumnFamilyStore.forceMajorCompaction(ColumnFamilyStore.java:2061)
at org.apache.cassandra.cql3.CQLTester.compact(CQLTester.java:426)
at 
org.apache.cassandra.io.sstable.LargePartitionsTest.lambda$withPartitionSize$2(LargePartitionsTest.java:92)
at 
org.apache.cassandra.io.sstable.LargePartitionsTest.measured(LargePartitionsTest.java:50)
at 
org.apache.cassandra.io.sstable.LargePartitionsTest.withPartitionSize(LargePartitionsTest.java:90)
at 
org.apache.cassandra.io.sstable.LargePartitionsTest.test_11_1G(LargePartitionsTest.java:198)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28)
at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:31)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
at 
com.intellij.junit4.JUnit4TestRunnerUtil$IgnoreIgnoredTestJUnit4ClassRunner.runChild(JUnit4TestRunnerUtil.java:358)
at 

[jira] [Commented] (CASSANDRA-11425) Add prepared query parameter to trace for "Execute CQL3 prepared query" session

2016-05-18 Thread Yasuharu Goto (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1523#comment-1523
 ] 

Yasuharu Goto commented on CASSANDRA-11425:
---

[~snazy] Thank you very much for the modifying and the merge!
I'll be careful of unit tests next time.
I'm interested in CASSANDRA-11719, but there is a challenger already. I'm going 
to watch the ticket.

Thanks.

> Add prepared query parameter to trace for "Execute CQL3 prepared query" 
> session
> ---
>
> Key: CASSANDRA-11425
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11425
> Project: Cassandra
>  Issue Type: Improvement
>  Components: CQL
>Reporter: Yasuharu Goto
>Assignee: Yasuharu Goto
>Priority: Minor
> Fix For: 3.8
>
>
> For now, the system_traces.sessions rows for "Execute CQL3 prepared query" do 
> not show us any information about the prepared query which is executed on the 
> session. So we can't see what query is the session executing.
> I think this makes performance tuning difficult on Cassandra.
> So, In this ticket, I'd like to add the prepared query parameter on Execute 
> session trace like this.
> {noformat}
> cqlsh:system_traces> select * from sessions ;
>  session_id   | client| command | coordinator | 
> duration | parameters 
>   
> | request | started_at
> --+---+-+-+--+--+-+-
>  a001ec00-f1c5-11e5-b14a-6fe1292cf9f1 | 127.0.0.1 |   QUERY |   127.0.0.1 |   
>666 |  \{'consistency_level': 'ONE', 'page_size': '5000', 'query': 
> 'SELECT * FROM test.test2 WHERE id=? LIMIT 1', 'serial_consistency_level': 
> 'SERIAL'\} | Execute CQL3 prepared query | 2016-03-24 13:38:00.00+
>  a0019de0-f1c5-11e5-b14a-6fe1292cf9f1 | 127.0.0.1 |   QUERY |   127.0.0.1 |   
>109 |  
>{'query': 'SELECT * FROM test.test2 WHERE id=? LIMIT 
> 1'} |Preparing CQL3 query | 2016-03-24 13:37:59.998000+
>  a0014fc0-f1c5-11e5-b14a-6fe1292cf9f1 | 127.0.0.1 |   QUERY |   127.0.0.1 |   
>126 |  
>  {'query': 'INSERT INTO test.test2(id,value) VALUES 
> (?,?)'} |Preparing CQL3 query | 2016-03-24 13:37:59.996000+
>  a0019de1-f1c5-11e5-b14a-6fe1292cf9f1 | 127.0.0.1 |   QUERY |   127.0.0.1 |   
>764 |  {'consistency_level': 'ONE', 'page_size': '5000', 'query': 
> 'SELECT * FROM test.test2 WHERE id=? LIMIT 1', 'serial_consistency_level': 
> 'SERIAL'} | Execute CQL3 prepared query | 2016-03-24 13:37:59.998000+
>  a00176d0-f1c5-11e5-b14a-6fe1292cf9f1 | 127.0.0.1 |   QUERY |   127.0.0.1 |   
>857 | {'consistency_level': 'QUORUM', 'page_size': '5000', 'query': 
> 'INSERT INTO test.test2(id,value) VALUES (?,?)', 'serial_consistency_level': 
> 'SERIAL'} | Execute CQL3 prepared query | 2016-03-24 13:37:59.997000+
> {noformat}
> Now, "Execute CQL3 prepared query" session displays its query.
> I believe that this additional information would help operators a lot.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11425) Add prepared query parameter to trace for "Execute CQL3 prepared query" session

2016-03-29 Thread Yasuharu Goto (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15216098#comment-15216098
 ] 

Yasuharu Goto commented on CASSANDRA-11425:
---

[~thobbs] Thanks for your response!

I concern about the memory consumption too. I have no idea if we should trim 
queries to save memory.
For a good memory management, I think we might have to add query string into 
EntryWeigher.weightOf() calculation (and increase MAX_CACHE_PREPARED_MEMORY ?).

> Add prepared query parameter to trace for "Execute CQL3 prepared query" 
> session
> ---
>
> Key: CASSANDRA-11425
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11425
> Project: Cassandra
>  Issue Type: Improvement
>  Components: CQL
>Reporter: Yasuharu Goto
>Assignee: Yasuharu Goto
>Priority: Minor
>
> For now, the system_traces.sessions rows for "Execute CQL3 prepared query" do 
> not show us any information about the prepared query which is executed on the 
> session. So we can't see what query is the session executing.
> I think this makes performance tuning difficult on Cassandra.
> So, In this ticket, I'd like to add the prepared query parameter on Execute 
> session trace like this.
> {noformat}
> cqlsh:system_traces> select * from sessions ;
>  session_id   | client| command | coordinator | 
> duration | parameters 
>   
> | request | started_at
> --+---+-+-+--+--+-+-
>  a001ec00-f1c5-11e5-b14a-6fe1292cf9f1 | 127.0.0.1 |   QUERY |   127.0.0.1 |   
>666 |  {'consistency_level': 'ONE', 'page_size': '5000', 'query': 
> 'SELECT * FROM test.test2 WHERE id=? LIMIT 1', 'serial_consistency_level': 
> 'SERIAL'} | Execute CQL3 prepared query | 2016-03-24 13:38:00.00+
>  a0019de0-f1c5-11e5-b14a-6fe1292cf9f1 | 127.0.0.1 |   QUERY |   127.0.0.1 |   
>109 |  
>{'query': 'SELECT * FROM test.test2 WHERE id=? LIMIT 
> 1'} |Preparing CQL3 query | 2016-03-24 13:37:59.998000+
>  a0014fc0-f1c5-11e5-b14a-6fe1292cf9f1 | 127.0.0.1 |   QUERY |   127.0.0.1 |   
>126 |  
>  {'query': 'INSERT INTO test.test2(id,value) VALUES 
> (?,?)'} |Preparing CQL3 query | 2016-03-24 13:37:59.996000+
>  a0019de1-f1c5-11e5-b14a-6fe1292cf9f1 | 127.0.0.1 |   QUERY |   127.0.0.1 |   
>764 |  {'consistency_level': 'ONE', 'page_size': '5000', 'query': 
> 'SELECT * FROM test.test2 WHERE id=? LIMIT 1', 'serial_consistency_level': 
> 'SERIAL'} | Execute CQL3 prepared query | 2016-03-24 13:37:59.998000+
>  a00176d0-f1c5-11e5-b14a-6fe1292cf9f1 | 127.0.0.1 |   QUERY |   127.0.0.1 |   
>857 | {'consistency_level': 'QUORUM', 'page_size': '5000', 'query': 
> 'INSERT INTO test.test2(id,value) VALUES (?,?)', 'serial_consistency_level': 
> 'SERIAL'} | Execute CQL3 prepared query | 2016-03-24 13:37:59.997000+
> {noformat}
> Now, "Execute CQL3 prepared query" session displays its query.
> I believe that this additional information would help operators a lot.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-11425) Add prepared query parameter to session trace for "Execute prepared CQL3 Query"

2016-03-24 Thread Yasuharu Goto (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yasuharu Goto updated CASSANDRA-11425:
--
Summary: Add prepared query parameter to session trace for "Execute 
prepared CQL3 Query"  (was: Add prepared statement on Execute prepared query 
session trace.)

> Add prepared query parameter to session trace for "Execute prepared CQL3 
> Query"
> ---
>
> Key: CASSANDRA-11425
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11425
> Project: Cassandra
>  Issue Type: Improvement
>  Components: CQL
>Reporter: Yasuharu Goto
>Assignee: Yasuharu Goto
>Priority: Minor
>
> For now, the system_traces.sessions rows for "Execute CQL3 prepared query" do 
> not show us any information about the prepared query which is executed on the 
> session. So we can't see what query is the session executing.
> I think this makes performance tuning difficult on Cassandra.
> So, In this ticket, I'd like to add the prepared query parameter on Execute 
> session trace like this.
> {noformat}
> cqlsh:system_traces> select * from sessions ;
>  session_id   | client| command | coordinator | 
> duration | parameters 
>   
> | request | started_at
> --+---+-+-+--+--+-+-
>  a001ec00-f1c5-11e5-b14a-6fe1292cf9f1 | 127.0.0.1 |   QUERY |   127.0.0.1 |   
>666 |  {'consistency_level': 'ONE', 'page_size': '5000', 'query': 
> 'SELECT * FROM test.test2 WHERE id=? LIMIT 1', 'serial_consistency_level': 
> 'SERIAL'} | Execute CQL3 prepared query | 2016-03-24 13:38:00.00+
>  a0019de0-f1c5-11e5-b14a-6fe1292cf9f1 | 127.0.0.1 |   QUERY |   127.0.0.1 |   
>109 |  
>{'query': 'SELECT * FROM test.test2 WHERE id=? LIMIT 
> 1'} |Preparing CQL3 query | 2016-03-24 13:37:59.998000+
>  a0014fc0-f1c5-11e5-b14a-6fe1292cf9f1 | 127.0.0.1 |   QUERY |   127.0.0.1 |   
>126 |  
>  {'query': 'INSERT INTO test.test2(id,value) VALUES 
> (?,?)'} |Preparing CQL3 query | 2016-03-24 13:37:59.996000+
>  a0019de1-f1c5-11e5-b14a-6fe1292cf9f1 | 127.0.0.1 |   QUERY |   127.0.0.1 |   
>764 |  {'consistency_level': 'ONE', 'page_size': '5000', 'query': 
> 'SELECT * FROM test.test2 WHERE id=? LIMIT 1', 'serial_consistency_level': 
> 'SERIAL'} | Execute CQL3 prepared query | 2016-03-24 13:37:59.998000+
>  a00176d0-f1c5-11e5-b14a-6fe1292cf9f1 | 127.0.0.1 |   QUERY |   127.0.0.1 |   
>857 | {'consistency_level': 'QUORUM', 'page_size': '5000', 'query': 
> 'INSERT INTO test.test2(id,value) VALUES (?,?)', 'serial_consistency_level': 
> 'SERIAL'} | Execute CQL3 prepared query | 2016-03-24 13:37:59.997000+
> {noformat}
> Now, "Execute CQL3 prepared query" session displays its query.
> I believe that this additional information would help operators a lot.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-11425) Add prepared query parameter to trace for "Execute CQL3 prepared query" session

2016-03-24 Thread Yasuharu Goto (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yasuharu Goto updated CASSANDRA-11425:
--
Summary: Add prepared query parameter to trace for "Execute CQL3 prepared 
query" session  (was: Add prepared query parameter to session trace for 
"Execute prepared CQL3 Query")

> Add prepared query parameter to trace for "Execute CQL3 prepared query" 
> session
> ---
>
> Key: CASSANDRA-11425
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11425
> Project: Cassandra
>  Issue Type: Improvement
>  Components: CQL
>Reporter: Yasuharu Goto
>Assignee: Yasuharu Goto
>Priority: Minor
>
> For now, the system_traces.sessions rows for "Execute CQL3 prepared query" do 
> not show us any information about the prepared query which is executed on the 
> session. So we can't see what query is the session executing.
> I think this makes performance tuning difficult on Cassandra.
> So, In this ticket, I'd like to add the prepared query parameter on Execute 
> session trace like this.
> {noformat}
> cqlsh:system_traces> select * from sessions ;
>  session_id   | client| command | coordinator | 
> duration | parameters 
>   
> | request | started_at
> --+---+-+-+--+--+-+-
>  a001ec00-f1c5-11e5-b14a-6fe1292cf9f1 | 127.0.0.1 |   QUERY |   127.0.0.1 |   
>666 |  {'consistency_level': 'ONE', 'page_size': '5000', 'query': 
> 'SELECT * FROM test.test2 WHERE id=? LIMIT 1', 'serial_consistency_level': 
> 'SERIAL'} | Execute CQL3 prepared query | 2016-03-24 13:38:00.00+
>  a0019de0-f1c5-11e5-b14a-6fe1292cf9f1 | 127.0.0.1 |   QUERY |   127.0.0.1 |   
>109 |  
>{'query': 'SELECT * FROM test.test2 WHERE id=? LIMIT 
> 1'} |Preparing CQL3 query | 2016-03-24 13:37:59.998000+
>  a0014fc0-f1c5-11e5-b14a-6fe1292cf9f1 | 127.0.0.1 |   QUERY |   127.0.0.1 |   
>126 |  
>  {'query': 'INSERT INTO test.test2(id,value) VALUES 
> (?,?)'} |Preparing CQL3 query | 2016-03-24 13:37:59.996000+
>  a0019de1-f1c5-11e5-b14a-6fe1292cf9f1 | 127.0.0.1 |   QUERY |   127.0.0.1 |   
>764 |  {'consistency_level': 'ONE', 'page_size': '5000', 'query': 
> 'SELECT * FROM test.test2 WHERE id=? LIMIT 1', 'serial_consistency_level': 
> 'SERIAL'} | Execute CQL3 prepared query | 2016-03-24 13:37:59.998000+
>  a00176d0-f1c5-11e5-b14a-6fe1292cf9f1 | 127.0.0.1 |   QUERY |   127.0.0.1 |   
>857 | {'consistency_level': 'QUORUM', 'page_size': '5000', 'query': 
> 'INSERT INTO test.test2(id,value) VALUES (?,?)', 'serial_consistency_level': 
> 'SERIAL'} | Execute CQL3 prepared query | 2016-03-24 13:37:59.997000+
> {noformat}
> Now, "Execute CQL3 prepared query" session displays its query.
> I believe that this additional information would help operators a lot.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Issue Comment Deleted] (CASSANDRA-11425) Add prepared statement on Execute prepared query session trace.

2016-03-24 Thread Yasuharu Goto (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yasuharu Goto updated CASSANDRA-11425:
--
Comment: was deleted

(was: My patch is here.
https://github.com/apache/cassandra/compare/trunk...matope:11425-trunk)

> Add prepared statement on Execute prepared query session trace.
> ---
>
> Key: CASSANDRA-11425
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11425
> Project: Cassandra
>  Issue Type: Improvement
>  Components: CQL
>Reporter: Yasuharu Goto
>Assignee: Yasuharu Goto
>Priority: Minor
>
> For now, the system_traces.sessions rows for "Execute CQL3 prepared query" do 
> not show us any information about the prepared query which is executed on the 
> session. So we can't see what query is the session executing.
> I think this makes performance tuning difficult on Cassandra.
> So, In this ticket, I'd like to add the prepared query parameter on Execute 
> session trace like this.
> {noformat}
> cqlsh:system_traces> select * from sessions ;
>  session_id   | client| command | coordinator | 
> duration | parameters 
>   
> | request | started_at
> --+---+-+-+--+--+-+-
>  a001ec00-f1c5-11e5-b14a-6fe1292cf9f1 | 127.0.0.1 |   QUERY |   127.0.0.1 |   
>666 |  {'consistency_level': 'ONE', 'page_size': '5000', 'query': 
> 'SELECT * FROM test.test2 WHERE id=? LIMIT 1', 'serial_consistency_level': 
> 'SERIAL'} | Execute CQL3 prepared query | 2016-03-24 13:38:00.00+
>  a0019de0-f1c5-11e5-b14a-6fe1292cf9f1 | 127.0.0.1 |   QUERY |   127.0.0.1 |   
>109 |  
>{'query': 'SELECT * FROM test.test2 WHERE id=? LIMIT 
> 1'} |Preparing CQL3 query | 2016-03-24 13:37:59.998000+
>  a0014fc0-f1c5-11e5-b14a-6fe1292cf9f1 | 127.0.0.1 |   QUERY |   127.0.0.1 |   
>126 |  
>  {'query': 'INSERT INTO test.test2(id,value) VALUES 
> (?,?)'} |Preparing CQL3 query | 2016-03-24 13:37:59.996000+
>  a0019de1-f1c5-11e5-b14a-6fe1292cf9f1 | 127.0.0.1 |   QUERY |   127.0.0.1 |   
>764 |  {'consistency_level': 'ONE', 'page_size': '5000', 'query': 
> 'SELECT * FROM test.test2 WHERE id=? LIMIT 1', 'serial_consistency_level': 
> 'SERIAL'} | Execute CQL3 prepared query | 2016-03-24 13:37:59.998000+
>  a00176d0-f1c5-11e5-b14a-6fe1292cf9f1 | 127.0.0.1 |   QUERY |   127.0.0.1 |   
>857 | {'consistency_level': 'QUORUM', 'page_size': '5000', 'query': 
> 'INSERT INTO test.test2(id,value) VALUES (?,?)', 'serial_consistency_level': 
> 'SERIAL'} | Execute CQL3 prepared query | 2016-03-24 13:37:59.997000+
> {noformat}
> Now, "Execute CQL3 prepared query" session displays its query.
> I believe that this additional information would help operators a lot.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-11425) Add prepared statement on Execute prepared query session trace.

2016-03-24 Thread Yasuharu Goto (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yasuharu Goto updated CASSANDRA-11425:
--
Status: Patch Available  (was: Open)

My patch is here.
https://github.com/apache/cassandra/compare/trunk...matope:11425-trunk

> Add prepared statement on Execute prepared query session trace.
> ---
>
> Key: CASSANDRA-11425
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11425
> Project: Cassandra
>  Issue Type: Improvement
>  Components: CQL
>Reporter: Yasuharu Goto
>Assignee: Yasuharu Goto
>Priority: Minor
>
> For now, the system_traces.sessions rows for "Execute CQL3 prepared query" do 
> not show us any information about the prepared query which is executed on the 
> session. So we can't see what query is the session executing.
> I think this makes performance tuning difficult on Cassandra.
> So, In this ticket, I'd like to add the prepared query parameter on Execute 
> session trace like this.
> {noformat}
> cqlsh:system_traces> select * from sessions ;
>  session_id   | client| command | coordinator | 
> duration | parameters 
>   
> | request | started_at
> --+---+-+-+--+--+-+-
>  a001ec00-f1c5-11e5-b14a-6fe1292cf9f1 | 127.0.0.1 |   QUERY |   127.0.0.1 |   
>666 |  {'consistency_level': 'ONE', 'page_size': '5000', 'query': 
> 'SELECT * FROM test.test2 WHERE id=? LIMIT 1', 'serial_consistency_level': 
> 'SERIAL'} | Execute CQL3 prepared query | 2016-03-24 13:38:00.00+
>  a0019de0-f1c5-11e5-b14a-6fe1292cf9f1 | 127.0.0.1 |   QUERY |   127.0.0.1 |   
>109 |  
>{'query': 'SELECT * FROM test.test2 WHERE id=? LIMIT 
> 1'} |Preparing CQL3 query | 2016-03-24 13:37:59.998000+
>  a0014fc0-f1c5-11e5-b14a-6fe1292cf9f1 | 127.0.0.1 |   QUERY |   127.0.0.1 |   
>126 |  
>  {'query': 'INSERT INTO test.test2(id,value) VALUES 
> (?,?)'} |Preparing CQL3 query | 2016-03-24 13:37:59.996000+
>  a0019de1-f1c5-11e5-b14a-6fe1292cf9f1 | 127.0.0.1 |   QUERY |   127.0.0.1 |   
>764 |  {'consistency_level': 'ONE', 'page_size': '5000', 'query': 
> 'SELECT * FROM test.test2 WHERE id=? LIMIT 1', 'serial_consistency_level': 
> 'SERIAL'} | Execute CQL3 prepared query | 2016-03-24 13:37:59.998000+
>  a00176d0-f1c5-11e5-b14a-6fe1292cf9f1 | 127.0.0.1 |   QUERY |   127.0.0.1 |   
>857 | {'consistency_level': 'QUORUM', 'page_size': '5000', 'query': 
> 'INSERT INTO test.test2(id,value) VALUES (?,?)', 'serial_consistency_level': 
> 'SERIAL'} | Execute CQL3 prepared query | 2016-03-24 13:37:59.997000+
> {noformat}
> Now, "Execute CQL3 prepared query" session displays its query.
> I believe that this additional information would help operators a lot.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11425) Add prepared statement on Execute prepared query session trace.

2016-03-24 Thread Yasuharu Goto (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15210265#comment-15210265
 ] 

Yasuharu Goto commented on CASSANDRA-11425:
---

My patch is here.
https://github.com/apache/cassandra/compare/trunk...matope:11425-trunk

> Add prepared statement on Execute prepared query session trace.
> ---
>
> Key: CASSANDRA-11425
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11425
> Project: Cassandra
>  Issue Type: Improvement
>  Components: CQL
>Reporter: Yasuharu Goto
>Assignee: Yasuharu Goto
>Priority: Minor
>
> For now, the system_traces.sessions rows for "Execute CQL3 prepared query" do 
> not show us any information about the prepared query which is executed on the 
> session. So we can't see what query is the session executing.
> I think this makes performance tuning difficult on Cassandra.
> So, In this ticket, I'd like to add the prepared query parameter on Execute 
> session trace like this.
> {noformat}
> cqlsh:system_traces> select * from sessions ;
>  session_id   | client| command | coordinator | 
> duration | parameters 
>   
> | request | started_at
> --+---+-+-+--+--+-+-
>  a001ec00-f1c5-11e5-b14a-6fe1292cf9f1 | 127.0.0.1 |   QUERY |   127.0.0.1 |   
>666 |  {'consistency_level': 'ONE', 'page_size': '5000', 'query': 
> 'SELECT * FROM test.test2 WHERE id=? LIMIT 1', 'serial_consistency_level': 
> 'SERIAL'} | Execute CQL3 prepared query | 2016-03-24 13:38:00.00+
>  a0019de0-f1c5-11e5-b14a-6fe1292cf9f1 | 127.0.0.1 |   QUERY |   127.0.0.1 |   
>109 |  
>{'query': 'SELECT * FROM test.test2 WHERE id=? LIMIT 
> 1'} |Preparing CQL3 query | 2016-03-24 13:37:59.998000+
>  a0014fc0-f1c5-11e5-b14a-6fe1292cf9f1 | 127.0.0.1 |   QUERY |   127.0.0.1 |   
>126 |  
>  {'query': 'INSERT INTO test.test2(id,value) VALUES 
> (?,?)'} |Preparing CQL3 query | 2016-03-24 13:37:59.996000+
>  a0019de1-f1c5-11e5-b14a-6fe1292cf9f1 | 127.0.0.1 |   QUERY |   127.0.0.1 |   
>764 |  {'consistency_level': 'ONE', 'page_size': '5000', 'query': 
> 'SELECT * FROM test.test2 WHERE id=? LIMIT 1', 'serial_consistency_level': 
> 'SERIAL'} | Execute CQL3 prepared query | 2016-03-24 13:37:59.998000+
>  a00176d0-f1c5-11e5-b14a-6fe1292cf9f1 | 127.0.0.1 |   QUERY |   127.0.0.1 |   
>857 | {'consistency_level': 'QUORUM', 'page_size': '5000', 'query': 
> 'INSERT INTO test.test2(id,value) VALUES (?,?)', 'serial_consistency_level': 
> 'SERIAL'} | Execute CQL3 prepared query | 2016-03-24 13:37:59.997000+
> {noformat}
> Now, "Execute CQL3 prepared query" session displays its query.
> I believe that this additional information would help operators a lot.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-11425) Add prepared statement on Execute prepared query session trace.

2016-03-24 Thread Yasuharu Goto (JIRA)
Yasuharu Goto created CASSANDRA-11425:
-

 Summary: Add prepared statement on Execute prepared query session 
trace.
 Key: CASSANDRA-11425
 URL: https://issues.apache.org/jira/browse/CASSANDRA-11425
 Project: Cassandra
  Issue Type: Improvement
  Components: CQL
Reporter: Yasuharu Goto
Assignee: Yasuharu Goto
Priority: Minor


For now, the system_traces.sessions rows for "Execute CQL3 prepared query" do 
not show us any information about the prepared query which is executed on the 
session. So we can't see what query is the session executing.
I think this makes performance tuning difficult on Cassandra.

So, In this ticket, I'd like to add the prepared query parameter on Execute 
session trace like this.

{noformat}
cqlsh:system_traces> select * from sessions ;

 session_id   | client| command | coordinator | 
duration | parameters   

| request | started_at
--+---+-+-+--+--+-+-
 a001ec00-f1c5-11e5-b14a-6fe1292cf9f1 | 127.0.0.1 |   QUERY |   127.0.0.1 | 
 666 |  {'consistency_level': 'ONE', 'page_size': '5000', 'query': 'SELECT 
* FROM test.test2 WHERE id=? LIMIT 1', 'serial_consistency_level': 'SERIAL'} | 
Execute CQL3 prepared query | 2016-03-24 13:38:00.00+
 a0019de0-f1c5-11e5-b14a-6fe1292cf9f1 | 127.0.0.1 |   QUERY |   127.0.0.1 | 
 109 |  
   {'query': 'SELECT * FROM test.test2 WHERE id=? LIMIT 1'} |   
 Preparing CQL3 query | 2016-03-24 13:37:59.998000+
 a0014fc0-f1c5-11e5-b14a-6fe1292cf9f1 | 127.0.0.1 |   QUERY |   127.0.0.1 | 
 126 |  
 {'query': 'INSERT INTO test.test2(id,value) VALUES (?,?)'} |   
 Preparing CQL3 query | 2016-03-24 13:37:59.996000+
 a0019de1-f1c5-11e5-b14a-6fe1292cf9f1 | 127.0.0.1 |   QUERY |   127.0.0.1 | 
 764 |  {'consistency_level': 'ONE', 'page_size': '5000', 'query': 'SELECT 
* FROM test.test2 WHERE id=? LIMIT 1', 'serial_consistency_level': 'SERIAL'} | 
Execute CQL3 prepared query | 2016-03-24 13:37:59.998000+
 a00176d0-f1c5-11e5-b14a-6fe1292cf9f1 | 127.0.0.1 |   QUERY |   127.0.0.1 | 
 857 | {'consistency_level': 'QUORUM', 'page_size': '5000', 'query': 'INSERT 
INTO test.test2(id,value) VALUES (?,?)', 'serial_consistency_level': 'SERIAL'} 
| Execute CQL3 prepared query | 2016-03-24 13:37:59.997000+
{noformat}

Now, "Execute CQL3 prepared query" session displays its query.
I believe that this additional information would help operators a lot.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10875) cqlsh fails to decode utf-8 characters for text typed columns.

2016-01-05 Thread Yasuharu Goto (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15082979#comment-15082979
 ] 

Yasuharu Goto commented on CASSANDRA-10875:
---

Oh, I've overlooked the commit.
Thank you for your review and merge! [~pauloricardomg] [~snazy]

> cqlsh fails to decode utf-8 characters for text typed columns.
> --
>
> Key: CASSANDRA-10875
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10875
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tools
>Reporter: Yasuharu Goto
>Assignee: Yasuharu Goto
>Priority: Minor
> Fix For: 2.1.13, 2.2.5, 3.0.3
>
> Attachments: 10875-2.1-2.txt, 10875-2.1-3.txt, 10875-2.1.12.txt, 
> 10875-2.2.txt, 10875-3.1.txt
>
>
> Hi, we've found a bug that cqlsh can't handle unicode text in select 
> conditions even if it were text type.
> {noformat}
> $ ./bin/cqlsh
> Connected to Test Cluster at 127.0.0.1:9042.
> [cqlsh 5.0.1 | Cassandra 3.2-SNAPSHOT | CQL spec 3.3.1 | Native protocol v4]
> Use HELP for help.
> cqlsh> create KEYSPACE test WITH replication = {'class': 'SimpleStrategy', 
> 'replication_factor': 1};
> cqlsh> create table test.test(txt text primary key);
> cqlsh> insert into test.test (txt) values('日本語');
> cqlsh> select * from test.test where txt='日本語';
> 'ascii' codec can't decode byte 0xe6 in position 35: ordinal not in range(128)
> cqlsh> 
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-10875) cqlsh fails to decode utf-8 characters for text typed columns.

2015-12-23 Thread Yasuharu Goto (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10875?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yasuharu Goto updated CASSANDRA-10875:
--
Attachment: 10875-2.1-2.txt

> cqlsh fails to decode utf-8 characters for text typed columns.
> --
>
> Key: CASSANDRA-10875
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10875
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tools
>Reporter: Yasuharu Goto
>Assignee: Yasuharu Goto
>Priority: Minor
> Fix For: 2.1.x, 2.2.x, 3.0.x, 3.x
>
> Attachments: 10875-2.1-2.txt, 10875-2.1.12.txt, 10875-2.2.txt, 
> 10875-3.1.txt
>
>
> Hi, we've found a bug that cqlsh can't handle unicode text in select 
> conditions even if it were text type.
> {noformat}
> $ ./bin/cqlsh
> Connected to Test Cluster at 127.0.0.1:9042.
> [cqlsh 5.0.1 | Cassandra 3.2-SNAPSHOT | CQL spec 3.3.1 | Native protocol v4]
> Use HELP for help.
> cqlsh> create KEYSPACE test WITH replication = {'class': 'SimpleStrategy', 
> 'replication_factor': 1};
> cqlsh> create table test.test(txt text primary key);
> cqlsh> insert into test.test (txt) values('日本語');
> cqlsh> select * from test.test where txt='日本語';
> 'ascii' codec can't decode byte 0xe6 in position 35: ordinal not in range(128)
> cqlsh> 
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-10875) cqlsh fails to decode utf-8 characters for text typed columns.

2015-12-23 Thread Yasuharu Goto (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10875?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yasuharu Goto updated CASSANDRA-10875:
--
Attachment: 10875-2.2.txt

> cqlsh fails to decode utf-8 characters for text typed columns.
> --
>
> Key: CASSANDRA-10875
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10875
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tools
>Reporter: Yasuharu Goto
>Assignee: Yasuharu Goto
>Priority: Minor
> Fix For: 2.1.x, 2.2.x, 3.0.x, 3.x
>
> Attachments: 10875-2.1-2.txt, 10875-2.1.12.txt, 10875-2.2.txt, 
> 10875-3.1.txt
>
>
> Hi, we've found a bug that cqlsh can't handle unicode text in select 
> conditions even if it were text type.
> {noformat}
> $ ./bin/cqlsh
> Connected to Test Cluster at 127.0.0.1:9042.
> [cqlsh 5.0.1 | Cassandra 3.2-SNAPSHOT | CQL spec 3.3.1 | Native protocol v4]
> Use HELP for help.
> cqlsh> create KEYSPACE test WITH replication = {'class': 'SimpleStrategy', 
> 'replication_factor': 1};
> cqlsh> create table test.test(txt text primary key);
> cqlsh> insert into test.test (txt) values('日本語');
> cqlsh> select * from test.test where txt='日本語';
> 'ascii' codec can't decode byte 0xe6 in position 35: ordinal not in range(128)
> cqlsh> 
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10875) cqlsh fails to decode utf-8 characters for text typed columns.

2015-12-23 Thread Yasuharu Goto (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15070499#comment-15070499
 ] 

Yasuharu Goto commented on CASSANDRA-10875:
---

[~pauloricardomg] Thank you for your great review! (And sorry to be late)
I've updated my patches to 10875-2.1-2.txt and 10875-2.2.txt.
I could merge 10875-2.2 to 2.2,3.0, and trunk.

> cqlsh fails to decode utf-8 characters for text typed columns.
> --
>
> Key: CASSANDRA-10875
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10875
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tools
>Reporter: Yasuharu Goto
>Assignee: Yasuharu Goto
>Priority: Minor
> Fix For: 2.1.x, 2.2.x, 3.0.x, 3.x
>
> Attachments: 10875-2.1-2.txt, 10875-2.1.12.txt, 10875-2.2.txt, 
> 10875-3.1.txt
>
>
> Hi, we've found a bug that cqlsh can't handle unicode text in select 
> conditions even if it were text type.
> {noformat}
> $ ./bin/cqlsh
> Connected to Test Cluster at 127.0.0.1:9042.
> [cqlsh 5.0.1 | Cassandra 3.2-SNAPSHOT | CQL spec 3.3.1 | Native protocol v4]
> Use HELP for help.
> cqlsh> create KEYSPACE test WITH replication = {'class': 'SimpleStrategy', 
> 'replication_factor': 1};
> cqlsh> create table test.test(txt text primary key);
> cqlsh> insert into test.test (txt) values('日本語');
> cqlsh> select * from test.test where txt='日本語';
> 'ascii' codec can't decode byte 0xe6 in position 35: ordinal not in range(128)
> cqlsh> 
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10875) cqlsh fails to decode utf-8 characters for text typed columns.

2015-12-17 Thread Yasuharu Goto (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15063294#comment-15063294
 ] 

Yasuharu Goto commented on CASSANDRA-10875:
---

Thank you for your response!

I didn't notice --encoding option.
I checked --help and --encoding option.

In cassandra-2.1.9, cqlsh doesn't have --encoding option.
{noformat}

$ cqlsh --help
Usage: cqlsh [options] [host [port]]

CQL Shell for Apache Cassandra

Options:
  --version show program's version number and exit
  -h, --helpshow this help message and exit
  -C, --color   Always use color output
  --no-colorNever use color output
  -u USERNAME, --username=USERNAME
Authenticate as user.
  -p PASSWORD, --password=PASSWORD
Authenticate using password.
  -k KEYSPACE, --keyspace=KEYSPACE
Authenticate to the given keyspace.
  -f FILE, --file=FILE  Execute commands from FILE, then exit
  -t TRANSPORT_FACTORY, --transport-factory=TRANSPORT_FACTORY
Use the provided Thrift transport factory function.
  --debug   Show additional debugging information
  --cqlversion=CQLVERSION
Specify a particular CQL version (default: 3).
Examples: "2", "3.0.0-beta1"
  -2, --cql2Shortcut notation for --cqlversion=2
  -3, --cql3Shortcut notation for --cqlversion=3

Connects to localhost:9160 by default. These defaults can be changed by
setting $CQLSH_HOST and/or $CQLSH_PORT. When a host (and optional port number)
are given on the command line, they take precedence over any defaults.

$ cqlsh --encode=utf8
Usage: cqlsh [options] [host [port]]

cqlsh: error: no such option: --encode
{noformat}

In Cassandra-3.0.0, cqlsh has it. But the help says encoding is utf8 already.
{noformat}
./cqlsh --help
Usage: cqlsh.py [options] [host [port]]

CQL Shell for Apache Cassandra

Options:
  --version show program's version number and exit
  -h, --helpshow this help message and exit
  -C, --color   Always use color output
  --no-colorNever use color output
  --ssl Use SSL
  -u USERNAME, --username=USERNAME
Authenticate as user.
  -p PASSWORD, --password=PASSWORD
Authenticate using password.
  -k KEYSPACE, --keyspace=KEYSPACE
Authenticate to the given keyspace.
  -f FILE, --file=FILE  Execute commands from FILE, then exit
  --debug   Show additional debugging information
  --encoding=ENCODING   Specify a non-default encoding for output.  If you are
experiencing problems with unicode characters, using
utf8 may fix the problem. (Default from system
preferences: UTF-8)
  --cqlshrc=CQLSHRC Specify an alternative cqlshrc file location.
  --cqlversion=CQLVERSION
Specify a particular CQL version (default: 3.3.1).
Examples: "3.0.3", "3.1.0"
  -e EXECUTE, --execute=EXECUTE
Execute the statement and quit.
  --connect-timeout=CONNECT_TIMEOUT
Specify the connection timeout in seconds (default: 5
seconds).

Connects to 127.0.0.1:9042 by default. These defaults can be changed by
setting $CQLSH_HOST and/or $CQLSH_PORT. When a host (and optional port number)
are given on the command line, they take precedence over any defaults.
{noformat}

But,  cqlsh --encoding=utf8 doesn't seem to work correctly.

{noformat}
./cqlsh --encoding=utf8
Connected to Test Cluster at 127.0.0.1:9042.
[cqlsh 5.0.1 | Cassandra 3.0.0 | CQL spec 3.3.1 | Native protocol v4]
Use HELP for help.
cqlsh> select * from test where id='日本語';
'ascii' codec can't decode byte 0xe6 in position 29: ordinal not in range(128)
cqlsh> 
{noformat}

> cqlsh fails to decode utf-8 characters for text typed columns.
> --
>
> Key: CASSANDRA-10875
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10875
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tools
>Reporter: Yasuharu Goto
>Assignee: Yasuharu Goto
>Priority: Minor
> Fix For: 2.1.13, 3.1
>
> Attachments: 10875-2.1.12.txt, 10875-3.1.txt
>
>
> Hi, we've found a bug that cqlsh can't handle unicode text in select 
> conditions even if it were text type.
> {noformat}
> $ ./bin/cqlsh
> Connected to Test Cluster at 127.0.0.1:9042.
> [cqlsh 5.0.1 | Cassandra 3.2-SNAPSHOT | CQL spec 3.3.1 | Native protocol v4]
> Use HELP for help.
> cqlsh> create KEYSPACE test WITH replication = {'class': 'SimpleStrategy', 
> 'replication_factor': 1};
> cqlsh> create table test.test(txt text 

[jira] [Comment Edited] (CASSANDRA-10875) cqlsh fails to decode utf-8 characters for text typed columns.

2015-12-17 Thread Yasuharu Goto (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15063294#comment-15063294
 ] 

Yasuharu Goto edited comment on CASSANDRA-10875 at 12/18/15 2:14 AM:
-

Thank you for your response!

I didn't notice --encoding option.
I checked --help and --encoding option.

In cassandra-2.1.9, cqlsh doesn't have --encoding option.
{noformat}

$ cqlsh --help
Usage: cqlsh [options] [host [port]]

CQL Shell for Apache Cassandra

Options:
  --version show program's version number and exit
  -h, --helpshow this help message and exit
  -C, --color   Always use color output
  --no-colorNever use color output
  -u USERNAME, --username=USERNAME
Authenticate as user.
  -p PASSWORD, --password=PASSWORD
Authenticate using password.
  -k KEYSPACE, --keyspace=KEYSPACE
Authenticate to the given keyspace.
  -f FILE, --file=FILE  Execute commands from FILE, then exit
  -t TRANSPORT_FACTORY, --transport-factory=TRANSPORT_FACTORY
Use the provided Thrift transport factory function.
  --debug   Show additional debugging information
  --cqlversion=CQLVERSION
Specify a particular CQL version (default: 3).
Examples: "2", "3.0.0-beta1"
  -2, --cql2Shortcut notation for --cqlversion=2
  -3, --cql3Shortcut notation for --cqlversion=3

Connects to localhost:9160 by default. These defaults can be changed by
setting $CQLSH_HOST and/or $CQLSH_PORT. When a host (and optional port number)
are given on the command line, they take precedence over any defaults.

$ cqlsh --encode=utf8
Usage: cqlsh [options] [host [port]]

cqlsh: error: no such option: --encode
{noformat}

In Cassandra-3.0.0, cqlsh has it. But the help says encoding is utf8 already.
{noformat}
./cqlsh --help
Usage: cqlsh.py [options] [host [port]]

CQL Shell for Apache Cassandra

Options:
  --version show program's version number and exit
  -h, --helpshow this help message and exit
  -C, --color   Always use color output
  --no-colorNever use color output
  --ssl Use SSL
  -u USERNAME, --username=USERNAME
Authenticate as user.
  -p PASSWORD, --password=PASSWORD
Authenticate using password.
  -k KEYSPACE, --keyspace=KEYSPACE
Authenticate to the given keyspace.
  -f FILE, --file=FILE  Execute commands from FILE, then exit
  --debug   Show additional debugging information
  --encoding=ENCODING   Specify a non-default encoding for output.  If you are
experiencing problems with unicode characters, using
utf8 may fix the problem. (Default from system
preferences: UTF-8)
  --cqlshrc=CQLSHRC Specify an alternative cqlshrc file location.
  --cqlversion=CQLVERSION
Specify a particular CQL version (default: 3.3.1).
Examples: "3.0.3", "3.1.0"
  -e EXECUTE, --execute=EXECUTE
Execute the statement and quit.
  --connect-timeout=CONNECT_TIMEOUT
Specify the connection timeout in seconds (default: 5
seconds).

Connects to 127.0.0.1:9042 by default. These defaults can be changed by
setting $CQLSH_HOST and/or $CQLSH_PORT. When a host (and optional port number)
are given on the command line, they take precedence over any defaults.
{noformat}

Furthermore,  cqlsh --encoding=utf8 doesn't seem to work correctly.

{noformat}
./cqlsh --encoding=utf8
Connected to Test Cluster at 127.0.0.1:9042.
[cqlsh 5.0.1 | Cassandra 3.0.0 | CQL spec 3.3.1 | Native protocol v4]
Use HELP for help.
cqlsh> select * from test where id='日本語';
'ascii' codec can't decode byte 0xe6 in position 29: ordinal not in range(128)
cqlsh> 
{noformat}


was (Author: yasuharu):
Thank you for your response!

I didn't notice --encoding option.
I checked --help and --encoding option.

In cassandra-2.1.9, cqlsh doesn't have --encoding option.
{noformat}

$ cqlsh --help
Usage: cqlsh [options] [host [port]]

CQL Shell for Apache Cassandra

Options:
  --version show program's version number and exit
  -h, --helpshow this help message and exit
  -C, --color   Always use color output
  --no-colorNever use color output
  -u USERNAME, --username=USERNAME
Authenticate as user.
  -p PASSWORD, --password=PASSWORD
Authenticate using password.
  -k KEYSPACE, --keyspace=KEYSPACE
Authenticate to the given keyspace.
  -f FILE, --file=FILE  Execute commands from FILE, then exit
  -t TRANSPORT_FACTORY, --transport-factory=TRANSPORT_FACTORY
   

[jira] [Updated] (CASSANDRA-10875) cqlsh fails to decode utf-8 characters for text typed columns.

2015-12-16 Thread Yasuharu Goto (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10875?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yasuharu Goto updated CASSANDRA-10875:
--
Summary: cqlsh fails to decode utf-8 characters for text typed columns.  
(was: cqlsh decodes text column values as ascii in SELECT statements.)

> cqlsh fails to decode utf-8 characters for text typed columns.
> --
>
> Key: CASSANDRA-10875
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10875
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tools
>Reporter: Yasuharu Goto
>Assignee: Yasuharu Goto
>Priority: Minor
> Fix For: 2.1.13, 3.1
>
> Attachments: 10875-2.1.12.txt, 10875-3.1.txt
>
>
> Hi, we've found a bug that cqlsh can't handle unicode text in select 
> conditions even if it were text type.
> {noformat}
> $ ./bin/cqlsh
> Connected to Test Cluster at 127.0.0.1:9042.
> [cqlsh 5.0.1 | Cassandra 3.2-SNAPSHOT | CQL spec 3.3.1 | Native protocol v4]
> Use HELP for help.
> cqlsh> create KEYSPACE test WITH replication = {'class': 'SimpleStrategy', 
> 'replication_factor': 1};
> cqlsh> create table test.test(txt text primary key);
> cqlsh> insert into test.test (txt) values('日本語');
> cqlsh> select * from test.test where txt='日本語';
> 'ascii' codec can't decode byte 0xe6 in position 35: ordinal not in range(128)
> cqlsh> 
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-10875) cqlsh decodes text column values as ascii in SELECT clause.

2015-12-15 Thread Yasuharu Goto (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10875?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yasuharu Goto updated CASSANDRA-10875:
--
Summary: cqlsh decodes text column values as ascii in SELECT clause.  (was: 
cqlsh decodes text as ascii in SELECT clause.)

> cqlsh decodes text column values as ascii in SELECT clause.
> ---
>
> Key: CASSANDRA-10875
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10875
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tools
>Reporter: Yasuharu Goto
>Assignee: Yasuharu Goto
>Priority: Minor
>
> Hi, we've found a bug that cqlsh can't handle unicode text in select 
> conditions even if it were text type.
> {noformat}
> $ ./bin/cqlsh
> Connected to Test Cluster at 127.0.0.1:9042.
> [cqlsh 5.0.1 | Cassandra 3.2-SNAPSHOT | CQL spec 3.3.1 | Native protocol v4]
> Use HELP for help.
> cqlsh> create KEYSPACE test WITH replication = {'class': 'SimpleStrategy', 
> 'replication_factor': 1};
> cqlsh> create table test.test(txt text primary key);
> cqlsh> insert into test.test (txt) values('日本語');
> cqlsh> select * from test.test where txt='日本語';
> 'ascii' codec can't decode byte 0xe6 in position 35: ordinal not in range(128)
> cqlsh> 
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-10875) cqlsh decodes text as ascii in SELECT clause.

2015-12-15 Thread Yasuharu Goto (JIRA)
Yasuharu Goto created CASSANDRA-10875:
-

 Summary: cqlsh decodes text as ascii in SELECT clause.
 Key: CASSANDRA-10875
 URL: https://issues.apache.org/jira/browse/CASSANDRA-10875
 Project: Cassandra
  Issue Type: Bug
  Components: Tools
Reporter: Yasuharu Goto
Assignee: Yasuharu Goto
Priority: Minor


Hi, we've found a bug that cqlsh can't handle unicode text in select conditions 
even if it were text type.

{noformat}
$ ./bin/cqlsh
Connected to Test Cluster at 127.0.0.1:9042.
[cqlsh 5.0.1 | Cassandra 3.2-SNAPSHOT | CQL spec 3.3.1 | Native protocol v4]
Use HELP for help.
cqlsh> create KEYSPACE test WITH replication = {'class': 'SimpleStrategy', 
'replication_factor': 1};
cqlsh> create table test.test(txt text primary key);
cqlsh> insert into test.test (txt) values('日本語');
cqlsh> select * from test.test where txt='日本語';
'ascii' codec can't decode byte 0xe6 in position 35: ordinal not in range(128)
cqlsh> 
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-10875) cqlsh decodes text column values as ascii in SELECT clause.

2015-12-15 Thread Yasuharu Goto (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10875?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yasuharu Goto updated CASSANDRA-10875:
--
Attachment: 10875-3.1.txt

> cqlsh decodes text column values as ascii in SELECT clause.
> ---
>
> Key: CASSANDRA-10875
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10875
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tools
>Reporter: Yasuharu Goto
>Assignee: Yasuharu Goto
>Priority: Minor
>
> Hi, we've found a bug that cqlsh can't handle unicode text in select 
> conditions even if it were text type.
> {noformat}
> $ ./bin/cqlsh
> Connected to Test Cluster at 127.0.0.1:9042.
> [cqlsh 5.0.1 | Cassandra 3.2-SNAPSHOT | CQL spec 3.3.1 | Native protocol v4]
> Use HELP for help.
> cqlsh> create KEYSPACE test WITH replication = {'class': 'SimpleStrategy', 
> 'replication_factor': 1};
> cqlsh> create table test.test(txt text primary key);
> cqlsh> insert into test.test (txt) values('日本語');
> cqlsh> select * from test.test where txt='日本語';
> 'ascii' codec can't decode byte 0xe6 in position 35: ordinal not in range(128)
> cqlsh> 
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-10875) cqlsh decodes text column values as ascii in SELECT clause.

2015-12-15 Thread Yasuharu Goto (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10875?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yasuharu Goto updated CASSANDRA-10875:
--
Attachment: (was: 10875-3.1.txt)

> cqlsh decodes text column values as ascii in SELECT clause.
> ---
>
> Key: CASSANDRA-10875
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10875
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tools
>Reporter: Yasuharu Goto
>Assignee: Yasuharu Goto
>Priority: Minor
>
> Hi, we've found a bug that cqlsh can't handle unicode text in select 
> conditions even if it were text type.
> {noformat}
> $ ./bin/cqlsh
> Connected to Test Cluster at 127.0.0.1:9042.
> [cqlsh 5.0.1 | Cassandra 3.2-SNAPSHOT | CQL spec 3.3.1 | Native protocol v4]
> Use HELP for help.
> cqlsh> create KEYSPACE test WITH replication = {'class': 'SimpleStrategy', 
> 'replication_factor': 1};
> cqlsh> create table test.test(txt text primary key);
> cqlsh> insert into test.test (txt) values('日本語');
> cqlsh> select * from test.test where txt='日本語';
> 'ascii' codec can't decode byte 0xe6 in position 35: ordinal not in range(128)
> cqlsh> 
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-10875) cqlsh decodes text column values as ascii in SELECT clause.

2015-12-15 Thread Yasuharu Goto (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10875?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yasuharu Goto updated CASSANDRA-10875:
--
Attachment: 10875-3.1.txt

> cqlsh decodes text column values as ascii in SELECT clause.
> ---
>
> Key: CASSANDRA-10875
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10875
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tools
>Reporter: Yasuharu Goto
>Assignee: Yasuharu Goto
>Priority: Minor
> Attachments: 10875-2.1.12.txt, 10875-3.1.txt
>
>
> Hi, we've found a bug that cqlsh can't handle unicode text in select 
> conditions even if it were text type.
> {noformat}
> $ ./bin/cqlsh
> Connected to Test Cluster at 127.0.0.1:9042.
> [cqlsh 5.0.1 | Cassandra 3.2-SNAPSHOT | CQL spec 3.3.1 | Native protocol v4]
> Use HELP for help.
> cqlsh> create KEYSPACE test WITH replication = {'class': 'SimpleStrategy', 
> 'replication_factor': 1};
> cqlsh> create table test.test(txt text primary key);
> cqlsh> insert into test.test (txt) values('日本語');
> cqlsh> select * from test.test where txt='日本語';
> 'ascii' codec can't decode byte 0xe6 in position 35: ordinal not in range(128)
> cqlsh> 
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-10875) cqlsh decodes text column values as ascii in SELECT clause.

2015-12-15 Thread Yasuharu Goto (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10875?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yasuharu Goto updated CASSANDRA-10875:
--
Attachment: 10875-2.1.12.txt

> cqlsh decodes text column values as ascii in SELECT clause.
> ---
>
> Key: CASSANDRA-10875
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10875
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tools
>Reporter: Yasuharu Goto
>Assignee: Yasuharu Goto
>Priority: Minor
> Attachments: 10875-2.1.12.txt
>
>
> Hi, we've found a bug that cqlsh can't handle unicode text in select 
> conditions even if it were text type.
> {noformat}
> $ ./bin/cqlsh
> Connected to Test Cluster at 127.0.0.1:9042.
> [cqlsh 5.0.1 | Cassandra 3.2-SNAPSHOT | CQL spec 3.3.1 | Native protocol v4]
> Use HELP for help.
> cqlsh> create KEYSPACE test WITH replication = {'class': 'SimpleStrategy', 
> 'replication_factor': 1};
> cqlsh> create table test.test(txt text primary key);
> cqlsh> insert into test.test (txt) values('日本語');
> cqlsh> select * from test.test where txt='日本語';
> 'ascii' codec can't decode byte 0xe6 in position 35: ordinal not in range(128)
> cqlsh> 
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-10875) cqlsh decodes text column values as ascii in SELECT statements.

2015-12-15 Thread Yasuharu Goto (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10875?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yasuharu Goto updated CASSANDRA-10875:
--
Summary: cqlsh decodes text column values as ascii in SELECT statements.  
(was: cqlsh decodes text column values as ascii in SELECT clause.)

> cqlsh decodes text column values as ascii in SELECT statements.
> ---
>
> Key: CASSANDRA-10875
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10875
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tools
>Reporter: Yasuharu Goto
>Assignee: Yasuharu Goto
>Priority: Minor
> Fix For: 2.1.13, 3.1
>
> Attachments: 10875-2.1.12.txt, 10875-3.1.txt
>
>
> Hi, we've found a bug that cqlsh can't handle unicode text in select 
> conditions even if it were text type.
> {noformat}
> $ ./bin/cqlsh
> Connected to Test Cluster at 127.0.0.1:9042.
> [cqlsh 5.0.1 | Cassandra 3.2-SNAPSHOT | CQL spec 3.3.1 | Native protocol v4]
> Use HELP for help.
> cqlsh> create KEYSPACE test WITH replication = {'class': 'SimpleStrategy', 
> 'replication_factor': 1};
> cqlsh> create table test.test(txt text primary key);
> cqlsh> insert into test.test (txt) values('日本語');
> cqlsh> select * from test.test where txt='日本語';
> 'ascii' codec can't decode byte 0xe6 in position 35: ordinal not in range(128)
> cqlsh> 
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9779) Append-only optimization

2015-11-01 Thread Yasuharu Goto (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14984278#comment-14984278
 ] 

Yasuharu Goto commented on CASSANDRA-9779:
--

How about WITH INSERTS ONLY option for each columns?

In our use case, we have mutable and immutable columns in a table and we're 
indexing only immutable columns manually now.
We'll be happy if this optimization could be applied to our app.

> Append-only optimization
> 
>
> Key: CASSANDRA-9779
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9779
> Project: Cassandra
>  Issue Type: New Feature
>  Components: CQL
>Reporter: Jonathan Ellis
> Fix For: 3.x
>
>
> Many common workloads are append-only: that is, they insert new rows but do 
> not update existing ones.  However, Cassandra has no way to infer this and so 
> it must treat all tables as if they may experience updates in the future.
> If we added syntax to tell Cassandra about this ({{WITH INSERTS ONLY}} for 
> instance) then we could do a number of optimizations:
> - Compaction would only need to worry about defragmenting partitions, not 
> rows.  We could default to DTCS or similar.
> - CollationController could stop scanning sstables as soon as it finds a 
> matching row
> - Most importantly, materialized views wouldn't need to worry about deleting 
> prior values, which would eliminate the majority of the MV overhead



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9898) cqlsh crashes if it load a utf-8 file.

2015-08-19 Thread Yasuharu Goto (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14703391#comment-14703391
 ] 

Yasuharu Goto commented on CASSANDRA-9898:
--

Ping [~carlyeks].
What should I do as a next step?

 cqlsh crashes if it load a utf-8 file.
 --

 Key: CASSANDRA-9898
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9898
 Project: Cassandra
  Issue Type: Bug
  Components: Tools
 Environment: linux, os x yosemite.
Reporter: Yasuharu Goto
Assignee: Yasuharu Goto
Priority: Minor
  Labels: cqlsh
 Fix For: 2.1.x, 2.2.x

 Attachments: cassandra-2.1-9898.txt, cassandra-2.2-9898.txt


 cqlsh crashes when it load a cql script file encoded in utf-8.
 This is a reproduction procedure.
 {noformat}
 $cat ./test.cql
 // 日本語のコメント
 use system;
 select * from system.peers;
 $cqlsh --version
 cqlsh 5.0.1
 $cqlsh -f ./test.cql
 Traceback (most recent call last):
   File ./cqlsh, line 2459, in module
 main(*read_options(sys.argv[1:], os.environ))
   File ./cqlsh, line 2451, in main
 shell.cmdloop()
   File ./cqlsh, line 940, in cmdloop
 line = self.get_input_line(self.prompt)
   File ./cqlsh, line 909, in get_input_line
 self.lastcmd = self.stdin.readline()
   File 
 /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/codecs.py,
  line 675, in readline
 return self.reader.readline(size)
   File 
 /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/codecs.py,
  line 530, in readline
 data = self.read(readsize, firstline=True)
   File 
 /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/codecs.py,
  line 477, in read
 newchars, decodedbytes = self.decode(data, self.errors)
 UnicodeDecodeError: 'ascii' codec can't decode byte 0xe6 in position 3: 
 ordinal not in range(128)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-9898) cqlsh crashes if it load a utf-8 file.

2015-08-04 Thread Yasuharu Goto (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9898?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yasuharu Goto updated CASSANDRA-9898:
-
Description: 
cqlsh crashes when it load a cql script file encoded in utf-8.

This is a reproduction procedure.

{noformat}
$cat ./test.cql
// 日本語のコメント
use system;
select * from system.peers;

$cqlsh --version
cqlsh 5.0.1

$cqlsh -f ./test.cql
Traceback (most recent call last):
  File ./cqlsh, line 2459, in module
main(*read_options(sys.argv[1:], os.environ))
  File ./cqlsh, line 2451, in main
shell.cmdloop()
  File ./cqlsh, line 940, in cmdloop
line = self.get_input_line(self.prompt)
  File ./cqlsh, line 909, in get_input_line
self.lastcmd = self.stdin.readline()
  File 
/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/codecs.py,
 line 675, in readline
return self.reader.readline(size)
  File 
/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/codecs.py,
 line 530, in readline
data = self.read(readsize, firstline=True)
  File 
/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/codecs.py,
 line 477, in read
newchars, decodedbytes = self.decode(data, self.errors)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe6 in position 3: ordinal 
not in range(128)
{noformat}

  was:
cqlsh crashes when it load a cql script file encoded in utf-8.

This is a reproduction procedure.

{noformat}
$cat ./test.cql
// 日本語のコメント
use system;
select * from system.peers;

$cqlsh --version
cqlsh 5.0.1

$cqlsh -f ./test.cql
Traceback (most recent call last):
  File ./cqlsh, line 2459, in module
main(*read_options(sys.argv[1:], os.environ))
  File ./cqlsh, line 2451, in main
shell.cmdloop()
  File ./cqlsh, line 940, in cmdloop
line = self.get_input_line(self.prompt)
  File ./cqlsh, line 909, in get_input_line
self.lastcmd = self.stdin.readline()
  File 
/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/codecs.py,
 line 675, in readline
return self.reader.readline(size)
  File 
/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/codecs.py,
 line 530, in readline
data = self.read(readsize, firstline=True)
  File 
/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/codecs.py,
 line 477, in read
newchars, decodedbytes = self.decode(data, self.errors)
{noformat}


 cqlsh crashes if it load a utf-8 file.
 --

 Key: CASSANDRA-9898
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9898
 Project: Cassandra
  Issue Type: Bug
  Components: Tools
 Environment: linux, os x yosemite.
Reporter: Yasuharu Goto
Assignee: Yasuharu Goto
Priority: Minor
  Labels: cqlsh
 Fix For: 2.1.x, 2.2.x

 Attachments: cassandra-2.1-9898.txt, cassandra-2.2-9898.txt


 cqlsh crashes when it load a cql script file encoded in utf-8.
 This is a reproduction procedure.
 {noformat}
 $cat ./test.cql
 // 日本語のコメント
 use system;
 select * from system.peers;
 $cqlsh --version
 cqlsh 5.0.1
 $cqlsh -f ./test.cql
 Traceback (most recent call last):
   File ./cqlsh, line 2459, in module
 main(*read_options(sys.argv[1:], os.environ))
   File ./cqlsh, line 2451, in main
 shell.cmdloop()
   File ./cqlsh, line 940, in cmdloop
 line = self.get_input_line(self.prompt)
   File ./cqlsh, line 909, in get_input_line
 self.lastcmd = self.stdin.readline()
   File 
 /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/codecs.py,
  line 675, in readline
 return self.reader.readline(size)
   File 
 /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/codecs.py,
  line 530, in readline
 data = self.read(readsize, firstline=True)
   File 
 /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/codecs.py,
  line 477, in read
 newchars, decodedbytes = self.decode(data, self.errors)
 UnicodeDecodeError: 'ascii' codec can't decode byte 0xe6 in position 3: 
 ordinal not in range(128)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9898) cqlsh crashes if it load a utf-8 file.

2015-08-04 Thread Yasuharu Goto (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14654600#comment-14654600
 ] 

Yasuharu Goto commented on CASSANDRA-9898:
--

Oops, I've not paste the last line of my error log. I've updated my description.

 cqlsh crashes if it load a utf-8 file.
 --

 Key: CASSANDRA-9898
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9898
 Project: Cassandra
  Issue Type: Bug
  Components: Tools
 Environment: linux, os x yosemite.
Reporter: Yasuharu Goto
Assignee: Yasuharu Goto
Priority: Minor
  Labels: cqlsh
 Fix For: 2.1.x, 2.2.x

 Attachments: cassandra-2.1-9898.txt, cassandra-2.2-9898.txt


 cqlsh crashes when it load a cql script file encoded in utf-8.
 This is a reproduction procedure.
 {noformat}
 $cat ./test.cql
 // 日本語のコメント
 use system;
 select * from system.peers;
 $cqlsh --version
 cqlsh 5.0.1
 $cqlsh -f ./test.cql
 Traceback (most recent call last):
   File ./cqlsh, line 2459, in module
 main(*read_options(sys.argv[1:], os.environ))
   File ./cqlsh, line 2451, in main
 shell.cmdloop()
   File ./cqlsh, line 940, in cmdloop
 line = self.get_input_line(self.prompt)
   File ./cqlsh, line 909, in get_input_line
 self.lastcmd = self.stdin.readline()
   File 
 /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/codecs.py,
  line 675, in readline
 return self.reader.readline(size)
   File 
 /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/codecs.py,
  line 530, in readline
 data = self.read(readsize, firstline=True)
   File 
 /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/codecs.py,
  line 477, in read
 newchars, decodedbytes = self.decode(data, self.errors)
 UnicodeDecodeError: 'ascii' codec can't decode byte 0xe6 in position 3: 
 ordinal not in range(128)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-9898) cqlsh crashes if it load a utf-8 file.

2015-08-04 Thread Yasuharu Goto (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14654651#comment-14654651
 ] 

Yasuharu Goto edited comment on CASSANDRA-9898 at 8/5/15 1:33 AM:
--

Hmm, I've seen the ticket, but then I thought it's different issue with mine 
because their error log looks so different.
But now I agree with you. their repro code looks get fixed by my patch with my 
brief test.


was (Author: yasuharu):
Hmm, I've seen the ticket, but then I thought it's different issue with mine 
because their error log looks so different.
But now I agree with you. their repro code looks get fixed by my patch in my 
brief test.

 cqlsh crashes if it load a utf-8 file.
 --

 Key: CASSANDRA-9898
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9898
 Project: Cassandra
  Issue Type: Bug
  Components: Tools
 Environment: linux, os x yosemite.
Reporter: Yasuharu Goto
Assignee: Yasuharu Goto
Priority: Minor
  Labels: cqlsh
 Fix For: 2.1.x, 2.2.x

 Attachments: cassandra-2.1-9898.txt, cassandra-2.2-9898.txt


 cqlsh crashes when it load a cql script file encoded in utf-8.
 This is a reproduction procedure.
 {noformat}
 $cat ./test.cql
 // 日本語のコメント
 use system;
 select * from system.peers;
 $cqlsh --version
 cqlsh 5.0.1
 $cqlsh -f ./test.cql
 Traceback (most recent call last):
   File ./cqlsh, line 2459, in module
 main(*read_options(sys.argv[1:], os.environ))
   File ./cqlsh, line 2451, in main
 shell.cmdloop()
   File ./cqlsh, line 940, in cmdloop
 line = self.get_input_line(self.prompt)
   File ./cqlsh, line 909, in get_input_line
 self.lastcmd = self.stdin.readline()
   File 
 /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/codecs.py,
  line 675, in readline
 return self.reader.readline(size)
   File 
 /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/codecs.py,
  line 530, in readline
 data = self.read(readsize, firstline=True)
   File 
 /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/codecs.py,
  line 477, in read
 newchars, decodedbytes = self.decode(data, self.errors)
 UnicodeDecodeError: 'ascii' codec can't decode byte 0xe6 in position 3: 
 ordinal not in range(128)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9898) cqlsh crashes if it load a utf-8 file.

2015-08-04 Thread Yasuharu Goto (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14654651#comment-14654651
 ] 

Yasuharu Goto commented on CASSANDRA-9898:
--

Hmm, I've seen the ticket, but then I thought it's different issue with mine 
because their error log looks so different.
But now I agree with you. their repro code looks get fixed by my patch in my 
brief test.

 cqlsh crashes if it load a utf-8 file.
 --

 Key: CASSANDRA-9898
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9898
 Project: Cassandra
  Issue Type: Bug
  Components: Tools
 Environment: linux, os x yosemite.
Reporter: Yasuharu Goto
Assignee: Yasuharu Goto
Priority: Minor
  Labels: cqlsh
 Fix For: 2.1.x, 2.2.x

 Attachments: cassandra-2.1-9898.txt, cassandra-2.2-9898.txt


 cqlsh crashes when it load a cql script file encoded in utf-8.
 This is a reproduction procedure.
 {noformat}
 $cat ./test.cql
 // 日本語のコメント
 use system;
 select * from system.peers;
 $cqlsh --version
 cqlsh 5.0.1
 $cqlsh -f ./test.cql
 Traceback (most recent call last):
   File ./cqlsh, line 2459, in module
 main(*read_options(sys.argv[1:], os.environ))
   File ./cqlsh, line 2451, in main
 shell.cmdloop()
   File ./cqlsh, line 940, in cmdloop
 line = self.get_input_line(self.prompt)
   File ./cqlsh, line 909, in get_input_line
 self.lastcmd = self.stdin.readline()
   File 
 /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/codecs.py,
  line 675, in readline
 return self.reader.readline(size)
   File 
 /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/codecs.py,
  line 530, in readline
 data = self.read(readsize, firstline=True)
   File 
 /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/codecs.py,
  line 477, in read
 newchars, decodedbytes = self.decode(data, self.errors)
 UnicodeDecodeError: 'ascii' codec can't decode byte 0xe6 in position 3: 
 ordinal not in range(128)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-9898) cqlsh crashes if it load a utf-8 file.

2015-07-25 Thread Yasuharu Goto (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9898?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yasuharu Goto updated CASSANDRA-9898:
-
Attachment: cassandra-2.2-9898.txt

 cqlsh crashes if it load a utf-8 file.
 --

 Key: CASSANDRA-9898
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9898
 Project: Cassandra
  Issue Type: Bug
  Components: Tools
 Environment: linux, os x yosemite.
Reporter: Yasuharu Goto
Assignee: Yasuharu Goto
Priority: Minor
 Attachments: cassandra-2.1-9898.txt, cassandra-2.2-9898.txt


 cqlsh crashes when it load a cql script file encoded in utf-8.
 This is a reproduction procedure.
 {quote}
 $cat ./test.cql
 // 日本語のコメント
 use system;
 select * from system.peers;
 $cqlsh --version
 cqlsh 5.0.1
 $cqlsh -f ./test.cql
 Traceback (most recent call last):
   File ./cqlsh, line 2459, in module
 main(*read_options(sys.argv[1:], os.environ))
   File ./cqlsh, line 2451, in main
 shell.cmdloop()
   File ./cqlsh, line 940, in cmdloop
 line = self.get_input_line(self.prompt)
   File ./cqlsh, line 909, in get_input_line
 self.lastcmd = self.stdin.readline()
   File 
 /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/codecs.py,
  line 675, in readline
 return self.reader.readline(size)
   File 
 /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/codecs.py,
  line 530, in readline
 data = self.read(readsize, firstline=True)
   File 
 /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/codecs.py,
  line 477, in read
 newchars, decodedbytes = self.decode(data, self.errors)
 {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-9898) cqlsh crashes if it load a utf-8 file.

2015-07-25 Thread Yasuharu Goto (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9898?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yasuharu Goto updated CASSANDRA-9898:
-
Assignee: Yuki Morishita

 cqlsh crashes if it load a utf-8 file.
 --

 Key: CASSANDRA-9898
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9898
 Project: Cassandra
  Issue Type: Bug
  Components: Tools
 Environment: linux, os x yosemite.
Reporter: Yasuharu Goto
Assignee: Yuki Morishita
Priority: Minor
 Attachments: cassandra-2.1-9898.txt


 cqlsh crashes when it load a cql script file encoded in utf-8.
 This is a reproduction procedure.
 {quote}
 $cat ./test.cql
 // 日本語のコメント
 use system;
 select * from system.peers;
 $cqlsh --version
 cqlsh 5.0.1
 $cqlsh -f ./test.cql
 Traceback (most recent call last):
   File ./cqlsh, line 2459, in module
 main(*read_options(sys.argv[1:], os.environ))
   File ./cqlsh, line 2451, in main
 shell.cmdloop()
   File ./cqlsh, line 940, in cmdloop
 line = self.get_input_line(self.prompt)
   File ./cqlsh, line 909, in get_input_line
 self.lastcmd = self.stdin.readline()
   File 
 /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/codecs.py,
  line 675, in readline
 return self.reader.readline(size)
   File 
 /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/codecs.py,
  line 530, in readline
 data = self.read(readsize, firstline=True)
   File 
 /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/codecs.py,
  line 477, in read
 newchars, decodedbytes = self.decode(data, self.errors)
 {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-9898) cqlsh crashes if it load a utf-8 file.

2015-07-24 Thread Yasuharu Goto (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9898?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yasuharu Goto updated CASSANDRA-9898:
-
   Attachment: cassandra-2.1-9898.txt
  Environment: linux, os x yosemite.
Reproduced In: 2.1.8, 2.2.0 rc2  (was: 2.2.0 rc2, 2.1.8)

 cqlsh crashes if it load a utf-8 file.
 --

 Key: CASSANDRA-9898
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9898
 Project: Cassandra
  Issue Type: Bug
  Components: Tools
 Environment: linux, os x yosemite.
Reporter: Yasuharu Goto
Priority: Minor
 Attachments: cassandra-2.1-9898.txt


 cqlsh crashes when it load a cql script file encoded in utf-8.
 This is a reproduction procedure.
 {quote}
 $cat ./test.cql
 // 日本語のコメント
 use system;
 select * from system.peers;
 $cqlsh --version
 cqlsh 5.0.1
 $cqlsh -f ./test.cql
 Traceback (most recent call last):
   File ./cqlsh, line 2459, in module
 main(*read_options(sys.argv[1:], os.environ))
   File ./cqlsh, line 2451, in main
 shell.cmdloop()
   File ./cqlsh, line 940, in cmdloop
 line = self.get_input_line(self.prompt)
   File ./cqlsh, line 909, in get_input_line
 self.lastcmd = self.stdin.readline()
   File 
 /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/codecs.py,
  line 675, in readline
 return self.reader.readline(size)
   File 
 /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/codecs.py,
  line 530, in readline
 data = self.read(readsize, firstline=True)
   File 
 /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/codecs.py,
  line 477, in read
 newchars, decodedbytes = self.decode(data, self.errors)
 {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-9898) cqlsh crashes if it load a utf-8 file.

2015-07-24 Thread Yasuharu Goto (JIRA)
Yasuharu Goto created CASSANDRA-9898:


 Summary: cqlsh crashes if it load a utf-8 file.
 Key: CASSANDRA-9898
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9898
 Project: Cassandra
  Issue Type: Bug
  Components: Tools
Reporter: Yasuharu Goto
Priority: Minor


cqlsh crashes when it load a cql script file encoded in utf-8.

This is a reproduction procedure.

{quote}
$cat ./test.cql
// 日本語のコメント
use system;
select * from system.peers;

$cqlsh --version
cqlsh 5.0.1

$cqlsh -f ./test.cql
Traceback (most recent call last):
  File ./cqlsh, line 2459, in module
main(*read_options(sys.argv[1:], os.environ))
  File ./cqlsh, line 2451, in main
shell.cmdloop()
  File ./cqlsh, line 940, in cmdloop
line = self.get_input_line(self.prompt)
  File ./cqlsh, line 909, in get_input_line
self.lastcmd = self.stdin.readline()
  File 
/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/codecs.py,
 line 675, in readline
return self.reader.readline(size)
  File 
/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/codecs.py,
 line 530, in readline
data = self.read(readsize, firstline=True)
  File 
/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/codecs.py,
 line 477, in read
newchars, decodedbytes = self.decode(data, self.errors)
{quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7469) RejectedExecutionException causes orphan SSTables

2014-06-30 Thread Yasuharu Goto (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14048407#comment-14048407
 ] 

Yasuharu Goto commented on CASSANDRA-7469:
--

I agree with you. I'm goint to upgrade my cluster.

 RejectedExecutionException causes orphan SSTables
 -

 Key: CASSANDRA-7469
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7469
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Yasuharu Goto
Priority: Minor

 I noticed that some old SSTables are not deleted and remaining in data dir. 
 They are never compacted.
 {code}
  ./ks2-cf2-he-9690-Data.db
  ./ks2-cf2-he-9691-Data.db
  ./ks2-cf2-he-9679-Data.db- current version id
  ./ks2-cf2-he-205-Data.db- very old version id
  ./ks2-cf2-he-201-Data.db
  ./ks2-cf2-he-202-Data.db
  ./ks2-cf2-he-203-Data.db
 {code}
 And I noticed that some RejectedExecutionException causes these orphan 
 SSTables.
 {code}
 ...
  INFO 18:51:45,323 DRAINING: starting drain process
  INFO 18:51:45,324 Stop listening to thrift clients
 ...
  # This compaction is not finished. Terminated by following Exception and 
 nerver retried. So these SSTables are not deleted eternally.
  INFO 18:51:46,512 Compacting 
 [SSTableReader(path='/var/cassandra/data/ks2/cf2/ks2-cf2-he-205-Data.db'), 
 SSTableReader(path='/var/cassandra/data/ks2/cf2/ks2-cf2-he-203-Data.db'), 
 SSTableReader(path='/var/cassandra/data/ks2/cf2/ks2-cf2-he-202-Data.db'), 
 SSTableReader(path='/var/cassandra/data/ks2/cf2/ks2-cf2-he-201-Data.db')]
 ...
 # This compaction is finished. They don't get to be orphans.
  INFO 18:51:46,641 Compacting 
 [SSTableReader(path='/var/cassandra/data/ks1/cf1/ks1-cf1-he-90-Data.db'), 
 SSTableReader(path='/var/cassandra/data/ks1/cf1/ks1-cf1-he-89-Data.db'), 
 SSTableReader(path='/var/cassandra/data/ks1/cf1/ks1-cf1-he-88-Data.db'), 
 SSTableReader(path='/var/cassandra/data/ks1/cf1/ks1-cf1-he-87-Data.db')]
  INFO 18:51:46,736 Compacted to 
 [/var/cassandra/data/ks1/cf1/ks1-cf1-he-91-Data.db,].  370,606 to 317,566 
 (~85% of original) bytes for 193 keys at 3.187943MB/s.  Time: 95ms.
  INFO 18:51:46,836 DRAINED
 ERROR 18:51:49,807 Exception in thread Thread[CompactionExecutor:1927,1,RMI 
 Runtime]
 java.util.concurrent.RejectedExecutionException: Task 
 java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask@32b5a2c6 
 rejected from 
 org.apache.cassandra.concurrent.DebuggableScheduledThreadPoolExecutor@32d18f2c[Terminated,
  pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 3043]
 at 
 java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2013)
 at 
 java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:816)
 at 
 java.util.concurrent.ScheduledThreadPoolExecutor.delayedExecute(ScheduledThreadPoolExecutor.java:325)
 at 
 java.util.concurrent.ScheduledThreadPoolExecutor.schedule(ScheduledThreadPoolExecutor.java:530)
 at 
 java.util.concurrent.ScheduledThreadPoolExecutor.submit(ScheduledThreadPoolExecutor.java:629)
 at 
 org.apache.cassandra.io.sstable.SSTableDeletingTask.schedule(SSTableDeletingTask.java:67)
 at 
 org.apache.cassandra.io.sstable.SSTableReader.releaseReference(SSTableReader.java:806)
 at 
 org.apache.cassandra.db.DataTracker.removeOldSSTablesSize(DataTracker.java:358)
 at 
 org.apache.cassandra.db.DataTracker.postReplace(DataTracker.java:330)
 at org.apache.cassandra.db.DataTracker.replace(DataTracker.java:324)
 at 
 org.apache.cassandra.db.DataTracker.replaceCompactedSSTables(DataTracker.java:253)
 at 
 org.apache.cassandra.db.ColumnFamilyStore.replaceCompactedSSTables(ColumnFamilyStore.java:992)
 at 
 org.apache.cassandra.db.compaction.CompactionTask.execute(CompactionTask.java:200)
 at 
 org.apache.cassandra.db.compaction.CompactionManager$1.runMayThrow(CompactionManager.java:154)
 at 
 org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
 at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
 at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
 at java.util.concurrent.FutureTask.run(FutureTask.java:166)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
 at java.lang.Thread.run(Thread.java:722)
  INFO 18:52:54,010 Cassandra shutting down...
 {code}
 As a result of log servey, we found some orphan SSTables caused by 
 RejectExcutionException.
 Maybe I can fix each orphan files by nodetool refresh.
 But I'd like to ask if this is a problem that already has been solved in 
 

[jira] [Created] (CASSANDRA-7469) RejectedExecutionException causes orphan SSTables

2014-06-29 Thread Yasuharu Goto (JIRA)
Yasuharu Goto created CASSANDRA-7469:


 Summary: RejectedExecutionException causes orphan SSTables
 Key: CASSANDRA-7469
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7469
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Yasuharu Goto
Priority: Minor


I noticed that some old SSTables are not deleted and remaining in data dir. 
They are never compacted.
{code}
 ./ks2-cf2-he-9690-Data.db
 ./ks2-cf2-he-9691-Data.db
 ./ks2-cf2-he-9679-Data.db- current version id
 ./ks2-cf2-he-205-Data.db- very old version id
 ./ks2-cf2-he-201-Data.db
 ./ks2-cf2-he-202-Data.db
 ./ks2-cf2-he-203-Data.db
{code}

And I noticed that some RejectedExecutionException causes these orphan SSTables.
{code}
...
 INFO 18:51:45,323 DRAINING: starting drain process
 INFO 18:51:45,324 Stop listening to thrift clients

...

 # This compaction is not finished. Terminated by following Exception and 
nerver retried. So these SSTables are not deleted eternally.

 INFO 18:51:46,512 Compacting 
[SSTableReader(path='/var/cassandra/data/ks2/cf2/ks2-cf2-he-205-Data.db'), 
SSTableReader(path='/var/cassandra/data/ks2/cf2/ks2-cf2-he-203-Data.db'), 
SSTableReader(path='/var/cassandra/data/ks2/cf2/ks2-cf2-he-202-Data.db'), 
SSTableReader(path='/var/cassandra/data/ks2/cf2/ks2-cf2-he-201-Data.db')]

...

# This compaction is finished. They don't get to be orphans.

 INFO 18:51:46,641 Compacting 
[SSTableReader(path='/var/cassandra/data/ks1/cf1/ks1-cf1-he-90-Data.db'), 
SSTableReader(path='/var/cassandra/data/ks1/cf1/ks1-cf1-he-89-Data.db'), 
SSTableReader(path='/var/cassandra/data/ks1/cf1/ks1-cf1-he-88-Data.db'), 
SSTableReader(path='/var/cassandra/data/ks1/cf1/ks1-cf1-he-87-Data.db')]
 INFO 18:51:46,736 Compacted to 
[/var/cassandra/data/ks1/cf1/ks1-cf1-he-91-Data.db,].  370,606 to 317,566 (~85% 
of original) bytes for 193 keys at 3.187943MB/s.  Time: 95ms.


 INFO 18:51:46,836 DRAINED
ERROR 18:51:49,807 Exception in thread Thread[CompactionExecutor:1927,1,RMI 
Runtime]
java.util.concurrent.RejectedExecutionException: Task 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask@32b5a2c6 
rejected from 
org.apache.cassandra.concurrent.DebuggableScheduledThreadPoolExecutor@32d18f2c[Terminated,
 pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 3043]
at 
java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2013)
at 
java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:816)
at 
java.util.concurrent.ScheduledThreadPoolExecutor.delayedExecute(ScheduledThreadPoolExecutor.java:325)
at 
java.util.concurrent.ScheduledThreadPoolExecutor.schedule(ScheduledThreadPoolExecutor.java:530)
at 
java.util.concurrent.ScheduledThreadPoolExecutor.submit(ScheduledThreadPoolExecutor.java:629)
at 
org.apache.cassandra.io.sstable.SSTableDeletingTask.schedule(SSTableDeletingTask.java:67)
at 
org.apache.cassandra.io.sstable.SSTableReader.releaseReference(SSTableReader.java:806)
at 
org.apache.cassandra.db.DataTracker.removeOldSSTablesSize(DataTracker.java:358)
at org.apache.cassandra.db.DataTracker.postReplace(DataTracker.java:330)
at org.apache.cassandra.db.DataTracker.replace(DataTracker.java:324)
at 
org.apache.cassandra.db.DataTracker.replaceCompactedSSTables(DataTracker.java:253)
at 
org.apache.cassandra.db.ColumnFamilyStore.replaceCompactedSSTables(ColumnFamilyStore.java:992)
at 
org.apache.cassandra.db.compaction.CompactionTask.execute(CompactionTask.java:200)
at 
org.apache.cassandra.db.compaction.CompactionManager$1.runMayThrow(CompactionManager.java:154)
at 
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
at java.util.concurrent.FutureTask.run(FutureTask.java:166)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:722)
 INFO 18:52:54,010 Cassandra shutting down...
{code}

As a result of log servey, we found some orphan SSTables caused by 
RejectExcutionException.

Maybe I can fix each orphan files by nodetool refresh.
But I'd like to ask if this is a problem that already has been solved in early 
release.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7210) Add --resolve-ip option on 'nodetool ring'

2014-05-16 Thread Yasuharu Goto (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13999554#comment-13999554
 ] 

Yasuharu Goto commented on CASSANDRA-7210:
--

[~mishail] Thank you for your review and commit !

 Add --resolve-ip option on 'nodetool ring'
 --

 Key: CASSANDRA-7210
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7210
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Reporter: Yasuharu Goto
Assignee: Yasuharu Goto
Priority: Trivial
 Fix For: 2.0.9, 2.1 rc1

 Attachments: 2.0-7210-2.txt, 2.0-7210.txt, trunk-7210-2.txt, 
 trunk-7210.txt


 Give nodetool ring the option of either displaying IPs or hostnames for the 
 nodes in a ring.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (CASSANDRA-7210) Add --resolve-ip option on 'nodetool ring'

2014-05-16 Thread Yasuharu Goto (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yasuharu Goto updated CASSANDRA-7210:
-

Attachment: trunk-7210-2.txt

[~mishail] Oops, I fixed.

 Add --resolve-ip option on 'nodetool ring'
 --

 Key: CASSANDRA-7210
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7210
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Reporter: Yasuharu Goto
Assignee: Yasuharu Goto
Priority: Trivial
 Fix For: 2.0.9, 2.1 rc1

 Attachments: 2.0-7210-2.txt, 2.0-7210.txt, trunk-7210-2.txt, 
 trunk-7210.txt


 Give nodetool ring the option of either displaying IPs or hostnames for the 
 nodes in a ring.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (CASSANDRA-7210) Add --resolve-ip option on 'nodetool ring'

2014-05-15 Thread Yasuharu Goto (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yasuharu Goto updated CASSANDRA-7210:
-

Attachment: 2.0-7210.txt

Thanks. I think 2.0-7210.txt goes correctly for 2.0.

 Add --resolve-ip option on 'nodetool ring'
 --

 Key: CASSANDRA-7210
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7210
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Reporter: Yasuharu Goto
Assignee: Yasuharu Goto
Priority: Trivial
 Fix For: 2.0.9

 Attachments: 2.0-7210.txt, trunk-7210.txt


 Give nodetool ring the option of either displaying IPs or hostnames for the 
 nodes in a ring.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (CASSANDRA-7210) Add --resolve-ip option on 'nodetool ring'

2014-05-15 Thread Yasuharu Goto (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yasuharu Goto updated CASSANDRA-7210:
-

Attachment: 2.0-7210-2.txt

Oh, sorry. I did debug via Eclipse and I couldn't notice that.
Now I fixed and checked it via bin/nodetool.

 Add --resolve-ip option on 'nodetool ring'
 --

 Key: CASSANDRA-7210
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7210
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Reporter: Yasuharu Goto
Assignee: Yasuharu Goto
Priority: Trivial
 Fix For: 2.0.9, 2.1 rc1

 Attachments: 2.0-7210-2.txt, 2.0-7210.txt, trunk-7210.txt


 Give nodetool ring the option of either displaying IPs or hostnames for the 
 nodes in a ring.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (CASSANDRA-2238) Allow nodetool to print out hostnames given an option

2014-05-12 Thread Yasuharu Goto (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yasuharu Goto updated CASSANDRA-2238:
-

Attachment: trunk-2238.txt

This issue has been already fixed on 'nodetool status'.
I'd like to add this --resolve-ip option on 'nodetool ring' too.

 Allow nodetool to print out hostnames given an option
 -

 Key: CASSANDRA-2238
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2238
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Reporter: Joaquin Casares
Assignee: Daneel S. Yaitskov
Priority: Trivial
 Fix For: 1.2.14, 2.0.5

 Attachments: trunk-2238.txt


 Give nodetool the option of either displaying IPs or hostnames for the nodes 
 in a ring.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-2238) Allow nodetool to print out hostnames given an option

2014-05-12 Thread Yasuharu Goto (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13995272#comment-13995272
 ] 

Yasuharu Goto commented on CASSANDRA-2238:
--

OK, I opened a new tickeet CASSANDRA-7210. Thank you.

 Allow nodetool to print out hostnames given an option
 -

 Key: CASSANDRA-2238
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2238
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Reporter: Joaquin Casares
Assignee: Daneel S. Yaitskov
Priority: Trivial
 Fix For: 1.2.14, 2.0.5

 Attachments: trunk-2238.txt


 Give nodetool the option of either displaying IPs or hostnames for the nodes 
 in a ring.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (CASSANDRA-7210) Add --resolve-ip option on 'nodetool ring'

2014-05-12 Thread Yasuharu Goto (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yasuharu Goto updated CASSANDRA-7210:
-

Attachment: trunk-7210.txt

CASSANDRA-2238 Adds --resolve-ip option which allow 'nodetool status' to print 
out hostname of nodes.
I'd like to add this option on 'nodetool ring' too.

 Add --resolve-ip option on 'nodetool ring'
 --

 Key: CASSANDRA-7210
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7210
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Reporter: Yasuharu Goto
Priority: Trivial
 Attachments: trunk-7210.txt


 Give nodetool ring the option of either displaying IPs or hostnames for the 
 nodes in a ring.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (CASSANDRA-7210) Add --resolve-ip option on 'nodetool ring'

2014-05-12 Thread Yasuharu Goto (JIRA)
Yasuharu Goto created CASSANDRA-7210:


 Summary: Add --resolve-ip option on 'nodetool ring'
 Key: CASSANDRA-7210
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7210
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Reporter: Yasuharu Goto
Priority: Trivial


Give nodetool ring the option of either displaying IPs or hostnames for the 
nodes in a ring.



--
This message was sent by Atlassian JIRA
(v6.2#6252)