[jira] [Comment Edited] (CASSANDRA-10937) OOM on multiple nodes on write load (v. 3.0.0), problem also present on DSE-4.8.3, but there it survives more time

2015-12-30 Thread Peter Kovgan (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15075732#comment-15075732
 ] 

Peter Kovgan edited comment on CASSANDRA-10937 at 12/31/15 5:30 AM:


DSE 4.8.3 failed with OOM after 48 hours of work.
Frustration is a weak word for what I feel


was (Author: tierhetze):
DSE 4.8.3 failed with OOM after 48 hours of work.
Frustration is a weak word fr what I feel

> OOM on multiple nodes on write load (v. 3.0.0), problem also present on 
> DSE-4.8.3, but there it survives more time
> --
>
> Key: CASSANDRA-10937
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10937
> Project: Cassandra
>  Issue Type: Bug
> Environment: Cassandra : 3.0.0
> Installed as open archive, no connection to any OS specific installer.
> Java:
> Java(TM) SE Runtime Environment (build 1.8.0_65-b17)
> OS :
> Linux version 2.6.32-431.el6.x86_64 
> (mockbu...@x86-023.build.eng.bos.redhat.com) (gcc version 4.4.7 20120313 (Red 
> Hat 4.4.7-4) (GCC) ) #1 SMP Sun Nov 10 22:19:54 EST 2013
> We have:
> 8 guests ( Linux OS as above) on 2 (VMWare managed) physical hosts. Each 
> physical host keeps 4 guests.
> Physical host parameters(shared by all 4 guests):
> Model: HP ProLiant DL380 Gen9
> Intel(R) Xeon(R) CPU E5-2690 v3 @ 2.60GHz
> 46 logical processors.
> Hyperthreading - enabled
> Each guest assigned to have:
> 1 disk 300 Gb for seq. log (NOT SSD)
> 1 disk 4T for data (NOT SSD)
> 11 CPU cores
> Disks are local, not shared.
> Memory on each host -  24 Gb total.
> 8 (or 6, tested both) Gb - cassandra heap
> (lshw and cpuinfo attached in file test2.rar)
>Reporter: Peter Kovgan
>Priority: Critical
> Attachments: gc-stat.txt, more-logs.rar, some-heap-stats.rar, 
> test2.rar, test3.rar, test4.rar
>
>
> 8 cassandra nodes.
> Load test started with 4 clients(different and not equal machines), each 
> running 1000 threads.
> Each thread assigned in round-robin way to run one of 4 different inserts. 
> Consistency->ONE.
> I attach the full CQL schema of tables and the query of insert.
> Replication factor - 2:
> create keyspace OBLREPOSITORY_NY with replication = 
> {'class':'NetworkTopologyStrategy','NY':2};
> Initiall throughput is:
> 215.000  inserts /sec
> or
> 54Mb/sec, considering single insert size a bit larger than 256byte.
> Data:
> all fields(5-6) are short strings, except one is BLOB of 256 bytes.
> After about a 2-3 hours of work, I was forced to increase timeout from 2000 
> to 5000ms, for some requests failed for short timeout.
> Later on(after aprox. 12 hous of work) OOM happens on multiple nodes.
> (all failed nodes logs attached)
> I attach also java load client and instructions how set-up and use 
> it.(test2.rar)
> Update:
> Later on test repeated with lesser load (10 mes/sec) with more relaxed 
> CPU (idle 25%), with only 2 test clients, but anyway test failed.
> Update:
> DSE-4.8.3 also failed on OOM (3 nodes from 8), but here it survived 48 hours, 
> not 10-12.
> Attachments:
> test2.rar -contains most of material
> more-logs.rar - contains additional nodes logs



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10937) OOM on multiple nodes on write load (v. 3.0.0), problem also present on DSE-4.8.3, but there it survives more time

2015-12-30 Thread Peter Kovgan (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15075732#comment-15075732
 ] 

Peter Kovgan commented on CASSANDRA-10937:
--

DSE 4.8.3 failed with OOM after 48 hours of work.
Frustration is a weak word fr what I feel

> OOM on multiple nodes on write load (v. 3.0.0), problem also present on 
> DSE-4.8.3, but there it survives more time
> --
>
> Key: CASSANDRA-10937
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10937
> Project: Cassandra
>  Issue Type: Bug
> Environment: Cassandra : 3.0.0
> Installed as open archive, no connection to any OS specific installer.
> Java:
> Java(TM) SE Runtime Environment (build 1.8.0_65-b17)
> OS :
> Linux version 2.6.32-431.el6.x86_64 
> (mockbu...@x86-023.build.eng.bos.redhat.com) (gcc version 4.4.7 20120313 (Red 
> Hat 4.4.7-4) (GCC) ) #1 SMP Sun Nov 10 22:19:54 EST 2013
> We have:
> 8 guests ( Linux OS as above) on 2 (VMWare managed) physical hosts. Each 
> physical host keeps 4 guests.
> Physical host parameters(shared by all 4 guests):
> Model: HP ProLiant DL380 Gen9
> Intel(R) Xeon(R) CPU E5-2690 v3 @ 2.60GHz
> 46 logical processors.
> Hyperthreading - enabled
> Each guest assigned to have:
> 1 disk 300 Gb for seq. log (NOT SSD)
> 1 disk 4T for data (NOT SSD)
> 11 CPU cores
> Disks are local, not shared.
> Memory on each host -  24 Gb total.
> 8 (or 6, tested both) Gb - cassandra heap
> (lshw and cpuinfo attached in file test2.rar)
>Reporter: Peter Kovgan
>Priority: Critical
> Attachments: gc-stat.txt, more-logs.rar, some-heap-stats.rar, 
> test2.rar, test3.rar, test4.rar
>
>
> 8 cassandra nodes.
> Load test started with 4 clients(different and not equal machines), each 
> running 1000 threads.
> Each thread assigned in round-robin way to run one of 4 different inserts. 
> Consistency->ONE.
> I attach the full CQL schema of tables and the query of insert.
> Replication factor - 2:
> create keyspace OBLREPOSITORY_NY with replication = 
> {'class':'NetworkTopologyStrategy','NY':2};
> Initiall throughput is:
> 215.000  inserts /sec
> or
> 54Mb/sec, considering single insert size a bit larger than 256byte.
> Data:
> all fields(5-6) are short strings, except one is BLOB of 256 bytes.
> After about a 2-3 hours of work, I was forced to increase timeout from 2000 
> to 5000ms, for some requests failed for short timeout.
> Later on(after aprox. 12 hous of work) OOM happens on multiple nodes.
> (all failed nodes logs attached)
> I attach also java load client and instructions how set-up and use 
> it.(test2.rar)
> Update:
> Later on test repeated with lesser load (10 mes/sec) with more relaxed 
> CPU (idle 25%), with only 2 test clients, but anyway test failed.
> Update:
> DSE-4.8.3 also failed on OOM (3 nodes from 8), but here it survived 48 hours, 
> not 10-12.
> Attachments:
> test2.rar -contains most of material
> more-logs.rar - contains additional nodes logs



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


cassandra git commit: close log file stream

2015-12-30 Thread dbrosius
Repository: cassandra
Updated Branches:
  refs/heads/trunk 2a6aa8cfb -> bcbb53b7d


close log file stream


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/bcbb53b7
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/bcbb53b7
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/bcbb53b7

Branch: refs/heads/trunk
Commit: bcbb53b7deb9d45a2833eb4755a2f39860ae05b6
Parents: 2a6aa8c
Author: Dave Brosius 
Authored: Wed Dec 30 21:28:54 2015 -0500
Committer: Dave Brosius 
Committed: Wed Dec 30 21:28:54 2015 -0500

--
 .../apache/cassandra/stress/StressGraph.java| 34 ++--
 1 file changed, 17 insertions(+), 17 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/bcbb53b7/tools/stress/src/org/apache/cassandra/stress/StressGraph.java
--
diff --git a/tools/stress/src/org/apache/cassandra/stress/StressGraph.java 
b/tools/stress/src/org/apache/cassandra/stress/StressGraph.java
index 4e7fd12..ebaa0ae 100644
--- a/tools/stress/src/org/apache/cassandra/stress/StressGraph.java
+++ b/tools/stress/src/org/apache/cassandra/stress/StressGraph.java
@@ -234,28 +234,28 @@ public class StressGraph
 
 private JSONObject createJSONStats(JSONObject json)
 {
-JSONArray stats;
-if (json == null)
+try (InputStream logStream = new 
FileInputStream(stressSettings.graph.temporaryLogFile))
 {
-json = new JSONObject();
-stats = new JSONArray();
-}
-else
-{
-stats = (JSONArray) json.get("stats");
-}
+JSONArray stats;
+if (json == null)
+{
+json = new JSONObject();
+stats = new JSONArray();
+}
+else
+{
+stats = (JSONArray) json.get("stats");
+}
 
-try
-{
-stats = parseLogStats(new 
FileInputStream(stressSettings.graph.temporaryLogFile), stats);
+stats = parseLogStats(logStream, stats);
+
+json.put("title", stressSettings.graph.title);
+json.put("stats", stats);
+return json;
 }
-catch (FileNotFoundException e)
+catch (IOException e)
 {
 throw new RuntimeException(e);
 }
-
-json.put("title", stressSettings.graph.title);
-json.put("stats", stats);
-return json;
 }
 }



[jira] [Updated] (CASSANDRA-10937) OOM on multiple nodes on write load (v. 3.0.0), problem also present on DSE-4.8.3, but there it survives more time

2015-12-30 Thread Peter Kovgan (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Kovgan updated CASSANDRA-10937:
-
Attachment: test5.rar

test5.rar contains logs of failed DSE-4.8.3
Unfortunatelly I have not found .hprof files.

> OOM on multiple nodes on write load (v. 3.0.0), problem also present on 
> DSE-4.8.3, but there it survives more time
> --
>
> Key: CASSANDRA-10937
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10937
> Project: Cassandra
>  Issue Type: Bug
> Environment: Cassandra : 3.0.0
> Installed as open archive, no connection to any OS specific installer.
> Java:
> Java(TM) SE Runtime Environment (build 1.8.0_65-b17)
> OS :
> Linux version 2.6.32-431.el6.x86_64 
> (mockbu...@x86-023.build.eng.bos.redhat.com) (gcc version 4.4.7 20120313 (Red 
> Hat 4.4.7-4) (GCC) ) #1 SMP Sun Nov 10 22:19:54 EST 2013
> We have:
> 8 guests ( Linux OS as above) on 2 (VMWare managed) physical hosts. Each 
> physical host keeps 4 guests.
> Physical host parameters(shared by all 4 guests):
> Model: HP ProLiant DL380 Gen9
> Intel(R) Xeon(R) CPU E5-2690 v3 @ 2.60GHz
> 46 logical processors.
> Hyperthreading - enabled
> Each guest assigned to have:
> 1 disk 300 Gb for seq. log (NOT SSD)
> 1 disk 4T for data (NOT SSD)
> 11 CPU cores
> Disks are local, not shared.
> Memory on each host -  24 Gb total.
> 8 (or 6, tested both) Gb - cassandra heap
> (lshw and cpuinfo attached in file test2.rar)
>Reporter: Peter Kovgan
>Priority: Critical
> Attachments: gc-stat.txt, more-logs.rar, some-heap-stats.rar, 
> test2.rar, test3.rar, test4.rar, test5.rar
>
>
> 8 cassandra nodes.
> Load test started with 4 clients(different and not equal machines), each 
> running 1000 threads.
> Each thread assigned in round-robin way to run one of 4 different inserts. 
> Consistency->ONE.
> I attach the full CQL schema of tables and the query of insert.
> Replication factor - 2:
> create keyspace OBLREPOSITORY_NY with replication = 
> {'class':'NetworkTopologyStrategy','NY':2};
> Initiall throughput is:
> 215.000  inserts /sec
> or
> 54Mb/sec, considering single insert size a bit larger than 256byte.
> Data:
> all fields(5-6) are short strings, except one is BLOB of 256 bytes.
> After about a 2-3 hours of work, I was forced to increase timeout from 2000 
> to 5000ms, for some requests failed for short timeout.
> Later on(after aprox. 12 hous of work) OOM happens on multiple nodes.
> (all failed nodes logs attached)
> I attach also java load client and instructions how set-up and use 
> it.(test2.rar)
> Update:
> Later on test repeated with lesser load (10 mes/sec) with more relaxed 
> CPU (idle 25%), with only 2 test clients, but anyway test failed.
> Update:
> DSE-4.8.3 also failed on OOM (3 nodes from 8), but here it survived 48 hours, 
> not 10-12.
> Attachments:
> test2.rar -contains most of material
> more-logs.rar - contains additional nodes logs



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-10958) Range query using secondary index returns weird results

2015-12-30 Thread Taiyuan Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Taiyuan Zhang updated CASSANDRA-10958:
--
Description: 
 I'm playing with Cassandra 3. I added a secondary index on a column of 
integer, then I want to do a range query. First it threw an error:

{code:shell}
InvalidRequest: code=2200 [Invalid query] message="No supported secondary index 
found for the non primary key columns restrictions"
{code}

So I added 'Allow Filtering'

cqlsh:mykeyspace> SELECT * FROM test ;

id | id2 | age | extra
+-+-+---
  1 |   1 |   1 | 1
  2 |   2 |   2 | 2

(2 rows)
cqlsh:mykeyspace > CREATE INDEX test_age on test (extra) ;
cqlsh:mykeyspace > select * FROM test WHERE extra < 2 ALLOW FILTERING ;

 id | id2  | age | extra
+--+-+---
  1 |1 |   1 | 1
  2 | null |   2 |  null

(2 rows)

My schema is:

CREATE TABLE mykeyspace.test (
id int,
id2 int,
age int static,
extra int,
PRIMARY KEY (id, id2)
) 

It certainly looks like a BUG to me, even though it has a chance to be 
something by-design.


  was:
 I'm playing with Cassandra 3. I added a secondary index on a column of 
integer, then I want to do a range query. First it threw an error:

{code}
InvalidRequest: code=2200 [Invalid query] message="No supported secondary 
index found for the non primary key columns restrictions"
{code}

So I added 'Allow Filtering'

cqlsh:mykeyspace> SELECT * FROM test ;

id | id2 | age | extra
+-+-+---
  1 |   1 |   1 | 1
  2 |   2 |   2 | 2

(2 rows)
cqlsh:mykeyspace > CREATE INDEX test_age on test (extra) ;
cqlsh:mykeyspace > select * FROM test WHERE extra < 2 ALLOW FILTERING ;

 id | id2  | age | extra
+--+-+---
  1 |1 |   1 | 1
  2 | null |   2 |  null

(2 rows)

My schema is:

CREATE TABLE mykeyspace.test (
id int,
id2 int,
age int static,
extra int,
PRIMARY KEY (id, id2)
) 

It certainly looks like a BUG to me, even though it has a chance to be 
something by-design.



> Range query using secondary index returns weird results
> ---
>
> Key: CASSANDRA-10958
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10958
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Taiyuan Zhang
>Priority: Minor
>
>  I'm playing with Cassandra 3. I added a secondary index on a column of 
> integer, then I want to do a range query. First it threw an error:
> {code:shell}
> InvalidRequest: code=2200 [Invalid query] message="No supported secondary 
> index found for the non primary key columns restrictions"
> {code}
> So I added 'Allow Filtering'
> cqlsh:mykeyspace> SELECT * FROM test ;
> id | id2 | age | extra
> +-+-+---
>   1 |   1 |   1 | 1
>   2 |   2 |   2 | 2
> (2 rows)
> cqlsh:mykeyspace > CREATE INDEX test_age on test (extra) ;
> cqlsh:mykeyspace > select * FROM test WHERE extra < 2 ALLOW FILTERING ;
>  id | id2  | age | extra
> +--+-+---
>   1 |1 |   1 | 1
>   2 | null |   2 |  null
> (2 rows)
> My schema is:
> CREATE TABLE mykeyspace.test (
> id int,
> id2 int,
> age int static,
> extra int,
> PRIMARY KEY (id, id2)
> ) 
> It certainly looks like a BUG to me, even though it has a chance to be 
> something by-design.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-10958) Range query using secondary index returns weird results

2015-12-30 Thread Taiyuan Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Taiyuan Zhang updated CASSANDRA-10958:
--
Description: 
 I'm playing with Cassandra 3. I added a secondary index on a column of 
integer, then I want to do a range query. First it threw an error:

{code}
InvalidRequest: code=2200 [Invalid query] message="No supported secondary index 
found for the non primary key columns restrictions"
{code}

So I added 'Allow Filtering'

{code}
cqlsh:mykeyspace> SELECT * FROM test ;

id | id2 | age | extra
+-+-+---
  1 |   1 |   1 | 1
  2 |   2 |   2 | 2

(2 rows)
cqlsh:mykeyspace > CREATE INDEX test_age on test (extra) ;
cqlsh:mykeyspace > select * FROM test WHERE extra < 2 ALLOW FILTERING ;

 id | id2  | age | extra
+--+-+---
  1 |1 |   1 | 1
  2 | null |   2 |  null

(2 rows)
{code}
My schema is:
{code}
CREATE TABLE mykeyspace.test (
id int,
id2 int,
age int static,
extra int,
PRIMARY KEY (id, id2)
) 
{code}

I don't know if this is by design or not, but it really does look like a BUG to 
me.


  was:
 I'm playing with Cassandra 3. I added a secondary index on a column of 
integer, then I want to do a range query. First it threw an error:

{code}
InvalidRequest: code=2200 [Invalid query] message="No supported secondary index 
found for the non primary key columns restrictions"
{code}

So I added 'Allow Filtering'

{code}
cqlsh:mykeyspace> SELECT * FROM test ;

id | id2 | age | extra
+-+-+---
  1 |   1 |   1 | 1
  2 |   2 |   2 | 2

(2 rows)
cqlsh:mykeyspace > CREATE INDEX test_age on test (extra) ;
cqlsh:mykeyspace > select * FROM test WHERE extra < 2 ALLOW FILTERING ;

 id | id2  | age | extra
+--+-+---
  1 |1 |   1 | 1
  2 | null |   2 |  null

(2 rows)
{code}
My schema is:
{code}
CREATE TABLE mykeyspace.test (
id int,
id2 int,
age int static,
extra int,
PRIMARY KEY (id, id2)
) 
{code}

It certainly looks like a BUG to me, even though it has a chance to be 
something by-design.



> Range query using secondary index returns weird results
> ---
>
> Key: CASSANDRA-10958
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10958
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Taiyuan Zhang
>Priority: Minor
>
>  I'm playing with Cassandra 3. I added a secondary index on a column of 
> integer, then I want to do a range query. First it threw an error:
> {code}
> InvalidRequest: code=2200 [Invalid query] message="No supported secondary 
> index found for the non primary key columns restrictions"
> {code}
> So I added 'Allow Filtering'
> {code}
> cqlsh:mykeyspace> SELECT * FROM test ;
> id | id2 | age | extra
> +-+-+---
>   1 |   1 |   1 | 1
>   2 |   2 |   2 | 2
> (2 rows)
> cqlsh:mykeyspace > CREATE INDEX test_age on test (extra) ;
> cqlsh:mykeyspace > select * FROM test WHERE extra < 2 ALLOW FILTERING ;
>  id | id2  | age | extra
> +--+-+---
>   1 |1 |   1 | 1
>   2 | null |   2 |  null
> (2 rows)
> {code}
> My schema is:
> {code}
> CREATE TABLE mykeyspace.test (
> id int,
> id2 int,
> age int static,
> extra int,
> PRIMARY KEY (id, id2)
> ) 
> {code}
> I don't know if this is by design or not, but it really does look like a BUG 
> to me.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-10958) Range query using secondary index returns weird results

2015-12-30 Thread Taiyuan Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Taiyuan Zhang updated CASSANDRA-10958:
--
Description: 
 I'm playing with Cassandra 3. I added a secondary index on a column of 
integer, then I want to do a range query. First it threw an error:

{code}
InvalidRequest: code=2200 [Invalid query] message="No supported secondary index 
found for the non primary key columns restrictions"
{code}

So I added 'Allow Filtering'

{code}
cqlsh:mykeyspace> SELECT * FROM test ;

id | id2 | age | extra
+-+-+---
  1 |   1 |   1 | 1
  2 |   2 |   2 | 2

(2 rows)
cqlsh:mykeyspace > CREATE INDEX test_age on test (extra) ;
cqlsh:mykeyspace > select * FROM test WHERE extra < 2 ALLOW FILTERING ;

 id | id2  | age | extra
+--+-+---
  1 |1 |   1 | 1
  2 | null |   2 |  null

(2 rows)
{code}
My schema is:
{code}
CREATE TABLE mykeyspace.test (
id int,
id2 int,
age int static,
extra int,
PRIMARY KEY (id, id2)
) 
{code}

It certainly looks like a BUG to me, even though it has a chance to be 
something by-design.


  was:
 I'm playing with Cassandra 3. I added a secondary index on a column of 
integer, then I want to do a range query. First it threw an error:

{code:shell}
InvalidRequest: code=2200 [Invalid query] message="No supported secondary index 
found for the non primary key columns restrictions"
{code}

So I added 'Allow Filtering'

cqlsh:mykeyspace> SELECT * FROM test ;

id | id2 | age | extra
+-+-+---
  1 |   1 |   1 | 1
  2 |   2 |   2 | 2

(2 rows)
cqlsh:mykeyspace > CREATE INDEX test_age on test (extra) ;
cqlsh:mykeyspace > select * FROM test WHERE extra < 2 ALLOW FILTERING ;

 id | id2  | age | extra
+--+-+---
  1 |1 |   1 | 1
  2 | null |   2 |  null

(2 rows)

My schema is:

CREATE TABLE mykeyspace.test (
id int,
id2 int,
age int static,
extra int,
PRIMARY KEY (id, id2)
) 

It certainly looks like a BUG to me, even though it has a chance to be 
something by-design.



> Range query using secondary index returns weird results
> ---
>
> Key: CASSANDRA-10958
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10958
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Taiyuan Zhang
>Priority: Minor
>
>  I'm playing with Cassandra 3. I added a secondary index on a column of 
> integer, then I want to do a range query. First it threw an error:
> {code}
> InvalidRequest: code=2200 [Invalid query] message="No supported secondary 
> index found for the non primary key columns restrictions"
> {code}
> So I added 'Allow Filtering'
> {code}
> cqlsh:mykeyspace> SELECT * FROM test ;
> id | id2 | age | extra
> +-+-+---
>   1 |   1 |   1 | 1
>   2 |   2 |   2 | 2
> (2 rows)
> cqlsh:mykeyspace > CREATE INDEX test_age on test (extra) ;
> cqlsh:mykeyspace > select * FROM test WHERE extra < 2 ALLOW FILTERING ;
>  id | id2  | age | extra
> +--+-+---
>   1 |1 |   1 | 1
>   2 | null |   2 |  null
> (2 rows)
> {code}
> My schema is:
> {code}
> CREATE TABLE mykeyspace.test (
> id int,
> id2 int,
> age int static,
> extra int,
> PRIMARY KEY (id, id2)
> ) 
> {code}
> It certainly looks like a BUG to me, even though it has a chance to be 
> something by-design.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-10958) Range query using secondary index returns weird results

2015-12-30 Thread Taiyuan Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Taiyuan Zhang updated CASSANDRA-10958:
--
Description: 
 I'm playing with Cassandra 3. I added a secondary index on a column of 
integer, then I want to do a range query. First it threw an error:

{code}
InvalidRequest: code=2200 [Invalid query] message="No supported secondary 
index found for the non primary key columns restrictions"
{code}

So I added 'Allow Filtering'

cqlsh:mykeyspace> SELECT * FROM test ;

id | id2 | age | extra
+-+-+---
  1 |   1 |   1 | 1
  2 |   2 |   2 | 2

(2 rows)
cqlsh:mykeyspace > CREATE INDEX test_age on test (extra) ;
cqlsh:mykeyspace > select * FROM test WHERE extra < 2 ALLOW FILTERING ;

 id | id2  | age | extra
+--+-+---
  1 |1 |   1 | 1
  2 | null |   2 |  null

(2 rows)

My schema is:

CREATE TABLE mykeyspace.test (
id int,
id2 int,
age int static,
extra int,
PRIMARY KEY (id, id2)
) 

It certainly looks like a BUG to me, even though it has a chance to be 
something by-design.


  was:
 I'm playing with Cassandra 3. I added a secondary index on a column of 
integer, then I want to do a range query. First it threw an error:

InvalidRequest: code=2200 [Invalid query] message="No supported secondary 
index found for the non primary key columns restrictions"

So I added 'Allow Filtering'

cqlsh:mykeyspace> SELECT * FROM test ;

id | id2 | age | extra
+-+-+---
  1 |   1 |   1 | 1
  2 |   2 |   2 | 2

(2 rows)
cqlsh:mykeyspace > CREATE INDEX test_age on test (extra) ;
cqlsh:mykeyspace > select * FROM test WHERE extra < 2 ALLOW FILTERING ;

 id | id2  | age | extra
+--+-+---
  1 |1 |   1 | 1
  2 | null |   2 |  null

(2 rows)

My schema is:

CREATE TABLE mykeyspace.test (
id int,
id2 int,
age int static,
extra int,
PRIMARY KEY (id, id2)
) 

It certainly looks like a BUG to me, even though it has a chance to be 
something by-design.



> Range query using secondary index returns weird results
> ---
>
> Key: CASSANDRA-10958
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10958
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Taiyuan Zhang
>Priority: Minor
>
>  I'm playing with Cassandra 3. I added a secondary index on a column of 
> integer, then I want to do a range query. First it threw an error:
> {code}
> InvalidRequest: code=2200 [Invalid query] message="No supported secondary 
> index found for the non primary key columns restrictions"
> {code}
> So I added 'Allow Filtering'
> cqlsh:mykeyspace> SELECT * FROM test ;
> id | id2 | age | extra
> +-+-+---
>   1 |   1 |   1 | 1
>   2 |   2 |   2 | 2
> (2 rows)
> cqlsh:mykeyspace > CREATE INDEX test_age on test (extra) ;
> cqlsh:mykeyspace > select * FROM test WHERE extra < 2 ALLOW FILTERING ;
>  id | id2  | age | extra
> +--+-+---
>   1 |1 |   1 | 1
>   2 | null |   2 |  null
> (2 rows)
> My schema is:
> CREATE TABLE mykeyspace.test (
> id int,
> id2 int,
> age int static,
> extra int,
> PRIMARY KEY (id, id2)
> ) 
> It certainly looks like a BUG to me, even though it has a chance to be 
> something by-design.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-10937) OOM on multiple nodes on write load (v. 3.0.0), problem also present on DSE-4.8.3, but there it survives more time

2015-12-30 Thread Peter Kovgan (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Kovgan updated CASSANDRA-10937:
-
Description: 
8 cassandra nodes.

Load test started with 4 clients(different and not equal machines), each 
running 1000 threads.
Each thread assigned in round-robin way to run one of 4 different inserts. 
Consistency->ONE.

I attach the full CQL schema of tables and the query of insert.

Replication factor - 2:
create keyspace OBLREPOSITORY_NY with replication = 
{'class':'NetworkTopologyStrategy','NY':2};

Initiall throughput is:
215.000  inserts /sec
or
54Mb/sec, considering single insert size a bit larger than 256byte.

Data:
all fields(5-6) are short strings, except one is BLOB of 256 bytes.

After about a 2-3 hours of work, I was forced to increase timeout from 2000 to 
5000ms, for some requests failed for short timeout.

Later on(after aprox. 12 hous of work) OOM happens on multiple nodes.
(all failed nodes logs attached)

I attach also java load client and instructions how set-up and use 
it.(test2.rar)

Update:

Later on test repeated with lesser load (10 mes/sec) with more relaxed CPU 
(idle 25%), with only 2 test clients, but anyway test failed.

Update:

DSE-4.8.3 also failed on OOM (3 nodes from 8), but here it survived 48 hours, 
not 10-12.

Attachments:
test2.rar -contains most of material
more-logs.rar - contains additional nodes logs






  was:
8 cassandra nodes.

Load test started with 4 clients(different and not equal machines), each 
running 1000 threads.
Each thread assigned in round-robin way to run one of 4 different inserts. 
Consistency->ONE.

I attach the full CQL schema of tables and the query of insert.

Replication factor - 2:
create keyspace OBLREPOSITORY_NY with replication = 
{'class':'NetworkTopologyStrategy','NY':2};

Initiall throughput is:
215.000  inserts /sec
or
54Mb/sec, considering single insert size a bit larger than 256byte.

Data:
all fields(5-6) are short strings, except one is BLOB of 256 bytes.

After about a 2-3 hours of work, I was forced to increase timeout from 2000 to 
5000ms, for some requests failed for short timeout.

Later on(after aprox. 12 hous of work) OOM happens on multiple nodes.
(all failed nodes logs attached)

I attach also java load client and instructions how set-up and use 
it.(test2.rar)

Update:

Later on test repeated with lesser load (10 mes/sec) with more relaxed CPU 
(idle 25%), with only 2 test clients, but anyway test failed.

Update:

DSE-4.8.3 also failed on OOM (3 nodes from 8), but here it survived 48 hours, 
not 10-12.

If we do not find resolution in closest time we will consider some other DB.

Attachments:
test2.rar -contains most of material
more-logs.rar - contains additional nodes logs







> OOM on multiple nodes on write load (v. 3.0.0), problem also present on 
> DSE-4.8.3, but there it survives more time
> --
>
> Key: CASSANDRA-10937
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10937
> Project: Cassandra
>  Issue Type: Bug
> Environment: Cassandra : 3.0.0
> Installed as open archive, no connection to any OS specific installer.
> Java:
> Java(TM) SE Runtime Environment (build 1.8.0_65-b17)
> OS :
> Linux version 2.6.32-431.el6.x86_64 
> (mockbu...@x86-023.build.eng.bos.redhat.com) (gcc version 4.4.7 20120313 (Red 
> Hat 4.4.7-4) (GCC) ) #1 SMP Sun Nov 10 22:19:54 EST 2013
> We have:
> 8 guests ( Linux OS as above) on 2 (VMWare managed) physical hosts. Each 
> physical host keeps 4 guests.
> Physical host parameters(shared by all 4 guests):
> Model: HP ProLiant DL380 Gen9
> Intel(R) Xeon(R) CPU E5-2690 v3 @ 2.60GHz
> 46 logical processors.
> Hyperthreading - enabled
> Each guest assigned to have:
> 1 disk 300 Gb for seq. log (NOT SSD)
> 1 disk 4T for data (NOT SSD)
> 11 CPU cores
> Disks are local, not shared.
> Memory on each host -  24 Gb total.
> 8 (or 6, tested both) Gb - cassandra heap
> (lshw and cpuinfo attached in file test2.rar)
>Reporter: Peter Kovgan
>Priority: Critical
> Attachments: gc-stat.txt, more-logs.rar, some-heap-stats.rar, 
> test2.rar, test3.rar, test4.rar
>
>
> 8 cassandra nodes.
> Load test started with 4 clients(different and not equal machines), each 
> running 1000 threads.
> Each thread assigned in round-robin way to run one of 4 different inserts. 
> Consistency->ONE.
> I attach the full CQL schema of tables and the query of insert.
> Replication factor - 2:
> create keyspace OBLREPOSITORY_NY with replication = 
> {'class':'NetworkTopologyStrategy','NY':2};
> Initiall throughput is:
> 215.000  inserts /sec
> or
> 54Mb/sec, considering single insert size a bit larger than 256byte.
> Data:
> all fields(5-6) are short strings, except one is BLOB of 256 bytes.
> 

[jira] [Comment Edited] (CASSANDRA-10937) OOM on multiple nodes on write load (v. 3.0.0), problem also present on DSE-4.8.3, but there it survives more time

2015-12-30 Thread Peter Kovgan (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15075748#comment-15075748
 ] 

Peter Kovgan edited comment on CASSANDRA-10937 at 12/31/15 6:12 AM:


test5.rar contains logs of failed DSE-4.8.3



was (Author: tierhetze):
test5.rar contains logs of failed DSE-4.8.3
Unfortunatelly I have not found .hprof files.

> OOM on multiple nodes on write load (v. 3.0.0), problem also present on 
> DSE-4.8.3, but there it survives more time
> --
>
> Key: CASSANDRA-10937
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10937
> Project: Cassandra
>  Issue Type: Bug
> Environment: Cassandra : 3.0.0
> Installed as open archive, no connection to any OS specific installer.
> Java:
> Java(TM) SE Runtime Environment (build 1.8.0_65-b17)
> OS :
> Linux version 2.6.32-431.el6.x86_64 
> (mockbu...@x86-023.build.eng.bos.redhat.com) (gcc version 4.4.7 20120313 (Red 
> Hat 4.4.7-4) (GCC) ) #1 SMP Sun Nov 10 22:19:54 EST 2013
> We have:
> 8 guests ( Linux OS as above) on 2 (VMWare managed) physical hosts. Each 
> physical host keeps 4 guests.
> Physical host parameters(shared by all 4 guests):
> Model: HP ProLiant DL380 Gen9
> Intel(R) Xeon(R) CPU E5-2690 v3 @ 2.60GHz
> 46 logical processors.
> Hyperthreading - enabled
> Each guest assigned to have:
> 1 disk 300 Gb for seq. log (NOT SSD)
> 1 disk 4T for data (NOT SSD)
> 11 CPU cores
> Disks are local, not shared.
> Memory on each host -  24 Gb total.
> 8 (or 6, tested both) Gb - cassandra heap
> (lshw and cpuinfo attached in file test2.rar)
>Reporter: Peter Kovgan
>Priority: Critical
> Attachments: gc-stat.txt, more-logs.rar, some-heap-stats.rar, 
> test2.rar, test3.rar, test4.rar, test5.rar
>
>
> 8 cassandra nodes.
> Load test started with 4 clients(different and not equal machines), each 
> running 1000 threads.
> Each thread assigned in round-robin way to run one of 4 different inserts. 
> Consistency->ONE.
> I attach the full CQL schema of tables and the query of insert.
> Replication factor - 2:
> create keyspace OBLREPOSITORY_NY with replication = 
> {'class':'NetworkTopologyStrategy','NY':2};
> Initiall throughput is:
> 215.000  inserts /sec
> or
> 54Mb/sec, considering single insert size a bit larger than 256byte.
> Data:
> all fields(5-6) are short strings, except one is BLOB of 256 bytes.
> After about a 2-3 hours of work, I was forced to increase timeout from 2000 
> to 5000ms, for some requests failed for short timeout.
> Later on(after aprox. 12 hous of work) OOM happens on multiple nodes.
> (all failed nodes logs attached)
> I attach also java load client and instructions how set-up and use 
> it.(test2.rar)
> Update:
> Later on test repeated with lesser load (10 mes/sec) with more relaxed 
> CPU (idle 25%), with only 2 test clients, but anyway test failed.
> Update:
> DSE-4.8.3 also failed on OOM (3 nodes from 8), but here it survived 48 hours, 
> not 10-12.
> Attachments:
> test2.rar -contains most of material
> more-logs.rar - contains additional nodes logs



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-10958) Range query using secondary index returns weird results

2015-12-30 Thread Taiyuan Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Taiyuan Zhang updated CASSANDRA-10958:
--
Description: 
 I'm playing with Cassandra 3. I added a secondary index on a column of 
integer, then I want to do a range query. First it threw an error:

InvalidRequest: code=2200 [Invalid query] message="No supported secondary 
index found for the non primary key columns restrictions"

So I added 'Allow Filtering'

cqlsh:mykeyspace> SELECT * FROM test ;

id | id2 | age | extra
+-+-+---
  1 |   1 |   1 | 1
  2 |   2 |   2 | 2

(2 rows)
cqlsh:mykeyspace > CREATE INDEX test_age on test (extra) ;
cqlsh:mykeyspace > select * FROM test WHERE extra < 2 ALLOW FILTERING ;

 id | id2  | age | extra
+--+-+---
  1 |1 |   1 | 1
  2 | null |   2 |  null

(2 rows)

My schema is:

CREATE TABLE mykeyspace.test (
id int,
id2 int,
age int static,
extra int,
PRIMARY KEY (id, id2)
) 

It certainly looks like a BUG to me, even though it has a chance to be 
something by-design.


  was:
I'm playing with Cassandra 3. I added a secondary index on a column of integer, 
then I want to do a range query. First it threw an error:

InvalidRequest: code=2200 [Invalid query] message="No supported secondary 
index found for the non primary key columns restrictions"

So I added 'Allow Filtering'

cqlsh:mykeyspace> SELECT * FROM test ;

id | id2 | age | extra
+-+-+---
  1 |   1 |   1 | 1
  2 |   2 |   2 | 2

(2 rows)
cqlsh:mykeyspace > CREATE INDEX test_age on test (extra) ;
cqlsh:mykeyspace > select * FROM test WHERE extra < 2 ALLOW FILTERING ;

 id | id2  | age | extra
+--+-+---
  1 |1 |   1 | 1
  2 | null |   2 |  null

(2 rows)

My schema is:

CREATE TABLE mykeyspace.test (
id int,
id2 int,
age int static,
extra int,
PRIMARY KEY (id, id2)
) 

It certainly looks like a BUG to me, even though it has a chance to be 
something by-design.



> Range query using secondary index returns weird results
> ---
>
> Key: CASSANDRA-10958
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10958
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Taiyuan Zhang
>Priority: Minor
>
>  I'm playing with Cassandra 3. I added a secondary index on a column of 
> integer, then I want to do a range query. First it threw an error:
> InvalidRequest: code=2200 [Invalid query] message="No supported secondary 
> index found for the non primary key columns restrictions"
> So I added 'Allow Filtering'
> cqlsh:mykeyspace> SELECT * FROM test ;
> id | id2 | age | extra
> +-+-+---
>   1 |   1 |   1 | 1
>   2 |   2 |   2 | 2
> (2 rows)
> cqlsh:mykeyspace > CREATE INDEX test_age on test (extra) ;
> cqlsh:mykeyspace > select * FROM test WHERE extra < 2 ALLOW FILTERING ;
>  id | id2  | age | extra
> +--+-+---
>   1 |1 |   1 | 1
>   2 | null |   2 |  null
> (2 rows)
> My schema is:
> CREATE TABLE mykeyspace.test (
> id int,
> id2 int,
> age int static,
> extra int,
> PRIMARY KEY (id, id2)
> ) 
> It certainly looks like a BUG to me, even though it has a chance to be 
> something by-design.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-10958) Range query using secondary index returns weird results

2015-12-30 Thread Taiyuan Zhang (JIRA)
Taiyuan Zhang created CASSANDRA-10958:
-

 Summary: Range query using secondary index returns weird results
 Key: CASSANDRA-10958
 URL: https://issues.apache.org/jira/browse/CASSANDRA-10958
 Project: Cassandra
  Issue Type: Bug
Reporter: Taiyuan Zhang
Priority: Minor


I'm playing with Cassandra 3. I added a secondary index on a column of integer, 
then I want to do a range query. First it threw an error:

InvalidRequest: code=2200 [Invalid query] message="No supported secondary 
index found for the non primary key columns restrictions"

So I added 'Allow Filtering'

cqlsh:mykeyspace> SELECT * FROM test ;

id | id2 | age | extra
+-+-+---
  1 |   1 |   1 | 1
  2 |   2 |   2 | 2

(2 rows)
cqlsh:mykeyspace > CREATE INDEX test_age on test (extra) ;
cqlsh:mykeyspace > select * FROM test WHERE extra < 2 ALLOW FILTERING ;

 id | id2  | age | extra
+--+-+---
  1 |1 |   1 | 1
  2 | null |   2 |  null

(2 rows)

My schema is:

CREATE TABLE mykeyspace.test (
id int,
id2 int,
age int static,
extra int,
PRIMARY KEY (id, id2)
) 

It certainly looks like a BUG to me, even though it has a chance to be 
something by-design.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-10953) Make all timeouts configurable via nodetool and jmx

2015-12-30 Thread Jeremy Hanna (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeremy Hanna updated CASSANDRA-10953:
-
Attachment: 10953-2.1.txt

> Make all timeouts configurable via nodetool and jmx
> ---
>
> Key: CASSANDRA-10953
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10953
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Sebastian Estevez
>Assignee: Jeremy Hanna
>  Labels: docs-impacting
> Fix For: 2.1.13
>
> Attachments: 10953-2.1.txt
>
>
> Specifically I was interested in being able to monitor and set 
> stream_socket_timeout_in_ms from either (or both) nodetool and JMX. 
> Chatting with [~thobbs] and [~jeromatron] we suspect it would also be useful 
> to be able to view and edit other C* timeouts via nodetool and JMX.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-10937) OOM on multiple nodes on write load (v. 3.0.0), problem also present on DSE-4.8.3, but there it survives more time

2015-12-30 Thread Peter Kovgan (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Kovgan updated CASSANDRA-10937:
-
Description: 
8 cassandra nodes.

Load test started with 4 clients(different and not equal machines), each 
running 1000 threads.
Each thread assigned in round-robin way to run one of 4 different inserts. 
Consistency->ONE.

I attach the full CQL schema of tables and the query of insert.

Replication factor - 2:
create keyspace OBLREPOSITORY_NY with replication = 
{'class':'NetworkTopologyStrategy','NY':2};

Initiall throughput is:
215.000  inserts /sec
or
54Mb/sec, considering single insert size a bit larger than 256byte.

Data:
all fields(5-6) are short strings, except one is BLOB of 256 bytes.

After about a 2-3 hours of work, I was forced to increase timeout from 2000 to 
5000ms, for some requests failed for short timeout.

Later on(after aprox. 12 hous of work) OOM happens on multiple nodes.
(all failed nodes logs attached)

I attach also java load client and instructions how set-up and use 
it.(test2.rar)

Update:

Later on test repeated with lesser load (10 mes/sec) with more relaxed CPU 
(idle 25%), with only 2 test clients, but anyway test failed.

Update:

DSE-4.8.3 also failed on OOM (3 nodes from 8), but here it survived 48 hours, 
not 10-12.

If we do not find resolution in closest time we will consider some other DB.

Attachments:
test2.rar -contains most of material
more-logs.rar - contains additional nodes logs






  was:
8 cassandra nodes.

Load test started with 4 clients(different and not equal machines), each 
running 1000 threads.
Each thread assigned in round-robin way to run one of 4 different inserts. 
Consistency->ONE.

I attach the full CQL schema of tables and the query of insert.

Replication factor - 2:
create keyspace OBLREPOSITORY_NY with replication = 
{'class':'NetworkTopologyStrategy','NY':2};

Initiall throughput is:
215.000  inserts /sec
or
54Mb/sec, considering single insert size a bit larger than 256byte.

Data:
all fields(5-6) are short strings, except one is BLOB of 256 bytes.

After about a 2-3 hours of work, I was forced to increase timeout from 2000 to 
5000ms, for some requests failed for short timeout.

Later on(after aprox. 12 hous of work) OOM happens on multiple nodes.
(all failed nodes logs attached)

I attach also java load client and instructions how set-up and use 
it.(test2.rar)

Update:

Later on test repeated with lesser load (10 mes/sec) with more relaxed CPU 
(idle 25%), with only 2 test clients, but anyway test failed.

Update:

DSE-4.8.3 also failed on OOM (3 nodes from 8), but here it survived 48 hours, 
not 10-12.


Attachments:
test2.rar -contains most of material
more-logs.rar - contains additional nodes logs







> OOM on multiple nodes on write load (v. 3.0.0), problem also present on 
> DSE-4.8.3, but there it survives more time
> --
>
> Key: CASSANDRA-10937
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10937
> Project: Cassandra
>  Issue Type: Bug
> Environment: Cassandra : 3.0.0
> Installed as open archive, no connection to any OS specific installer.
> Java:
> Java(TM) SE Runtime Environment (build 1.8.0_65-b17)
> OS :
> Linux version 2.6.32-431.el6.x86_64 
> (mockbu...@x86-023.build.eng.bos.redhat.com) (gcc version 4.4.7 20120313 (Red 
> Hat 4.4.7-4) (GCC) ) #1 SMP Sun Nov 10 22:19:54 EST 2013
> We have:
> 8 guests ( Linux OS as above) on 2 (VMWare managed) physical hosts. Each 
> physical host keeps 4 guests.
> Physical host parameters(shared by all 4 guests):
> Model: HP ProLiant DL380 Gen9
> Intel(R) Xeon(R) CPU E5-2690 v3 @ 2.60GHz
> 46 logical processors.
> Hyperthreading - enabled
> Each guest assigned to have:
> 1 disk 300 Gb for seq. log (NOT SSD)
> 1 disk 4T for data (NOT SSD)
> 11 CPU cores
> Disks are local, not shared.
> Memory on each host -  24 Gb total.
> 8 (or 6, tested both) Gb - cassandra heap
> (lshw and cpuinfo attached in file test2.rar)
>Reporter: Peter Kovgan
>Priority: Critical
> Attachments: gc-stat.txt, more-logs.rar, some-heap-stats.rar, 
> test2.rar, test3.rar, test4.rar
>
>
> 8 cassandra nodes.
> Load test started with 4 clients(different and not equal machines), each 
> running 1000 threads.
> Each thread assigned in round-robin way to run one of 4 different inserts. 
> Consistency->ONE.
> I attach the full CQL schema of tables and the query of insert.
> Replication factor - 2:
> create keyspace OBLREPOSITORY_NY with replication = 
> {'class':'NetworkTopologyStrategy','NY':2};
> Initiall throughput is:
> 215.000  inserts /sec
> or
> 54Mb/sec, considering single insert size a bit larger than 256byte.
> Data:
> all fields(5-6) are short strings, except one is BLOB of 256 bytes.
> 

[jira] [Updated] (CASSANDRA-10937) OOM on multiple nodes on write load (v. 3.0.0), problem also present on DSE-4.8.3, but there it survives more time

2015-12-30 Thread Peter Kovgan (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Kovgan updated CASSANDRA-10937:
-
Description: 
8 cassandra nodes.

Load test started with 4 clients(different and not equal machines), each 
running 1000 threads.
Each thread assigned in round-robin way to run one of 4 different inserts. 
Consistency->ONE.

I attach the full CQL schema of tables and the query of insert.

Replication factor - 2:
create keyspace OBLREPOSITORY_NY with replication = 
{'class':'NetworkTopologyStrategy','NY':2};

Initiall throughput is:
215.000  inserts /sec
or
54Mb/sec, considering single insert size a bit larger than 256byte.

Data:
all fields(5-6) are short strings, except one is BLOB of 256 bytes.

After about a 2-3 hours of work, I was forced to increase timeout from 2000 to 
5000ms, for some requests failed for short timeout.

Later on(after aprox. 12 hous of work) OOM happens on multiple nodes.
(all failed nodes logs attached)

I attach also java load client and instructions how set-up and use 
it.(test2.rar)

Update:

Later on test repeated with lesser load (10 mes/sec) with more relaxed CPU 
(idle 25%), with only 2 test clients, but anyway test failed.

Update:

DSE-4.8.3 also failed on OOM (3 nodes from 8), but here it survived 48 hours, 
not 10-12.


Attachments:
test2.rar -contains most of material
more-logs.rar - contains additional nodes logs






  was:
8 cassandra nodes.

Load test started with 4 clients(different and not equal machines), each 
running 1000 threads.
Each thread assigned in round-robin way to run one of 4 different inserts. 
Consistency->ONE.

I attach the full CQL schema of tables and the query of insert.

Replication factor - 2:
create keyspace OBLREPOSITORY_NY with replication = 
{'class':'NetworkTopologyStrategy','NY':2};

Initiall throughput is:
215.000  inserts /sec
or
54Mb/sec, considering single insert size a bit larger than 256byte.

Data:
all fields(5-6) are short strings, except one is BLOB of 256 bytes.

After about a 2-3 hours of work, I was forced to increase timeout from 2000 to 
5000ms, for some requests failed for short timeout.

Later on(after aprox. 12 hous of work) OOM happens on multiple nodes.
(all failed nodes logs attached)

I attach also java load client and instructions how set-up and use 
it.(test2.rar)

Update:

Later on test repeated with lesser load (10 mes/sec) with more relaxed CPU 
(idle 25%), with only 2 test clients, but anyway test failed.

At the end (29/12/16) tested DSE-4.8.3 with the load 100 000 mes/sec and it 
survived. The same installation pattern.

I think this is a bug, because OOM happens on later stage , when system runs 10 
hours and accumulated data on each node is about 250Gb. It is problem growing 
with time. Definitely.

I also noticed, that DSE a bit "slows down" a client (sometime) , may be it has 
better mechnism to manage load by not allowing too extensive load.


Attachments:
test2.rar -contains most of material
more-logs.rar - contains additional nodes logs







> OOM on multiple nodes on write load (v. 3.0.0), problem also present on 
> DSE-4.8.3, but there it survives more time
> --
>
> Key: CASSANDRA-10937
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10937
> Project: Cassandra
>  Issue Type: Bug
> Environment: Cassandra : 3.0.0
> Installed as open archive, no connection to any OS specific installer.
> Java:
> Java(TM) SE Runtime Environment (build 1.8.0_65-b17)
> OS :
> Linux version 2.6.32-431.el6.x86_64 
> (mockbu...@x86-023.build.eng.bos.redhat.com) (gcc version 4.4.7 20120313 (Red 
> Hat 4.4.7-4) (GCC) ) #1 SMP Sun Nov 10 22:19:54 EST 2013
> We have:
> 8 guests ( Linux OS as above) on 2 (VMWare managed) physical hosts. Each 
> physical host keeps 4 guests.
> Physical host parameters(shared by all 4 guests):
> Model: HP ProLiant DL380 Gen9
> Intel(R) Xeon(R) CPU E5-2690 v3 @ 2.60GHz
> 46 logical processors.
> Hyperthreading - enabled
> Each guest assigned to have:
> 1 disk 300 Gb for seq. log (NOT SSD)
> 1 disk 4T for data (NOT SSD)
> 11 CPU cores
> Disks are local, not shared.
> Memory on each host -  24 Gb total.
> 8 (or 6, tested both) Gb - cassandra heap
> (lshw and cpuinfo attached in file test2.rar)
>Reporter: Peter Kovgan
>Priority: Critical
> Attachments: gc-stat.txt, more-logs.rar, some-heap-stats.rar, 
> test2.rar, test3.rar, test4.rar
>
>
> 8 cassandra nodes.
> Load test started with 4 clients(different and not equal machines), each 
> running 1000 threads.
> Each thread assigned in round-robin way to run one of 4 different inserts. 
> Consistency->ONE.
> I attach the full CQL schema of tables and the query of insert.
> Replication factor - 2:
> create keyspace 

[jira] [Updated] (CASSANDRA-10937) OOM on multiple nodes on write load (v. 3.0.0), problem also present on DSE-4.8.3, but there it survives more time

2015-12-30 Thread Peter Kovgan (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Kovgan updated CASSANDRA-10937:
-
Summary: OOM on multiple nodes on write load (v. 3.0.0), problem also 
present on DSE-4.8.3, but there it survives more time  (was: OOM on multiple 
nodes on write load (v. 3.0.0), problem absent on DSE-4.8.3, so it is a bug!)

> OOM on multiple nodes on write load (v. 3.0.0), problem also present on 
> DSE-4.8.3, but there it survives more time
> --
>
> Key: CASSANDRA-10937
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10937
> Project: Cassandra
>  Issue Type: Bug
> Environment: Cassandra : 3.0.0
> Installed as open archive, no connection to any OS specific installer.
> Java:
> Java(TM) SE Runtime Environment (build 1.8.0_65-b17)
> OS :
> Linux version 2.6.32-431.el6.x86_64 
> (mockbu...@x86-023.build.eng.bos.redhat.com) (gcc version 4.4.7 20120313 (Red 
> Hat 4.4.7-4) (GCC) ) #1 SMP Sun Nov 10 22:19:54 EST 2013
> We have:
> 8 guests ( Linux OS as above) on 2 (VMWare managed) physical hosts. Each 
> physical host keeps 4 guests.
> Physical host parameters(shared by all 4 guests):
> Model: HP ProLiant DL380 Gen9
> Intel(R) Xeon(R) CPU E5-2690 v3 @ 2.60GHz
> 46 logical processors.
> Hyperthreading - enabled
> Each guest assigned to have:
> 1 disk 300 Gb for seq. log (NOT SSD)
> 1 disk 4T for data (NOT SSD)
> 11 CPU cores
> Disks are local, not shared.
> Memory on each host -  24 Gb total.
> 8 (or 6, tested both) Gb - cassandra heap
> (lshw and cpuinfo attached in file test2.rar)
>Reporter: Peter Kovgan
>Priority: Critical
> Attachments: gc-stat.txt, more-logs.rar, some-heap-stats.rar, 
> test2.rar, test3.rar, test4.rar
>
>
> 8 cassandra nodes.
> Load test started with 4 clients(different and not equal machines), each 
> running 1000 threads.
> Each thread assigned in round-robin way to run one of 4 different inserts. 
> Consistency->ONE.
> I attach the full CQL schema of tables and the query of insert.
> Replication factor - 2:
> create keyspace OBLREPOSITORY_NY with replication = 
> {'class':'NetworkTopologyStrategy','NY':2};
> Initiall throughput is:
> 215.000  inserts /sec
> or
> 54Mb/sec, considering single insert size a bit larger than 256byte.
> Data:
> all fields(5-6) are short strings, except one is BLOB of 256 bytes.
> After about a 2-3 hours of work, I was forced to increase timeout from 2000 
> to 5000ms, for some requests failed for short timeout.
> Later on(after aprox. 12 hous of work) OOM happens on multiple nodes.
> (all failed nodes logs attached)
> I attach also java load client and instructions how set-up and use 
> it.(test2.rar)
> Update:
> Later on test repeated with lesser load (10 mes/sec) with more relaxed 
> CPU (idle 25%), with only 2 test clients, but anyway test failed.
> At the end (29/12/16) tested DSE-4.8.3 with the load 100 000 mes/sec and it 
> survived. The same installation pattern.
> I think this is a bug, because OOM happens on later stage , when system runs 
> 10 hours and accumulated data on each node is about 250Gb. It is problem 
> growing with time. Definitely.
> I also noticed, that DSE a bit "slows down" a client (sometime) , may be it 
> has better mechnism to manage load by not allowing too extensive load.
> Attachments:
> test2.rar -contains most of material
> more-logs.rar - contains additional nodes logs



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-10938) test_bulk_round_trip_blogposts is failing occasionally

2015-12-30 Thread Stefania (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefania updated CASSANDRA-10938:
-
Attachment: (was: 6452.png)

> test_bulk_round_trip_blogposts is failing occasionally
> --
>
> Key: CASSANDRA-10938
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10938
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Tools
>Reporter: Stefania
>Assignee: Stefania
> Fix For: 2.1.x
>
> Attachments: 7300a.png, 7300b.png, node1_debug.log, node2_debug.log, 
> node3_debug.log
>
>
> We get timeouts occasionally that cause the number of records to be incorrect:
> http://cassci.datastax.com/job/trunk_dtest/858/testReport/cqlsh_tests.cqlsh_copy_tests/CqlshCopyTest/test_bulk_round_trip_blogposts/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-10938) test_bulk_round_trip_blogposts is failing occasionally

2015-12-30 Thread Stefania (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefania updated CASSANDRA-10938:
-
Attachment: 7300.nps
6452.nps

> test_bulk_round_trip_blogposts is failing occasionally
> --
>
> Key: CASSANDRA-10938
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10938
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Tools
>Reporter: Stefania
>Assignee: Stefania
> Fix For: 2.1.x
>
> Attachments: 6452.nps, 6452.png, 7300.nps, 7300a.png, 7300b.png, 
> node1_debug.log, node2_debug.log, node3_debug.log
>
>
> We get timeouts occasionally that cause the number of records to be incorrect:
> http://cassci.datastax.com/job/trunk_dtest/858/testReport/cqlsh_tests.cqlsh_copy_tests/CqlshCopyTest/test_bulk_round_trip_blogposts/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-10955) Multi-partitions queries with ORDER BY can result in a NPE

2015-12-30 Thread Benjamin Lerer (JIRA)
Benjamin Lerer created CASSANDRA-10955:
--

 Summary: Multi-partitions queries with ORDER BY can result in a NPE
 Key: CASSANDRA-10955
 URL: https://issues.apache.org/jira/browse/CASSANDRA-10955
 Project: Cassandra
  Issue Type: Bug
  Components: CQL
Reporter: Benjamin Lerer
Assignee: Benjamin Lerer


In the case of a table with static columns, if only the static columns have 
been set for some partitions, a multi-partitions query with an {{ORDER BY}} can 
cause a {{NPE}}.

The following unit test can be used to reproduce the problem:
{code}
@Test
public void testOrderByForInClauseWithNullValue() throws Throwable
{
createTable("CREATE TABLE %s (a int, b int, c int, s int static, d int, 
PRIMARY KEY (a, b, c))");

execute("INSERT INTO %s (a, b, c, d) VALUES (1, 1, 1, 1)");
execute("INSERT INTO %s (a, b, c, d) VALUES (1, 1, 2, 1)");
execute("INSERT INTO %s (a, b, c, d) VALUES (2, 2, 1, 1)");
execute("INSERT INTO %s (a, b, c, d) VALUES (2, 2, 2, 1)");

execute("UPDATE %s SET s = 1 WHERE a = 1");
execute("UPDATE %s SET s = 2 WHERE a = 2");
execute("UPDATE %s SET s = 3 WHERE a = 3");

assertRows(execute("SELECT a, b, c, d, s FROM %s WHERE a IN (1, 2, 3) 
ORDER BY b DESC"),
   row(2, 2, 2, 1, 2),
   row(2, 2, 1, 1, 2),
   row(1, 1, 2, 1, 1),
   row(1, 1, 1, 1, 1),
   row(3, null, null, null, 3));
}
{code} 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-10938) test_bulk_round_trip_blogposts is failing occasionally

2015-12-30 Thread Stefania (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefania updated CASSANDRA-10938:
-
Attachment: 7300b.png
7300a.png

> test_bulk_round_trip_blogposts is failing occasionally
> --
>
> Key: CASSANDRA-10938
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10938
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Tools
>Reporter: Stefania
>Assignee: Stefania
> Fix For: 2.1.x
>
> Attachments: 6452.png, 7300a.png, 7300b.png, node1_debug.log, 
> node2_debug.log, node3_debug.log
>
>
> We get timeouts occasionally that cause the number of records to be incorrect:
> http://cassci.datastax.com/job/trunk_dtest/858/testReport/cqlsh_tests.cqlsh_copy_tests/CqlshCopyTest/test_bulk_round_trip_blogposts/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-10938) test_bulk_round_trip_blogposts is failing occasionally

2015-12-30 Thread Stefania (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefania updated CASSANDRA-10938:
-
Attachment: 6452.png

> test_bulk_round_trip_blogposts is failing occasionally
> --
>
> Key: CASSANDRA-10938
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10938
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Tools
>Reporter: Stefania
>Assignee: Stefania
> Fix For: 2.1.x
>
> Attachments: 6452.png, 7300a.png, 7300b.png, node1_debug.log, 
> node2_debug.log, node3_debug.log
>
>
> We get timeouts occasionally that cause the number of records to be incorrect:
> http://cassci.datastax.com/job/trunk_dtest/858/testReport/cqlsh_tests.cqlsh_copy_tests/CqlshCopyTest/test_bulk_round_trip_blogposts/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10938) test_bulk_round_trip_blogposts is failing occasionally

2015-12-30 Thread Stefania (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15075251#comment-15075251
 ] 

Stefania commented on CASSANDRA-10938:
--

The test failed again on Jenkins and eventually I got it to fail again locally 
as well, even after switching to UNLOGGED batching. A closer comparative 
analysis of CPU hot-spots now indicates {{ServerConnection.getQueryState()}}, 
see attached 7300 screenshots and nps files of 7300 and 6452. 

{{StorageProxy.getBatchlogEndpoints()}} was a false positive.

> test_bulk_round_trip_blogposts is failing occasionally
> --
>
> Key: CASSANDRA-10938
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10938
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Tools
>Reporter: Stefania
>Assignee: Stefania
> Fix For: 2.1.x
>
> Attachments: 6452.nps, 6452.png, 7300.nps, 7300a.png, 7300b.png, 
> node1_debug.log, node2_debug.log, node3_debug.log
>
>
> We get timeouts occasionally that cause the number of records to be incorrect:
> http://cassci.datastax.com/job/trunk_dtest/858/testReport/cqlsh_tests.cqlsh_copy_tests/CqlshCopyTest/test_bulk_round_trip_blogposts/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8708) inter_dc_stream_throughput_outbound_megabits_per_sec to defaults to unlimited

2015-12-30 Thread Ariel Weisberg (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15075292#comment-15075292
 ] 

Ariel Weisberg commented on CASSANDRA-8708:
---

Local results don't matter only what Cassci thinks matters because some things 
will persistently pass locally but fail in Cassci. What you can do is look at 
the the test history for the branch you started off of and if it is failing 
there as well we don't let it block commit right now.

I checked the 2.1 branch and it seems to be about as good as the unmodified 2.1 
branch. Will do the same for the others once they finish.

> inter_dc_stream_throughput_outbound_megabits_per_sec to defaults to unlimited
> -
>
> Key: CASSANDRA-8708
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8708
> Project: Cassandra
>  Issue Type: Bug
>  Components: Streaming and Messaging
>Reporter: Adam Hattrell
>Assignee: Jeremy Hanna
>  Labels: docs-impacting
> Fix For: 2.1.x, 2.2.x
>
> Attachments: 8708-2.1-jmx-nodetool-bulkloader-v2.txt, 
> 8708-2.1-jmx-nodetool-bulkloader.txt, 8708-2.1-with-jmx-nodetool.txt, 
> 8708-2.1.txt, 8708-2.2.txt
>
>
> inter_dc_stream_throughput_outbound_megabits_per_sec was introduced in 
> CASSANDRA-6596.
> There's some discussion in the ticket of the intention to link the default to 
> whatever stream_throughput_outbound_megabits_per_sec is set to.  
> However, it looks like it's just set to 0 - from 
> /src/java/org/apache/cassandra/config/Config.java
> This is a bit of a pain - usually folks want to set the inter dc limits lower 
> than the base streaming figure.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8708) inter_dc_stream_throughput_outbound_megabits_per_sec to defaults to unlimited

2015-12-30 Thread Philip Thompson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15075371#comment-15075371
 ] 

Philip Thompson commented on CASSANDRA-8708:


[~aweisberg], you can use 
http://cassci.datastax.com/userContent/cstar_report/index.html?jobs=cassandra-2.1_dtest,jeromatron-8708-jeremy-dtest,jeromatron-8708-2.2-dtest,jeromatron-8708-3.0-dtest
 to compare the results of the various jobs Jeremy set up. I agree that no 
failures look related

> inter_dc_stream_throughput_outbound_megabits_per_sec to defaults to unlimited
> -
>
> Key: CASSANDRA-8708
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8708
> Project: Cassandra
>  Issue Type: Bug
>  Components: Streaming and Messaging
>Reporter: Adam Hattrell
>Assignee: Jeremy Hanna
>  Labels: docs-impacting
> Fix For: 2.1.x, 2.2.x
>
> Attachments: 8708-2.1-jmx-nodetool-bulkloader-v2.txt, 
> 8708-2.1-jmx-nodetool-bulkloader.txt, 8708-2.1-with-jmx-nodetool.txt, 
> 8708-2.1.txt, 8708-2.2.txt
>
>
> inter_dc_stream_throughput_outbound_megabits_per_sec was introduced in 
> CASSANDRA-6596.
> There's some discussion in the ticket of the intention to link the default to 
> whatever stream_throughput_outbound_megabits_per_sec is set to.  
> However, it looks like it's just set to 0 - from 
> /src/java/org/apache/cassandra/config/Config.java
> This is a bit of a pain - usually folks want to set the inter dc limits lower 
> than the base streaming figure.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-10956) Enable authentication of native protocol users via client certificates

2015-12-30 Thread Samuel Klock (JIRA)
Samuel Klock created CASSANDRA-10956:


 Summary: Enable authentication of native protocol users via client 
certificates
 Key: CASSANDRA-10956
 URL: https://issues.apache.org/jira/browse/CASSANDRA-10956
 Project: Cassandra
  Issue Type: New Feature
Reporter: Samuel Klock
Assignee: Samuel Klock


Currently, the native protocol only supports user authentication via SASL.  
While this is adequate for many use cases, it may be superfluous in scenarios 
where clients are required to present an SSL certificate to connect to the 
server.  If the certificate presented by a client is sufficient by itself to 
specify a user, then an additional (series of) authentication step(s) via SASL 
merely add overhead.  Worse, for uses wherein it's desirable to obtain the 
identity from the client's certificate, it's necessary to implement a custom 
SASL mechanism to do so, which increases the effort required to maintain both 
client and server and which also duplicates functionality already provided via 
SSL/TLS.

Cassandra should provide a means of using certificates for user authentication 
in the native protocol without any effort above configuring SSL on the client 
and server.  Here's a possible strategy:

* Add a new authenticator interface that returns {{AuthenticatedUser}} objects 
based on the certificate chain presented by the client.
* If this interface is in use, the user is authenticated immediately after the 
server receives the {{STARTUP}} message.  It then responds with a {{READY}} 
message.
* Otherwise, the existing flow of control is used (i.e., if the authenticator 
requires authentication, then an {{AUTHENTICATE}} message is sent to the 
client).

One advantage of this strategy is that it is backwards-compatible with existing 
schemes; current users of SASL/{{IAuthenticator}} are not impacted.  Moreover, 
it can function as a drop-in replacement for SASL schemes without requiring 
code changes (or even config changes) on the client side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-10956) Enable authentication of native protocol users via client certificates

2015-12-30 Thread Samuel Klock (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10956?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Samuel Klock updated CASSANDRA-10956:
-
Attachment: 10956.patch

Attaching a patch with a first-pass implementation of the proposal, along with 
a new concrete authenticator {{CommonNameCertificateAuthenticator}} that uses 
the CN field in the client certificate's subject as the Cassandra username.

> Enable authentication of native protocol users via client certificates
> --
>
> Key: CASSANDRA-10956
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10956
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Samuel Klock
>Assignee: Samuel Klock
> Attachments: 10956.patch
>
>
> Currently, the native protocol only supports user authentication via SASL.  
> While this is adequate for many use cases, it may be superfluous in scenarios 
> where clients are required to present an SSL certificate to connect to the 
> server.  If the certificate presented by a client is sufficient by itself to 
> specify a user, then an additional (series of) authentication step(s) via 
> SASL merely add overhead.  Worse, for uses wherein it's desirable to obtain 
> the identity from the client's certificate, it's necessary to implement a 
> custom SASL mechanism to do so, which increases the effort required to 
> maintain both client and server and which also duplicates functionality 
> already provided via SSL/TLS.
> Cassandra should provide a means of using certificates for user 
> authentication in the native protocol without any effort above configuring 
> SSL on the client and server.  Here's a possible strategy:
> * Add a new authenticator interface that returns {{AuthenticatedUser}} 
> objects based on the certificate chain presented by the client.
> * If this interface is in use, the user is authenticated immediately after 
> the server receives the {{STARTUP}} message.  It then responds with a 
> {{READY}} message.
> * Otherwise, the existing flow of control is used (i.e., if the authenticator 
> requires authentication, then an {{AUTHENTICATE}} message is sent to the 
> client).
> One advantage of this strategy is that it is backwards-compatible with 
> existing schemes; current users of SASL/{{IAuthenticator}} are not impacted.  
> Moreover, it can function as a drop-in replacement for SASL schemes without 
> requiring code changes (or even config changes) on the client side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10953) Make all timeouts configurable via nodetool and jmx

2015-12-30 Thread Jeremy Hanna (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15075536#comment-15075536
 ] 

Jeremy Hanna commented on CASSANDRA-10953:
--

[~sebastian.este...@datastax.com] [~thobbs] I have a patch that allows for 
getting/setting streaming socket timeout in jmx and nodetool as well as the rpc 
and cas contention timeouts via jmx.  Do we want to set/get the timeouts via 
nodetool as well?  That's an additional 14 commands in nodetool. 

> Make all timeouts configurable via nodetool and jmx
> ---
>
> Key: CASSANDRA-10953
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10953
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Sebastian Estevez
>Assignee: Jeremy Hanna
>  Labels: docs-impacting
>
> Specifically I was interested in being able to monitor and set 
> stream_socket_timeout_in_ms from either (or both) nodetool and JMX. 
> Chatting with [~thobbs] and [~jeromatron] we suspect it would also be useful 
> to be able to view and edit other C* timeouts via nodetool and JMX.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10953) Make all timeouts configurable via nodetool and jmx

2015-12-30 Thread Tyler Hobbs (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15075544#comment-15075544
 ] 

Tyler Hobbs commented on CASSANDRA-10953:
-

Perhaps we could condense them into a single {{nodetool settimeout}} command 
with a second argument for the specific timeout to set?

> Make all timeouts configurable via nodetool and jmx
> ---
>
> Key: CASSANDRA-10953
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10953
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Sebastian Estevez
>Assignee: Jeremy Hanna
>  Labels: docs-impacting
>
> Specifically I was interested in being able to monitor and set 
> stream_socket_timeout_in_ms from either (or both) nodetool and JMX. 
> Chatting with [~thobbs] and [~jeromatron] we suspect it would also be useful 
> to be able to view and edit other C* timeouts via nodetool and JMX.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10953) Make all timeouts configurable via nodetool and jmx

2015-12-30 Thread Jeremy Hanna (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15075552#comment-15075552
 ] 

Jeremy Hanna commented on CASSANDRA-10953:
--

Good thought - I'll look into doing that.  Thanks Tyler.

> Make all timeouts configurable via nodetool and jmx
> ---
>
> Key: CASSANDRA-10953
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10953
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Sebastian Estevez
>Assignee: Jeremy Hanna
>  Labels: docs-impacting
>
> Specifically I was interested in being able to monitor and set 
> stream_socket_timeout_in_ms from either (or both) nodetool and JMX. 
> Chatting with [~thobbs] and [~jeromatron] we suspect it would also be useful 
> to be able to view and edit other C* timeouts via nodetool and JMX.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10050) Secondary Index Performance Dependent on TokenRange Searched in Analytics

2015-12-30 Thread Brian Hess (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15075572#comment-15075572
 ] 

 Brian Hess commented on CASSANDRA-10050:
-

I'm a little confused here.  [~beobal], why would the performance get slower as 
the tokens got larger, though?  Wouldn't it be just as inefficient to read the 
first bit of the token range rather than the last bit of the token range?  
Seems like something else is going on here, no?

> Secondary Index Performance Dependent on TokenRange Searched in Analytics
> -
>
> Key: CASSANDRA-10050
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10050
> Project: Cassandra
>  Issue Type: Improvement
> Environment: Single node, macbook, 2.1.8
>Reporter: Russell Alexander Spitzer
> Fix For: 3.x
>
>
> In doing some test work on the Spark Cassandra Connector I saw some odd 
> performance when pushing down range queries with Secondary Index filters. 
> When running the queries we see huge amount of time when the C* server is not 
> doing any work and the query seem to be hanging. This investigation led to 
> the work in this document
> https://docs.google.com/spreadsheets/d/1aJg3KX7nPnY77RJ9ZT-IfaYADgJh0A--nAxItvC6hb4/edit#gid=0
> The Spark Cassandra Connector builds up token range specific queries and 
> allows the user to pushdown relevant fields to C*. Here we have two indexed 
> fields (size) and (color) being pushed down to C*. 
> {code}
> SELECT count(*) FROM ks.tab WHERE token("store") > $min AND token("store") <= 
> $max AND color = 'red' AND size = 'P' ALLOW FILTERING;{code}
> These queries will have different token ranges inserted and executed as 
> separate spark tasks. Spark tasks with token ranges near the Min(token) end 
> up executing much faster than those near Max(token) which also happen to 
> through errors.
> {code}
> Coordinator node timed out waiting for replica nodes' responses] 
> message="Operation timed out - received only 0 responses." 
> info={'received_responses': 0, 'required_responses': 1, 'consistency': 'ONE'}
> {code}
> I took the queries and ran them through CQLSH to see the difference in time. 
> A linear relationship is seen based on where the tokenRange being queried is 
> starting with only 2 second for queries near the beginning of the full token 
> spectrum and over 12 seconds at the end of the spectrum. 
> The question is, can this behavior be improved? or should we not recommend 
> using secondary indexes with Analytics workloads?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Issue Comment Deleted] (CASSANDRA-10050) Secondary Index Performance Dependent on TokenRange Searched in Analytics

2015-12-30 Thread Brian Hess (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brian Hess updated CASSANDRA-10050:
---
Comment: was deleted

(was: I'm a little confused here.  [~beobal], why would the performance get 
slower as the tokens got larger, though?  Wouldn't it be just as inefficient to 
read the first bit of the token range rather than the last bit of the token 
range?  Seems like something else is going on here, no?)

> Secondary Index Performance Dependent on TokenRange Searched in Analytics
> -
>
> Key: CASSANDRA-10050
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10050
> Project: Cassandra
>  Issue Type: Improvement
> Environment: Single node, macbook, 2.1.8
>Reporter: Russell Alexander Spitzer
> Fix For: 3.x
>
>
> In doing some test work on the Spark Cassandra Connector I saw some odd 
> performance when pushing down range queries with Secondary Index filters. 
> When running the queries we see huge amount of time when the C* server is not 
> doing any work and the query seem to be hanging. This investigation led to 
> the work in this document
> https://docs.google.com/spreadsheets/d/1aJg3KX7nPnY77RJ9ZT-IfaYADgJh0A--nAxItvC6hb4/edit#gid=0
> The Spark Cassandra Connector builds up token range specific queries and 
> allows the user to pushdown relevant fields to C*. Here we have two indexed 
> fields (size) and (color) being pushed down to C*. 
> {code}
> SELECT count(*) FROM ks.tab WHERE token("store") > $min AND token("store") <= 
> $max AND color = 'red' AND size = 'P' ALLOW FILTERING;{code}
> These queries will have different token ranges inserted and executed as 
> separate spark tasks. Spark tasks with token ranges near the Min(token) end 
> up executing much faster than those near Max(token) which also happen to 
> through errors.
> {code}
> Coordinator node timed out waiting for replica nodes' responses] 
> message="Operation timed out - received only 0 responses." 
> info={'received_responses': 0, 'required_responses': 1, 'consistency': 'ONE'}
> {code}
> I took the queries and ran them through CQLSH to see the difference in time. 
> A linear relationship is seen based on where the tokenRange being queried is 
> starting with only 2 second for queries near the beginning of the full token 
> spectrum and over 12 seconds at the end of the spectrum. 
> The question is, can this behavior be improved? or should we not recommend 
> using secondary indexes with Analytics workloads?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8708) inter_dc_stream_throughput_outbound_megabits_per_sec to defaults to unlimited

2015-12-30 Thread Jeremy Hanna (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15075453#comment-15075453
 ] 

Jeremy Hanna commented on CASSANDRA-8708:
-

Looks like 2.2, 3.0, and trunk are getting the same failures on cassci.  2.1 is 
running now.

> inter_dc_stream_throughput_outbound_megabits_per_sec to defaults to unlimited
> -
>
> Key: CASSANDRA-8708
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8708
> Project: Cassandra
>  Issue Type: Bug
>  Components: Streaming and Messaging
>Reporter: Adam Hattrell
>Assignee: Jeremy Hanna
>  Labels: docs-impacting
> Fix For: 2.1.x, 2.2.x
>
> Attachments: 8708-2.1-jmx-nodetool-bulkloader-v2.txt, 
> 8708-2.1-jmx-nodetool-bulkloader.txt, 8708-2.1-with-jmx-nodetool.txt, 
> 8708-2.1.txt, 8708-2.2.txt
>
>
> inter_dc_stream_throughput_outbound_megabits_per_sec was introduced in 
> CASSANDRA-6596.
> There's some discussion in the ticket of the intention to link the default to 
> whatever stream_throughput_outbound_megabits_per_sec is set to.  
> However, it looks like it's just set to 0 - from 
> /src/java/org/apache/cassandra/config/Config.java
> This is a bit of a pain - usually folks want to set the inter dc limits lower 
> than the base streaming figure.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10844) failed_bootstrap_wiped_node_can_join_test is failing

2015-12-30 Thread Joel Knighton (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15075465#comment-15075465
 ] 

Joel Knighton commented on CASSANDRA-10844:
---

In [CASSANDRA-7069], we added consistent range movement to prevent concurrent 
bootstraps/decommissions.

We do this in the {{checkForEndpointCollision}} method.

In [CASSANDRA-7939], we observed that this prevented immediately retrying a 
failed bootstrap. In order to avoid this, we switched to checking if a node is 
a fat client and not checking {{isSafeForBootstrap}} in this situation, but 
instead iterating over
 all endpoint states and checking if any endpoints are in 
STATUS_LEAVING/STATUS_MOVING/STATUS_BOOTSTRAPPING.

However, this didn't solve the problem if a node had reached the point of 
setting this status before failing its bootstrap.

In [CASSANDRA-8494], this deficiency was noticed in adding resumable 
bootstrapping and a line was added in [this 
commit|https://github.com/yukim/cassandra/commit/5f7fd497ae83f813078d56ba1b61f7ea322e5d5a]
 to ignore this gossip state for a fat client with the same broadcastAddress as 
the bootstrapping node. Since resumable bootstrapping went in to 2.2+ only, 
this explains why this test is failing only on 2.1 (since we aren't ignoring 
the fat client gossip entry for our previous failed bootstrap).

This failing test was added in [CASSANDRA-9765], which addressed deficiencies 
in {{checkForEndpointCollision}}. 

The consensus on 9765 was that bootstrapping is a safe state when checking for 
endpoint collisions (deferring to 7939).

I think the best fix here is to backport the bootstrapping broadcastAddress 
check from 2.2 - what do you think [~Stefania]? Do you recall seeing a 
different behavior for this test on 2.1?



> failed_bootstrap_wiped_node_can_join_test is failing
> 
>
> Key: CASSANDRA-10844
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10844
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Streaming and Messaging, Testing
>Reporter: Philip Thompson
> Fix For: 2.1.x
>
> Attachments: node1.log, node2.log
>
>
> {{bootstrap_test.TestBootstrap.failed_bootstap_wiped_node_can_join_test}} is 
> failing on 2.1-head. The second node fails to join the cluster. I see a lot 
> of exceptions in node1's log, such as 
> {code}
> ERROR [STREAM-OUT-/127.0.0.2] 2015-12-11 12:06:13,778 StreamSession.java:505 
> - [Stream #7b5ec5a0-a029-11e5-bad9-ffd0922f40e6] Streaming error occurred
> java.io.IOException: Broken pipe
> at sun.nio.ch.FileDispatcherImpl.write0(Native Method) ~[na:1.8.0_51]
> at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47) 
> ~[na:1.8.0_51]
> at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93) 
> ~[na:1.8.0_51]
> at sun.nio.ch.IOUtil.write(IOUtil.java:65) ~[na:1.8.0_51]
> at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:471) 
> ~[na:1.8.0_51]
> at 
> org.apache.cassandra.io.util.DataOutputStreamAndChannel.write(DataOutputStreamAndChannel.java:48)
>  ~[main/:na]
> at 
> org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:44)
>  ~[main/:na]
> at 
> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:351)
>  [main/:na]
> at 
> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:331)
>  [main/:na]
> at java.lang.Thread.run(Thread.java:745) [na:1.8.0_51]
> {code}
> Which seem consistent with node2 being killed, so the bootstrap fails. But 
> then when restarting node2, it does not join. It *looks* like it fails to 
> rejoin because of a false positive in checking the 2 minute rule.
> {code}
> ERROR [main] 2015-12-11 12:06:17,954 CassandraDaemon.java:579 - Except
> ion encountered during startup
> java.lang.UnsupportedOperationException: Other bootstrapping/leaving/m
> oving nodes detected, cannot bootstrap while cassandra.consistent.rang
> emovement is true
> at org.apache.cassandra.service.StorageService.checkForEndpoin
> tCollision(StorageService.java:559) ~[main/:na]
> at org.apache.cassandra.service.StorageService.prepareToJoin(S
> torageService.java:789) ~[main/:na]
> at org.apache.cassandra.service.StorageService.initServer(Stor
> ageService.java:721) ~[main/:na]
> at org.apache.cassandra.service.StorageService.initServer(Stor
> ageService.java:612) ~[main/:na]
> at 
> org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:387) 
> [main/:na]
> at 
> org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:562)
>  [main/:na]
> at 
> org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:651) 
> 

[jira] [Issue Comment Deleted] (CASSANDRA-9318) Bound the number of in-flight requests at the coordinator

2015-12-30 Thread Ariel Weisberg (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ariel Weisberg updated CASSANDRA-9318:
--
Comment: was deleted

(was: HT Aleksey Scylla discussion on a similar topic
https://groups.google.com/forum/#!topic/scylladb-dev/7Ge6vHN52os)

> Bound the number of in-flight requests at the coordinator
> -
>
> Key: CASSANDRA-9318
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9318
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local Write-Read Paths, Streaming and Messaging
>Reporter: Ariel Weisberg
>Assignee: Ariel Weisberg
> Fix For: 2.1.x, 2.2.x
>
>
> It's possible to somewhat bound the amount of load accepted into the cluster 
> by bounding the number of in-flight requests and request bytes.
> An implementation might do something like track the number of outstanding 
> bytes and requests and if it reaches a high watermark disable read on client 
> connections until it goes back below some low watermark.
> Need to make sure that disabling read on the client connection won't 
> introduce other issues.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9318) Bound the number of in-flight requests at the coordinator

2015-12-30 Thread Aleksey Yeschenko (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15075476#comment-15075476
 ] 

Aleksey Yeschenko commented on CASSANDRA-9318:
--

Posting myself instead: a link to a very relevant discussion by Cloudius guys: 
https://groups.google.com/forum/#!topic/scylladb-dev/7Ge6vHN52os

> Bound the number of in-flight requests at the coordinator
> -
>
> Key: CASSANDRA-9318
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9318
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local Write-Read Paths, Streaming and Messaging
>Reporter: Ariel Weisberg
>Assignee: Ariel Weisberg
> Fix For: 2.1.x, 2.2.x
>
>
> It's possible to somewhat bound the amount of load accepted into the cluster 
> by bounding the number of in-flight requests and request bytes.
> An implementation might do something like track the number of outstanding 
> bytes and requests and if it reaches a high watermark disable read on client 
> connections until it goes back below some low watermark.
> Need to make sure that disabling read on the client connection won't 
> introduce other issues.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-10957) Verify disk is readable on FileNotFound Exceptions

2015-12-30 Thread T Jake Luciani (JIRA)
T Jake Luciani created CASSANDRA-10957:
--

 Summary: Verify disk is readable on FileNotFound Exceptions
 Key: CASSANDRA-10957
 URL: https://issues.apache.org/jira/browse/CASSANDRA-10957
 Project: Cassandra
  Issue Type: Improvement
Reporter: T Jake Luciani


In JVMStabilityInspector we only mark ourselves unstable when we get some 
special messages in file not found exceptions.

{code}
// Check for file handle exhaustion
if (t instanceof FileNotFoundException || t instanceof SocketException)
if (t.getMessage().contains("Too many open files"))
isUnstable = true;
{code}


It seems like the OS might also have the same issue of too many open files but 
will instead return "No such file or directory".

It might make more sense when we check this exception type we try to read a 
known to exist file to verify if the disk is readable vs relying on the current 
check.

This would mean creating a hidden file on startup on each data disk? other 
ideas?





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-10957) Verify disk is readable on FileNotFound Exceptions

2015-12-30 Thread T Jake Luciani (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

T Jake Luciani updated CASSANDRA-10957:
---
Description: 
In JVMStabilityInspector we only mark ourselves unstable when we get some 
special messages in file not found exceptions.

{code}
// Check for file handle exhaustion
if (t instanceof FileNotFoundException || t instanceof SocketException)
if (t.getMessage().contains("Too many open files"))
isUnstable = true;
{code}


It seems like the OS might also have the same issue of too many open files but 
will instead return "No such file or directory".

It might make more sense, when we check this exception type, to try to read a 
known-to-exist file to verify the disk is readable.

This would mean creating a hidden file on startup on each data disk? other 
ideas?



  was:
In JVMStabilityInspector we only mark ourselves unstable when we get some 
special messages in file not found exceptions.

{code}
// Check for file handle exhaustion
if (t instanceof FileNotFoundException || t instanceof SocketException)
if (t.getMessage().contains("Too many open files"))
isUnstable = true;
{code}


It seems like the OS might also have the same issue of too many open files but 
will instead return "No such file or directory".

It might make more sense when we check this exception type we try to read a 
known to exist file to verify if the disk is readable vs relying on the current 
check.

This would mean creating a hidden file on startup on each data disk? other 
ideas?




> Verify disk is readable on FileNotFound Exceptions
> --
>
> Key: CASSANDRA-10957
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10957
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: T Jake Luciani
>
> In JVMStabilityInspector we only mark ourselves unstable when we get some 
> special messages in file not found exceptions.
> {code}
> // Check for file handle exhaustion
> if (t instanceof FileNotFoundException || t instanceof 
> SocketException)
> if (t.getMessage().contains("Too many open files"))
> isUnstable = true;
> {code}
> It seems like the OS might also have the same issue of too many open files 
> but will instead return "No such file or directory".
> It might make more sense, when we check this exception type, to try to read a 
> known-to-exist file to verify the disk is readable.
> This would mean creating a hidden file on startup on each data disk? other 
> ideas?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9318) Bound the number of in-flight requests at the coordinator

2015-12-30 Thread Ariel Weisberg (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15075444#comment-15075444
 ] 

Ariel Weisberg commented on CASSANDRA-9318:
---

HT Aleksey Scylla discussion on a similar topic
https://groups.google.com/forum/#!topic/scylladb-dev/7Ge6vHN52os

> Bound the number of in-flight requests at the coordinator
> -
>
> Key: CASSANDRA-9318
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9318
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local Write-Read Paths, Streaming and Messaging
>Reporter: Ariel Weisberg
>Assignee: Ariel Weisberg
> Fix For: 2.1.x, 2.2.x
>
>
> It's possible to somewhat bound the amount of load accepted into the cluster 
> by bounding the number of in-flight requests and request bytes.
> An implementation might do something like track the number of outstanding 
> bytes and requests and if it reaches a high watermark disable read on client 
> connections until it goes back below some low watermark.
> Need to make sure that disabling read on the client connection won't 
> introduce other issues.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-8708) inter_dc_stream_throughput_outbound_megabits_per_sec to defaults to unlimited

2015-12-30 Thread Jeremy Hanna (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeremy Hanna updated CASSANDRA-8708:

Attachment: 8708-2.1-jmx-nodetool-bulkloader-v3.txt

v3 for 2.1 is with the typo fixed [~brandon.williams] :)
the 2.2 patch applies cleanly to 2.2, 3.0, and trunk.
Ariel said that we mostly take from the tested branches rather than patches but 
just wanted to include this and the 2.2 patch on the ticket.

> inter_dc_stream_throughput_outbound_megabits_per_sec to defaults to unlimited
> -
>
> Key: CASSANDRA-8708
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8708
> Project: Cassandra
>  Issue Type: Bug
>  Components: Streaming and Messaging
>Reporter: Adam Hattrell
>Assignee: Jeremy Hanna
>  Labels: docs-impacting
> Fix For: 2.1.x, 2.2.x
>
> Attachments: 8708-2.1-jmx-nodetool-bulkloader-v2.txt, 
> 8708-2.1-jmx-nodetool-bulkloader-v3.txt, 
> 8708-2.1-jmx-nodetool-bulkloader.txt, 8708-2.1-with-jmx-nodetool.txt, 
> 8708-2.1.txt, 8708-2.2.txt
>
>
> inter_dc_stream_throughput_outbound_megabits_per_sec was introduced in 
> CASSANDRA-6596.
> There's some discussion in the ticket of the intention to link the default to 
> whatever stream_throughput_outbound_megabits_per_sec is set to.  
> However, it looks like it's just set to 0 - from 
> /src/java/org/apache/cassandra/config/Config.java
> This is a bit of a pain - usually folks want to set the inter dc limits lower 
> than the base streaming figure.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (CASSANDRA-10844) failed_bootstrap_wiped_node_can_join_test is failing

2015-12-30 Thread Joel Knighton (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Knighton reassigned CASSANDRA-10844:
-

Assignee: Joel Knighton

> failed_bootstrap_wiped_node_can_join_test is failing
> 
>
> Key: CASSANDRA-10844
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10844
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Streaming and Messaging, Testing
>Reporter: Philip Thompson
>Assignee: Joel Knighton
> Fix For: 2.1.x
>
> Attachments: node1.log, node2.log
>
>
> {{bootstrap_test.TestBootstrap.failed_bootstap_wiped_node_can_join_test}} is 
> failing on 2.1-head. The second node fails to join the cluster. I see a lot 
> of exceptions in node1's log, such as 
> {code}
> ERROR [STREAM-OUT-/127.0.0.2] 2015-12-11 12:06:13,778 StreamSession.java:505 
> - [Stream #7b5ec5a0-a029-11e5-bad9-ffd0922f40e6] Streaming error occurred
> java.io.IOException: Broken pipe
> at sun.nio.ch.FileDispatcherImpl.write0(Native Method) ~[na:1.8.0_51]
> at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47) 
> ~[na:1.8.0_51]
> at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93) 
> ~[na:1.8.0_51]
> at sun.nio.ch.IOUtil.write(IOUtil.java:65) ~[na:1.8.0_51]
> at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:471) 
> ~[na:1.8.0_51]
> at 
> org.apache.cassandra.io.util.DataOutputStreamAndChannel.write(DataOutputStreamAndChannel.java:48)
>  ~[main/:na]
> at 
> org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:44)
>  ~[main/:na]
> at 
> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:351)
>  [main/:na]
> at 
> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:331)
>  [main/:na]
> at java.lang.Thread.run(Thread.java:745) [na:1.8.0_51]
> {code}
> Which seem consistent with node2 being killed, so the bootstrap fails. But 
> then when restarting node2, it does not join. It *looks* like it fails to 
> rejoin because of a false positive in checking the 2 minute rule.
> {code}
> ERROR [main] 2015-12-11 12:06:17,954 CassandraDaemon.java:579 - Except
> ion encountered during startup
> java.lang.UnsupportedOperationException: Other bootstrapping/leaving/m
> oving nodes detected, cannot bootstrap while cassandra.consistent.rang
> emovement is true
> at org.apache.cassandra.service.StorageService.checkForEndpoin
> tCollision(StorageService.java:559) ~[main/:na]
> at org.apache.cassandra.service.StorageService.prepareToJoin(S
> torageService.java:789) ~[main/:na]
> at org.apache.cassandra.service.StorageService.initServer(Stor
> ageService.java:721) ~[main/:na]
> at org.apache.cassandra.service.StorageService.initServer(Stor
> ageService.java:612) ~[main/:na]
> at 
> org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:387) 
> [main/:na]
> at 
> org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:562)
>  [main/:na]
> at 
> org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:651) 
> [main/:na]
> {code}
> This fails consistently locally and on cassci. Logs attached.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-10844) failed_bootstrap_wiped_node_can_join_test is failing

2015-12-30 Thread Joel Knighton (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15075465#comment-15075465
 ] 

Joel Knighton edited comment on CASSANDRA-10844 at 12/30/15 10:10 PM:
--

In [CASSANDRA-7069], we added consistent range movement to prevent concurrent 
bootstraps/decommissions.

We do this in the {{checkForEndpointCollision}} method.

In [CASSANDRA-7939], we observed that this prevented immediately retrying a 
failed bootstrap. In order to avoid this, we switched to checking if a node is 
a fat client and not checking {{isSafeForBootstrap}} in this situation, but 
instead iterating over
 all endpoint states and checking if any endpoints are in 
STATUS_LEAVING/STATUS_MOVING/STATUS_BOOTSTRAPPING.

However, this didn't solve the problem if a node had reached the point of 
setting this status before failing its bootstrap.

In [CASSANDRA-8838], this deficiency was noticed in adding resumable 
bootstrapping and a line was added in [this 
commit|https://github.com/yukim/cassandra/commit/5f7fd497ae83f813078d56ba1b61f7ea322e5d5a]
 to ignore this gossip state for a fat client with the same broadcastAddress as 
the bootstrapping node. Since resumable bootstrapping went in to 2.2+ only, 
this explains why this test is failing only on 2.1 (since we aren't ignoring 
the fat client gossip entry for our previous failed bootstrap).

This failing test was added in [CASSANDRA-9765], which addressed deficiencies 
in {{checkForEndpointCollision}}. 

The consensus on 9765 was that bootstrapping is a safe state when checking for 
endpoint collisions (deferring to 7939).

I think the best fix here is to backport the bootstrapping broadcastAddress 
check from 2.2 - what do you think [~Stefania]? Do you recall seeing a 
different behavior for this test on 2.1?




was (Author: jkni):
In [CASSANDRA-7069], we added consistent range movement to prevent concurrent 
bootstraps/decommissions.

We do this in the {{checkForEndpointCollision}} method.

In [CASSANDRA-7939], we observed that this prevented immediately retrying a 
failed bootstrap. In order to avoid this, we switched to checking if a node is 
a fat client and not checking {{isSafeForBootstrap}} in this situation, but 
instead iterating over
 all endpoint states and checking if any endpoints are in 
STATUS_LEAVING/STATUS_MOVING/STATUS_BOOTSTRAPPING.

However, this didn't solve the problem if a node had reached the point of 
setting this status before failing its bootstrap.

In [CASSANDRA-8494], this deficiency was noticed in adding resumable 
bootstrapping and a line was added in [this 
commit|https://github.com/yukim/cassandra/commit/5f7fd497ae83f813078d56ba1b61f7ea322e5d5a]
 to ignore this gossip state for a fat client with the same broadcastAddress as 
the bootstrapping node. Since resumable bootstrapping went in to 2.2+ only, 
this explains why this test is failing only on 2.1 (since we aren't ignoring 
the fat client gossip entry for our previous failed bootstrap).

This failing test was added in [CASSANDRA-9765], which addressed deficiencies 
in {{checkForEndpointCollision}}. 

The consensus on 9765 was that bootstrapping is a safe state when checking for 
endpoint collisions (deferring to 7939).

I think the best fix here is to backport the bootstrapping broadcastAddress 
check from 2.2 - what do you think [~Stefania]? Do you recall seeing a 
different behavior for this test on 2.1?



> failed_bootstrap_wiped_node_can_join_test is failing
> 
>
> Key: CASSANDRA-10844
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10844
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Streaming and Messaging, Testing
>Reporter: Philip Thompson
> Fix For: 2.1.x
>
> Attachments: node1.log, node2.log
>
>
> {{bootstrap_test.TestBootstrap.failed_bootstap_wiped_node_can_join_test}} is 
> failing on 2.1-head. The second node fails to join the cluster. I see a lot 
> of exceptions in node1's log, such as 
> {code}
> ERROR [STREAM-OUT-/127.0.0.2] 2015-12-11 12:06:13,778 StreamSession.java:505 
> - [Stream #7b5ec5a0-a029-11e5-bad9-ffd0922f40e6] Streaming error occurred
> java.io.IOException: Broken pipe
> at sun.nio.ch.FileDispatcherImpl.write0(Native Method) ~[na:1.8.0_51]
> at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47) 
> ~[na:1.8.0_51]
> at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93) 
> ~[na:1.8.0_51]
> at sun.nio.ch.IOUtil.write(IOUtil.java:65) ~[na:1.8.0_51]
> at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:471) 
> ~[na:1.8.0_51]
> at 
> org.apache.cassandra.io.util.DataOutputStreamAndChannel.write(DataOutputStreamAndChannel.java:48)
>  ~[main/:na]
> at 
> 

[jira] [Comment Edited] (CASSANDRA-10937) OOM on multiple nodes on write load (v. 3.0.0), problem also present on DSE-4.8.3, but there it survives more time

2015-12-30 Thread Peter Kovgan (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15075732#comment-15075732
 ] 

Peter Kovgan edited comment on CASSANDRA-10937 at 12/31/15 5:48 AM:


DSE 4.8.3 failed with OOM after 48 hours of work.



was (Author: tierhetze):
DSE 4.8.3 failed with OOM after 48 hours of work.
Frustration is a weak word for what I feel

> OOM on multiple nodes on write load (v. 3.0.0), problem also present on 
> DSE-4.8.3, but there it survives more time
> --
>
> Key: CASSANDRA-10937
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10937
> Project: Cassandra
>  Issue Type: Bug
> Environment: Cassandra : 3.0.0
> Installed as open archive, no connection to any OS specific installer.
> Java:
> Java(TM) SE Runtime Environment (build 1.8.0_65-b17)
> OS :
> Linux version 2.6.32-431.el6.x86_64 
> (mockbu...@x86-023.build.eng.bos.redhat.com) (gcc version 4.4.7 20120313 (Red 
> Hat 4.4.7-4) (GCC) ) #1 SMP Sun Nov 10 22:19:54 EST 2013
> We have:
> 8 guests ( Linux OS as above) on 2 (VMWare managed) physical hosts. Each 
> physical host keeps 4 guests.
> Physical host parameters(shared by all 4 guests):
> Model: HP ProLiant DL380 Gen9
> Intel(R) Xeon(R) CPU E5-2690 v3 @ 2.60GHz
> 46 logical processors.
> Hyperthreading - enabled
> Each guest assigned to have:
> 1 disk 300 Gb for seq. log (NOT SSD)
> 1 disk 4T for data (NOT SSD)
> 11 CPU cores
> Disks are local, not shared.
> Memory on each host -  24 Gb total.
> 8 (or 6, tested both) Gb - cassandra heap
> (lshw and cpuinfo attached in file test2.rar)
>Reporter: Peter Kovgan
>Priority: Critical
> Attachments: gc-stat.txt, more-logs.rar, some-heap-stats.rar, 
> test2.rar, test3.rar, test4.rar
>
>
> 8 cassandra nodes.
> Load test started with 4 clients(different and not equal machines), each 
> running 1000 threads.
> Each thread assigned in round-robin way to run one of 4 different inserts. 
> Consistency->ONE.
> I attach the full CQL schema of tables and the query of insert.
> Replication factor - 2:
> create keyspace OBLREPOSITORY_NY with replication = 
> {'class':'NetworkTopologyStrategy','NY':2};
> Initiall throughput is:
> 215.000  inserts /sec
> or
> 54Mb/sec, considering single insert size a bit larger than 256byte.
> Data:
> all fields(5-6) are short strings, except one is BLOB of 256 bytes.
> After about a 2-3 hours of work, I was forced to increase timeout from 2000 
> to 5000ms, for some requests failed for short timeout.
> Later on(after aprox. 12 hous of work) OOM happens on multiple nodes.
> (all failed nodes logs attached)
> I attach also java load client and instructions how set-up and use 
> it.(test2.rar)
> Update:
> Later on test repeated with lesser load (10 mes/sec) with more relaxed 
> CPU (idle 25%), with only 2 test clients, but anyway test failed.
> Update:
> DSE-4.8.3 also failed on OOM (3 nodes from 8), but here it survived 48 hours, 
> not 10-12.
> If we do not find resolution in closest time we will consider some other DB.
> Attachments:
> test2.rar -contains most of material
> more-logs.rar - contains additional nodes logs



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[2/2] cassandra git commit: Merge branch 'cassandra-3.0' into trunk

2015-12-30 Thread marcuse
Merge branch 'cassandra-3.0' into trunk


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/2a6aa8cf
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/2a6aa8cf
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/2a6aa8cf

Branch: refs/heads/trunk
Commit: 2a6aa8cfb71c89de43cc1ab69050f3244481dc10
Parents: 2601639 3d78939
Author: Marcus Eriksson 
Authored: Wed Dec 30 09:00:10 2015 +0100
Committer: Marcus Eriksson 
Committed: Wed Dec 30 09:00:10 2015 +0100

--
 CHANGES.txt| 1 +
 src/java/org/apache/cassandra/streaming/StreamReceiveTask.java | 6 --
 2 files changed, 5 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/2a6aa8cf/CHANGES.txt
--
diff --cc CHANGES.txt
index 17829d7,dd517f5..5cd1b4c
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@@ -1,30 -1,7 +1,31 @@@
 -3.0.3
 +3.2
 + * Add forceUserDefinedCleanup to allow more flexible cleanup 
(CASSANDRA-10708)
 + * (cqlsh) allow setting TTL with COPY (CASSANDRA-9494)
+  * Fix counting of received sstables in streaming (CASSANDRA-10949)
   * Implement hints compression (CASSANDRA-9428)
   * Fix potential assertion error when reading static columns (CASSANDRA-10903)
 + * Fix EstimatedHistogram creation in nodetool tablehistograms 
(CASSANDRA-10859)
 + * Establish bootstrap stream sessions sequentially (CASSANDRA-6992)
 + * Sort compactionhistory output by timestamp (CASSANDRA-10464)
 + * More efficient BTree removal (CASSANDRA-9991)
 + * Make tablehistograms accept the same syntax as tablestats (CASSANDRA-10149)
 + * Group pending compactions based on table (CASSANDRA-10718)
 + * Add compressor name in sstablemetadata output (CASSANDRA-9879)
 + * Fix type casting for counter columns (CASSANDRA-10824)
 + * Prevent running Cassandra as root (CASSANDRA-8142)
 + * bound maximum in-flight commit log replay mutation bytes to 64 megabytes 
(CASSANDRA-8639)
 + * Normalize all scripts (CASSANDRA-10679)
 + * Make compression ratio much more accurate (CASSANDRA-10225)
 + * Optimize building of Clustering object when only one is created 
(CASSANDRA-10409)
 + * Make index building pluggable (CASSANDRA-10681)
 + * Add sstable flush observer (CASSANDRA-10678)
 + * Improve NTS endpoints calculation (CASSANDRA-10200)
 + * Improve performance of the folderSize function (CASSANDRA-10677)
 + * Add support for type casting in selection clause (CASSANDRA-10310)
 + * Added graphing option to cassandra-stress (CASSANDRA-7918)
 + * Abort in-progress queries that time out (CASSANDRA-7392)
 + * Add transparent data encryption core classes (CASSANDRA-9945)
 +Merged from 3.0:
   * Avoid NoSuchElementException when executing empty batch (CASSANDRA-10711)
   * Avoid building PartitionUpdate in toString (CASSANDRA-10897)
   * Reduce heap spent when receiving many SSTables (CASSANDRA-10797)



[1/2] cassandra git commit: Correctly count received sstables during streaming

2015-12-30 Thread marcuse
Repository: cassandra
Updated Branches:
  refs/heads/trunk 260163994 -> 2a6aa8cfb


Correctly count received sstables during streaming

Patch by marcuse; reviewed by yukim for CASSANDRA-10949


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/3d78939a
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/3d78939a
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/3d78939a

Branch: refs/heads/trunk
Commit: 3d78939a831eae5c5ae72bc977a9bb0107c24bab
Parents: 36608ce
Author: Marcus Eriksson 
Authored: Mon Dec 28 16:34:03 2015 +0100
Committer: Marcus Eriksson 
Committed: Wed Dec 30 08:57:13 2015 +0100

--
 CHANGES.txt| 1 +
 src/java/org/apache/cassandra/streaming/StreamReceiveTask.java | 6 --
 2 files changed, 5 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/3d78939a/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 837a592..dd517f5 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 3.0.3
+ * Fix counting of received sstables in streaming (CASSANDRA-10949)
  * Implement hints compression (CASSANDRA-9428)
  * Fix potential assertion error when reading static columns (CASSANDRA-10903)
  * Avoid NoSuchElementException when executing empty batch (CASSANDRA-10711)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/3d78939a/src/java/org/apache/cassandra/streaming/StreamReceiveTask.java
--
diff --git a/src/java/org/apache/cassandra/streaming/StreamReceiveTask.java 
b/src/java/org/apache/cassandra/streaming/StreamReceiveTask.java
index 92a14d1..6280f3a 100644
--- a/src/java/org/apache/cassandra/streaming/StreamReceiveTask.java
+++ b/src/java/org/apache/cassandra/streaming/StreamReceiveTask.java
@@ -74,6 +74,8 @@ public class StreamReceiveTask extends StreamTask
 //  holds references to SSTables received
 protected Collection sstables;
 
+private int remoteSSTablesReceived = 0;
+
 public StreamReceiveTask(StreamSession session, UUID cfId, int totalFiles, 
long totalSize)
 {
 super(session, cfId);
@@ -94,14 +96,14 @@ public class StreamReceiveTask extends StreamTask
 {
 if (done)
 return;
-
+remoteSSTablesReceived++;
 assert cfId.equals(sstable.getCfId());
 
 Collection finished = sstable.finish(true);
 txn.update(finished, false);
 sstables.addAll(finished);
 
-if (sstables.size() == totalFiles)
+if (remoteSSTablesReceived == totalFiles)
 {
 done = true;
 executor.submit(new OnCompletionRunnable(this));



cassandra git commit: Correctly count received sstables during streaming

2015-12-30 Thread marcuse
Repository: cassandra
Updated Branches:
  refs/heads/cassandra-3.0 36608cefa -> 3d78939a8


Correctly count received sstables during streaming

Patch by marcuse; reviewed by yukim for CASSANDRA-10949


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/3d78939a
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/3d78939a
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/3d78939a

Branch: refs/heads/cassandra-3.0
Commit: 3d78939a831eae5c5ae72bc977a9bb0107c24bab
Parents: 36608ce
Author: Marcus Eriksson 
Authored: Mon Dec 28 16:34:03 2015 +0100
Committer: Marcus Eriksson 
Committed: Wed Dec 30 08:57:13 2015 +0100

--
 CHANGES.txt| 1 +
 src/java/org/apache/cassandra/streaming/StreamReceiveTask.java | 6 --
 2 files changed, 5 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/3d78939a/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 837a592..dd517f5 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 3.0.3
+ * Fix counting of received sstables in streaming (CASSANDRA-10949)
  * Implement hints compression (CASSANDRA-9428)
  * Fix potential assertion error when reading static columns (CASSANDRA-10903)
  * Avoid NoSuchElementException when executing empty batch (CASSANDRA-10711)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/3d78939a/src/java/org/apache/cassandra/streaming/StreamReceiveTask.java
--
diff --git a/src/java/org/apache/cassandra/streaming/StreamReceiveTask.java 
b/src/java/org/apache/cassandra/streaming/StreamReceiveTask.java
index 92a14d1..6280f3a 100644
--- a/src/java/org/apache/cassandra/streaming/StreamReceiveTask.java
+++ b/src/java/org/apache/cassandra/streaming/StreamReceiveTask.java
@@ -74,6 +74,8 @@ public class StreamReceiveTask extends StreamTask
 //  holds references to SSTables received
 protected Collection sstables;
 
+private int remoteSSTablesReceived = 0;
+
 public StreamReceiveTask(StreamSession session, UUID cfId, int totalFiles, 
long totalSize)
 {
 super(session, cfId);
@@ -94,14 +96,14 @@ public class StreamReceiveTask extends StreamTask
 {
 if (done)
 return;
-
+remoteSSTablesReceived++;
 assert cfId.equals(sstable.getCfId());
 
 Collection finished = sstable.finish(true);
 txn.update(finished, false);
 sstables.addAll(finished);
 
-if (sstables.size() == totalFiles)
+if (remoteSSTablesReceived == totalFiles)
 {
 done = true;
 executor.submit(new OnCompletionRunnable(this));



[jira] [Commented] (CASSANDRA-9977) Support counter-columns for native aggregates (sum,avg,max,min)

2015-12-30 Thread Benjamin Lerer (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15074804#comment-15074804
 ] 

Benjamin Lerer commented on CASSANDRA-9977:
---

There seems to be no changes in your branch. Did I miss something?

> Support counter-columns for native aggregates (sum,avg,max,min)
> ---
>
> Key: CASSANDRA-9977
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9977
> Project: Cassandra
>  Issue Type: Improvement
>  Components: CQL
>Reporter: Noam Liran
>Assignee: Robert Stupp
> Fix For: 2.2.5, 3.0.3
>
>
> When trying to SUM a column of type COUNTER, this error is returned:
> {noformat}
> InvalidRequest: code=2200 [Invalid query] message="Invalid call to function 
> sum, none of its type signatures match (known type signatures: system.sum : 
> (tinyint) -> tinyint, system.sum : (smallint) -> smallint, system.sum : (int) 
> -> int, system.sum : (bigint) -> bigint, system.sum : (float) -> float, 
> system.sum : (double) -> double, system.sum : (decimal) -> decimal, 
> system.sum : (varint) -> varint)"
> {noformat}
> This might be relevant for other agg. functions.
> CQL for reproduction:
> {noformat}
> CREATE TABLE test (
> key INT,
> ctr COUNTER,
> PRIMARY KEY (
> key
> )
> );
> UPDATE test SET ctr = ctr + 1 WHERE key = 1;
> SELECT SUM(ctr) FROM test;
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-10411) Add/drop multiple columns in one ALTER TABLE statement

2015-12-30 Thread Amit Singh Chowdhery (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15029486#comment-15029486
 ] 

Amit Singh Chowdhery edited comment on CASSANDRA-10411 at 12/30/15 11:56 AM:
-

Hi Team,

I think this will take CQL little closer to SQL and will allow little more 
flexibility to user experience.

So I will pick this JIRA issue. I am thinking for the below changes(Just 
blueprint) ::

Step 1: Change in Grammar src\java\org\apache\cassandra\cql3\Cql.g , 
altertablestatement will be changed for both ADD and DROP statements.

Step 2: After this corresponding java files will be changed to support above 
grammar changes.

Requesting you all to provide Comments/suggestions for same.

Thanks
Amit Singh Chowdhery




was (Author: achowdhe):
Hi Team,

I think this will take CQL little closer to SQL and will allow little more 
flexibility to user experience.

So I will pick this JIRA issue. I am thinking for the below changes(Just 
blueprint) ::

Step 1: Change in Grammar src\java\org\apache\cassandra\cql3\Cql.g , 
altertablestatement will be changed for both ADD and DROP statements. it might 
now look like :

For ADD -->

K_ADD   ( { isStatic=true; } K_STATIC)? { type = AlterTableStatement.Type.ADD; }
c1=cident { mColumnName.add(c1); }  v1=comparatorType { 
mValidator.add(v1); } 
   ( ',' cn=cident { mColumnName.add(cn); }  vn=comparatorType 
{ mValidator.add(vn); } )*

For DROP -->

K_DROP  id=cident  { mColumnName.add(id); } ( ',' cn=cident { 
mColumnName.add(cn); } )* { type = AlterTableStatement.Type.DROP; }.

Step 2: After this corresponding java files will be changed to support above 
grammar changes.

Requesting you all to provide Comments/suggestions for same.

Thanks
Amit Singh Chowdhery



> Add/drop multiple columns in one ALTER TABLE statement
> --
>
> Key: CASSANDRA-10411
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10411
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Bryn Cooke
>Assignee: Amit Singh Chowdhery
>Priority: Minor
>  Labels: patch
> Fix For: 2.0.17
>
> Attachments: cassandra-10411.diff
>
>
> Currently it is only possible to add one column at a time in an alter table 
> statement. It would be great if we could add multiple columns at a time.
> The primary reason for this is that adding each column individually seems to 
> take a significant amount of time (at least on my development machine), I 
> know all the columns I want to add, but don't know them until after the 
> initial table is created.
> As a secondary consideration it brings CQL slightly closer to SQL where most 
> databases can handle adding multiple columns in one statement.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10805) Additional Compaction Logging

2015-12-30 Thread Marcus Eriksson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15074953#comment-15074953
 ] 

Marcus Eriksson commented on CASSANDRA-10805:
-

* Can we use logback to do the logging to file? It should be possible to create 
a special logger that goes to a separate file. Feels wrong to implement our own 
log rotation etc for this
* To enable/disable, can we just change the log level of that logger?
* Would be nice if the logging could be a bit more self-describing and human 
readable - JSON?

> Additional Compaction Logging
> -
>
> Key: CASSANDRA-10805
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10805
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Compaction, Observability
>Reporter: Carl Yeksigian
>Assignee: Carl Yeksigian
>Priority: Minor
>
> Currently, viewing the results of past compactions requires parsing the log 
> and looking at the compaction history system table, which doesn't have 
> information about, for example, flushed sstables not previously compacted.
> This is a proposal to extend the information captured for compaction. 
> Initially, this would be done through a JMX call, but if it proves to be 
> useful and not much overhead, it might be a feature that could be enabled for 
> the compaction strategy all the time.
> Initial log information would include:
> - The compaction strategy type controlling each column family
> - The set of sstables included in each compaction strategy
> - Information about flushes and compactions, including times and all involved 
> sstables
> - Information about sstables, including generation, size, and tokens
> - Any additional metadata the strategy wishes to add to a compaction or an 
> sstable, like the level of an sstable or the type of compaction being 
> performed



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-10411) Add/drop multiple columns in one ALTER TABLE statement

2015-12-30 Thread Amit Singh Chowdhery (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amit Singh Chowdhery updated CASSANDRA-10411:
-
Attachment: cassandra-10411.diff

> Add/drop multiple columns in one ALTER TABLE statement
> --
>
> Key: CASSANDRA-10411
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10411
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Bryn Cooke
>Assignee: Amit Singh Chowdhery
>Priority: Minor
>  Labels: patch
> Fix For: 2.0.17
>
> Attachments: cassandra-10411.diff
>
>
> Currently it is only possible to add one column at a time in an alter table 
> statement. It would be great if we could add multiple columns at a time.
> The primary reason for this is that adding each column individually seems to 
> take a significant amount of time (at least on my development machine), I 
> know all the columns I want to add, but don't know them until after the 
> initial table is created.
> As a secondary consideration it brings CQL slightly closer to SQL where most 
> databases can handle adding multiple columns in one statement.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-10938) test_bulk_round_trip_blogposts is failing occasionally

2015-12-30 Thread Stefania (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefania updated CASSANDRA-10938:
-
Attachment: 6452.png

> test_bulk_round_trip_blogposts is failing occasionally
> --
>
> Key: CASSANDRA-10938
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10938
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Tools
>Reporter: Stefania
>Assignee: Stefania
> Fix For: 2.1.x
>
> Attachments: 6452.png, node1_debug.log, node2_debug.log, 
> node3_debug.log
>
>
> We get timeouts occasionally that cause the number of records to be incorrect:
> http://cassci.datastax.com/job/trunk_dtest/858/testReport/cqlsh_tests.cqlsh_copy_tests/CqlshCopyTest/test_bulk_round_trip_blogposts/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10938) test_bulk_round_trip_blogposts is failing occasionally

2015-12-30 Thread Stefania (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15074920#comment-15074920
 ] 

Stefania commented on CASSANDRA-10938:
--

The CPU hotspot is caused by LOGGED batching, specifically by 
{{StorageProxy.getBatchlogEndpoints()}}, see screenshot attached: _6452.png_.

LOGGED batching was introduced after reintroducing batch-by-replica. After 
switching back to UNLOGGED batching, and keeping batch-by-replica, I've run the 
same test 11 times (once after rebooting the machine) and I could not reproduce 
the issue any longer. 

Let's keep on monitoring on Jenkins though.

> test_bulk_round_trip_blogposts is failing occasionally
> --
>
> Key: CASSANDRA-10938
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10938
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Tools
>Reporter: Stefania
>Assignee: Stefania
> Fix For: 2.1.x
>
> Attachments: 6452.png, node1_debug.log, node2_debug.log, 
> node3_debug.log
>
>
> We get timeouts occasionally that cause the number of records to be incorrect:
> http://cassci.datastax.com/job/trunk_dtest/858/testReport/cqlsh_tests.cqlsh_copy_tests/CqlshCopyTest/test_bulk_round_trip_blogposts/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-10950) Fix HintsCatalogTest

2015-12-30 Thread Robert Stupp (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Stupp updated CASSANDRA-10950:
-
Reviewer: Robert Stupp

> Fix HintsCatalogTest
> 
>
> Key: CASSANDRA-10950
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10950
> Project: Cassandra
>  Issue Type: Bug
>  Components: Testing
>Reporter: Yuki Morishita
>Assignee: Yuki Morishita
>Priority: Minor
> Fix For: 3.0.x, 3.x
>
>
> [HintsCatalogTest has been 
> broken|http://cassci.datastax.com/view/cassandra-3.0/job/cassandra-3.0_testall/331/testReport/]
>  since CASSANDRA-9428 was committed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10950) Fix HintsCatalogTest

2015-12-30 Thread Robert Stupp (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15074769#comment-15074769
 ] 

Robert Stupp commented on CASSANDRA-10950:
--

+1

> Fix HintsCatalogTest
> 
>
> Key: CASSANDRA-10950
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10950
> Project: Cassandra
>  Issue Type: Bug
>  Components: Testing
>Reporter: Yuki Morishita
>Assignee: Yuki Morishita
>Priority: Minor
> Fix For: 3.0.x, 3.x
>
>
> [HintsCatalogTest has been 
> broken|http://cassci.datastax.com/view/cassandra-3.0/job/cassandra-3.0_testall/331/testReport/]
>  since CASSANDRA-9428 was committed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10931) CassandraVersion complains about 3.x version strings

2015-12-30 Thread Robert Stupp (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15074775#comment-15074775
 ] 

Robert Stupp commented on CASSANDRA-10931:
--

+1 - can you remove the unused constructor in {{CassandraVersion}} on commit?

> CassandraVersion complains about 3.x version strings
> 
>
> Key: CASSANDRA-10931
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10931
> Project: Cassandra
>  Issue Type: Bug
>  Components: Testing
>Reporter: Robert Stupp
>Assignee: Yuki Morishita
> Fix For: 3.x
>
>
> The utest {{SystemKeyspaceTest.snapshotSystemKeyspaceIfUpgrading}} fails with 
> {{java.lang.IllegalArgumentException: Invalid version value: 3.2}} (e.g. 
> [here|http://cassci.datastax.com/view/trunk/job/trunk_testall/629/testReport/org.apache.cassandra.db/SystemKeyspaceTest/snapshotSystemKeyspaceIfUpgrading/]).
> The question is just:
> # change the regex pattern in {{CassandraVersion}} and silently assume {{.0}} 
> as the patch version or
> # go with {{x.y.0}} version strings instead of {{x.y}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-10931) CassandraVersion complains about 3.x version strings

2015-12-30 Thread Robert Stupp (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Stupp updated CASSANDRA-10931:
-
Reviewer: Robert Stupp

> CassandraVersion complains about 3.x version strings
> 
>
> Key: CASSANDRA-10931
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10931
> Project: Cassandra
>  Issue Type: Bug
>  Components: Testing
>Reporter: Robert Stupp
>Assignee: Yuki Morishita
> Fix For: 3.x
>
>
> The utest {{SystemKeyspaceTest.snapshotSystemKeyspaceIfUpgrading}} fails with 
> {{java.lang.IllegalArgumentException: Invalid version value: 3.2}} (e.g. 
> [here|http://cassci.datastax.com/view/trunk/job/trunk_testall/629/testReport/org.apache.cassandra.db/SystemKeyspaceTest/snapshotSystemKeyspaceIfUpgrading/]).
> The question is just:
> # change the regex pattern in {{CassandraVersion}} and silently assume {{.0}} 
> as the patch version or
> # go with {{x.y.0}} version strings instead of {{x.y}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-9303) Match cassandra-loader options in COPY FROM

2015-12-30 Thread Stefania (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15074033#comment-15074033
 ] 

Stefania edited comment on CASSANDRA-9303 at 12/30/15 10:33 AM:


The hanging tests were caused by the way in which we run cqlsh in ccm, this 
[pull request|https://github.com/pcmanus/ccm/pull/432] fixed it.

The remaining failures were caused by the following two things:

* Handling of temporary files is quite different on Windows, the changes to 
address this are only in the dtest code, see the second commit of the [pull 
request|https://github.com/riptano/cassandra-dtest/pull/724]. 

* The path names should have been normalized, see [this 
commit|https://github.com/stef1927/cassandra/commit/295219dfbcf24ece9729030cce6e9638899b2842].

I've also changed a few more things, mostly discovered whilst trying to 
reproduce CASSANDRA-10938 on Windows:

* Reverted to batching by replica to avoid Cassandra processes using too much 
CPU. Batching by replica was changed to batching by partition key during the 
code review of CASSANDRA-9302 because there is a cost in determining the 
replicas of each record. However, sending batches with records on different 
replicas is probably worst then spending a few cycles in Python determining the 
correct replicas. It also allows up to use LOGGED batching, see next point.

* EDIT: -Changed batch type from UNLOGGED to LOGGED to avoid a WARN in the 
Cassandra log files and for more consistent failed batch status reporting (even 
though INSERT should be idempotent, so this can be changed back to UNLOGGED if 
performance is impacted too much but it shouldn't since all parititions should 
be local).- This was too bad for performance, see discussion on CASSANDRA-10938.

* Fixed a problem with cassandra-stress that only manifested on Windows and on 
trunk when using a custom profile. However the Windows stress launch scripts 
were incorrect from 2.1 onwards.

* EDIT: Added checks on invalid {{boolstyle}} options, as requested by dtest 
code review.

I worked on the 2.2 patch and merged upwards. I also cherry-picked back to 2.1 
with manual conflict resolution in bin/cqlsh. Even though we don't support 
Windows for 2.1 I figured it was best to fix these problems anyway.


was (Author: stefania):
The hanging tests were caused by the way in which we run cqlsh in ccm, this 
[pull request|https://github.com/pcmanus/ccm/pull/432] fixed it.

The remaining failures were caused by the following two things:

* Handling of temporary files is quite different on Windows, the changes to 
address this are only in the dtest code, see the second commit of the [pull 
request|https://github.com/riptano/cassandra-dtest/pull/724]. 

* The path names should have been normalized, see [this 
commit|https://github.com/stef1927/cassandra/commit/295219dfbcf24ece9729030cce6e9638899b2842].

I've also changed a few more things, mostly discovered whilst trying to 
reproduce CASSANDRA-10938 on Windows:

* Reverted to batching by replica to avoid Cassandra processes using too much 
CPU. Batching by replica was changed to batching by partition key during the 
code review of CASSANDRA-9302 because there is a cost in determining the 
replicas of each record. However, sending batches with records on different 
replicas is probably worst then spending a few cycles in Python determining the 
correct replicas. It also allows up to use LOGGED batching, see next point.

* Changed batch type from UNLOGGED to LOGGED to avoid a WARN in the Cassandra 
log files and for more consistent failed batch status reporting (even though 
INSERT should be idempotent, so this can be changed back to UNLOGGED if 
performance is impacted too much but it shouldn't since all parititions should 
be local).

* Fixed a problem with cassandra-stress that only manifested on Windows and on 
trunk when using a custom profile. However the Windows stress launch scripts 
were incorrect from 2.1 onwards.

I worked on the 2.2 patch and merged upwards. I also cherry-picked back to 2.1 
with manual conflict resolution in bin/cqlsh. Even though we don't support 
Windows for 2.1 I figured it was best to fix these problems anyway.

> Match cassandra-loader options in COPY FROM
> ---
>
> Key: CASSANDRA-9303
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9303
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Tools
>Reporter: Jonathan Ellis
>Assignee: Stefania
>Priority: Critical
> Fix For: 2.1.x, 2.2.x, 3.0.x, 3.x
>
> Attachments: dtest.out
>
>
> https://github.com/brianmhess/cassandra-loader added a bunch of options to 
> handle real world requirements, we should match those.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-10954) [Regression] Error when removing list element with UPDATE statement

2015-12-30 Thread DOAN DuyHai (JIRA)
DOAN DuyHai created CASSANDRA-10954:
---

 Summary: [Regression] Error when removing list element with UPDATE 
statement
 Key: CASSANDRA-10954
 URL: https://issues.apache.org/jira/browse/CASSANDRA-10954
 Project: Cassandra
  Issue Type: Bug
  Components: Local Write-Read Paths
 Environment: Cassandra 3.0.0, Cassandra 3.1.1
Reporter: DOAN DuyHai


Steps to reproduce:

{code:sql}
CREATE TABLE simple(
  id int PRIMARY KEY,
  int_list list
);

INSERT INTO simple(id, int_list) VALUES(10, [1,2,3]);
SELECT * FROM simple;

 id | int_list
+---
 10 | [1, 2, 3]

UPDATE simple SET int_list[0]=null WHERE id=10;
ServerError: 
{code}

 Per CQL semantics, setting a column to NULL == deleting it.

 When using debugger, below is the Java stack trace on server side:

{noformat}
 ERROR o.apache.cassandra.transport.Message - Unexpected exception during 
request; channel = [id: 0x6dbc33bd, /192.168.51.1:57723 => /192.168.51.1:9473]
java.lang.AssertionError: null
at org.apache.cassandra.db.rows.BufferCell.(BufferCell.java:49) 
~[cassandra-all-3.1.1.jar:3.1.1]
at 
org.apache.cassandra.db.rows.BufferCell.tombstone(BufferCell.java:88) 
~[cassandra-all-3.1.1.jar:3.1.1]
at 
org.apache.cassandra.cql3.UpdateParameters.addTombstone(UpdateParameters.java:141)
 ~[cassandra-all-3.1.1.jar:3.1.1]
at 
org.apache.cassandra.cql3.UpdateParameters.addTombstone(UpdateParameters.java:136)
 ~[cassandra-all-3.1.1.jar:3.1.1]
at 
org.apache.cassandra.cql3.Lists$SetterByIndex.execute(Lists.java:362) 
~[cassandra-all-3.1.1.jar:3.1.1]
at 
org.apache.cassandra.cql3.statements.UpdateStatement.addUpdateForKey(UpdateStatement.java:94)
 ~[cassandra-all-3.1.1.jar:3.1.1]
at 
org.apache.cassandra.cql3.statements.ModificationStatement.addUpdates(ModificationStatement.java:666)
 ~[cassandra-all-3.1.1.jar:3.1.1]
at 
org.apache.cassandra.cql3.statements.ModificationStatement.getMutations(ModificationStatement.java:606)
 ~[cassandra-all-3.1.1.jar:3.1.1]
at 
org.apache.cassandra.cql3.statements.ModificationStatement.executeWithoutCondition(ModificationStatement.java:413)
 ~[cassandra-all-3.1.1.jar:3.1.1]
at 
org.apache.cassandra.cql3.statements.ModificationStatement.execute(ModificationStatement.java:401)
 ~[cassandra-all-3.1.1.jar:3.1.1]
at 
org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:206)
 ~[cassandra-all-3.1.1.jar:3.1.1]
at 
org.apache.cassandra.cql3.QueryProcessor.processPrepared(QueryProcessor.java:472)
 ~[cassandra-all-3.1.1.jar:3.1.1]
at 
org.apache.cassandra.cql3.QueryProcessor.processPrepared(QueryProcessor.java:449)
 ~[cassandra-all-3.1.1.jar:3.1.1]
at 
org.apache.cassandra.transport.messages.ExecuteMessage.execute(ExecuteMessage.java:130)
 ~[cassandra-all-3.1.1.jar:3.1.1]
at 
org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:507)
 [cassandra-all-3.1.1.jar:3.1.1]
at 
org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:401)
 [cassandra-all-3.1.1.jar:3.1.1]
at 
io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
 [netty-all-4.0.23.Final.jar:4.0.23.Final]
at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
 [netty-all-4.0.23.Final.jar:4.0.23.Final]
at 
io.netty.channel.AbstractChannelHandlerContext.access$700(AbstractChannelHandlerContext.java:32)
 [netty-all-4.0.23.Final.jar:4.0.23.Final]
at 
io.netty.channel.AbstractChannelHandlerContext$8.run(AbstractChannelHandlerContext.java:324)
 [netty-all-4.0.23.Final.jar:4.0.23.Final]
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
[na:1.8.0_60-ea]
at 
org.apache.cassandra.concurrent.AbstractTracingAwareExecutorService$FutureTask.run(AbstractTracingAwareExecutorService.java:164)
 [cassandra-all-3.1.1.jar:3.1.1]
at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) 
[cassandra-all-3.1.1.jar:3.1.1]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_60-ea]
{noformat}

The root cause seems to be located at *org.apache.cassandra.cql3.Lists:362* :

{code:java}
CellPath elementPath = 
existingRow.getComplexColumnData(column).getCellByIndex(idx).path();
if (value == null)
{
params.addTombstone(column);
}
else if (value != ByteBufferUtil.UNSET_BYTE_BUFFER)
{
params.addCell(column, elementPath, value);
}
{code}

 In the if block, it seems we do not pass the CellPath as it should be and it 
makes the asertion _assert column.isComplex() == (path != null);_ fails at 
*org.apache.cassandra.db.rows:49*

{code:java}
public BufferCell(ColumnDefinition column, long 

[jira] [Comment Edited] (CASSANDRA-10411) Add/drop multiple columns in one ALTER TABLE statement

2015-12-30 Thread Amit Singh Chowdhery (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15074966#comment-15074966
 ] 

Amit Singh Chowdhery edited comment on CASSANDRA-10411 at 12/30/15 1:22 PM:


In this patch , multiple columns addition and deletion has been entertained. 
New syntax for Addition will be :

Alter table tablename add ( col1 datatype ,col2 datatype...coln 
datatype  );

New Syntax for deletion will be :

Alter table tablename drop (col1,col2,col3.col n);

One can add single add and drop in same way which was earlier supported like :

Alter table tablename drop col1;
Alter table tablename add col1 datatype ;

Currently attached patch is for branch 2.0.x , In case you find any issues in 
auto merging the change on 2.1,2.2 or 3 , please let me know , i will submit 
separate patches for that.


was (Author: achowdhe):
In this patch , multiple columns addition and deletion has been entertained. 
New syntax for Addition will be :

Alter table tablename add ( col1 datatype ,col2 datatype...coln 
datatype  );

New Syntax for deletion will be :

Alter table tablename drop (col1,col2,col3.col n);

One can add single add and drop in same way which was earlier supported like :

Alter table tablename drop col1;
Alter table tablename add col1 datatype ;

Currently attached patch is for branch 2.0.x , In case to support for higher 
branches , please let me know i will upload that patch also.

> Add/drop multiple columns in one ALTER TABLE statement
> --
>
> Key: CASSANDRA-10411
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10411
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Bryn Cooke
>Assignee: Amit Singh Chowdhery
>Priority: Minor
>  Labels: patch
> Fix For: 2.0.17
>
> Attachments: cassandra-10411.diff
>
>
> Currently it is only possible to add one column at a time in an alter table 
> statement. It would be great if we could add multiple columns at a time.
> The primary reason for this is that adding each column individually seems to 
> take a significant amount of time (at least on my development machine), I 
> know all the columns I want to add, but don't know them until after the 
> initial table is created.
> As a secondary consideration it brings CQL slightly closer to SQL where most 
> databases can handle adding multiple columns in one statement.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-10411) Add/drop multiple columns in one ALTER TABLE statement

2015-12-30 Thread Amit Singh Chowdhery (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amit Singh Chowdhery updated CASSANDRA-10411:
-
Attachment: cassandra-10411.diff

> Add/drop multiple columns in one ALTER TABLE statement
> --
>
> Key: CASSANDRA-10411
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10411
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Bryn Cooke
>Assignee: Amit Singh Chowdhery
>Priority: Minor
>  Labels: patch
> Fix For: 2.0.17
>
> Attachments: cassandra-10411.diff
>
>
> Currently it is only possible to add one column at a time in an alter table 
> statement. It would be great if we could add multiple columns at a time.
> The primary reason for this is that adding each column individually seems to 
> take a significant amount of time (at least on my development machine), I 
> know all the columns I want to add, but don't know them until after the 
> initial table is created.
> As a secondary consideration it brings CQL slightly closer to SQL where most 
> databases can handle adding multiple columns in one statement.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-10411) Add/drop multiple columns in one ALTER TABLE statement

2015-12-30 Thread Amit Singh Chowdhery (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amit Singh Chowdhery updated CASSANDRA-10411:
-
Attachment: (was: cassandra-10411.diff)

> Add/drop multiple columns in one ALTER TABLE statement
> --
>
> Key: CASSANDRA-10411
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10411
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Bryn Cooke
>Assignee: Amit Singh Chowdhery
>Priority: Minor
>  Labels: patch
> Fix For: 2.0.17
>
> Attachments: cassandra-10411.diff
>
>
> Currently it is only possible to add one column at a time in an alter table 
> statement. It would be great if we could add multiple columns at a time.
> The primary reason for this is that adding each column individually seems to 
> take a significant amount of time (at least on my development machine), I 
> know all the columns I want to add, but don't know them until after the 
> initial table is created.
> As a secondary consideration it brings CQL slightly closer to SQL where most 
> databases can handle adding multiple columns in one statement.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10909) NPE in ActiveRepairService

2015-12-30 Thread Marcus Eriksson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15075028#comment-15075028
 ] 

Marcus Eriksson commented on CASSANDRA-10909:
-

How did you trigger this?

I can not reproduce locally

> NPE in ActiveRepairService 
> ---
>
> Key: CASSANDRA-10909
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10909
> Project: Cassandra
>  Issue Type: Bug
> Environment: cassandra-3.0.1.777
>Reporter: Eduard Tudenhoefner
>Assignee: Marcus Eriksson
>
> NPE after one started multiple incremental repairs
> {code}
> INFO  [Thread-62] 2015-12-21 11:40:53,742  RepairRunnable.java:125 - Starting 
> repair command #1, repairing keyspace keyspace1 with repair options 
> (parallelism: parallel, primary range: false, incremental: true, job threads: 
> 1, ColumnFamilies: [], dataCenters: [], hosts: [], # of ranges: 2)
> INFO  [Thread-62] 2015-12-21 11:40:53,813  RepairSession.java:237 - [repair 
> #b13e3740-a7d7-11e5-b568-f565b837eb0d] new session: will sync /10.200.177.32, 
> /10.200.177.33 on range [(10,-9223372036854775808]] for keyspace1.[counter1, 
> standard1]
> INFO  [Repair#1:1] 2015-12-21 11:40:53,853  RepairJob.java:100 - [repair 
> #b13e3740-a7d7-11e5-b568-f565b837eb0d] requesting merkle trees for counter1 
> (to [/10.200.177.33, /10.200.177.32])
> INFO  [Repair#1:1] 2015-12-21 11:40:53,853  RepairJob.java:174 - [repair 
> #b13e3740-a7d7-11e5-b568-f565b837eb0d] Requesting merkle trees for counter1 
> (to [/10.200.177.33, /10.200.177.32])
> INFO  [Thread-62] 2015-12-21 11:40:53,854  RepairSession.java:237 - [repair 
> #b1449fe0-a7d7-11e5-b568-f565b837eb0d] new session: will sync /10.200.177.32, 
> /10.200.177.31 on range [(0,10]] for keyspace1.[counter1, standard1]
> INFO  [AntiEntropyStage:1] 2015-12-21 11:40:53,896  RepairSession.java:181 - 
> [repair #b13e3740-a7d7-11e5-b568-f565b837eb0d] Received merkle tree for 
> counter1 from /10.200.177.32
> INFO  [AntiEntropyStage:1] 2015-12-21 11:40:53,906  RepairSession.java:181 - 
> [repair #b13e3740-a7d7-11e5-b568-f565b837eb0d] Received merkle tree for 
> counter1 from /10.200.177.33
> INFO  [Repair#1:1] 2015-12-21 11:40:53,906  RepairJob.java:100 - [repair 
> #b13e3740-a7d7-11e5-b568-f565b837eb0d] requesting merkle trees for standard1 
> (to [/10.200.177.33, /10.200.177.32])
> INFO  [Repair#1:1] 2015-12-21 11:40:53,906  RepairJob.java:174 - [repair 
> #b13e3740-a7d7-11e5-b568-f565b837eb0d] Requesting merkle trees for standard1 
> (to [/10.200.177.33, /10.200.177.32])
> INFO  [RepairJobTask:2] 2015-12-21 11:40:53,910  SyncTask.java:66 - [repair 
> #b13e3740-a7d7-11e5-b568-f565b837eb0d] Endpoints /10.200.177.33 and 
> /10.200.177.32 are consistent for counter1
> INFO  [RepairJobTask:1] 2015-12-21 11:40:53,910  RepairJob.java:145 - [repair 
> #b13e3740-a7d7-11e5-b568-f565b837eb0d] counter1 is fully synced
> INFO  [AntiEntropyStage:1] 2015-12-21 11:40:54,823  Validator.java:272 - 
> [repair #b17a2ed0-a7d7-11e5-ada8-8304f5629908] Sending completed merkle tree 
> to /10.200.177.33 for keyspace1.counter1
> ERROR [ValidationExecutor:3] 2015-12-21 11:40:55,104  
> CompactionManager.java:1065 - Cannot start multiple repair sessions over the 
> same sstables
> ERROR [ValidationExecutor:3] 2015-12-21 11:40:55,105  Validator.java:259 - 
> Failed creating a merkle tree for [repair 
> #b17a2ed0-a7d7-11e5-ada8-8304f5629908 on keyspace1/standard1, 
> [(10,-9223372036854775808]]], /10.200.177.33 (see log for details)
> ERROR [ValidationExecutor:3] 2015-12-21 11:40:55,110  
> CassandraDaemon.java:195 - Exception in thread 
> Thread[ValidationExecutor:3,1,main]
> java.lang.RuntimeException: Cannot start multiple repair sessions over the 
> same sstables
>   at 
> org.apache.cassandra.db.compaction.CompactionManager.doValidationCompaction(CompactionManager.java:1066)
>  ~[cassandra-all-3.0.1.777.jar:3.0.1.777]
>   at 
> org.apache.cassandra.db.compaction.CompactionManager.access$700(CompactionManager.java:80)
>  ~[cassandra-all-3.0.1.777.jar:3.0.1.777]
>   at 
> org.apache.cassandra.db.compaction.CompactionManager$10.call(CompactionManager.java:679)
>  ~[cassandra-all-3.0.1.777.jar:3.0.1.777]
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
> ~[na:1.8.0_40]
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  ~[na:1.8.0_40]
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  [na:1.8.0_40]
>   at java.lang.Thread.run(Thread.java:745) [na:1.8.0_40]
> ERROR [AntiEntropyStage:1] 2015-12-21 11:40:55,174  
> RepairMessageVerbHandler.java:161 - Got error, removing parent repair session
> INFO  [CompactionExecutor:3] 2015-12-21 11:40:55,175  
> CompactionManager.java:489 - Starting anticompaction for keyspace1.counter1 
> on 0/[] sstables
> INFO  

[jira] [Commented] (CASSANDRA-10411) Add/drop multiple columns in one ALTER TABLE statement

2015-12-30 Thread Robert Stupp (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15075099#comment-15075099
 ] 

Robert Stupp commented on CASSANDRA-10411:
--

[~achowdhe], new features need to go into trunk. Mind to provide a patch 
against trunk?
Version 2.0 is EOL for while now and won't even get bug fixes. 2.1 is near to 
EOL and will only get critical bug fixes. 2.2 and 3.0 are "feature complete" 
and only get bug fixes. So, trunk is the target branch. It is easier to fork 
the C* repo on github and just provide a link to a "feature branch".

>From a brief view of you patch, it seems that it contains a lot of changes 
>that are just changes to code-style, which makes it hard to see the actual 
>differences (see https://wiki.apache.org/cassandra/HowToContribute for code 
>style and IDE integrations). Please also avoid massive changes to the imports 
>section. You could use a separate unit test and let that class extend 
>{{CQLTester}}. The patch should also take care when the same column name is 
>used multiple times.

Generally, we are moving towards atomic schema changes. So, it would also be ok 
to go one step further and to allow drops, alters and adds of columns in the 
same statement like {{ALTER TABLE foo DROP (old_column_one, another_column) ADD 
(new_column_one bigint, next_column text);}}. Similarly for {{ALTER TYPE}}. 
Both should take care when touching the same column name in the same statement 
(drop + add). Mind tackling this, too?

(setting status to "in progress", please set to "patch available", if a new 
version of the patch is available).

> Add/drop multiple columns in one ALTER TABLE statement
> --
>
> Key: CASSANDRA-10411
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10411
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Bryn Cooke
>Assignee: Amit Singh Chowdhery
>Priority: Minor
>  Labels: patch
> Attachments: cassandra-10411.diff
>
>
> Currently it is only possible to add one column at a time in an alter table 
> statement. It would be great if we could add multiple columns at a time.
> The primary reason for this is that adding each column individually seems to 
> take a significant amount of time (at least on my development machine), I 
> know all the columns I want to add, but don't know them until after the 
> initial table is created.
> As a secondary consideration it brings CQL slightly closer to SQL where most 
> databases can handle adding multiple columns in one statement.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9977) Support counter-columns for native aggregates (sum,avg,max,min)

2015-12-30 Thread Robert Stupp (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15075100#comment-15075100
 ] 

Robert Stupp commented on CASSANDRA-9977:
-

Yea - I already deleted all my branches for 9977 - so the branch is a 
re-implementation (that's why I'm asking for another review). Sorry about that.

> Support counter-columns for native aggregates (sum,avg,max,min)
> ---
>
> Key: CASSANDRA-9977
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9977
> Project: Cassandra
>  Issue Type: Improvement
>  Components: CQL
>Reporter: Noam Liran
>Assignee: Robert Stupp
> Fix For: 2.2.5, 3.0.3
>
>
> When trying to SUM a column of type COUNTER, this error is returned:
> {noformat}
> InvalidRequest: code=2200 [Invalid query] message="Invalid call to function 
> sum, none of its type signatures match (known type signatures: system.sum : 
> (tinyint) -> tinyint, system.sum : (smallint) -> smallint, system.sum : (int) 
> -> int, system.sum : (bigint) -> bigint, system.sum : (float) -> float, 
> system.sum : (double) -> double, system.sum : (decimal) -> decimal, 
> system.sum : (varint) -> varint)"
> {noformat}
> This might be relevant for other agg. functions.
> CQL for reproduction:
> {noformat}
> CREATE TABLE test (
> key INT,
> ctr COUNTER,
> PRIMARY KEY (
> key
> )
> );
> UPDATE test SET ctr = ctr + 1 WHERE key = 1;
> SELECT SUM(ctr) FROM test;
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9309) Wrong interpretation of Config.getOutboundBindAny depending on using SSL or not

2015-12-30 Thread Robert Stupp (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15075103#comment-15075103
 ] 

Robert Stupp commented on CASSANDRA-9309:
-

Ping - +1'd

> Wrong interpretation of Config.getOutboundBindAny depending on using SSL or 
> not
> ---
>
> Key: CASSANDRA-9309
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9309
> Project: Cassandra
>  Issue Type: Bug
>  Components: Streaming and Messaging
>Reporter: Casey Marshall
>Assignee: Yuki Morishita
> Fix For: 2.1.x, 2.2.x, 3.x
>
>
> In function OutboundTcpConnectionPool.newSocket(), it appears the binding 
> behavior of client sockets is different depending on the encryption setting.
> If encryption is enabled, and Config.getOutboundBindAny() is true, then no 
> address is passed to SSLFactory.getSocket (so I assume it binds to any 
> address).
> If encryption is enabled, and Config.getOutboundBindAny() is false, then 
> FBUtilities.getLocalAddress() is passed to SSLFactory.getSocket (so I assume 
> the new socket will be bound to that address).
> If encryption is disabled, and Config.getOutboundBindAny() is true (and 
> socket.isBound() returns false) then the socket is bound to 
> FBUtilities.getLocalAddress().
> If encryption is disabled, and Config.getOutboundBindAny() is false, the 
> socket is not bound.
> The case of encryption disabled appears to be wrong, and the 
> Config.getOutboundBindAny() flag gets inverted depending on the encryption 
> setting. Shouldn't
> {code}
> if (Config.getOutboundBindAny() && !socket.isBound())
> {code}
> be this:
> {code}
> if (!Config.getOutboundBindAny() && !socket.isBound())
> {code}
> This is in my copy of the 2.0.11 tag, and appears to be the same in trunk.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)