[jira] [Updated] (CASSANDRA-7396) Allow selecting Map key, List index

2016-06-16 Thread Anonymous (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anonymous updated CASSANDRA-7396:
-
Status: Ready to Commit  (was: Patch Available)

> Allow selecting Map key, List index
> ---
>
> Key: CASSANDRA-7396
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7396
> Project: Cassandra
>  Issue Type: New Feature
>  Components: CQL
>Reporter: Jonathan Ellis
>Assignee: Robert Stupp
>  Labels: cql, docs-impacting
> Fix For: 3.x
>
> Attachments: 7396_unit_tests.txt
>
>
> Allow "SELECT map['key]" and "SELECT list[index]."  (Selecting a UDT subfield 
> is already supported.)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-11031) MultiTenant : support “ALLOW FILTERING" for Partition Key

2016-06-16 Thread ZhaoYang (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhaoYang updated CASSANDRA-11031:
-
Summary: MultiTenant : support “ALLOW FILTERING" for Partition Key  (was: 
MultiTenant : support “ALLOW FILTERING" for First Partition Key)

> MultiTenant : support “ALLOW FILTERING" for Partition Key
> -
>
> Key: CASSANDRA-11031
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11031
> Project: Cassandra
>  Issue Type: New Feature
>  Components: CQL
>Reporter: ZhaoYang
>Assignee: ZhaoYang
>Priority: Minor
> Fix For: 3.x
>
> Attachments: CASSANDRA-11031-3.7.patch
>
>
> Currently, Allow Filtering only works for secondary Index column or 
> clustering columns. And it's slow, because Cassandra will read all data from 
> SSTABLE from hard-disk to memory to filter.
> But we can support allow filtering on Partition Key, as far as I know, 
> Partition Key is in memory, so we can easily filter them, and then read 
> required data from SSTable.
> This will similar to "Select * from table" which scan through entire cluster.
> CREATE TABLE multi_tenant_table (
>   tenant_id text,
>   pk2 text,
>   c1 text,
>   c2 text,
>   v1 text,
>   v2 text,
>   PRIMARY KEY ((tenant_id,pk2),c1,c2)
> ) ;
> Select * from multi_tenant_table where tenant_id = "datastax" allow filtering;



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11537) Give clear error when certain nodetool commands are issued before server is ready

2016-06-16 Thread Edward Capriolo (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15335269#comment-15335269
 ] 

Edward Capriolo commented on CASSANDRA-11537:
-

Fixed java test issues 
https://github.com/apache/cassandra/compare/trunk...edwardcapriolo:CASSANDRA-11537-2?expand=1

> Give clear error when certain nodetool commands are issued before server is 
> ready
> -
>
> Key: CASSANDRA-11537
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11537
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Edward Capriolo
>Assignee: Edward Capriolo
>Priority: Minor
>  Labels: lhf
>
> As an ops person upgrading and servicing Cassandra servers, I require a more 
> clear message when I issue a nodetool command that the server is not ready 
> for it so that I am not confused.
> Technical description:
> If you deploy a new binary, restart, and issue nodetool 
> scrub/compact/updatess etc you get unfriendly assertion. An exception would 
> be easier to understand. Also if a user has turned assertions off it is 
> unclear what might happen. 
> {noformat}
> EC1: Throw exception to make it clear server is still in start up process. 
> :~# nodetool upgradesstables
> error: null
> -- StackTrace --
> java.lang.AssertionError
> at org.apache.cassandra.db.Keyspace.open(Keyspace.java:97)
> at 
> org.apache.cassandra.service.StorageService.getValidKeyspace(StorageService.java:2573)
> at 
> org.apache.cassandra.service.StorageService.getValidColumnFamilies(StorageService.java:2661)
> at 
> org.apache.cassandra.service.StorageService.upgradeSSTables(StorageService.java:2421)
> {noformat}
> EC1: 
> Patch against 2.1 (branch)
> https://github.com/apache/cassandra/compare/trunk...edwardcapriolo:exception-on-startup?expand=1



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


cassandra git commit: use long math, for long results

2016-06-16 Thread dbrosius
Repository: cassandra
Updated Branches:
  refs/heads/trunk 27395e78b -> 057c32997


use long math, for long results


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/057c3299
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/057c3299
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/057c3299

Branch: refs/heads/trunk
Commit: 057c32997442b5df8842fe46aa2ebe9b178d8647
Parents: 27395e7
Author: Dave Brosius 
Authored: Thu Jun 16 22:32:00 2016 -0400
Committer: Dave Brosius 
Committed: Thu Jun 16 22:32:00 2016 -0400

--
 .../cassandra/db/compaction/TimeWindowCompactionStrategy.java  | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/057c3299/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategy.java
--
diff --git 
a/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategy.java 
b/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategy.java
index df688c5..70f29e9 100644
--- 
a/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategy.java
+++ 
b/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategy.java
@@ -189,16 +189,16 @@ public class TimeWindowCompactionStrategy extends 
AbstractCompactionStrategy
 switch(windowTimeUnit)
 {
 case MINUTES:
-lowerTimestamp = timestampInSeconds - ((timestampInSeconds) % 
(60 * windowTimeSize));
+lowerTimestamp = timestampInSeconds - ((timestampInSeconds) % 
(60L * windowTimeSize));
 upperTimestamp = (lowerTimestamp + (60L * (windowTimeSize - 
1L))) + 59L;
 break;
 case HOURS:
-lowerTimestamp = timestampInSeconds - ((timestampInSeconds) % 
(3600 * windowTimeSize));
+lowerTimestamp = timestampInSeconds - ((timestampInSeconds) % 
(3600L * windowTimeSize));
 upperTimestamp = (lowerTimestamp + (3600L * (windowTimeSize - 
1L))) + 3599L;
 break;
 case DAYS:
 default:
-lowerTimestamp = timestampInSeconds - ((timestampInSeconds) % 
(86400 * windowTimeSize));
+lowerTimestamp = timestampInSeconds - ((timestampInSeconds) % 
(86400L * windowTimeSize));
 upperTimestamp = (lowerTimestamp + (86400L * (windowTimeSize - 
1L))) + 86399L;
 break;
 }



cassandra git commit: remove dead params

2016-06-16 Thread dbrosius
Repository: cassandra
Updated Branches:
  refs/heads/trunk ff9673920 -> 27395e78b


remove dead params


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/27395e78
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/27395e78
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/27395e78

Branch: refs/heads/trunk
Commit: 27395e78befd3694535736c9756c0380c6a00516
Parents: ff96739
Author: Dave Brosius 
Authored: Thu Jun 16 22:28:55 2016 -0400
Committer: Dave Brosius 
Committed: Thu Jun 16 22:28:55 2016 -0400

--
 .../db/compaction/TimeWindowCompactionStrategy.java | 5 +
 .../db/compaction/TimeWindowCompactionStrategyTest.java | 9 +++--
 2 files changed, 4 insertions(+), 10 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/27395e78/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategy.java
--
diff --git 
a/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategy.java 
b/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategy.java
index da3ef70..df688c5 100644
--- 
a/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategy.java
+++ 
b/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategy.java
@@ -19,7 +19,6 @@
 package org.apache.cassandra.db.compaction;
 
 import java.util.ArrayList;
-import java.util.Arrays;
 import java.util.Collection;
 import java.util.Collections;
 import java.util.Iterator;
@@ -158,8 +157,6 @@ public class TimeWindowCompactionStrategy extends 
AbstractCompactionStrategy
 List mostInteresting = newestBucket(buckets.left,

cfs.getMinimumCompactionThreshold(),

cfs.getMaximumCompactionThreshold(),
-   
options.sstableWindowUnit,
-   
options.sstableWindowSize,
options.stcsOptions,

this.highestWindowSeen);
 if (!mostInteresting.isEmpty())
@@ -267,7 +264,7 @@ public class TimeWindowCompactionStrategy extends 
AbstractCompactionStrategy
  * @return a bucket (list) of sstables to compact.
  */
 @VisibleForTesting
-static List newestBucket(HashMultimap 
buckets, int minThreshold, int maxThreshold, TimeUnit sstableWindowUnit, int 
sstableWindowSize, SizeTieredCompactionStrategyOptions stcsOptions, long now)
+static List newestBucket(HashMultimap 
buckets, int minThreshold, int maxThreshold, 
SizeTieredCompactionStrategyOptions stcsOptions, long now)
 {
 // If the current bucket has at least minThreshold SSTables, choose 
that one.
 // For any other bucket, at least 2 SSTables is enough.

http://git-wip-us.apache.org/repos/asf/cassandra/blob/27395e78/test/unit/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategyTest.java
--
diff --git 
a/test/unit/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategyTest.java
 
b/test/unit/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategyTest.java
index 3238170..5041b31 100644
--- 
a/test/unit/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategyTest.java
+++ 
b/test/unit/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategyTest.java
@@ -26,9 +26,6 @@ import java.util.concurrent.TimeUnit;
 
 import com.google.common.collect.HashMultimap;
 import com.google.common.collect.Iterables;
-import com.google.common.collect.Iterables;
-import com.google.common.collect.Lists;
-
 
 import org.junit.BeforeClass;
 import org.junit.Test;
@@ -179,10 +176,10 @@ public class TimeWindowCompactionStrategyTest extends 
SchemaLoader
 Pair bounds = getWindowBoundsInMillis(TimeUnit.HOURS, 
1, tstamp );
 buckets.put(bounds.left, sstrs.get(i));
 }
-List newBucket = newestBucket(buckets, 4, 32, 
TimeUnit.HOURS, 1, new SizeTieredCompactionStrategyOptions(), 
getWindowBoundsInMillis(TimeUnit.HOURS, 1, System.currentTimeMillis()).left );
+List newBucket = newestBucket(buckets, 4, 32, new 
SizeTieredCompactionStrategyOptions(), getWindowBoundsInMillis(TimeUnit.HOURS, 
1, System.currentTimeMillis()).left );
 assertTrue("incoming bucket should not be accepted when it has below 
the min threshold SSTables", newBucket.isEmpty());
 
-newBucket = newestBucket(buckets, 2, 32, TimeUnit.HOURS, 1, new 
SizeTieredCompactionStrategyOptions(), getWindowBoundsInMillis(TimeUnit.HOURS, 
1, System.currentTimeMi

cassandra git commit: fix Exception message generation by adding String.format markers needed

2016-06-16 Thread dbrosius
Repository: cassandra
Updated Branches:
  refs/heads/trunk 04afa2bf5 -> ff9673920


fix Exception message generation by adding String.format markers needed


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/ff967392
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/ff967392
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/ff967392

Branch: refs/heads/trunk
Commit: ff96739207d94a3e18339566c6abf8108196f95f
Parents: 04afa2b
Author: Dave Brosius 
Authored: Thu Jun 16 22:17:14 2016 -0400
Committer: Dave Brosius 
Committed: Thu Jun 16 22:17:14 2016 -0400

--
 src/java/org/apache/cassandra/db/commitlog/CommitLogReader.java | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/ff967392/src/java/org/apache/cassandra/db/commitlog/CommitLogReader.java
--
diff --git a/src/java/org/apache/cassandra/db/commitlog/CommitLogReader.java 
b/src/java/org/apache/cassandra/db/commitlog/CommitLogReader.java
index 73d70f3..6c4bb60 100644
--- a/src/java/org/apache/cassandra/db/commitlog/CommitLogReader.java
+++ b/src/java/org/apache/cassandra/db/commitlog/CommitLogReader.java
@@ -315,7 +315,7 @@ public class CommitLogReader
 catch (EOFException eof)
 {
 if (handler.shouldSkipSegmentOnError(new 
CommitLogReadException(
-String.format("Unexpected end 
of segment", mutationStart, statusTracker.errorContext),
+String.format("Unexpected end 
of segment at %d in %s", mutationStart, statusTracker.errorContext),
 CommitLogReadErrorReason.EOF,
 
statusTracker.tolerateErrorsInSection)))
 {



[jira] [Commented] (CASSANDRA-11868) unused imports and generic types

2016-06-16 Thread Edward Capriolo (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15335213#comment-15335213
 ] 

Edward Capriolo commented on CASSANDRA-11868:
-

I do not think 8385 is a blocker. I did not clean up abstract types in this 
ticket. I only cleaned imports and a few unused constants. 

> unused imports and generic types
> 
>
> Key: CASSANDRA-11868
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11868
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Edward Capriolo
>Assignee: Edward Capriolo
> Fix For: 3.8
>
>
> I was going through Cassandra source and for busy work I started looking at 
> all the .java files eclipse flags as warning. They are broken roughly into a 
> few cases. 
> 1) unused imports 
> 2) raw types missing <> 
> 3) case statements without defaults 
> 4) @resource annotation 
> My IDE claims item 4 is not needed (it looks like we have done this to 
> signify methods that return objects that need to be closed) I can guess 4 was 
> done intentionally and short of making out own annotation I will ignore these 
> for now. 
> I would like to tackle this busy work before I get started. I have some 
> questions: 
> 1) Do this only on trunk? or multiple branches 
> 2) should I tackle 1,2,3 in separate branches/patches



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12015) Rebuilding from another DC should use different sources

2016-06-16 Thread Paulo Motta (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15334937#comment-15334937
 ] 

Paulo Motta commented on CASSANDRA-12015:
-

bq.  However, beware of different RF in different DCs. You may have RF=3 in 
source DC and RF=5 in target DC, what will be the paired replica of the 4th 
replica of target DC ? Maybe use some modulo function. Same kind of issue if 
target DC RF > source DC RF. 

hmm good point. it seems this might be a bit harder than initially thought...

I suggest we restrict this ticket to avoid using dynamic snitch proximity to 
pick replicas to stream from, which would already prevent hotspots and help in 
the reported case, and tackle the more general problem of load balancing 
replica selection in another ticket

> Rebuilding from another DC should use different sources
> ---
>
> Key: CASSANDRA-12015
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12015
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Fabien Rousseau
>
> Currently, when adding a new DC (ex: DC2) and rebuilding it from an existing 
> DC (ex: DC1), only the closest replica is used as a "source of data".
> It works but is not optimal, because in case of an RF=3 and 3 nodes cluster, 
> only one node in DC1 is streaming the data to DC2. 
> To build the new DC in a reasonable time, it would be better, in that case, 
> to stream from multiple sources, thus distributing more evenly the load.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11519) Add support for IBM POWER

2016-06-16 Thread Rei Odaira (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15334750#comment-15334750
 ] 

Rei Odaira commented on CASSANDRA-11519:


Thanks for the suggestion. Let me investigate how we can do that.

> Add support for IBM POWER
> -
>
> Key: CASSANDRA-11519
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11519
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
> Environment: POWER architecture
>Reporter: Rei Odaira
>Assignee: Rei Odaira
>Priority: Minor
> Fix For: 2.1.x, 2.2.x, 3.0.x, 3.x
>
> Attachments: 11519-2.1.txt, 11519-3.0.txt
>
>
> Add support for the IBM POWER architecture (ppc, ppc64, and ppc64le) in 
> org.apache.cassandra.utils.FastByteOperations, 
> org.apache.cassandra.utils.memory.MemoryUtil, and 
> org.apache.cassandra.io.util.Memory.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-11576) Add support for JNA mlockall(2) on POWER

2016-06-16 Thread Rei Odaira (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rei Odaira updated CASSANDRA-11576:
---
Attachment: 11576-2.1.txt

> Add support for JNA mlockall(2) on POWER
> 
>
> Key: CASSANDRA-11576
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11576
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
> Environment: POWER architecture
>Reporter: Rei Odaira
>Assignee: Rei Odaira
>Priority: Minor
> Fix For: 2.1.x, 2.2.x, 3.0.x, 3.x
>
> Attachments: 11576-2.1.txt
>
>
> org.apache.cassandra.utils.CLibrary contains hard-coded C-macro values to be 
> passed to system calls through JNA. These values are system-dependent, and as 
> far as I investigated, Linux and AIX on the IBM POWER architecture define 
> {{MCL_CURRENT}} and {{MCL_FUTURE}} (for mlockall(2)) as different values than 
> the current hard-coded values.  As a result, mlockall(2) fails on these 
> platforms.
> {code}
> WARN  18:51:51 Unknown mlockall error 22
> {code}
> I am going to provide a patch to support JNA mlockall(2) on POWER.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11576) Add support for JNA mlockall(2) on POWER

2016-06-16 Thread Rei Odaira (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15334746#comment-15334746
 ] 

Rei Odaira commented on CASSANDRA-11576:


I have updated the patch.

> Add support for JNA mlockall(2) on POWER
> 
>
> Key: CASSANDRA-11576
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11576
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
> Environment: POWER architecture
>Reporter: Rei Odaira
>Assignee: Rei Odaira
>Priority: Minor
> Fix For: 2.1.x, 2.2.x, 3.0.x, 3.x
>
> Attachments: 11576-2.1.txt
>
>
> org.apache.cassandra.utils.CLibrary contains hard-coded C-macro values to be 
> passed to system calls through JNA. These values are system-dependent, and as 
> far as I investigated, Linux and AIX on the IBM POWER architecture define 
> {{MCL_CURRENT}} and {{MCL_FUTURE}} (for mlockall(2)) as different values than 
> the current hard-coded values.  As a result, mlockall(2) fails on these 
> platforms.
> {code}
> WARN  18:51:51 Unknown mlockall error 22
> {code}
> I am going to provide a patch to support JNA mlockall(2) on POWER.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-11576) Add support for JNA mlockall(2) on POWER

2016-06-16 Thread Rei Odaira (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rei Odaira updated CASSANDRA-11576:
---
Attachment: (was: 11576-2.1.txt)

> Add support for JNA mlockall(2) on POWER
> 
>
> Key: CASSANDRA-11576
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11576
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
> Environment: POWER architecture
>Reporter: Rei Odaira
>Assignee: Rei Odaira
>Priority: Minor
> Fix For: 2.1.x, 2.2.x, 3.0.x, 3.x
>
> Attachments: 11576-2.1.txt
>
>
> org.apache.cassandra.utils.CLibrary contains hard-coded C-macro values to be 
> passed to system calls through JNA. These values are system-dependent, and as 
> far as I investigated, Linux and AIX on the IBM POWER architecture define 
> {{MCL_CURRENT}} and {{MCL_FUTURE}} (for mlockall(2)) as different values than 
> the current hard-coded values.  As a result, mlockall(2) fails on these 
> platforms.
> {code}
> WARN  18:51:51 Unknown mlockall error 22
> {code}
> I am going to provide a patch to support JNA mlockall(2) on POWER.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11988) NullPointerExpception when reading/compacting table

2016-06-16 Thread Bartlomiej (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15334641#comment-15334641
 ] 

Bartlomiej commented on CASSANDRA-11988:


[~carlyeks] according to "I'm worried that other places might have the same 
issue;".  Can deletion of those static columns  protect us from such data 
corruption in future ?

> NullPointerExpception when reading/compacting table
> ---
>
> Key: CASSANDRA-11988
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11988
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Nimi Wariboko Jr.
>Assignee: Carl Yeksigian
> Fix For: 3.6
>
>
> I have a table that suddenly refuses to be read or compacted. Issuing a read 
> on the table causes a NPE.
> On compaction, it returns the error
> {code}
> ERROR [CompactionExecutor:6] 2016-06-09 17:10:15,724 CassandraDaemon.java:213 
> - Exception in thread Thread[CompactionExecutor:6,1,main]
> java.lang.NullPointerException: null
>   at 
> org.apache.cassandra.db.transform.UnfilteredRows.isEmpty(UnfilteredRows.java:38)
>  ~[apache-cassandra-3.6.jar:3.6]
>   at 
> org.apache.cassandra.db.partitions.PurgeFunction.applyToPartition(PurgeFunction.java:64)
>  ~[apache-cassandra-3.6.jar:3.6]
>   at 
> org.apache.cassandra.db.partitions.PurgeFunction.applyToPartition(PurgeFunction.java:24)
>  ~[apache-cassandra-3.6.jar:3.6]
>   at 
> org.apache.cassandra.db.transform.BasePartitions.hasNext(BasePartitions.java:76)
>  ~[apache-cassandra-3.6.jar:3.6]
>   at 
> org.apache.cassandra.db.compaction.CompactionIterator.hasNext(CompactionIterator.java:226)
>  ~[apache-cassandra-3.6.jar:3.6]
>   at 
> org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:182)
>  ~[apache-cassandra-3.6.jar:3.6]
>   at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) 
> ~[apache-cassandra-3.6.jar:3.6]
>   at 
> org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:82)
>  ~[apache-cassandra-3.6.jar:3.6]
>   at 
> org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:60)
>  ~[apache-cassandra-3.6.jar:3.6]
>   at 
> org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionCandidate.run(CompactionManager.java:264)
>  ~[apache-cassandra-3.6.jar:3.6]
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> ~[na:1.8.0_45]
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
> ~[na:1.8.0_45]
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  ~[na:1.8.0_45]
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  [na:1.8.0_45]
>   at java.lang.Thread.run(Thread.java:745) [na:1.8.0_45]
> {code}
> Schema:
> {code}
> CREATE TABLE cmpayments.report_payments (
> reportid timeuuid,
> userid timeuuid,
> adjustedearnings decimal,
> deleted set static,
> earnings map,
> gross map,
> organizationid text,
> payall timestamp static,
> status text,
> PRIMARY KEY (reportid, userid)
> ) WITH CLUSTERING ORDER BY (userid ASC)
> AND bloom_filter_fp_chance = 0.01
> AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
> AND comment = ''
> AND compaction = {'class': 
> 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 
> 'max_threshold': '32', 'min_threshold': '4'}
> AND compression = {'chunk_length_in_kb': '64', 'class': 
> 'org.apache.cassandra.io.compress.LZ4Compressor'}
> AND crc_check_chance = 1.0
> AND dclocal_read_repair_chance = 0.1
> AND default_time_to_live = 0
> AND gc_grace_seconds = 864000
> AND max_index_interval = 2048
> AND memtable_flush_period_in_ms = 0
> AND min_index_interval = 128
> AND read_repair_chance = 0.0
> AND speculative_retry = '99PERCENTILE';
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11988) NullPointerExpception when reading/compacting table

2016-06-16 Thread Carl Yeksigian (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15334626#comment-15334626
 ] 

Carl Yeksigian commented on CASSANDRA-11988:


This issue has been happening all the way back to 3.0; it was caused by 
CASSANDRA-9975.

We hit this condition when we have tombstoned a static column, and are ready to 
delete it. I was able to reproduce by setting {{gc_grace_seconds}} to 0, and 
then doing some tombstones of static columns.

Looking at the usages of {{BaseRows.staticRow()}}, I'm worried that other 
places might have the same issue; they take the static row, expecting it to be 
non-null, and call {{isEmpty()}} on it.

> NullPointerExpception when reading/compacting table
> ---
>
> Key: CASSANDRA-11988
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11988
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Nimi Wariboko Jr.
>Assignee: Carl Yeksigian
> Fix For: 3.6
>
>
> I have a table that suddenly refuses to be read or compacted. Issuing a read 
> on the table causes a NPE.
> On compaction, it returns the error
> {code}
> ERROR [CompactionExecutor:6] 2016-06-09 17:10:15,724 CassandraDaemon.java:213 
> - Exception in thread Thread[CompactionExecutor:6,1,main]
> java.lang.NullPointerException: null
>   at 
> org.apache.cassandra.db.transform.UnfilteredRows.isEmpty(UnfilteredRows.java:38)
>  ~[apache-cassandra-3.6.jar:3.6]
>   at 
> org.apache.cassandra.db.partitions.PurgeFunction.applyToPartition(PurgeFunction.java:64)
>  ~[apache-cassandra-3.6.jar:3.6]
>   at 
> org.apache.cassandra.db.partitions.PurgeFunction.applyToPartition(PurgeFunction.java:24)
>  ~[apache-cassandra-3.6.jar:3.6]
>   at 
> org.apache.cassandra.db.transform.BasePartitions.hasNext(BasePartitions.java:76)
>  ~[apache-cassandra-3.6.jar:3.6]
>   at 
> org.apache.cassandra.db.compaction.CompactionIterator.hasNext(CompactionIterator.java:226)
>  ~[apache-cassandra-3.6.jar:3.6]
>   at 
> org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:182)
>  ~[apache-cassandra-3.6.jar:3.6]
>   at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) 
> ~[apache-cassandra-3.6.jar:3.6]
>   at 
> org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:82)
>  ~[apache-cassandra-3.6.jar:3.6]
>   at 
> org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:60)
>  ~[apache-cassandra-3.6.jar:3.6]
>   at 
> org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionCandidate.run(CompactionManager.java:264)
>  ~[apache-cassandra-3.6.jar:3.6]
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> ~[na:1.8.0_45]
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
> ~[na:1.8.0_45]
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  ~[na:1.8.0_45]
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  [na:1.8.0_45]
>   at java.lang.Thread.run(Thread.java:745) [na:1.8.0_45]
> {code}
> Schema:
> {code}
> CREATE TABLE cmpayments.report_payments (
> reportid timeuuid,
> userid timeuuid,
> adjustedearnings decimal,
> deleted set static,
> earnings map,
> gross map,
> organizationid text,
> payall timestamp static,
> status text,
> PRIMARY KEY (reportid, userid)
> ) WITH CLUSTERING ORDER BY (userid ASC)
> AND bloom_filter_fp_chance = 0.01
> AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
> AND comment = ''
> AND compaction = {'class': 
> 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 
> 'max_threshold': '32', 'min_threshold': '4'}
> AND compression = {'chunk_length_in_kb': '64', 'class': 
> 'org.apache.cassandra.io.compress.LZ4Compressor'}
> AND crc_check_chance = 1.0
> AND dclocal_read_repair_chance = 0.1
> AND default_time_to_live = 0
> AND gc_grace_seconds = 864000
> AND max_index_interval = 2048
> AND memtable_flush_period_in_ms = 0
> AND min_index_interval = 128
> AND read_repair_chance = 0.0
> AND speculative_retry = '99PERCENTILE';
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11516) Make max number of streams configurable

2016-06-16 Thread Paulo Motta (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15334620#comment-15334620
 ] 

Paulo Motta commented on CASSANDRA-11516:
-

[~giampaolo] my bad, I actually just noticed that executor is only used to 
establish stream connections.

I'm not sure if the idea here is to actually bound the number of active 
streams, in which case it will be a bit more involved, since right now they're 
pretty much unbounded, or only bound the number of [post-processing streaming 
threads|https://github.com/apache/cassandra/blob/cassandra-2.1/src/java/org/apache/cassandra/streaming/StreamReceiveTask.java#L54]
 which finalizes received sstables and adds them to the data tracker, and what 
actually caused the reported problem, so maybe 
[~sebastian.este...@datastax.com] will be able to clarify best.

> Make max number of streams configurable
> ---
>
> Key: CASSANDRA-11516
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11516
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Sebastian Estevez
>  Labels: lhf
>
> Today we default to num cores. In large boxes (many cores), this is 
> suboptimal as it can generate huge amounts of garbage that GC can't keep up 
> with.
> Usually we tackle issues like this with the streaming throughput levers but 
> in this case the problem is CPU consumption by StreamReceiverTasks 
> specifically in the IntervalTree build -- 
> https://github.com/apache/cassandra/blob/cassandra-2.1.12/src/java/org/apache/cassandra/utils/IntervalTree.java#L257
> We need a max number of parallel streams lever to hanlde this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12015) Rebuilding from another DC should use different sources

2016-06-16 Thread DOAN DuyHai (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15334609#comment-15334609
 ] 

DOAN DuyHai commented on CASSANDRA-12015:
-

+1 on the paired replica approach used for MVs. 

 However, beware of different RF in different DCs. You may have RF=3 in source 
DC and RF=5 in target DC, what will be the paired replica of the 4th replica of 
target DC ? Maybe use some modulo function. Same kind of issue if target DC RF 
> source DC RF. 

> Rebuilding from another DC should use different sources
> ---
>
> Key: CASSANDRA-12015
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12015
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Fabien Rousseau
>
> Currently, when adding a new DC (ex: DC2) and rebuilding it from an existing 
> DC (ex: DC1), only the closest replica is used as a "source of data".
> It works but is not optimal, because in case of an RF=3 and 3 nodes cluster, 
> only one node in DC1 is streaming the data to DC2. 
> To build the new DC in a reasonable time, it would be better, in that case, 
> to stream from multiple sources, thus distributing more evenly the load.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11363) Blocked NTR When Connecting Causing Excessive Load

2016-06-16 Thread T Jake Luciani (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15334604#comment-15334604
 ] 

T Jake Luciani commented on CASSANDRA-11363:


The Native Transport Request pool is the only thread pool that has a bounded 
limit (128)

The NTR uses the SEPExecutor which effectively blocks till the queue has room.

However if I'm reading it correctly the SEPWorker goes into a spin loop for 
some scenarios when there us no work so perhaps we are hitting some edge case 
when the tasks are blocked.  [~pauloricardomg] perhaps try setting this queue 
to something small like 4 to force blocking?

> Blocked NTR When Connecting Causing Excessive Load
> --
>
> Key: CASSANDRA-11363
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11363
> Project: Cassandra
>  Issue Type: Bug
>  Components: Coordination
>Reporter: Russell Bradberry
>Assignee: Paulo Motta
> Attachments: cassandra-102-cms.stack, cassandra-102-g1gc.stack
>
>
> When upgrading from 2.1.9 to 2.1.13, we are witnessing an issue where the 
> machine load increases to very high levels (> 120 on an 8 core machine) and 
> native transport requests get blocked in tpstats.
> I was able to reproduce this in both CMS and G1GC as well as on JVM 7 and 8.
> The issue does not seem to affect the nodes running 2.1.9.
> The issue seems to coincide with the number of connections OR the number of 
> total requests being processed at a given time (as the latter increases with 
> the former in our system)
> Currently there is between 600 and 800 client connections on each machine and 
> each machine is handling roughly 2000-3000 client requests per second.
> Disabling the binary protocol fixes the issue for this node but isn't a 
> viable option cluster-wide.
> Here is the output from tpstats:
> {code}
> Pool NameActive   Pending  Completed   Blocked  All 
> time blocked
> MutationStage 0 88387821 0
>  0
> ReadStage 0 0 355860 0
>  0
> RequestResponseStage  0 72532457 0
>  0
> ReadRepairStage   0 0150 0
>  0
> CounterMutationStage 32   104 897560 0
>  0
> MiscStage 0 0  0 0
>  0
> HintedHandoff 0 0 65 0
>  0
> GossipStage   0 0   2338 0
>  0
> CacheCleanupExecutor  0 0  0 0
>  0
> InternalResponseStage 0 0  0 0
>  0
> CommitLogArchiver 0 0  0 0
>  0
> CompactionExecutor2   190474 0
>  0
> ValidationExecutor0 0  0 0
>  0
> MigrationStage0 0 10 0
>  0
> AntiEntropyStage  0 0  0 0
>  0
> PendingRangeCalculator0 0310 0
>  0
> Sampler   0 0  0 0
>  0
> MemtableFlushWriter   110 94 0
>  0
> MemtablePostFlush 134257 0
>  0
> MemtableReclaimMemory 0 0 94 0
>  0
> Native-Transport-Requests   128   156 38795716
> 278451
> Message type   Dropped
> READ 0
> RANGE_SLICE  0
> _TRACE   0
> MUTATION 0
> COUNTER_MUTATION 0
> BINARY   0
> REQUEST_RESPONSE 0
> PAGED_RANGE  0
> READ_REPAIR  0
> {code}
> Attached is the jstack output for both CMS and G1GC.
> Flight recordings are here:
> https://s3.amazonaws.com/simple-logs/cassandra-102-cms.jfr
> https://s3.amazonaws.com/simple-logs/cassandra-102-g1gc.jfr
> It is interesting to note that while the flight recording was taking place, 
> the load on the machine went back to healthy, and when the flight recording 
> finished the load went back to > 100.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-12010) UserTypesTest# is failing on trunk

2016-06-16 Thread Joel Knighton (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Knighton updated CASSANDRA-12010:
--
Fix Version/s: 3.x

> UserTypesTest# is failing on trunk
> --
>
> Key: CASSANDRA-12010
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12010
> Project: Cassandra
>  Issue Type: Test
>  Components: Testing
>Reporter: Alex Petrov
>Assignee: Alex Petrov
> Fix For: 3.x
>
>
> Test failure: 
> http://cassci.datastax.com/job/trunk_utest/1445/testReport/org.apache.cassandra.cql3.validation.entities/UserTypesTest/testAlteringUserTypeNestedWithinNonFrozenMap/
> This was caused by the merge after 
> [11604|https://issues.apache.org/jira/browse/CASSANDRA-11604] which probably 
> coincided with some other change, as this failure did not happen during the 
> [test run on the 
> branch|https://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-11604-trunk-testall/lastCompletedBuild/testReport/].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8700) replace the wiki with docs in the git repo

2016-06-16 Thread Tyler Hobbs (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15334573#comment-15334573
 ] 

Tyler Hobbs commented on CASSANDRA-8700:


bq. On the cqlsh doc though, I wonder if it's a good idea to include the 
description of the command line options, and even of the special commands? 
Feels like it we'll easily forgot to update it and it doesn't seem to add a lot 
of value over getting the help from cqlsh directly.

I think the main advantage is that this documentation will show up in search 
results.  This is particularly useful if you don't know what you're looking 
for, or you don't know if it's a commandline thing or a special command.  I 
also considered including docs for {{cqlshrc}} here for the same reason, but 
those are technically already online and searchable (although not nicely 
formatted).

However, it would be nice to generate the docs directly from the code, as you 
mention.  Sphinx works well with python, so this is probably reasonable to do.  
I'll look into it.

> replace the wiki with docs in the git repo
> --
>
> Key: CASSANDRA-8700
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8700
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Documentation and Website
>Reporter: Jon Haddad
>Assignee: Sylvain Lebresne
>Priority: Blocker
> Fix For: 3.8
>
> Attachments: TombstonesAndGcGrace.md, bloom_filters.md, 
> compression.md, contributing.zip, getting_started.zip, hardware.md
>
>
> The wiki as it stands is pretty terrible.  It takes several minutes to apply 
> a single update, and as a result, it's almost never updated.  The information 
> there has very little context as to what version it applies to.  Most people 
> I've talked to that try to use the information they find there find it is 
> more confusing than helpful.
> I'd like to propose that instead of using the wiki, the doc directory in the 
> cassandra repo be used for docs (already used for CQL3 spec) in a format that 
> can be built to a variety of output formats like HTML / epub / etc.  I won't 
> start the bikeshedding on which markup format is preferable - but there are 
> several options that can work perfectly fine.  I've personally use sphinx w/ 
> restructured text, and markdown.  Both can build easily and as an added bonus 
> be pushed to readthedocs (or something similar) automatically.  For an 
> example, see cqlengine's documentation, which I think is already 
> significantly better than the wiki: 
> http://cqlengine.readthedocs.org/en/latest/
> In addition to being overall easier to maintain, putting the documentation in 
> the git repo adds context, since it evolves with the versions of Cassandra.
> If the wiki were kept even remotely up to date, I wouldn't bother with this, 
> but not having at least some basic documentation in the repo, or anywhere 
> associated with the project, is frustrating.
> For reference, the last 3 updates were:
> 1/15/15 - updating committers list
> 1/08/15 - updating contributers and how to contribute
> 12/16/14 - added a link to CQL docs from wiki frontpage (by me)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12010) UserTypesTest# is failing on trunk

2016-06-16 Thread Joel Knighton (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15334563#comment-15334563
 ] 

Joel Knighton commented on CASSANDRA-12010:
---

+1, lgtm.

> UserTypesTest# is failing on trunk
> --
>
> Key: CASSANDRA-12010
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12010
> Project: Cassandra
>  Issue Type: Test
>  Components: Testing
>Reporter: Alex Petrov
>Assignee: Alex Petrov
> Fix For: 3.x
>
>
> Test failure: 
> http://cassci.datastax.com/job/trunk_utest/1445/testReport/org.apache.cassandra.cql3.validation.entities/UserTypesTest/testAlteringUserTypeNestedWithinNonFrozenMap/
> This was caused by the merge after 
> [11604|https://issues.apache.org/jira/browse/CASSANDRA-11604] which probably 
> coincided with some other change, as this failure did not happen during the 
> [test run on the 
> branch|https://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-11604-trunk-testall/lastCompletedBuild/testReport/].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-12010) UserTypesTest# is failing on trunk

2016-06-16 Thread Joel Knighton (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Knighton updated CASSANDRA-12010:
--
Status: Ready to Commit  (was: Patch Available)

> UserTypesTest# is failing on trunk
> --
>
> Key: CASSANDRA-12010
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12010
> Project: Cassandra
>  Issue Type: Test
>  Components: Testing
>Reporter: Alex Petrov
>Assignee: Alex Petrov
> Fix For: 3.x
>
>
> Test failure: 
> http://cassci.datastax.com/job/trunk_utest/1445/testReport/org.apache.cassandra.cql3.validation.entities/UserTypesTest/testAlteringUserTypeNestedWithinNonFrozenMap/
> This was caused by the merge after 
> [11604|https://issues.apache.org/jira/browse/CASSANDRA-11604] which probably 
> coincided with some other change, as this failure did not happen during the 
> [test run on the 
> branch|https://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-11604-trunk-testall/lastCompletedBuild/testReport/].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-12010) UserTypesTest# is failing on trunk

2016-06-16 Thread Joel Knighton (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Knighton updated CASSANDRA-12010:
--
Component/s: Testing

> UserTypesTest# is failing on trunk
> --
>
> Key: CASSANDRA-12010
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12010
> Project: Cassandra
>  Issue Type: Test
>  Components: Testing
>Reporter: Alex Petrov
>Assignee: Alex Petrov
> Fix For: 3.x
>
>
> Test failure: 
> http://cassci.datastax.com/job/trunk_utest/1445/testReport/org.apache.cassandra.cql3.validation.entities/UserTypesTest/testAlteringUserTypeNestedWithinNonFrozenMap/
> This was caused by the merge after 
> [11604|https://issues.apache.org/jira/browse/CASSANDRA-11604] which probably 
> coincided with some other change, as this failure did not happen during the 
> [test run on the 
> branch|https://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-11604-trunk-testall/lastCompletedBuild/testReport/].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12015) Rebuilding from another DC should use different sources

2016-06-16 Thread Paulo Motta (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15334555#comment-15334555
 ] 

Paulo Motta commented on CASSANDRA-12015:
-

bq. However, it does not help the original concern of this JIRA, which is to 
have some sort of randomization/round-robin selection for the source replica to 
stream data from.

I think there are two concerns here:
a) Improve source diversity for single node rebuilds
b) For simultaneous rebuilds, divide the load more evenly across replicas.

>From my understanding a) is easily solvable by using token order instead of 
>proximity to pick replicas to stream from, but this does not solve b) because 
>primary replicas from simultaneous rebuilds might become overloaded

Maybe b) can be solved without keeping state by using a paired replica approach 
similar to MVs?

> Rebuilding from another DC should use different sources
> ---
>
> Key: CASSANDRA-12015
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12015
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Fabien Rousseau
>
> Currently, when adding a new DC (ex: DC2) and rebuilding it from an existing 
> DC (ex: DC1), only the closest replica is used as a "source of data".
> It works but is not optimal, because in case of an RF=3 and 3 nodes cluster, 
> only one node in DC1 is streaming the data to DC2. 
> To build the new DC in a reasonable time, it would be better, in that case, 
> to stream from multiple sources, thus distributing more evenly the load.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-12018) CDC follow-ups

2016-06-16 Thread Joshua McKenzie (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joshua McKenzie updated CASSANDRA-12018:

Description: 
h6. Platform independent implementation of DirectorySizeCalculator
On linux, simplify to 
{{Arrays.stream(path.listFiles()).mapToLong(File::length).sum();}}

h6. Refactor DirectorySizeCalculator
bq. I don't get the DirectorySizeCalculator. Why the alive and visited sets, 
the listFiles step? Either list the files and just loop through them, or do the 
walkFileTree operation – you are now doing the same work twice. Use a plain 
long instead of the atomic as the class is still thread-unsafe.

h6. TolerateErrorsInSection should not depend on previous SyncSegment status in 
CommitLogReader
bq. tolerateErrorsInSection &=: I don't think it was intended for the value to 
depend on previous iterations.

h6. Refactor interface of SImpleCachedBufferPool
bq. SimpleCachedBufferPool should provide getThreadLocalReusableBuffer(int 
size) which should automatically reallocate if the available size is less, and 
not expose a setter at all.

h6. Change CDC exception to WriteFailureException instead of 
WriteTimeoutException

h6. Remove unused CommitLogTest.testRecovery(byte[] logData)

> CDC follow-ups
> --
>
> Key: CASSANDRA-12018
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12018
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Joshua McKenzie
>Assignee: Joshua McKenzie
>Priority: Minor
>
> h6. Platform independent implementation of DirectorySizeCalculator
> On linux, simplify to 
> {{Arrays.stream(path.listFiles()).mapToLong(File::length).sum();}}
> h6. Refactor DirectorySizeCalculator
> bq. I don't get the DirectorySizeCalculator. Why the alive and visited sets, 
> the listFiles step? Either list the files and just loop through them, or do 
> the walkFileTree operation – you are now doing the same work twice. Use a 
> plain long instead of the atomic as the class is still thread-unsafe.
> h6. TolerateErrorsInSection should not depend on previous SyncSegment status 
> in CommitLogReader
> bq. tolerateErrorsInSection &=: I don't think it was intended for the value 
> to depend on previous iterations.
> h6. Refactor interface of SImpleCachedBufferPool
> bq. SimpleCachedBufferPool should provide getThreadLocalReusableBuffer(int 
> size) which should automatically reallocate if the available size is less, 
> and not expose a setter at all.
> h6. Change CDC exception to WriteFailureException instead of 
> WriteTimeoutException
> h6. Remove unused CommitLogTest.testRecovery(byte[] logData)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-12018) CDC follow-ups

2016-06-16 Thread Joshua McKenzie (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joshua McKenzie updated CASSANDRA-12018:

Description: (was: Parent ticket to hold subtasks for things that came 
up during CDC discussion)

> CDC follow-ups
> --
>
> Key: CASSANDRA-12018
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12018
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Joshua McKenzie
>Assignee: Joshua McKenzie
>Priority: Minor
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Deleted] (CASSANDRA-12019) Platform independent implementation of DirectorySizeCalculator and refactor

2016-06-16 Thread Joshua McKenzie (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joshua McKenzie deleted CASSANDRA-12019:



> Platform independent implementation of DirectorySizeCalculator and refactor
> ---
>
> Key: CASSANDRA-12019
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12019
> Project: Cassandra
>  Issue Type: Sub-task
>Reporter: Joshua McKenzie
>Assignee: Joshua McKenzie
>Priority: Minor
>
> On linux, simplify to 
> {{Arrays.stream(path.listFiles()).mapToLong(File::length).sum();}}
> It's simpler and performs better, however is much slower on Windows.
> See discussion on CASSANDRA-8844.
> Also:
> bq. I don't get the DirectorySizeCalculator. Why the alive and visited sets, 
> the listFiles step? Either list the files and just loop through them, or do 
> the walkFileTree operation – you are now doing the same work twice. Use a 
> plain long instead of the atomic as the class is still thread-unsafe.
> So the existing class could use a refactor as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-12019) Platform independent implementation of DirectorySizeCalculator and refactor

2016-06-16 Thread Joshua McKenzie (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joshua McKenzie updated CASSANDRA-12019:

Summary: Platform independent implementation of DirectorySizeCalculator and 
refactor  (was: Platform independent implementation of DirectorySizeCalculator)

> Platform independent implementation of DirectorySizeCalculator and refactor
> ---
>
> Key: CASSANDRA-12019
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12019
> Project: Cassandra
>  Issue Type: Sub-task
>Reporter: Joshua McKenzie
>Assignee: Joshua McKenzie
>Priority: Minor
>
> On linux, simplify to 
> {{Arrays.stream(path.listFiles()).mapToLong(File::length).sum();}}
> It's simpler and performs better, however is much slower on Windows.
> See discussion on CASSANDRA-8844.
> Also:
> bq. I don't get the DirectorySizeCalculator. Why the alive and visited sets, 
> the listFiles step? Either list the files and just loop through them, or do 
> the walkFileTree operation – you are now doing the same work twice. Use a 
> plain long instead of the atomic as the class is still thread-unsafe.
> So the existing class could use a refactor as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-12019) Platform independent implementation of DirectorySizeCalculator

2016-06-16 Thread Joshua McKenzie (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joshua McKenzie updated CASSANDRA-12019:

Description: 
On linux, simplify to 
{{Arrays.stream(path.listFiles()).mapToLong(File::length).sum();}}

It's simpler and performs better, however is much slower on Windows.

See discussion on CASSANDRA-8844.

Also:
bq. I don't get the DirectorySizeCalculator. Why the alive and visited sets, 
the listFiles step? Either list the files and just loop through them, or do the 
walkFileTree operation – you are now doing the same work twice. Use a plain 
long instead of the atomic as the class is still thread-unsafe.

So the existing class could use a refactor as well.


  was:
On linux, simplify to 
{{Arrays.stream(path.listFiles()).mapToLong(File::length).sum();}}

It's simpler and performs better, however is much slower on Windows.

See discussion on CASSANDRA-8844.


> Platform independent implementation of DirectorySizeCalculator
> --
>
> Key: CASSANDRA-12019
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12019
> Project: Cassandra
>  Issue Type: Sub-task
>Reporter: Joshua McKenzie
>Assignee: Joshua McKenzie
>Priority: Minor
>
> On linux, simplify to 
> {{Arrays.stream(path.listFiles()).mapToLong(File::length).sum();}}
> It's simpler and performs better, however is much slower on Windows.
> See discussion on CASSANDRA-8844.
> Also:
> bq. I don't get the DirectorySizeCalculator. Why the alive and visited sets, 
> the listFiles step? Either list the files and just loop through them, or do 
> the walkFileTree operation – you are now doing the same work twice. Use a 
> plain long instead of the atomic as the class is still thread-unsafe.
> So the existing class could use a refactor as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-12019) Platform independent implementation of DirectorySizeCalculator

2016-06-16 Thread Joshua McKenzie (JIRA)
Joshua McKenzie created CASSANDRA-12019:
---

 Summary: Platform independent implementation of 
DirectorySizeCalculator
 Key: CASSANDRA-12019
 URL: https://issues.apache.org/jira/browse/CASSANDRA-12019
 Project: Cassandra
  Issue Type: Sub-task
Reporter: Joshua McKenzie
Assignee: Joshua McKenzie
Priority: Minor


On linux, simplify to 
{{Arrays.stream(path.listFiles()).mapToLong(File::length).sum();}}

It's simpler and performs better, however is much slower on Windows.

See discussion on CASSANDRA-8844.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-12018) CDC follow-ups

2016-06-16 Thread Joshua McKenzie (JIRA)
Joshua McKenzie created CASSANDRA-12018:
---

 Summary: CDC follow-ups
 Key: CASSANDRA-12018
 URL: https://issues.apache.org/jira/browse/CASSANDRA-12018
 Project: Cassandra
  Issue Type: Improvement
Reporter: Joshua McKenzie
Assignee: Joshua McKenzie
Priority: Minor


Parent ticket to hold subtasks for things that came up during CDC discussion



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12015) Rebuilding from another DC should use different sources

2016-06-16 Thread DOAN DuyHai (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15334412#comment-15334412
 ] 

DOAN DuyHai commented on CASSANDRA-12015:
-

I have checked the Git history and it seems that the 
{{snitch.getSortedListByProximity(address, rangeAddresses.get(range))}} line 
has always been there since 2012. Look likes nobody has noticed that 
DynamicSnitch can create this kind of hotspots since then, so yes we may update 
this.

However, it does not help the original concern of this JIRA, which is to have 
some sort of randomization/round-robin selection for the source replica to 
stream data from.

If we replace the dynamic snitch by {{AbstractEndpointSnitch.sortByProximity}}, 
it also will always pick the *same* replica for a given token range, whereas 
the idea is to pick randomly or to round-robin the replica. But it also mean 
that we need to keep *state* to know which replica has been selected previously 
for a given token range so that we can move to the next one. And I'm not sure 
whether having *state* is desirable or technically feasible 


> Rebuilding from another DC should use different sources
> ---
>
> Key: CASSANDRA-12015
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12015
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Fabien Rousseau
>
> Currently, when adding a new DC (ex: DC2) and rebuilding it from an existing 
> DC (ex: DC1), only the closest replica is used as a "source of data".
> It works but is not optimal, because in case of an RF=3 and 3 nodes cluster, 
> only one node in DC1 is streaming the data to DC2. 
> To build the new DC in a reasonable time, it would be better, in that case, 
> to stream from multiple sources, thus distributing more evenly the load.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12010) UserTypesTest# is failing on trunk

2016-06-16 Thread Alex Petrov (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15334373#comment-15334373
 ] 

Alex Petrov commented on CASSANDRA-12010:
-

Great catch! I've completely overlooked {{beforeAndAfterFlush}}. I've compared 
and ported all the changes from 
[here|https://github.com/ifesdjeen/cassandra/commit/0034d8b60acd52ff517cf8c7ab1ac86277c3dbc3].

Updated tree:

|[trunk|https://github.com/ifesdjeen/cassandra/tree/12010-trunk]|[utest|https://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-12010-trunk-testall/]

> UserTypesTest# is failing on trunk
> --
>
> Key: CASSANDRA-12010
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12010
> Project: Cassandra
>  Issue Type: Test
>Reporter: Alex Petrov
>Assignee: Alex Petrov
>
> Test failure: 
> http://cassci.datastax.com/job/trunk_utest/1445/testReport/org.apache.cassandra.cql3.validation.entities/UserTypesTest/testAlteringUserTypeNestedWithinNonFrozenMap/
> This was caused by the merge after 
> [11604|https://issues.apache.org/jira/browse/CASSANDRA-11604] which probably 
> coincided with some other change, as this failure did not happen during the 
> [test run on the 
> branch|https://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-11604-trunk-testall/lastCompletedBuild/testReport/].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-10202) simplify CommitLogSegmentManager

2016-06-16 Thread Branimir Lambov (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Branimir Lambov updated CASSANDRA-10202:

Status: Patch Available  (was: Open)

> simplify CommitLogSegmentManager
> 
>
> Key: CASSANDRA-10202
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10202
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local Write-Read Paths
>Reporter: Jonathan Ellis
>Assignee: Branimir Lambov
>Priority: Minor
>
> Now that we only keep one active segment around we can simplify this from the 
> old recycling design.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10202) simplify CommitLogSegmentManager

2016-06-16 Thread Branimir Lambov (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15334230#comment-15334230
 ] 

Branimir Lambov commented on CASSANDRA-10202:
-

Branch is updated to remove the custom concurrent list implementation.

> simplify CommitLogSegmentManager
> 
>
> Key: CASSANDRA-10202
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10202
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local Write-Read Paths
>Reporter: Jonathan Ellis
>Assignee: Branimir Lambov
>Priority: Minor
>
> Now that we only keep one active segment around we can simplify this from the 
> old recycling design.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11866) nodetool repair does not obey the column family parameter when -st and -et are provided (subrange repair)

2016-06-16 Thread Paulo Motta (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15334184#comment-15334184
 ] 

Paulo Motta commented on CASSANDRA-11866:
-

[~mahdix] interested to take this? should be quite easy.

> nodetool repair does not obey the column family parameter when -st and -et 
> are provided (subrange repair)
> -
>
> Key: CASSANDRA-11866
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11866
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tools
> Environment: Red Hat Enterprise Linux Server release 6.7 (Santiago) 
> x86_64
>Reporter: Shiva Venkateswaran
>  Labels: newbie
> Fix For: 2.1.x
>
>
> Command 1: Repairs all the CFs in ADL_GLOBAL keyspace and ignores the 
> parameter AssetModifyTimes_data used to restrict the CFs
> Executing: /aladdin/local/apps/apache-cassandra-2.1.8a/bin/nodetool -h 
> localhost -p 7199 -u user-pw ** repair ADL_GLOBAL AssetModifyTimes_data 
> -st 205279477618143669 -et 230991685737746901 -par
> [2016-05-20 17:31:39,116] Starting repair command #9, repairing 1 ranges for 
> keyspace ADL_GLOBAL (parallelism=PARALLEL, full=true)
> [2016-05-20 17:32:21,568] Repair session 3cae2530-1ed2-11e6-b490-d9df6932c7cf 
> for range (205279477618143669,230991685737746901] finished
> Command 2: Repairs all the CFs in ADL_GLOBAL keyspace and ignores the 
> parameter AssetModifyTimes_data used to restrict the CFs
> Executing: /aladdin/local/apps/apache-cassandra-2.1.8a/bin/nodetool -h 
> localhost -p 7199 -u controlRole -pw ** repair -st 205279477618143669 -et 
> 230991685737746901 -par -- ADL_GLOBAL AssetModifyTimes_data
> [2016-05-20 17:36:34,473] Starting repair command #10, repairing 1 ranges for 
> keyspace ADL_GLOBAL (parallelism=PARALLEL, full=true)
> [2016-05-20 17:37:15,365] Repair session ecb996d0-1ed2-11e6-b490-d9df6932c7cf 
> for range (205279477618143669,230991685737746901] finished
> [2016-05-20 17:37:15,365] Repair command #10 finished
> Command 3: Repairs only the CF ADL3Test1_data in keyspace ADL_GLOBAL
> Executing: /aladdin/local/apps/apache-cassandra-2.1.8a/bin/nodetool -h 
> localhost -p 7199 -u controlRole -pw ** repair -- ADL_GLOBAL 
> ADL3Test1_data
> [2016-05-20 17:38:35,781] Starting repair command #11, repairing 1043 ranges 
> for keyspace ADL_GLOBAL (parallelism=SEQUENTIAL, full=true)
> [2016-05-20 17:42:32,682] Repair session 3c8af050-1ed3-11e6-b490-d9df6932c7cf 
> for range (6241639152751626129,6241693909092643958] finished
> [2016-05-20 17:42:32,683] Repair session 3caf1a20-1ed3-11e6-b490-d9df6932c7cf 
> for range (-7096993048358106082,-7095000706885780850] finished
> [2016-05-20 17:42:32,683] Repair session 3ccfc180-1ed3-11e6-b490-d9df6932c7cf 
> for range (-7218939248114487080,-7218289345961492809] finished
> [2016-05-20 17:42:32,683] Repair session 3cf21690-1ed3-11e6-b490-d9df6932c7cf 
> for range (-5244794756638190874,-5190307341355030282] finished
> [2016-05-20 17:42:32,683] Repair session 3d126fd0-1ed3-11e6-b490-d9df6932c7cf 
> for range (3551629701277971766,321736534916502] finished
> [2016-05-20 17:42:32,683] Repair session 3d32f020-1ed3-11e6-b490-d9df6932c7cf 
> for range (-8139355591560661944,-8127928369093576603] finished
> [2016-05-20 17:42:32,683] Repair session 3d537070-1ed3-11e6-b490-d9df6932c7cf 
> for range (7098010153980465751,7100863011896759020] finished
> [2016-05-20 17:42:32,683] Repair session 3d73f0c0-1ed3-11e6-b490-d9df6932c7cf 
> for range (1004538726866173536,1008586133746764703] finished
> [2016-05-20 17:42:32,683] Repair session 3d947110-1ed3-11e6-b490-d9df6932c7cf 
> for range (5770817093573726645,5771418910784831587] finished
> .
> .
> .
> [2016-05-20 17:42:32,732] Repair command #11 finished



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (CASSANDRA-11986) Repair using subranges (-st / -et) ignore Keyspace / Table name arguments

2016-06-16 Thread Paulo Motta (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paulo Motta resolved CASSANDRA-11986.
-
Resolution: Duplicate

Closing as duplicate of CASSANDRA-11866. The fix is quite trivial, we only need 
to pass the CF argument to 
[probe.forceRepairRangeAsync|https://github.com/apache/cassandra/blob/cassandra-2.1/src/java/org/apache/cassandra/tools/NodeTool.java#L1950].

> Repair using subranges (-st / -et) ignore Keyspace / Table name arguments
> -
>
> Key: CASSANDRA-11986
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11986
> Project: Cassandra
>  Issue Type: Bug
> Environment: Reproduced using ccm and Cassandra 2.1.12
>Reporter: Alain RODRIGUEZ
>
> When repairing, it is impossible to repair using subranges and a specific 
> table at the same time.
> When running this:
> {noformat}
> date && echo "Repairing standard1 on 127.0.0.1" && time nodetool -h localhost 
> -p 7100 repair -dc datacenter1 -local -par -- keyspace1 standard1
> {noformat}
> *Without -st / -et* options, I have the following output:
> {noformat}
> MacBook-Pro:~ alain$ tail -100f ~/.ccm/test-2.1.12/node1/logs/system.log
> INFO  [Thread-33] 2016-06-09 14:18:52,193 StorageService.java:2939 - Starting 
> repair command #8, repairing 3 ranges for keyspace keyspace1 
> (parallelism=PARALLEL, full=true)
> INFO  [AntiEntropySessions:12] 2016-06-09 14:18:52,194 RepairSession.java:260 
> - [repair #53e6f820-2e3c-11e6-95ae-d1beb0ba4c9e] new session: will sync 
> /127.0.0.1, /127.0.0.2, /127.0.0.3 on range 
> (3074457345618258602,-9223372036854775808] for keyspace1.[standard1]
> INFO  [AntiEntropySessions:12] 2016-06-09 14:18:52,195 RepairJob.java:163 - 
> [repair #53e6f820-2e3c-11e6-95ae-d1beb0ba4c9e] requesting merkle trees for 
> standard1 (to [/127.0.0.2, /127.0.0.3, /127.0.0.1])
> INFO  [AntiEntropyStage:1] 2016-06-09 14:18:57,433 RepairSession.java:171 - 
> [repair #53e6f820-2e3c-11e6-95ae-d1beb0ba4c9e] Received merkle tree for 
> standard1 from /127.0.0.2
> INFO  [AntiEntropyStage:1] 2016-06-09 14:18:57,436 RepairSession.java:171 - 
> [repair #53e6f820-2e3c-11e6-95ae-d1beb0ba4c9e] Received merkle tree for 
> standard1 from /127.0.0.3
> INFO  [AntiEntropyStage:1] 2016-06-09 14:18:57,439 RepairSession.java:171 - 
> [repair #53e6f820-2e3c-11e6-95ae-d1beb0ba4c9e] Received merkle tree for 
> standard1 from /127.0.0.1
> INFO  [AntiEntropySessions:13] 2016-06-09 14:18:57,439 RepairSession.java:260 
> - [repair #57074af0-2e3c-11e6-95ae-d1beb0ba4c9e] new session: will sync 
> /127.0.0.1, /127.0.0.2, /127.0.0.3 on range 
> (-9223372036854775808,-3074457345618258603] for keyspace1.[standard1]
> INFO  [RepairJobTask:1] 2016-06-09 14:18:57,440 Differencer.java:67 - [repair 
> #53e6f820-2e3c-11e6-95ae-d1beb0ba4c9e] Endpoints /127.0.0.2 and /127.0.0.3 
> are consistent for standard1
> INFO  [RepairJobTask:3] 2016-06-09 14:18:57,440 Differencer.java:67 - [repair 
> #53e6f820-2e3c-11e6-95ae-d1beb0ba4c9e] Endpoints /127.0.0.3 and /127.0.0.1 
> are consistent for standard1
> INFO  [RepairJobTask:2] 2016-06-09 14:18:57,440 Differencer.java:67 - [repair 
> #53e6f820-2e3c-11e6-95ae-d1beb0ba4c9e] Endpoints /127.0.0.2 and /127.0.0.1 
> are consistent for standard1
> INFO  [AntiEntropySessions:13] 2016-06-09 14:18:57,440 RepairJob.java:163 - 
> [repair #57074af0-2e3c-11e6-95ae-d1beb0ba4c9e] requesting merkle trees for 
> standard1 (to [/127.0.0.2, /127.0.0.3, /127.0.0.1])
> INFO  [AntiEntropyStage:1] 2016-06-09 14:18:57,440 RepairSession.java:237 - 
> [repair #53e6f820-2e3c-11e6-95ae-d1beb0ba4c9e] standard1 is fully synced
> INFO  [AntiEntropySessions:12] 2016-06-09 14:18:57,440 RepairSession.java:299 
> - [repair #53e6f820-2e3c-11e6-95ae-d1beb0ba4c9e] session completed 
> successfully
> INFO  [AntiEntropyStage:1] 2016-06-09 14:19:03,676 RepairSession.java:171 - 
> [repair #57074af0-2e3c-11e6-95ae-d1beb0ba4c9e] Received merkle tree for 
> standard1 from /127.0.0.2
> INFO  [AntiEntropyStage:1] 2016-06-09 14:19:03,684 RepairSession.java:171 - 
> [repair #57074af0-2e3c-11e6-95ae-d1beb0ba4c9e] Received merkle tree for 
> standard1 from /127.0.0.3
> INFO  [AntiEntropyStage:1] 2016-06-09 14:19:03,758 RepairSession.java:171 - 
> [repair #57074af0-2e3c-11e6-95ae-d1beb0ba4c9e] Received merkle tree for 
> standard1 from /127.0.0.1
> INFO  [AntiEntropySessions:14] 2016-06-09 14:19:03,759 RepairSession.java:260 
> - [repair #5acba5f0-2e3c-11e6-95ae-d1beb0ba4c9e] new session: will sync 
> /127.0.0.1, /127.0.0.2, /127.0.0.3 on range 
> (-3074457345618258603,3074457345618258602] for keyspace1.[standard1]
> INFO  [RepairJobTask:1] 2016-06-09 14:19:03,759 Differencer.java:67 - [repair 
> #57074af0-2e3c-11e6-95ae-d1beb0ba4c9e] Endpoints /127.0.0.2 and /127.0.0.3 
> are consistent for standard1
> INFO  [AntiEntropySessions:1

[jira] [Commented] (CASSANDRA-11516) Make max number of streams configurable

2016-06-16 Thread Giampaolo (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15334171#comment-15334171
 ] 

Giampaolo commented on CASSANDRA-11516:
---

Thanks, [~pauloricardomg] for the pointer.  That was my first choice at the 
beginning but the issue refers to {{StreamReceiveTask}}. I will go with the one 
you gave me.


> Make max number of streams configurable
> ---
>
> Key: CASSANDRA-11516
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11516
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Sebastian Estevez
>  Labels: lhf
>
> Today we default to num cores. In large boxes (many cores), this is 
> suboptimal as it can generate huge amounts of garbage that GC can't keep up 
> with.
> Usually we tackle issues like this with the streaming throughput levers but 
> in this case the problem is CPU consumption by StreamReceiverTasks 
> specifically in the IntervalTree build -- 
> https://github.com/apache/cassandra/blob/cassandra-2.1.12/src/java/org/apache/cassandra/utils/IntervalTree.java#L257
> We need a max number of parallel streams lever to hanlde this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-12008) Make decommission operations resumable

2016-06-16 Thread Paulo Motta (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paulo Motta updated CASSANDRA-12008:

   Priority: Minor  (was: Major)
Component/s: (was: Lifecycle)
 Streaming and Messaging
 Issue Type: Improvement  (was: Bug)
Summary: Make decommission operations resumable  (was: Allow retrying 
failed streams (or stop them from failing))

> Make decommission operations resumable
> --
>
> Key: CASSANDRA-12008
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12008
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Streaming and Messaging
>Reporter: Tom van der Woerdt
>Priority: Minor
>
> We're dealing with large data sets (multiple terabytes per node) and 
> sometimes we need to add or remove nodes. These operations are very dependent 
> on the entire cluster being up, so while we're joining a new node (which 
> sometimes takes 6 hours or longer) a lot can go wrong and in a lot of cases 
> something does.
> It would be great if the ability to retry streams was implemented.
> Example to illustrate the problem :
> {code}
> 03:18 PM   ~ $ nodetool decommission
> error: Stream failed
> -- StackTrace --
> org.apache.cassandra.streaming.StreamException: Stream failed
> at 
> org.apache.cassandra.streaming.management.StreamEventJMXNotifier.onFailure(StreamEventJMXNotifier.java:85)
> at com.google.common.util.concurrent.Futures$6.run(Futures.java:1310)
> at 
> com.google.common.util.concurrent.MoreExecutors$DirectExecutor.execute(MoreExecutors.java:457)
> at 
> com.google.common.util.concurrent.ExecutionList.executeListener(ExecutionList.java:156)
> at 
> com.google.common.util.concurrent.ExecutionList.execute(ExecutionList.java:145)
> at 
> com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:202)
> at 
> org.apache.cassandra.streaming.StreamResultFuture.maybeComplete(StreamResultFuture.java:210)
> at 
> org.apache.cassandra.streaming.StreamResultFuture.handleSessionComplete(StreamResultFuture.java:186)
> at 
> org.apache.cassandra.streaming.StreamSession.closeSession(StreamSession.java:430)
> at 
> org.apache.cassandra.streaming.StreamSession.complete(StreamSession.java:622)
> at 
> org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:486)
> at 
> org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:274)
> at java.lang.Thread.run(Thread.java:745)
> 08:04 PM   ~ $ nodetool decommission
> nodetool: Unsupported operation: Node in LEAVING state; wait for status to 
> become normal or restart
> See 'nodetool help' or 'nodetool help '.
> {code}
> Streaming failed, probably due to load :
> {code}
> ERROR [STREAM-IN-/] 2016-06-14 18:05:47,275 StreamSession.java:520 - 
> [Stream #] Streaming error occurred
> java.net.SocketTimeoutException: null
> at 
> sun.nio.ch.SocketAdaptor$SocketInputStream.read(SocketAdaptor.java:211) 
> ~[na:1.8.0_77]
> at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103) 
> ~[na:1.8.0_77]
> at 
> java.nio.channels.Channels$ReadableByteChannelImpl.read(Channels.java:385) 
> ~[na:1.8.0_77]
> at 
> org.apache.cassandra.streaming.messages.StreamMessage.deserialize(StreamMessage.java:54)
>  ~[apache-cassandra-3.0.6.jar:3.0.6]
> at 
> org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:268)
>  ~[apache-cassandra-3.0.6.jar:3.0.6]
> at java.lang.Thread.run(Thread.java:745) [na:1.8.0_77]
> {code}
> If implementing retries is not possible, can we have a 'nodetool decommission 
> resume'?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12008) Allow retrying failed streams (or stop them from failing)

2016-06-16 Thread Paulo Motta (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15334153#comment-15334153
 ] 

Paulo Motta commented on CASSANDRA-12008:
-

bq. As for "nodetool decommission resume", can we have that?

It's definitely possible, so I will update the ticket to reflect that.

> Allow retrying failed streams (or stop them from failing)
> -
>
> Key: CASSANDRA-12008
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12008
> Project: Cassandra
>  Issue Type: Bug
>  Components: Lifecycle
>Reporter: Tom van der Woerdt
>
> We're dealing with large data sets (multiple terabytes per node) and 
> sometimes we need to add or remove nodes. These operations are very dependent 
> on the entire cluster being up, so while we're joining a new node (which 
> sometimes takes 6 hours or longer) a lot can go wrong and in a lot of cases 
> something does.
> It would be great if the ability to retry streams was implemented.
> Example to illustrate the problem :
> {code}
> 03:18 PM   ~ $ nodetool decommission
> error: Stream failed
> -- StackTrace --
> org.apache.cassandra.streaming.StreamException: Stream failed
> at 
> org.apache.cassandra.streaming.management.StreamEventJMXNotifier.onFailure(StreamEventJMXNotifier.java:85)
> at com.google.common.util.concurrent.Futures$6.run(Futures.java:1310)
> at 
> com.google.common.util.concurrent.MoreExecutors$DirectExecutor.execute(MoreExecutors.java:457)
> at 
> com.google.common.util.concurrent.ExecutionList.executeListener(ExecutionList.java:156)
> at 
> com.google.common.util.concurrent.ExecutionList.execute(ExecutionList.java:145)
> at 
> com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:202)
> at 
> org.apache.cassandra.streaming.StreamResultFuture.maybeComplete(StreamResultFuture.java:210)
> at 
> org.apache.cassandra.streaming.StreamResultFuture.handleSessionComplete(StreamResultFuture.java:186)
> at 
> org.apache.cassandra.streaming.StreamSession.closeSession(StreamSession.java:430)
> at 
> org.apache.cassandra.streaming.StreamSession.complete(StreamSession.java:622)
> at 
> org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:486)
> at 
> org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:274)
> at java.lang.Thread.run(Thread.java:745)
> 08:04 PM   ~ $ nodetool decommission
> nodetool: Unsupported operation: Node in LEAVING state; wait for status to 
> become normal or restart
> See 'nodetool help' or 'nodetool help '.
> {code}
> Streaming failed, probably due to load :
> {code}
> ERROR [STREAM-IN-/] 2016-06-14 18:05:47,275 StreamSession.java:520 - 
> [Stream #] Streaming error occurred
> java.net.SocketTimeoutException: null
> at 
> sun.nio.ch.SocketAdaptor$SocketInputStream.read(SocketAdaptor.java:211) 
> ~[na:1.8.0_77]
> at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103) 
> ~[na:1.8.0_77]
> at 
> java.nio.channels.Channels$ReadableByteChannelImpl.read(Channels.java:385) 
> ~[na:1.8.0_77]
> at 
> org.apache.cassandra.streaming.messages.StreamMessage.deserialize(StreamMessage.java:54)
>  ~[apache-cassandra-3.0.6.jar:3.0.6]
> at 
> org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:268)
>  ~[apache-cassandra-3.0.6.jar:3.0.6]
> at java.lang.Thread.run(Thread.java:745) [na:1.8.0_77]
> {code}
> If implementing retries is not possible, can we have a 'nodetool decommission 
> resume'?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Issue Comment Deleted] (CASSANDRA-12008) Allow retrying failed streams (or stop them from failing)

2016-06-16 Thread Paulo Motta (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paulo Motta updated CASSANDRA-12008:

Comment: was deleted

(was: In this specific case it seems the streaming failed due to low 
{{streaming_socket_timeout}} value. We just found out our previous default of 1 
hour was too low, and raised that to 24 hours on CASSANDRA-11840, on 3.0.7. 
Could you try increasing that and see if it helps with failed decommissions?)

> Allow retrying failed streams (or stop them from failing)
> -
>
> Key: CASSANDRA-12008
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12008
> Project: Cassandra
>  Issue Type: Bug
>  Components: Lifecycle
>Reporter: Tom van der Woerdt
>
> We're dealing with large data sets (multiple terabytes per node) and 
> sometimes we need to add or remove nodes. These operations are very dependent 
> on the entire cluster being up, so while we're joining a new node (which 
> sometimes takes 6 hours or longer) a lot can go wrong and in a lot of cases 
> something does.
> It would be great if the ability to retry streams was implemented.
> Example to illustrate the problem :
> {code}
> 03:18 PM   ~ $ nodetool decommission
> error: Stream failed
> -- StackTrace --
> org.apache.cassandra.streaming.StreamException: Stream failed
> at 
> org.apache.cassandra.streaming.management.StreamEventJMXNotifier.onFailure(StreamEventJMXNotifier.java:85)
> at com.google.common.util.concurrent.Futures$6.run(Futures.java:1310)
> at 
> com.google.common.util.concurrent.MoreExecutors$DirectExecutor.execute(MoreExecutors.java:457)
> at 
> com.google.common.util.concurrent.ExecutionList.executeListener(ExecutionList.java:156)
> at 
> com.google.common.util.concurrent.ExecutionList.execute(ExecutionList.java:145)
> at 
> com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:202)
> at 
> org.apache.cassandra.streaming.StreamResultFuture.maybeComplete(StreamResultFuture.java:210)
> at 
> org.apache.cassandra.streaming.StreamResultFuture.handleSessionComplete(StreamResultFuture.java:186)
> at 
> org.apache.cassandra.streaming.StreamSession.closeSession(StreamSession.java:430)
> at 
> org.apache.cassandra.streaming.StreamSession.complete(StreamSession.java:622)
> at 
> org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:486)
> at 
> org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:274)
> at java.lang.Thread.run(Thread.java:745)
> 08:04 PM   ~ $ nodetool decommission
> nodetool: Unsupported operation: Node in LEAVING state; wait for status to 
> become normal or restart
> See 'nodetool help' or 'nodetool help '.
> {code}
> Streaming failed, probably due to load :
> {code}
> ERROR [STREAM-IN-/] 2016-06-14 18:05:47,275 StreamSession.java:520 - 
> [Stream #] Streaming error occurred
> java.net.SocketTimeoutException: null
> at 
> sun.nio.ch.SocketAdaptor$SocketInputStream.read(SocketAdaptor.java:211) 
> ~[na:1.8.0_77]
> at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103) 
> ~[na:1.8.0_77]
> at 
> java.nio.channels.Channels$ReadableByteChannelImpl.read(Channels.java:385) 
> ~[na:1.8.0_77]
> at 
> org.apache.cassandra.streaming.messages.StreamMessage.deserialize(StreamMessage.java:54)
>  ~[apache-cassandra-3.0.6.jar:3.0.6]
> at 
> org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:268)
>  ~[apache-cassandra-3.0.6.jar:3.0.6]
> at java.lang.Thread.run(Thread.java:745) [na:1.8.0_77]
> {code}
> If implementing retries is not possible, can we have a 'nodetool decommission 
> resume'?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Issue Comment Deleted] (CASSANDRA-12008) Allow retrying failed streams (or stop them from failing)

2016-06-16 Thread Paulo Motta (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paulo Motta updated CASSANDRA-12008:

Comment: was deleted

(was: In this specific case it seems the streaming failed due to low 
{{streaming_socket_timeout}} value. We just found out our previous default of 1 
hour was too low, and raised that to 24 hours on CASSANDRA-11840, on 3.0.7. 
Could you try increasing that and see if it helps with failed decommissions?)

> Allow retrying failed streams (or stop them from failing)
> -
>
> Key: CASSANDRA-12008
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12008
> Project: Cassandra
>  Issue Type: Bug
>  Components: Lifecycle
>Reporter: Tom van der Woerdt
>
> We're dealing with large data sets (multiple terabytes per node) and 
> sometimes we need to add or remove nodes. These operations are very dependent 
> on the entire cluster being up, so while we're joining a new node (which 
> sometimes takes 6 hours or longer) a lot can go wrong and in a lot of cases 
> something does.
> It would be great if the ability to retry streams was implemented.
> Example to illustrate the problem :
> {code}
> 03:18 PM   ~ $ nodetool decommission
> error: Stream failed
> -- StackTrace --
> org.apache.cassandra.streaming.StreamException: Stream failed
> at 
> org.apache.cassandra.streaming.management.StreamEventJMXNotifier.onFailure(StreamEventJMXNotifier.java:85)
> at com.google.common.util.concurrent.Futures$6.run(Futures.java:1310)
> at 
> com.google.common.util.concurrent.MoreExecutors$DirectExecutor.execute(MoreExecutors.java:457)
> at 
> com.google.common.util.concurrent.ExecutionList.executeListener(ExecutionList.java:156)
> at 
> com.google.common.util.concurrent.ExecutionList.execute(ExecutionList.java:145)
> at 
> com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:202)
> at 
> org.apache.cassandra.streaming.StreamResultFuture.maybeComplete(StreamResultFuture.java:210)
> at 
> org.apache.cassandra.streaming.StreamResultFuture.handleSessionComplete(StreamResultFuture.java:186)
> at 
> org.apache.cassandra.streaming.StreamSession.closeSession(StreamSession.java:430)
> at 
> org.apache.cassandra.streaming.StreamSession.complete(StreamSession.java:622)
> at 
> org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:486)
> at 
> org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:274)
> at java.lang.Thread.run(Thread.java:745)
> 08:04 PM   ~ $ nodetool decommission
> nodetool: Unsupported operation: Node in LEAVING state; wait for status to 
> become normal or restart
> See 'nodetool help' or 'nodetool help '.
> {code}
> Streaming failed, probably due to load :
> {code}
> ERROR [STREAM-IN-/] 2016-06-14 18:05:47,275 StreamSession.java:520 - 
> [Stream #] Streaming error occurred
> java.net.SocketTimeoutException: null
> at 
> sun.nio.ch.SocketAdaptor$SocketInputStream.read(SocketAdaptor.java:211) 
> ~[na:1.8.0_77]
> at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103) 
> ~[na:1.8.0_77]
> at 
> java.nio.channels.Channels$ReadableByteChannelImpl.read(Channels.java:385) 
> ~[na:1.8.0_77]
> at 
> org.apache.cassandra.streaming.messages.StreamMessage.deserialize(StreamMessage.java:54)
>  ~[apache-cassandra-3.0.6.jar:3.0.6]
> at 
> org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:268)
>  ~[apache-cassandra-3.0.6.jar:3.0.6]
> at java.lang.Thread.run(Thread.java:745) [na:1.8.0_77]
> {code}
> If implementing retries is not possible, can we have a 'nodetool decommission 
> resume'?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11516) Make max number of streams configurable

2016-06-16 Thread Paulo Motta (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15334142#comment-15334142
 ] 

Paulo Motta commented on CASSANDRA-11516:
-

[~giampaolo] You should probably take a look at making 
[this|https://github.com/apache/cassandra/blob/3dcbe90e02440e6ee534f643c7603d50ca08482b/src/java/org/apache/cassandra/streaming/StreamCoordinator.java#L42]
 configurable.

> Make max number of streams configurable
> ---
>
> Key: CASSANDRA-11516
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11516
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Sebastian Estevez
>  Labels: lhf
>
> Today we default to num cores. In large boxes (many cores), this is 
> suboptimal as it can generate huge amounts of garbage that GC can't keep up 
> with.
> Usually we tackle issues like this with the streaming throughput levers but 
> in this case the problem is CPU consumption by StreamReceiverTasks 
> specifically in the IntervalTree build -- 
> https://github.com/apache/cassandra/blob/cassandra-2.1.12/src/java/org/apache/cassandra/utils/IntervalTree.java#L257
> We need a max number of parallel streams lever to hanlde this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10862) LCS repair: compact tables before making available in L0

2016-06-16 Thread Paulo Motta (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15334128#comment-15334128
 ] 

Paulo Motta commented on CASSANDRA-10862:
-

Thanks for the patch and sorry for the delay [~scv...@gmail.com]. Overall I 
like your approach, because it mitigates the impact without so many changes. 
See some suggestions for improvement below:
* I'm a bit uncomfortable with the unbounded busy wait so we should probably 
add a time bound to the loop in order to avoid hanging indefinitely if there is 
a problem with compactions catching up
* While the synchronization block on {{CFS}} would guarantee only one producer 
would add new sstables at a time, I think this might be a premature 
optimization that could be a source of problems later, so I'd prefer to take a 
best effort approach initially since I think that we're trying to protect from 
an abysmal number of sstables, and we don't allow concurrent repairs on the 
same tables anyway, so there shouldn't me many concurrent 
{{OnCompletionRunnable}} running for the same {{CFS}}
* With that said, I think we could have something like 
{{CompactionManager.waitForL0Leveling(ColumnFamilyStore cfs, int 
maxSStableCount, long maxWaitTime}}), similar to {{waitForCessation}} method 
but waits for L0 leveling instead and without taking a {{Callable}} as argument
* I think waiting for leveling on validation will probably cause overstreaming 
during repair, since different replicas will flush on different times, causing 
digest mismatches, so we should probably avoid that
* {{compaction_max_l0_sstable_count}} is quite an advanced lever so I don't 
think it should be exposed as a {{cassandra.yaml}} attribute, but instead as a 
system property (similar to {{cassandra.disable_stcs_in_l0}}). We could also 
maybe add this as a dynamic JMX attribute to facilitate tuning.
* We could also probably add another property for the {{max_wait_time}} for the 
L0 leveling, and maybe even provide a conservative default to both properties 
on trunk that could already bring some benefits to the average user and still 
allow more advanced users to tune it according to usage, something like: 
{{streaming.max_L0_count=1000}} and {{streaming.max_L0_wait_time=1min}}. I'm 
not really sure about these values, so it would be nice if you have any 
suggestions based on your tests so far.

Anything else to add here or any caveat I might be missing [~krummas]?

> LCS repair: compact tables before making available in L0
> 
>
> Key: CASSANDRA-10862
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10862
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction, Streaming and Messaging
>Reporter: Jeff Ferland
>Assignee: Chen Shen
>
> When doing repair on a system with lots of mismatched ranges, the number of 
> tables in L0 goes up dramatically, as correspondingly goes the number of 
> tables referenced for a query. Latency increases dramatically in tandem.
> Eventually all the copied tables are compacted down in L0, then copied into 
> L1 (which may be a very large copy), finally reducing the number of SSTables 
> per query into the manageable range.
> It seems to me that the cleanest answer is to compact after streaming, then 
> mark tables available rather than marking available when the file itself is 
> complete.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12015) Rebuilding from another DC should use different sources

2016-06-16 Thread Paulo Motta (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15334109#comment-15334109
 ] 

Paulo Motta commented on CASSANDRA-12015:
-

{{getAllRangesWithSourcesFor}} is only used for bootstrap when 
{{cassandra.consistent.rangemovement=false}}, otherwise 
{{getAllRangesWithStrictSourcesFor}} is used (which tries to stream from 
sources which will lose ranges to the bootstrapping node).

When {{cassandra.consistent.rangemovement=false}} it doesn't really matter from 
which replica you pick from, so I guess we're safe to move away from 
latency-based proximity. This is also used for replace, so I think it can also 
distribute replace/non-consistent-bootstrap load more evenly on that case, 
because right now we are prioritizing replicas which have a better dynamic 
snitch score, what will probably overload them with streaming originating from 
rebuild/replace/non-consistent-bootstrap.

> Rebuilding from another DC should use different sources
> ---
>
> Key: CASSANDRA-12015
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12015
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Fabien Rousseau
>
> Currently, when adding a new DC (ex: DC2) and rebuilding it from an existing 
> DC (ex: DC1), only the closest replica is used as a "source of data".
> It works but is not optimal, because in case of an RF=3 and 3 nodes cluster, 
> only one node in DC1 is streaming the data to DC2. 
> To build the new DC in a reasonable time, it would be better, in that case, 
> to stream from multiple sources, thus distributing more evenly the load.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-12017) Allow configuration of inter DC compression

2016-06-16 Thread Thom Valley (JIRA)
Thom Valley created CASSANDRA-12017:
---

 Summary: Allow configuration of inter DC compression 
 Key: CASSANDRA-12017
 URL: https://issues.apache.org/jira/browse/CASSANDRA-12017
 Project: Cassandra
  Issue Type: Improvement
Reporter: Thom Valley


With larger and more extensively geographically distributed clusters, users are 
beginning to need the ability to reduce bandwidth consumption as much as 
possible.

With larger workloads, the limits of even large intercontinental data links 
(55MBps is pretty typical) are beginning to be stretched.

InterDC SSL is currently hard coded to use the fastest (not highest) 
compression settings.  LZ4 is a great option, but being able to raise the 
compression at the cost of some additional CPU may save as much as 10% (perhaps 
slightly more depending on the data).  10% of a 55MBps link, if running at or 
near capacity is substantial.

This also has a large impact on the overhead and rate possible for 
instantiating new DCs as well as rebuilding a DC after a failure.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11873) Add duration type

2016-06-16 Thread Tyler Hobbs (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15334073#comment-15334073
 ] 

Tyler Hobbs commented on CASSANDRA-11873:
-

This is a good point to take a step back and plan out the semantics of our date 
and time types more thoroughly.  I don't think we need to implement everything 
up front, but we should think about how we want the various date, time, and 
interval types to work together.

We do not currently support a datetime type with timezones.  However, it's 
certainly possible that this may be added in the future, especially if we focus 
on timeseries (where you may want rollups by conceptual day instead of 24 hour 
periods).  So, I think we should consider how the types might interact with a 
timezone-aware datetime.

The current {{duration}} type is similar to Java's {{Duration}} and Python's 
{{timedelta}}.  It adds a number of nanoseconds to a datetime, ignoring effects 
like daylight savings time.  On the other hand, we may also want something like 
Java's {{Period}} class, which works in terms of "conceptual" days, months and 
years.  For example, if you add a conceptual day to a datetime and it happens 
to cross the daylight savings time boundary, it would end up at the same time 
of day on the next day (instead of being off by one hour, like the equivalent 
{{duration}} addition would be).

Or, we might combine these into an interval type like Postgres's {{interval}} 
that stores conceptual months and days, but also stores seconds and 
nanoseconds.  This could work in a pretty straightfoward way with our current 
timestamps (effectively UTC datetimes), but also work well with timezone-aware 
datetimes when those are added.  This type is certainly more complex than the 
current {{duration}} type, but I think we'll eventually need something like 
this anyway, and it's good to ask whether we also want to have a naive 
{{duration}} alongside that type.  If we introduce special syntax for 
{{duration}}, that may force future {{interval}} literals to have a more 
cumbersome syntax.  At the very least, the differences between the two may 
confuse users.

To summarize, if we want to plan for the future, it may be best to go ahead and 
implement a full {{interval}} type now that handles conceptual time units as 
well as raw seconds/nanoseconds.

bq. By consequence, it can be difficult for the driver to handle such a type.

I don't think that this should weigh heavily on how we design Cassandra's 
types.  We are already forced to implement custom types in several of the 
drivers.  For example, the python driver has custom classes for {{OrderedMap}}, 
{{SortedSet}}, {{Time}}, and {{Date}} to handle things like nested collections 
and nanosecond resolution.  These are slightly less friendly for users than 
types in the standard library, but it's fairly normal for a database driver to 
need to do this.

> Add duration type
> -
>
> Key: CASSANDRA-11873
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11873
> Project: Cassandra
>  Issue Type: Improvement
>  Components: CQL
>Reporter: Benjamin Lerer
>Assignee: Benjamin Lerer
>  Labels: client-impacting, doc-impacting
> Fix For: 3.x
>
>
> For CASSANDRA-11871 or to allow queries with {{WHERE}} clause like:
> {{... WHERE reading_time < now() - 2h}}, we need to support some duration 
> type.
> In my opinion, it should be represented internally as a number of 
> microseconds.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11845) Hanging repair in cassandra 2.2.4

2016-06-16 Thread vin01 (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15334083#comment-15334083
 ] 

vin01 commented on CASSANDRA-11845:
---

It never succeeded..

I just keep going with "nodetool repair -full -local" to minimize the 
inconsistency issues.

> Hanging repair in cassandra 2.2.4
> -
>
> Key: CASSANDRA-11845
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11845
> Project: Cassandra
>  Issue Type: Bug
>  Components: Streaming and Messaging
> Environment: Centos 6
>Reporter: vin01
>Priority: Minor
> Attachments: cassandra-2.2.4.error.log
>
>
> So after increasing the streaming_timeout_in_ms value to 3 hours, i was able 
> to avoid the socketTimeout errors i was getting earlier 
> (https://issues.apAache.org/jira/browse/CASSANDRA-11826), but now the issue 
> is repair just stays stuck.
> current status :-
> [2016-05-19 05:52:50,835] Repair session a0e590e1-1d99-11e6-9d63-b717b380ffdd 
> for range (-3309358208555432808,-3279958773585646585] finished (progress: 54%)
> [2016-05-19 05:53:09,446] Repair session a0e590e3-1d99-11e6-9d63-b717b380ffdd 
> for range (8149151263857514385,8181801084802729407] finished (progress: 55%)
> [2016-05-19 05:53:13,808] Repair session a0e5b7f1-1d99-11e6-9d63-b717b380ffdd 
> for range (3372779397996730299,3381236471688156773] finished (progress: 55%)
> [2016-05-19 05:53:27,543] Repair session a0e5b7f3-1d99-11e6-9d63-b717b380ffdd 
> for range (-4182952858113330342,-4157904914928848809] finished (progress: 55%)
> [2016-05-19 05:53:41,128] Repair session a0e5df00-1d99-11e6-9d63-b717b380ffdd 
> for range (6499366179019889198,6523760493740195344] finished (progress: 55%)
> And its 10:46:25 Now, almost 5 hours since it has been stuck right there.
> Earlier i could see repair session going on in system.log but there are no 
> logs coming in right now, all i get in logs is regular index summary 
> redistribution logs.
> Last logs for repair i saw in logs :-
> INFO  [RepairJobTask:5] 2016-05-19 05:53:41,125 RepairJob.java:152 - [repair 
> #a0e5df00-1d99-11e6-9d63-b717b380ffdd] TABLE_NAME is fully synced
> INFO  [RepairJobTask:5] 2016-05-19 05:53:41,126 RepairSession.java:279 - 
> [repair #a0e5df00-1d99-11e6-9d63-b717b380ffdd] Session completed successfully
> INFO  [RepairJobTask:5] 2016-05-19 05:53:41,126 RepairRunnable.java:232 - 
> Repair session a0e5df00-1d99-11e6-9d63-b717b380ffdd for range 
> (6499366179019889198,6523760493740195344] finished
> Its an incremental repair, and in "nodetool netstats" output i can see logs 
> like :-
> Repair e3055fb0-1d9d-11e6-9d63-b717b380ffdd
> /Node-2
> Receiving 8 files, 1093461 bytes total. Already received 8 files, 
> 1093461 bytes total
> 
> /data/cassandra/data/KEYSPACE_NAME/TABLE_NAME-01ad9750723e11e4bfe0d3887930a87c/tmp-la-80872-big-Data.db
>  399475/399475 bytes(100%) received from idx:0/Node-2
> 
> /data/cassandra/data/KEYSPACE_NAME/TABLE_NAME-01ad9750723e11e4bfe0d3887930a87c/tmp-la-80879-big-Data.db
>  53809/53809 bytes(100%) received from idx:0/Node-2
> 
> /data/cassandra/data/KEYSPACE_NAME/TABLE_NAME-01ad9750723e11e4bfe0d3887930a87c/tmp-la-80878-big-Data.db
>  89955/89955 bytes(100%) received from idx:0/Node-2
> 
> /data/cassandra/data/KEYSPACE_NAME/TABLE_NAME-01ad9750723e11e4bfe0d3887930a87c/tmp-la-80881-big-Data.db
>  168790/168790 bytes(100%) received from idx:0/Node-2
> 
> /data/cassandra/data/KEYSPACE_NAME/TABLE_NAME-01ad9750723e11e4bfe0d3887930a87c/tmp-la-80886-big-Data.db
>  107785/107785 bytes(100%) received from idx:0/Node-2
> 
> /data/cassandra/data/KEYSPACE_NAME/TABLE_NAME-01ad9750723e11e4bfe0d3887930a87c/tmp-la-80880-big-Data.db
>  52889/52889 bytes(100%) received from idx:0/Node-2
> 
> /data/cassandra/data/KEYSPACE_NAME/TABLE_NAME-01ad9750723e11e4bfe0d3887930a87c/tmp-la-80884-big-Data.db
>  148882/148882 bytes(100%) received from idx:0/Node-2
> 
> /data/cassandra/data/KEYSPACE_NAME/TABLE_NAME-01ad9750723e11e4bfe0d3887930a87c/tmp-la-80883-big-Data.db
>  71876/71876 bytes(100%) received from idx:0/Node-2
> Sending 5 files, 863321 bytes total. Already sent 5 files, 863321 
> bytes total
> 
> /data/cassandra/data/KEYSPACE_NAME/TABLE_NAME-01ad9750723e11e4bfe0d3887930a87c/la-73168-big-Data.db
>  161895/161895 bytes(100%) sent to idx:0/Node-2
> 
> /data/cassandra/data/KEYSPACE_NAME/TABLE_NAME-01ad9750723e11e4bfe0d3887930a87c/la-72604-big-Data.db
>  399865/399865 bytes(100%) sent to idx:0/Node-2
> 
> /data/cassandra/data/KEYSPACE_NAME/TABLE_NAME-01ad9750723e11e4bfe0d3887930a87c/la-73147-big-Data.db
>  149066/149066 bytes(100%) sent to idx:0/Node-2
> 
> /data/cassandra/data/KEYSPACE_NAME/TABLE_NAME-01ad9750723e11e4bfe

[jira] [Commented] (CASSANDRA-12015) Rebuilding from another DC should use different sources

2016-06-16 Thread DOAN DuyHai (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15334081#comment-15334081
 ] 

DOAN DuyHai commented on CASSANDRA-12015:
-

[~pauloricardomg] The problem is that this call to 
{{snitch.getSortedListByProximity(address, rangeAddresses.get(range))}} is 
inside {{RangeStreamer}} class, which is also used for Bootstrap also (maybe 
for other operations too, I did not do a comprehensive check) 

> Rebuilding from another DC should use different sources
> ---
>
> Key: CASSANDRA-12015
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12015
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Fabien Rousseau
>
> Currently, when adding a new DC (ex: DC2) and rebuilding it from an existing 
> DC (ex: DC1), only the closest replica is used as a "source of data".
> It works but is not optimal, because in case of an RF=3 and 3 nodes cluster, 
> only one node in DC1 is streaming the data to DC2. 
> To build the new DC in a reasonable time, it would be better, in that case, 
> to stream from multiple sources, thus distributing more evenly the load.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11919) Failure in nodetool decommission

2016-06-16 Thread vin01 (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15334064#comment-15334064
 ] 

vin01 commented on CASSANDRA-11919:
---

Replication Factor : 2 for DC1 and 1 for DC2.

CREATE KEYSPACE KEYSPACE_NAME WITH replication = {'class': 
'NetworkTopologyStrategy', 'DC1': '2', 'DC2': '1'}  AND durable_writes = true;

I was able to remove the node with 'removenode' but it left cluster 
inconsistent and i had to perform a full repair on all keyspaces to fix that.

> Failure in nodetool decommission
> 
>
> Key: CASSANDRA-11919
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11919
> Project: Cassandra
>  Issue Type: Bug
>  Components: Streaming and Messaging
> Environment: Centos 6.6 x86_64, Cassandra 2.2.4
>Reporter: vin01
>Priority: Minor
> Fix For: 2.2.x
>
>
> I keep getting an exception while attempting "nodetool decommission".
> {code}
> ERROR [STREAM-IN-/[NODE_ON_WHICH_DECOMMISSION_RUNNING]] 2016-05-29 
> 13:08:39,040 StreamSession.java:524 - [Stream 
> #b2039080-25c2-11e6-bd92-d71331aaf180] Streaming error occurred
> java.lang.IllegalArgumentException: Unknown type 0
> at 
> org.apache.cassandra.streaming.messages.StreamMessage$Type.get(StreamMessage.java:96)
>  ~[apache-cassandra-2.2.4.jar:2.2.4]
> at 
> org.apache.cassandra.streaming.messages.StreamMessage.deserialize(StreamMessage.java:57)
>  ~[apache-cassandra-2.2.4.jar:2.2.4]
> at 
> org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:261)
>  ~[apache-cassandra-2.2.4.jar:2.2.4]
> {code}
> Because of these, decommission process is not succeeding.
> Is interrupting the decommission process safe? Seems like i will have to 
> retry to make it work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12010) UserTypesTest# is failing on trunk

2016-06-16 Thread Joel Knighton (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15334059#comment-15334059
 ] 

Joel Knighton commented on CASSANDRA-12010:
---

It looks like the problem here is that on [CASSANDRA-11604] commit, the 3.0 
test was simply merged into trunk instead of the trunk test being merged to 
trunk.

Your fix works - on your original [CASSANDRA-11604] trunk branch, you also used 
the {{beforeAndAfterFlush}} helper in the style of the rest of the tests. Let's 
update the test in this ticket to use that helper. I don't think we need to 
rerun CI; this run looked good and an updated test in that style should be 
identical to the one on [CASSANDRA-11604] for which we have CI results.

> UserTypesTest# is failing on trunk
> --
>
> Key: CASSANDRA-12010
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12010
> Project: Cassandra
>  Issue Type: Test
>Reporter: Alex Petrov
>Assignee: Alex Petrov
>
> Test failure: 
> http://cassci.datastax.com/job/trunk_utest/1445/testReport/org.apache.cassandra.cql3.validation.entities/UserTypesTest/testAlteringUserTypeNestedWithinNonFrozenMap/
> This was caused by the merge after 
> [11604|https://issues.apache.org/jira/browse/CASSANDRA-11604] which probably 
> coincided with some other change, as this failure did not happen during the 
> [test run on the 
> branch|https://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-11604-trunk-testall/lastCompletedBuild/testReport/].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-11991) On clock skew, paxos may "corrupt" the node clock

2016-06-16 Thread Sylvain Lebresne (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sylvain Lebresne updated CASSANDRA-11991:
-
Status: Patch Available  (was: Open)

For context, the problem is basically the one I described in [my 
comment|https://issues.apache.org/jira/browse/CASSANDRA-9649?focusedCommentId=14601016&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14601016]
 on CASSANDRA-9649 and for which I suggested reverting CASSANDRA-7801.

Now, I was kind of wrong about reverting CASSANDRA-7801 since since 
CASSANDRA-9649 we were relying on {{ClientState.getTimestamp()}} to give use 
timestamp that were unique for the running VM, which meant we can't blindly 
revert CASSANDRA-7801.

What I think is the simplest solution however is to stop relying on that 
property (of {{ClientState.getTimestamp()}}) for the uniqueness of our ballots, 
but instead randomize the non-timestamp parts of the ballot for every new 
ballot. With that, we don't have to revert CASSANDRA-7801, we just have to 
ensure that if we use the last known proposal timestamp (i.e. if whomever clock 
generated that timestamp is "in the future"), we don't persist it in the local 
clock (this in turn means the timestamp might not be unique in the VM for 2 
concurrent paxos operation and hence the need to randomize the rest of the 
UUID).

I've pushed a patch for this for 2.1. I'll attach branches for 2.2+ with tests 
tomorrow (but was waiting on the 2.1 results before doing that) but I don't 
think the modified code has changed since 2.1 so marking ready for review in 
the meantime.

| [2.1|https://github.com/pcmanus/cassandra/commits/11991-2.1] | 
[utests|http://cassci.datastax.com/job/pcmanus-11991-2.1-testall/] | 
[dtests|http://cassci.datastax.com/job/pcmanus-11991-2.1-dtest/] |


> On clock skew, paxos may "corrupt" the node clock
> -
>
> Key: CASSANDRA-11991
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11991
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Sylvain Lebresne
>Assignee: Sylvain Lebresne
> Fix For: 2.1.x, 2.2.x, 3.0.x
>
>
> W made a mistake in CASSANDRA-9649 so that a temporal clock skew on one node 
> can "corrupt" other node clocks through Paxos. That wasn't intended and we 
> should fix that. I'll attach a patch later.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11873) Add duration type

2016-06-16 Thread Brian Hess (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15334052#comment-15334052
 ] 

 Brian Hess commented on CASSANDRA-11873:
-

I will save the discussion/debate on the relationship between CQL and SQL to 
another venue. The reason to bring it up is in the context of user/developer  
experience and usability. If SQL has an approach then we should consider it, 
but if we can do better then by all means we should do that instead (which I 
think nobody is debating). 

A few comments:
1. We should certainly consider the month and year durations. These are common 
uses and we should at least sketch out how we would support that (if not also 
implement it in this ticket - which I think we should do). 
2. How would we abbreviate the example that Sylvain proposes "1 year 2 months 3 
days 4 hours 5 minutes 6 seconds"? Specifically, what is the abbreviation for 
months and minutes? ISO 8601 has M for both, but the P/T format allows for 
disambiguation. 
3. With respect to ISO 8601 that Postgres does also support, if someone bothers 
to read the CQL documentation on Date formats for Timestamp types he will find 
that it states "A timestamp type can be entered as an integer for CQL input, or 
as a string literal in any of the following ISO 8601 formats" 
(https://docs.datastax.com/en/cql/3.3/cql/cql_reference/timestamp_type_r.html). 
So, C* already chose ISO 8601 for Date formats. For consistency with CQL itself 
we should seriously consider making the same choice for durations. 
4. According to the C* documentation, the TIMESTAMP data type, which is what is 
returned from the Now() call, is the "number of milliseconds since the standard 
base time known as the epoch". How are we going to support microseconds and 
nanoseconds? Even Version 1 UUIDs (UUID/TimeUUID format for C*) don't support 
nanosecond resolution. 
5. If we choose to stick with the current bespoke syntax, I suggest moving at 
least to the Influx format. That leaves 2 items:
a) change microseconds from "us" to "u", which is what Influx uses 
b) support weeks with the "w" abbreviation. 

> Add duration type
> -
>
> Key: CASSANDRA-11873
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11873
> Project: Cassandra
>  Issue Type: Improvement
>  Components: CQL
>Reporter: Benjamin Lerer
>Assignee: Benjamin Lerer
>  Labels: client-impacting, doc-impacting
> Fix For: 3.x
>
>
> For CASSANDRA-11871 or to allow queries with {{WHERE}} clause like:
> {{... WHERE reading_time < now() - 2h}}, we need to support some duration 
> type.
> In my opinion, it should be represented internally as a number of 
> microseconds.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12015) Rebuilding from another DC should use different sources

2016-06-16 Thread Paulo Motta (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15334041#comment-15334041
 ] 

Paulo Motta commented on CASSANDRA-12015:
-

while picking replicas from the same DC/rack is definitely useful, I'm not sure 
sorting replicas by dynamic snitch within the same rack/dc will buy us many 
benefits here for bulk operation like streaming. A simple fix here would be to 
use the current AbstractEndpointSnitch.sortByProximity instead, that will only 
sort replicas by rack/dc, which should pick primary replicas for each range and 
that should already yield a reasonable load distribution.

> Rebuilding from another DC should use different sources
> ---
>
> Key: CASSANDRA-12015
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12015
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Fabien Rousseau
>
> Currently, when adding a new DC (ex: DC2) and rebuilding it from an existing 
> DC (ex: DC1), only the closest replica is used as a "source of data".
> It works but is not optimal, because in case of an RF=3 and 3 nodes cluster, 
> only one node in DC1 is streaming the data to DC2. 
> To build the new DC in a reasonable time, it would be better, in that case, 
> to stream from multiple sources, thus distributing more evenly the load.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12002) SSTable tools mishandling LocalPartitioner

2016-06-16 Thread Chris Lohfink (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15334010#comment-15334010
 ] 

Chris Lohfink commented on CASSANDRA-12002:
---

Was reported on the user list when someone tried to look at sstables of 
system.batches.

+1 to the changes. I can add some unit tests.

> SSTable tools mishandling LocalPartitioner
> --
>
> Key: CASSANDRA-12002
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12002
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tools
>Reporter: Chris Lohfink
>Assignee: Chris Lohfink
>Priority: Minor
> Attachments: CASSADNRA-12002.txt
>
>
> The sstabledump and sstablemetadata tools use the FBUtilities.newPartitioner 
> from the name of the partitioner in the validation component. This fails on 
> sstables that are created with things that use the LocalPartitioner 
> (secondary indexes, and the system.batches table). The sstabledump had a 
> check for secondary indexes, but still failed for the system table it was 
> failing for all in the metadata tool.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-11960) Hints are not seekable

2016-06-16 Thread Stefan Podkowinski (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Podkowinski updated CASSANDRA-11960:
---
Status: Awaiting Feedback  (was: In Progress)

> Hints are not seekable
> --
>
> Key: CASSANDRA-11960
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11960
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Robert Stupp
>Assignee: Stefan Podkowinski
>
> Got the following error message on trunk. No idea how to reproduce. But the 
> only thing the (not overridden) seek method does is throwing this exception.
> {code}
> ERROR [HintsDispatcher:2] 2016-06-05 18:51:09,397 CassandraDaemon.java:222 - 
> Exception in thread Thread[HintsDispatcher:2,1,main]
> java.lang.UnsupportedOperationException: Hints are not seekable.
>   at org.apache.cassandra.hints.HintsReader.seek(HintsReader.java:114) 
> ~[main/:na]
>   at 
> org.apache.cassandra.hints.HintsDispatcher.seek(HintsDispatcher.java:79) 
> ~[main/:na]
>   at 
> org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.deliver(HintsDispatchExecutor.java:257)
>  ~[main/:na]
>   at 
> org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.dispatch(HintsDispatchExecutor.java:242)
>  ~[main/:na]
>   at 
> org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.dispatch(HintsDispatchExecutor.java:220)
>  ~[main/:na]
>   at 
> org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.run(HintsDispatchExecutor.java:199)
>  ~[main/:na]
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> ~[na:1.8.0_91]
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
> ~[na:1.8.0_91]
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  ~[na:1.8.0_91]
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  [na:1.8.0_91]
>   at java.lang.Thread.run(Thread.java:745) [na:1.8.0_91]
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11960) Hints are not seekable

2016-06-16 Thread Stefan Podkowinski (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15333995#comment-15333995
 ] 

Stefan Podkowinski commented on CASSANDRA-11960:


I've now created a patch that would move away from file offset based retries 
and instead replays the whole page. As describe above, the 
{{RebufferingInputStream}} data input doesn't provide a way to seek an offset. 
Although this should be possible to implement, I think these changes should be 
considered more carefully, as they have to be done in the common io.utils code. 
Maybe we should open a different ticket for that?

Although replaying a complete page isn't optimal, as we'll deliver duplicate 
hints, we don't guarantee at-most-once semantics for hints anyway. This is not 
so great for non-idempotent operations, such as list appends (counters are not 
hinted), but the current implementation is clearly broken so we have to do 
something about it. But I'm open to ideas how to further optimize this.



> Hints are not seekable
> --
>
> Key: CASSANDRA-11960
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11960
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Robert Stupp
>Assignee: Stefan Podkowinski
>
> Got the following error message on trunk. No idea how to reproduce. But the 
> only thing the (not overridden) seek method does is throwing this exception.
> {code}
> ERROR [HintsDispatcher:2] 2016-06-05 18:51:09,397 CassandraDaemon.java:222 - 
> Exception in thread Thread[HintsDispatcher:2,1,main]
> java.lang.UnsupportedOperationException: Hints are not seekable.
>   at org.apache.cassandra.hints.HintsReader.seek(HintsReader.java:114) 
> ~[main/:na]
>   at 
> org.apache.cassandra.hints.HintsDispatcher.seek(HintsDispatcher.java:79) 
> ~[main/:na]
>   at 
> org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.deliver(HintsDispatchExecutor.java:257)
>  ~[main/:na]
>   at 
> org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.dispatch(HintsDispatchExecutor.java:242)
>  ~[main/:na]
>   at 
> org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.dispatch(HintsDispatchExecutor.java:220)
>  ~[main/:na]
>   at 
> org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.run(HintsDispatchExecutor.java:199)
>  ~[main/:na]
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> ~[na:1.8.0_91]
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
> ~[na:1.8.0_91]
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  ~[na:1.8.0_91]
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  [na:1.8.0_91]
>   at java.lang.Thread.run(Thread.java:745) [na:1.8.0_91]
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-12016) Create MessagingService mocking classes

2016-06-16 Thread Stefan Podkowinski (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Podkowinski updated CASSANDRA-12016:
---
Status: Awaiting Feedback  (was: In Progress)

> Create MessagingService mocking classes
> ---
>
> Key: CASSANDRA-12016
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12016
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Testing
>Reporter: Stefan Podkowinski
>Assignee: Stefan Podkowinski
>
> Interactions between clients and nodes in the cluster are taking place by 
> exchanging messages through the {{MessagingService}}. Black box testing for 
> message based systems is usually pretty easy, as we're just dealing with 
> messages in/out. My suggestion would be to add tests that make use of this 
> fact by mocking message exchanges via MessagingService. Given the right use 
> case, this would turn out to be a much simpler and more efficient alternative 
> for dtests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12016) Create MessagingService mocking classes

2016-06-16 Thread Stefan Podkowinski (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15333973#comment-15333973
 ] 

Stefan Podkowinski commented on CASSANDRA-12016:


Please find the suggested implementation in the linked WIP branch. An example 
how a unit test using those classes looks like can be found 
[here|https://github.com/spodkowinski/cassandra/blob/3cd4ef203cd147713a6f8c4b1466703436124e0b/test/unit/org/apache/cassandra/hints/HintsServiceTest.java].
 I'm looking forward for any feedback.


> Create MessagingService mocking classes
> ---
>
> Key: CASSANDRA-12016
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12016
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Testing
>Reporter: Stefan Podkowinski
>Assignee: Stefan Podkowinski
>
> Interactions between clients and nodes in the cluster are taking place by 
> exchanging messages through the {{MessagingService}}. Black box testing for 
> message based systems is usually pretty easy, as we're just dealing with 
> messages in/out. My suggestion would be to add tests that make use of this 
> fact by mocking message exchanges via MessagingService. Given the right use 
> case, this would turn out to be a much simpler and more efficient alternative 
> for dtests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


cassandra git commit: Add cross-DC latency metrics

2016-06-16 Thread carl
Repository: cassandra
Updated Branches:
  refs/heads/trunk e31e21623 -> 04afa2bf5


Add cross-DC latency metrics

Patch by Chris Lohfink, reviewed by Carl Yeksigian for CASSANDRA-11596


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/04afa2bf
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/04afa2bf
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/04afa2bf

Branch: refs/heads/trunk
Commit: 04afa2bf52ce6f5a534323678defd625dca67336
Parents: e31e216
Author: Chris Lohfink 
Authored: Wed May 18 16:00:04 2016 -0500
Committer: Carl Yeksigian 
Committed: Thu Jun 16 10:49:24 2016 -0400

--
 CHANGES.txt |  1 +
 .../cassandra/metrics/MessagingMetrics.java | 59 +++
 .../cassandra/net/IncomingTcpConnection.java|  2 +-
 .../org/apache/cassandra/net/MessageIn.java |  9 ++-
 .../apache/cassandra/net/MessagingService.java  |  3 +
 .../cassandra/net/MessagingServiceTest.java | 62 +++-
 6 files changed, 131 insertions(+), 5 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/04afa2bf/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 9c44a63..08b5e4a 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 3.8
+ * Add cross-DC latency metrics (CASSANDRA-11596)
  * Allow terms in selection clause (CASSANDRA-10783)
  * Add bind variables to trace (CASSANDRA-11719)
  * Switch counter shards' clock to timestamps (CASSANDRA-9811)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/04afa2bf/src/java/org/apache/cassandra/metrics/MessagingMetrics.java
--
diff --git a/src/java/org/apache/cassandra/metrics/MessagingMetrics.java 
b/src/java/org/apache/cassandra/metrics/MessagingMetrics.java
new file mode 100644
index 000..e126c93
--- /dev/null
+++ b/src/java/org/apache/cassandra/metrics/MessagingMetrics.java
@@ -0,0 +1,59 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.cassandra.metrics;
+
+import java.net.InetAddress;
+import java.util.concurrent.ConcurrentHashMap;
+import java.util.concurrent.TimeUnit;
+
+import org.apache.cassandra.config.DatabaseDescriptor;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import com.codahale.metrics.Timer;
+
+import static org.apache.cassandra.metrics.CassandraMetricsRegistry.Metrics;
+
+/**
+ * Metrics for messages
+ */
+public class MessagingMetrics
+{
+private static Logger logger = 
LoggerFactory.getLogger(MessagingMetrics.class);
+private static final MetricNameFactory factory = new 
DefaultNameFactory("Messaging");
+public final Timer crossNodeLatency;
+public final ConcurrentHashMap dcLatency;
+
+public MessagingMetrics()
+{
+crossNodeLatency = 
Metrics.timer(factory.createMetricName("CrossNodeLatency"));
+dcLatency = new ConcurrentHashMap<>();
+}
+
+public void addTimeTaken(InetAddress from, long timeTaken)
+{
+String dc = DatabaseDescriptor.getEndpointSnitch().getDatacenter(from);
+Timer timer = dcLatency.get(dc);
+if (timer == null)
+{
+timer = dcLatency.computeIfAbsent(dc, k -> 
Metrics.timer(factory.createMetricName(dc + "-Latency")));
+}
+timer.update(timeTaken, TimeUnit.MILLISECONDS);
+crossNodeLatency.update(timeTaken, TimeUnit.MILLISECONDS);
+}
+}

http://git-wip-us.apache.org/repos/asf/cassandra/blob/04afa2bf/src/java/org/apache/cassandra/net/IncomingTcpConnection.java
--
diff --git a/src/java/org/apache/cassandra/net/IncomingTcpConnection.java 
b/src/java/org/apache/cassandra/net/IncomingTcpConnection.java
index 2a09bf4..9e8e2e1 100644
--- a/src/java/org/apache/cassandra/net/IncomingTcpConnection.java
+++ b/src/java/org/apache/cassandra/net/IncomingTcpConnection.java
@@ -187,7 +187,7 @@ public class IncomingTc

[jira] [Updated] (CASSANDRA-11569) Track message latency across DCs

2016-06-16 Thread Carl Yeksigian (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11569?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Yeksigian updated CASSANDRA-11569:
---
   Resolution: Fixed
Fix Version/s: 3.8
   Status: Resolved  (was: Patch Available)

+1. Thanks, [~cnlwsu]!

Commited as 
[04afa2b|https://git-wip-us.apache.org/repos/asf?p=cassandra.git;a=commit;h=04afa2bf52ce6f5a534323678defd625dca67336].

> Track message latency across DCs
> 
>
> Key: CASSANDRA-11569
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11569
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Observability
>Reporter: Chris Lohfink
>Assignee: Chris Lohfink
>Priority: Minor
> Fix For: 3.8
>
> Attachments: CASSANDRA-11569.patch, CASSANDRA-11569v2.txt, 
> nodeLatency.PNG
>
>
> Since we have the timestamp a message is created and when arrives, we can get 
> an approximate time it took relatively easy and would remove necessity for 
> more complex hacks to determine latency between DCs.
> Although is not going to be very meaningful when ntp is not setup, it is 
> pretty common to have NTP setup and even with clock drift nothing is really 
> hurt except the metric becoming whacky.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8119) More Expressive Consistency Levels

2016-06-16 Thread Randy Fradin (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15333886#comment-15333886
 ] 

Randy Fradin commented on CASSANDRA-8119:
-

I think taking a map of DC -> level pushes too much complexity to the client, 
and also lacks some of the flexibility we're looking for. For example, one use 
case is to do a QUORUM operation across nodes in a subset of data centers 
(where the data centers involved depend on where the coordinator is located). 
Another use case is to do "uneven" quorums, e.g. hit somewhere between (n/2)+1 
and n-1 replicas on write and (n+1)/2 and 2 on read, or vice-versa (which is 
useful when the distance between replicas is not uniform and the number of 
replicas may not be an odd number).

The interface Tyler describes allows for that level of flexibility. Putting it 
in CQL makes it simple for operators to define, deploy, and view the custom 
CLs. A custom "strategy" class approach provides a similar level of flexibility 
but would be more cumbersome for operators.

> More Expressive Consistency Levels
> --
>
> Key: CASSANDRA-8119
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8119
> Project: Cassandra
>  Issue Type: New Feature
>  Components: CQL
>Reporter: Tyler Hobbs
> Fix For: 3.x
>
>
> For some multi-datacenter environments, the current set of consistency levels 
> are too restrictive.  For example, the following consistency requirements 
> cannot be expressed:
> * LOCAL_QUORUM in two specific DCs
> * LOCAL_QUORUM in the local DC plus LOCAL_QUORUM in at least one other DC
> * LOCAL_QUORUM in the local DC plus N remote replicas in any DC
> I propose that we add a new consistency level: CUSTOM.  In the v4 (or v5) 
> protocol, this would be accompanied by an additional map argument.  A map of 
> {DC: CL} or a map of {DC: int} is sufficient to cover the first example.  If 
> we accept a special keys to represent "any datacenter", the second case can 
> be handled.  A similar technique could be used for "any other nodes".
> I'm not in love with the special keys, so if anybody has ideas for something 
> more elegant, feel free to propose them.  The main idea is that we want to be 
> flexible enough to cover any reasonable consistency or durability 
> requirements.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-12016) Create MessagingService mocking classes

2016-06-16 Thread Stefan Podkowinski (JIRA)
Stefan Podkowinski created CASSANDRA-12016:
--

 Summary: Create MessagingService mocking classes
 Key: CASSANDRA-12016
 URL: https://issues.apache.org/jira/browse/CASSANDRA-12016
 Project: Cassandra
  Issue Type: New Feature
  Components: Testing
Reporter: Stefan Podkowinski
Assignee: Stefan Podkowinski


Interactions between clients and nodes in the cluster are taking place by 
exchanging messages through the {{MessagingService}}. Black box testing for 
message based systems is usually pretty easy, as we're just dealing with 
messages in/out. My suggestion would be to add tests that make use of this fact 
by mocking message exchanges via MessagingService. Given the right use case, 
this would turn out to be a much simpler and more efficient alternative for 
dtests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-8844) Change Data Capture (CDC)

2016-06-16 Thread Joshua McKenzie (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15333832#comment-15333832
 ] 

Joshua McKenzie edited comment on CASSANDRA-8844 at 6/16/16 2:03 PM:
-

Switching between C# and Java everyday has its costs.

Fixed that, tidied up NEWS.txt (spacing and ordering on Upgrading and 
Deprecation), and 
[committed|https://git-wip-us.apache.org/repos/asf?p=cassandra.git;a=commit;h=e31e216234c6b57a531cae607e0355666007deb2].

Thanks for the assist [~carlyeks] and [~blambov]!

I'll be creating a follow-up meta ticket w/subtasks from all the stuff that 
came up here that we deferred and link that to this ticket, as well as moving 
the link to CASSANDRA-11957 over there.


was (Author: joshuamckenzie):
Switching between C# and Java everyday has its costs.

Fixed that, tidied up NEWS.txt (spacing and ordering on Upgrading and 
Deprecation), and 
[committed|https://git-wip-us.apache.org/repos/asf?p=cassandra.git;a=commit;h=5dcab286ca0fcd9a71e28dad805f028362572e21].

Thanks for the assist [~carlyeks] and [~blambov]!

I'll be creating a follow-up meta ticket w/subtasks from all the stuff that 
came up here that we deferred and link that to this ticket, as well as moving 
the link to CASSANDRA-11957 over there.

> Change Data Capture (CDC)
> -
>
> Key: CASSANDRA-8844
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8844
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Coordination, Local Write-Read Paths
>Reporter: Tupshin Harper
>Assignee: Joshua McKenzie
>Priority: Critical
> Fix For: 3.8
>
>
> "In databases, change data capture (CDC) is a set of software design patterns 
> used to determine (and track) the data that has changed so that action can be 
> taken using the changed data. Also, Change data capture (CDC) is an approach 
> to data integration that is based on the identification, capture and delivery 
> of the changes made to enterprise data sources."
> -Wikipedia
> As Cassandra is increasingly being used as the Source of Record (SoR) for 
> mission critical data in large enterprises, it is increasingly being called 
> upon to act as the central hub of traffic and data flow to other systems. In 
> order to try to address the general need, we (cc [~brianmhess]), propose 
> implementing a simple data logging mechanism to enable per-table CDC patterns.
> h2. The goals:
> # Use CQL as the primary ingestion mechanism, in order to leverage its 
> Consistency Level semantics, and in order to treat it as the single 
> reliable/durable SoR for the data.
> # To provide a mechanism for implementing good and reliable 
> (deliver-at-least-once with possible mechanisms for deliver-exactly-once ) 
> continuous semi-realtime feeds of mutations going into a Cassandra cluster.
> # To eliminate the developmental and operational burden of users so that they 
> don't have to do dual writes to other systems.
> # For users that are currently doing batch export from a Cassandra system, 
> give them the opportunity to make that realtime with a minimum of coding.
> h2. The mechanism:
> We propose a durable logging mechanism that functions similar to a commitlog, 
> with the following nuances:
> - Takes place on every node, not just the coordinator, so RF number of copies 
> are logged.
> - Separate log per table.
> - Per-table configuration. Only tables that are specified as CDC_LOG would do 
> any logging.
> - Per DC. We are trying to keep the complexity to a minimum to make this an 
> easy enhancement, but most likely use cases would prefer to only implement 
> CDC logging in one (or a subset) of the DCs that are being replicated to
> - In the critical path of ConsistencyLevel acknowledgment. Just as with the 
> commitlog, failure to write to the CDC log should fail that node's write. If 
> that means the requested consistency level was not met, then clients *should* 
> experience UnavailableExceptions.
> - Be written in a Row-centric manner such that it is easy for consumers to 
> reconstitute rows atomically.
> - Written in a simple format designed to be consumed *directly* by daemons 
> written in non JVM languages
> h2. Nice-to-haves
> I strongly suspect that the following features will be asked for, but I also 
> believe that they can be deferred for a subsequent release, and to guage 
> actual interest.
> - Multiple logs per table. This would make it easy to have multiple 
> "subscribers" to a single table's changes. A workaround would be to create a 
> forking daemon listener, but that's not a great answer.
> - Log filtering. Being able to apply filters, including UDF-based filters 
> would make Casandra a much more versatile feeder into other systems, and 
> again, reduce complexity that would otherwise need to be bui

[2/5] cassandra git commit: Add Change Data Capture

2016-06-16 Thread jmckenzie
http://git-wip-us.apache.org/repos/asf/cassandra/blob/5dcab286/src/java/org/apache/cassandra/db/commitlog/SegmentReader.java
--
diff --git a/src/java/org/apache/cassandra/db/commitlog/SegmentReader.java 
b/src/java/org/apache/cassandra/db/commitlog/SegmentReader.java
deleted file mode 100644
index 17980de..000
--- a/src/java/org/apache/cassandra/db/commitlog/SegmentReader.java
+++ /dev/null
@@ -1,355 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one
- * or more contributor license agreements.  See the NOTICE file
- * distributed with this work for additional information
- * regarding copyright ownership.  The ASF licenses this file
- * to you under the Apache License, Version 2.0 (the
- * "License"); you may not use this file except in compliance
- * with the License.  You may obtain a copy of the License at
- *
- * http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-package org.apache.cassandra.db.commitlog;
-
-import java.io.IOException;
-import java.nio.ByteBuffer;
-import java.util.Iterator;
-import java.util.zip.CRC32;
-import javax.crypto.Cipher;
-
-import com.google.common.annotations.VisibleForTesting;
-import com.google.common.collect.AbstractIterator;
-
-import 
org.apache.cassandra.db.commitlog.EncryptedFileSegmentInputStream.ChunkProvider;
-import org.apache.cassandra.io.FSReadError;
-import org.apache.cassandra.io.compress.ICompressor;
-import org.apache.cassandra.io.util.FileDataInput;
-import org.apache.cassandra.io.util.FileSegmentInputStream;
-import org.apache.cassandra.io.util.RandomAccessReader;
-import org.apache.cassandra.schema.CompressionParams;
-import org.apache.cassandra.security.EncryptionUtils;
-import org.apache.cassandra.security.EncryptionContext;
-import org.apache.cassandra.utils.ByteBufferUtil;
-
-import static 
org.apache.cassandra.db.commitlog.CommitLogSegment.SYNC_MARKER_SIZE;
-import static org.apache.cassandra.utils.FBUtilities.updateChecksumInt;
-
-/**
- * Read each sync section of a commit log, iteratively.
- */
-public class SegmentReader implements Iterable
-{
-private final CommitLogDescriptor descriptor;
-private final RandomAccessReader reader;
-private final Segmenter segmenter;
-private final boolean tolerateTruncation;
-
-/**
- * ending position of the current sync section.
- */
-protected int end;
-
-protected SegmentReader(CommitLogDescriptor descriptor, RandomAccessReader 
reader, boolean tolerateTruncation)
-{
-this.descriptor = descriptor;
-this.reader = reader;
-this.tolerateTruncation = tolerateTruncation;
-
-end = (int) reader.getFilePointer();
-if (descriptor.getEncryptionContext().isEnabled())
-segmenter = new EncryptedSegmenter(reader, descriptor);
-else if (descriptor.compression != null)
-segmenter = new CompressedSegmenter(descriptor, reader);
-else
-segmenter = new NoOpSegmenter(reader);
-}
-
-public Iterator iterator()
-{
-return new SegmentIterator();
-}
-
-protected class SegmentIterator extends 
AbstractIterator
-{
-protected SyncSegment computeNext()
-{
-while (true)
-{
-try
-{
-final int currentStart = end;
-end = readSyncMarker(descriptor, currentStart, reader);
-if (end == -1)
-{
-return endOfData();
-}
-if (end > reader.length())
-{
-// the CRC was good (meaning it was good when it was 
written and still looks legit), but the file is truncated now.
-// try to grab and use as much of the file as 
possible, which might be nothing if the end of the file truly is corrupt
-end = (int) reader.length();
-}
-
-return segmenter.nextSegment(currentStart + 
SYNC_MARKER_SIZE, end);
-}
-catch(SegmentReader.SegmentReadException e)
-{
-try
-{
-CommitLogReplayer.handleReplayError(!e.invalidCrc && 
tolerateTruncation, e.getMessage());
-}
-catch (IOException ioe)
-{
-throw new RuntimeException(ioe);
-}
-}
-catch (IOException e)
-{
-try
-  

[4/5] cassandra git commit: Add Change Data Capture

2016-06-16 Thread jmckenzie
http://git-wip-us.apache.org/repos/asf/cassandra/blob/5dcab286/src/java/org/apache/cassandra/db/commitlog/CommitLog.java
--
diff --git a/src/java/org/apache/cassandra/db/commitlog/CommitLog.java 
b/src/java/org/apache/cassandra/db/commitlog/CommitLog.java
index 4a660ca..b1f48b2 100644
--- a/src/java/org/apache/cassandra/db/commitlog/CommitLog.java
+++ b/src/java/org/apache/cassandra/db/commitlog/CommitLog.java
@@ -22,34 +22,35 @@ import java.lang.management.ManagementFactory;
 import java.nio.ByteBuffer;
 import java.util.*;
 import java.util.zip.CRC32;
-
 import javax.management.MBeanServer;
 import javax.management.ObjectName;
 
 import com.google.common.annotations.VisibleForTesting;
-
+import org.apache.commons.lang3.StringUtils;
 import org.slf4j.Logger;
 import org.slf4j.LoggerFactory;
 
-import org.apache.commons.lang3.StringUtils;
-
 import org.apache.cassandra.config.Config;
 import org.apache.cassandra.config.DatabaseDescriptor;
 import org.apache.cassandra.config.ParameterizedClass;
 import org.apache.cassandra.db.*;
+import org.apache.cassandra.exceptions.WriteTimeoutException;
 import org.apache.cassandra.io.FSWriteError;
-import org.apache.cassandra.schema.CompressionParams;
 import org.apache.cassandra.io.compress.ICompressor;
 import org.apache.cassandra.io.util.BufferedDataOutputStreamPlus;
 import org.apache.cassandra.io.util.DataOutputBufferFixed;
+import org.apache.cassandra.io.util.FileUtils;
 import org.apache.cassandra.metrics.CommitLogMetrics;
 import org.apache.cassandra.net.MessagingService;
+import org.apache.cassandra.schema.CompressionParams;
 import org.apache.cassandra.security.EncryptionContext;
 import org.apache.cassandra.service.StorageService;
 import org.apache.cassandra.utils.FBUtilities;
 import org.apache.cassandra.utils.JVMStabilityInspector;
 
-import static org.apache.cassandra.db.commitlog.CommitLogSegment.*;
+import static org.apache.cassandra.db.commitlog.CommitLogSegment.Allocation;
+import static 
org.apache.cassandra.db.commitlog.CommitLogSegment.CommitLogSegmentFileComparator;
+import static 
org.apache.cassandra.db.commitlog.CommitLogSegment.ENTRY_OVERHEAD_SIZE;
 import static org.apache.cassandra.utils.FBUtilities.updateChecksum;
 import static org.apache.cassandra.utils.FBUtilities.updateChecksumInt;
 
@@ -65,19 +66,19 @@ public class CommitLog implements CommitLogMBean
 
 // we only permit records HALF the size of a commit log, to ensure we 
don't spin allocating many mostly
 // empty segments when writing large records
-private final long MAX_MUTATION_SIZE = 
DatabaseDescriptor.getMaxMutationSize();
+final long MAX_MUTATION_SIZE = DatabaseDescriptor.getMaxMutationSize();
+
+final public AbstractCommitLogSegmentManager segmentManager;
 
-public final CommitLogSegmentManager allocator;
 public final CommitLogArchiver archiver;
 final CommitLogMetrics metrics;
 final AbstractCommitLogService executor;
 
 volatile Configuration configuration;
-final public String location;
 
 private static CommitLog construct()
 {
-CommitLog log = new 
CommitLog(DatabaseDescriptor.getCommitLogLocation(), 
CommitLogArchiver.construct());
+CommitLog log = new CommitLog(CommitLogArchiver.construct());
 
 MBeanServer mbs = ManagementFactory.getPlatformMBeanServer();
 try
@@ -92,9 +93,8 @@ public class CommitLog implements CommitLogMBean
 }
 
 @VisibleForTesting
-CommitLog(String location, CommitLogArchiver archiver)
+CommitLog(CommitLogArchiver archiver)
 {
-this.location = location;
 this.configuration = new 
Configuration(DatabaseDescriptor.getCommitLogCompression(),

DatabaseDescriptor.getEncryptionContext());
 DatabaseDescriptor.createAllDirectories();
@@ -106,16 +106,17 @@ public class CommitLog implements CommitLogMBean
 ? new BatchCommitLogService(this)
 : new PeriodicCommitLogService(this);
 
-allocator = new CommitLogSegmentManager(this);
-
+segmentManager = DatabaseDescriptor.isCDCEnabled()
+ ? new CommitLogSegmentManagerCDC(this, 
DatabaseDescriptor.getCommitLogLocation())
+ : new CommitLogSegmentManagerStandard(this, 
DatabaseDescriptor.getCommitLogLocation());
 // register metrics
-metrics.attach(executor, allocator);
+metrics.attach(executor, segmentManager);
 }
 
 CommitLog start()
 {
 executor.start();
-allocator.start();
+segmentManager.start();
 return this;
 }
 
@@ -123,11 +124,12 @@ public class CommitLog implements CommitLogMBean
  * Perform recovery on commit logs located in the directory specified by 
the config file.
  *
  * @return the number of mutations replayed
+ * @throws IOException
  */
-public int recover() throws IO

[5/5] cassandra git commit: Add Change Data Capture

2016-06-16 Thread jmckenzie
Add Change Data Capture

Patch by jmckenzie; reviewed by cyeksigian and blambov for CASSANDRA-8844


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/e31e2162
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/e31e2162
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/e31e2162

Branch: refs/heads/trunk
Commit: e31e216234c6b57a531cae607e0355666007deb2
Parents: ed538f9
Author: Josh McKenzie 
Authored: Sun Mar 27 09:20:47 2016 -0400
Committer: Josh McKenzie 
Committed: Thu Jun 16 10:01:39 2016 -0400

--
 CHANGES.txt |   1 +
 NEWS.txt|  32 +-
 build.xml   |  51 +-
 conf/cassandra.yaml |  26 +
 pylib/cqlshlib/cql3handling.py  |   5 +-
 src/antlr/Parser.g  |   3 +-
 .../org/apache/cassandra/config/Config.java |   6 +
 .../cassandra/config/DatabaseDescriptor.java|  86 ++-
 .../statements/CreateKeyspaceStatement.java |   1 +
 .../cql3/statements/DropKeyspaceStatement.java  |   2 +-
 .../cql3/statements/TableAttributes.java|   3 +
 .../apache/cassandra/db/ColumnFamilyStore.java  |  67 +--
 .../org/apache/cassandra/db/Directories.java|  51 +-
 src/java/org/apache/cassandra/db/Keyspace.java  |  17 +-
 src/java/org/apache/cassandra/db/Memtable.java  |  47 +-
 src/java/org/apache/cassandra/db/Mutation.java  |  18 +
 .../org/apache/cassandra/db/SystemKeyspace.java |  26 +-
 src/java/org/apache/cassandra/db/WriteType.java |   3 +-
 .../AbstractCommitLogSegmentManager.java| 584 +++
 .../db/commitlog/AbstractCommitLogService.java  |   3 +-
 .../cassandra/db/commitlog/CommitLog.java   | 157 +++--
 .../db/commitlog/CommitLogPosition.java | 121 
 .../db/commitlog/CommitLogReadHandler.java  |  76 +++
 .../cassandra/db/commitlog/CommitLogReader.java | 501 
 .../db/commitlog/CommitLogReplayer.java | 582 +++---
 .../db/commitlog/CommitLogSegment.java  | 110 ++--
 .../db/commitlog/CommitLogSegmentManager.java   | 567 --
 .../commitlog/CommitLogSegmentManagerCDC.java   | 302 ++
 .../CommitLogSegmentManagerStandard.java|  89 +++
 .../db/commitlog/CommitLogSegmentReader.java| 366 
 .../db/commitlog/CompressedSegment.java |  12 +-
 .../db/commitlog/EncryptedSegment.java  |  18 +-
 .../db/commitlog/FileDirectSegment.java |  73 +--
 .../db/commitlog/MemoryMappedSegment.java   |   6 +-
 .../cassandra/db/commitlog/ReplayPosition.java  | 178 --
 .../cassandra/db/commitlog/SegmentReader.java   | 355 ---
 .../db/commitlog/SimpleCachedBufferPool.java| 118 
 .../apache/cassandra/db/lifecycle/Tracker.java  |   8 +-
 .../apache/cassandra/db/view/TableViews.java|   4 +-
 .../apache/cassandra/db/view/ViewManager.java   |   2 -
 .../io/sstable/format/SSTableReader.java|   1 -
 .../metadata/LegacyMetadataSerializer.java  |  12 +-
 .../io/sstable/metadata/MetadataCollector.java  |  16 +-
 .../io/sstable/metadata/StatsMetadata.java  |  24 +-
 .../cassandra/metrics/CommitLogMetrics.java |   9 +-
 .../apache/cassandra/schema/SchemaKeyspace.java |   6 +-
 .../apache/cassandra/schema/TableParams.java|  23 +-
 .../cassandra/service/CassandraDaemon.java  |   4 +-
 .../cassandra/streaming/StreamReceiveTask.java  |  36 +-
 .../utils/DirectorySizeCalculator.java  |  98 
 .../cassandra/utils/JVMStabilityInspector.java  |   3 +-
 .../cassandra/utils/memory/BufferPool.java  |   2 +-
 test/conf/cassandra-murmur.yaml |   2 +
 test/conf/cassandra.yaml|   2 +
 test/conf/cdc.yaml  |   1 +
 test/data/bloom-filter/ka/foo.cql   |   2 +-
 .../db/commitlog/CommitLogStressTest.java   | 123 ++--
 .../test/microbench/DirectorySizerBench.java| 105 
 .../OffsetAwareConfigurationLoader.java |  13 +-
 .../cassandra/batchlog/BatchlogManagerTest.java |   4 +-
 .../apache/cassandra/cql3/CDCStatementTest.java |  50 ++
 .../org/apache/cassandra/cql3/CQLTester.java|   4 +
 .../apache/cassandra/cql3/OutOfSpaceTest.java   |   2 +-
 .../cql3/validation/operations/CreateTest.java  |   5 +-
 .../apache/cassandra/db/ReadMessageTest.java|  10 +-
 .../db/commitlog/CommitLogReaderTest.java   | 267 +
 .../CommitLogSegmentManagerCDCTest.java | 220 +++
 .../commitlog/CommitLogSegmentManagerTest.java  |  23 +-
 .../cassandra/db/commitlog/CommitLogTest.java   | 130 +++--
 .../db/commitlog/CommitLogTestReplayer.java |  59 +-
 .../db/commitlog/CommitLogUpgradeTest.java  |  18 +-
 .../db/commitlog/CommitLogUpgradeTestMaker.java |   4 +-
 .../db/commitlog/SegmentReaderTest

[2/5] cassandra git commit: Add Change Data Capture

2016-06-16 Thread jmckenzie
http://git-wip-us.apache.org/repos/asf/cassandra/blob/e31e2162/src/java/org/apache/cassandra/db/commitlog/SegmentReader.java
--
diff --git a/src/java/org/apache/cassandra/db/commitlog/SegmentReader.java 
b/src/java/org/apache/cassandra/db/commitlog/SegmentReader.java
deleted file mode 100644
index 17980de..000
--- a/src/java/org/apache/cassandra/db/commitlog/SegmentReader.java
+++ /dev/null
@@ -1,355 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one
- * or more contributor license agreements.  See the NOTICE file
- * distributed with this work for additional information
- * regarding copyright ownership.  The ASF licenses this file
- * to you under the Apache License, Version 2.0 (the
- * "License"); you may not use this file except in compliance
- * with the License.  You may obtain a copy of the License at
- *
- * http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-package org.apache.cassandra.db.commitlog;
-
-import java.io.IOException;
-import java.nio.ByteBuffer;
-import java.util.Iterator;
-import java.util.zip.CRC32;
-import javax.crypto.Cipher;
-
-import com.google.common.annotations.VisibleForTesting;
-import com.google.common.collect.AbstractIterator;
-
-import 
org.apache.cassandra.db.commitlog.EncryptedFileSegmentInputStream.ChunkProvider;
-import org.apache.cassandra.io.FSReadError;
-import org.apache.cassandra.io.compress.ICompressor;
-import org.apache.cassandra.io.util.FileDataInput;
-import org.apache.cassandra.io.util.FileSegmentInputStream;
-import org.apache.cassandra.io.util.RandomAccessReader;
-import org.apache.cassandra.schema.CompressionParams;
-import org.apache.cassandra.security.EncryptionUtils;
-import org.apache.cassandra.security.EncryptionContext;
-import org.apache.cassandra.utils.ByteBufferUtil;
-
-import static 
org.apache.cassandra.db.commitlog.CommitLogSegment.SYNC_MARKER_SIZE;
-import static org.apache.cassandra.utils.FBUtilities.updateChecksumInt;
-
-/**
- * Read each sync section of a commit log, iteratively.
- */
-public class SegmentReader implements Iterable
-{
-private final CommitLogDescriptor descriptor;
-private final RandomAccessReader reader;
-private final Segmenter segmenter;
-private final boolean tolerateTruncation;
-
-/**
- * ending position of the current sync section.
- */
-protected int end;
-
-protected SegmentReader(CommitLogDescriptor descriptor, RandomAccessReader 
reader, boolean tolerateTruncation)
-{
-this.descriptor = descriptor;
-this.reader = reader;
-this.tolerateTruncation = tolerateTruncation;
-
-end = (int) reader.getFilePointer();
-if (descriptor.getEncryptionContext().isEnabled())
-segmenter = new EncryptedSegmenter(reader, descriptor);
-else if (descriptor.compression != null)
-segmenter = new CompressedSegmenter(descriptor, reader);
-else
-segmenter = new NoOpSegmenter(reader);
-}
-
-public Iterator iterator()
-{
-return new SegmentIterator();
-}
-
-protected class SegmentIterator extends 
AbstractIterator
-{
-protected SyncSegment computeNext()
-{
-while (true)
-{
-try
-{
-final int currentStart = end;
-end = readSyncMarker(descriptor, currentStart, reader);
-if (end == -1)
-{
-return endOfData();
-}
-if (end > reader.length())
-{
-// the CRC was good (meaning it was good when it was 
written and still looks legit), but the file is truncated now.
-// try to grab and use as much of the file as 
possible, which might be nothing if the end of the file truly is corrupt
-end = (int) reader.length();
-}
-
-return segmenter.nextSegment(currentStart + 
SYNC_MARKER_SIZE, end);
-}
-catch(SegmentReader.SegmentReadException e)
-{
-try
-{
-CommitLogReplayer.handleReplayError(!e.invalidCrc && 
tolerateTruncation, e.getMessage());
-}
-catch (IOException ioe)
-{
-throw new RuntimeException(ioe);
-}
-}
-catch (IOException e)
-{
-try
-  

[4/5] cassandra git commit: Add Change Data Capture

2016-06-16 Thread jmckenzie
http://git-wip-us.apache.org/repos/asf/cassandra/blob/e31e2162/src/java/org/apache/cassandra/db/commitlog/CommitLog.java
--
diff --git a/src/java/org/apache/cassandra/db/commitlog/CommitLog.java 
b/src/java/org/apache/cassandra/db/commitlog/CommitLog.java
index 4a660ca..b1f48b2 100644
--- a/src/java/org/apache/cassandra/db/commitlog/CommitLog.java
+++ b/src/java/org/apache/cassandra/db/commitlog/CommitLog.java
@@ -22,34 +22,35 @@ import java.lang.management.ManagementFactory;
 import java.nio.ByteBuffer;
 import java.util.*;
 import java.util.zip.CRC32;
-
 import javax.management.MBeanServer;
 import javax.management.ObjectName;
 
 import com.google.common.annotations.VisibleForTesting;
-
+import org.apache.commons.lang3.StringUtils;
 import org.slf4j.Logger;
 import org.slf4j.LoggerFactory;
 
-import org.apache.commons.lang3.StringUtils;
-
 import org.apache.cassandra.config.Config;
 import org.apache.cassandra.config.DatabaseDescriptor;
 import org.apache.cassandra.config.ParameterizedClass;
 import org.apache.cassandra.db.*;
+import org.apache.cassandra.exceptions.WriteTimeoutException;
 import org.apache.cassandra.io.FSWriteError;
-import org.apache.cassandra.schema.CompressionParams;
 import org.apache.cassandra.io.compress.ICompressor;
 import org.apache.cassandra.io.util.BufferedDataOutputStreamPlus;
 import org.apache.cassandra.io.util.DataOutputBufferFixed;
+import org.apache.cassandra.io.util.FileUtils;
 import org.apache.cassandra.metrics.CommitLogMetrics;
 import org.apache.cassandra.net.MessagingService;
+import org.apache.cassandra.schema.CompressionParams;
 import org.apache.cassandra.security.EncryptionContext;
 import org.apache.cassandra.service.StorageService;
 import org.apache.cassandra.utils.FBUtilities;
 import org.apache.cassandra.utils.JVMStabilityInspector;
 
-import static org.apache.cassandra.db.commitlog.CommitLogSegment.*;
+import static org.apache.cassandra.db.commitlog.CommitLogSegment.Allocation;
+import static 
org.apache.cassandra.db.commitlog.CommitLogSegment.CommitLogSegmentFileComparator;
+import static 
org.apache.cassandra.db.commitlog.CommitLogSegment.ENTRY_OVERHEAD_SIZE;
 import static org.apache.cassandra.utils.FBUtilities.updateChecksum;
 import static org.apache.cassandra.utils.FBUtilities.updateChecksumInt;
 
@@ -65,19 +66,19 @@ public class CommitLog implements CommitLogMBean
 
 // we only permit records HALF the size of a commit log, to ensure we 
don't spin allocating many mostly
 // empty segments when writing large records
-private final long MAX_MUTATION_SIZE = 
DatabaseDescriptor.getMaxMutationSize();
+final long MAX_MUTATION_SIZE = DatabaseDescriptor.getMaxMutationSize();
+
+final public AbstractCommitLogSegmentManager segmentManager;
 
-public final CommitLogSegmentManager allocator;
 public final CommitLogArchiver archiver;
 final CommitLogMetrics metrics;
 final AbstractCommitLogService executor;
 
 volatile Configuration configuration;
-final public String location;
 
 private static CommitLog construct()
 {
-CommitLog log = new 
CommitLog(DatabaseDescriptor.getCommitLogLocation(), 
CommitLogArchiver.construct());
+CommitLog log = new CommitLog(CommitLogArchiver.construct());
 
 MBeanServer mbs = ManagementFactory.getPlatformMBeanServer();
 try
@@ -92,9 +93,8 @@ public class CommitLog implements CommitLogMBean
 }
 
 @VisibleForTesting
-CommitLog(String location, CommitLogArchiver archiver)
+CommitLog(CommitLogArchiver archiver)
 {
-this.location = location;
 this.configuration = new 
Configuration(DatabaseDescriptor.getCommitLogCompression(),

DatabaseDescriptor.getEncryptionContext());
 DatabaseDescriptor.createAllDirectories();
@@ -106,16 +106,17 @@ public class CommitLog implements CommitLogMBean
 ? new BatchCommitLogService(this)
 : new PeriodicCommitLogService(this);
 
-allocator = new CommitLogSegmentManager(this);
-
+segmentManager = DatabaseDescriptor.isCDCEnabled()
+ ? new CommitLogSegmentManagerCDC(this, 
DatabaseDescriptor.getCommitLogLocation())
+ : new CommitLogSegmentManagerStandard(this, 
DatabaseDescriptor.getCommitLogLocation());
 // register metrics
-metrics.attach(executor, allocator);
+metrics.attach(executor, segmentManager);
 }
 
 CommitLog start()
 {
 executor.start();
-allocator.start();
+segmentManager.start();
 return this;
 }
 
@@ -123,11 +124,12 @@ public class CommitLog implements CommitLogMBean
  * Perform recovery on commit logs located in the directory specified by 
the config file.
  *
  * @return the number of mutations replayed
+ * @throws IOException
  */
-public int recover() throws IO

[1/5] cassandra git commit: Add Change Data Capture [Forced Update!]

2016-06-16 Thread jmckenzie
Repository: cassandra
Updated Branches:
  refs/heads/trunk 5dcab286c -> e31e21623 (forced update)


http://git-wip-us.apache.org/repos/asf/cassandra/blob/e31e2162/test/unit/org/apache/cassandra/db/commitlog/CommitLogReaderTest.java
--
diff --git 
a/test/unit/org/apache/cassandra/db/commitlog/CommitLogReaderTest.java 
b/test/unit/org/apache/cassandra/db/commitlog/CommitLogReaderTest.java
new file mode 100644
index 000..edff3b7
--- /dev/null
+++ b/test/unit/org/apache/cassandra/db/commitlog/CommitLogReaderTest.java
@@ -0,0 +1,267 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.cassandra.db.commitlog;
+
+import java.io.File;
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.List;
+
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.BeforeClass;
+import org.junit.Test;
+
+import org.apache.cassandra.config.CFMetaData;
+import org.apache.cassandra.config.ColumnDefinition;
+import org.apache.cassandra.config.Config;
+import org.apache.cassandra.config.DatabaseDescriptor;
+import org.apache.cassandra.cql3.CQLTester;
+import org.apache.cassandra.cql3.ColumnIdentifier;
+import org.apache.cassandra.db.Keyspace;
+import org.apache.cassandra.db.Mutation;
+import org.apache.cassandra.db.partitions.PartitionUpdate;
+import org.apache.cassandra.db.rows.Row;
+import org.apache.cassandra.utils.JVMStabilityInspector;
+import org.apache.cassandra.utils.KillerForTests;
+
+public class CommitLogReaderTest extends CQLTester
+{
+@BeforeClass
+public static void beforeClass()
+{
+
DatabaseDescriptor.setCommitFailurePolicy(Config.CommitFailurePolicy.ignore);
+JVMStabilityInspector.replaceKiller(new KillerForTests(false));
+}
+
+@Before
+public void before() throws IOException
+{
+CommitLog.instance.resetUnsafe(true);
+}
+
+@Test
+public void testReadAll() throws Throwable
+{
+int samples = 1000;
+populateData(samples);
+ArrayList toCheck = getCommitLogs();
+
+CommitLogReader reader = new CommitLogReader();
+
+TestCLRHandler testHandler = new 
TestCLRHandler(currentTableMetadata());
+for (File f : toCheck)
+reader.readCommitLogSegment(testHandler, f, 
CommitLogReader.ALL_MUTATIONS, false);
+
+Assert.assertEquals("Expected 1000 seen mutations, got: " + 
testHandler.seenMutationCount(),
+1000, testHandler.seenMutationCount());
+
+confirmReadOrder(testHandler, 0);
+}
+
+@Test
+public void testReadCount() throws Throwable
+{
+int samples = 50;
+int readCount = 10;
+populateData(samples);
+ArrayList toCheck = getCommitLogs();
+
+CommitLogReader reader = new CommitLogReader();
+TestCLRHandler testHandler = new TestCLRHandler();
+
+for (File f : toCheck)
+reader.readCommitLogSegment(testHandler, f, readCount - 
testHandler.seenMutationCount(), false);
+
+Assert.assertEquals("Expected " + readCount + " seen mutations, got: " 
+ testHandler.seenMutations.size(),
+readCount, testHandler.seenMutationCount());
+}
+
+@Test
+public void testReadFromMidpoint() throws Throwable
+{
+int samples = 1000;
+int readCount = 500;
+CommitLogPosition midpoint = populateData(samples);
+ArrayList toCheck = getCommitLogs();
+
+CommitLogReader reader = new CommitLogReader();
+TestCLRHandler testHandler = new TestCLRHandler();
+
+// Will skip on incorrect segments due to id mismatch on midpoint
+for (File f : toCheck)
+reader.readCommitLogSegment(testHandler, f, midpoint, readCount, 
false);
+
+// Confirm correct count on replay
+Assert.assertEquals("Expected " + readCount + " seen mutations, got: " 
+ testHandler.seenMutations.size(),
+readCount, testHandler.seenMutationCount());
+
+confirmReadOrder(testHandler, samples / 2);
+}
+
+@Test
+public void testReadFromMidpointTooMany() throws Throwable
+{
+int samples

[3/5] cassandra git commit: Add Change Data Capture

2016-06-16 Thread jmckenzie
http://git-wip-us.apache.org/repos/asf/cassandra/blob/e31e2162/src/java/org/apache/cassandra/db/commitlog/CommitLogSegment.java
--
diff --git a/src/java/org/apache/cassandra/db/commitlog/CommitLogSegment.java 
b/src/java/org/apache/cassandra/db/commitlog/CommitLogSegment.java
index 2045c35..2e97fd5 100644
--- a/src/java/org/apache/cassandra/db/commitlog/CommitLogSegment.java
+++ b/src/java/org/apache/cassandra/db/commitlog/CommitLogSegment.java
@@ -22,34 +22,22 @@ import java.io.IOException;
 import java.nio.ByteBuffer;
 import java.nio.channels.FileChannel;
 import java.nio.file.StandardOpenOption;
-import java.util.ArrayList;
-import java.util.Collection;
-import java.util.Collections;
-import java.util.Comparator;
-import java.util.Iterator;
-import java.util.List;
-import java.util.Map;
-import java.util.UUID;
+import java.util.*;
 import java.util.concurrent.ConcurrentHashMap;
 import java.util.concurrent.ConcurrentMap;
 import java.util.concurrent.atomic.AtomicInteger;
 import java.util.zip.CRC32;
 
-import com.codahale.metrics.Timer;
-
 import org.cliffc.high_scale_lib.NonBlockingHashMap;
-
 import org.slf4j.Logger;
 import org.slf4j.LoggerFactory;
 
-import org.apache.cassandra.config.CFMetaData;
-import org.apache.cassandra.config.DatabaseDescriptor;
-import org.apache.cassandra.config.Schema;
+import com.codahale.metrics.Timer;
+import org.apache.cassandra.config.*;
 import org.apache.cassandra.db.Mutation;
 import org.apache.cassandra.db.commitlog.CommitLog.Configuration;
 import org.apache.cassandra.db.partitions.PartitionUpdate;
 import org.apache.cassandra.io.FSWriteError;
-import org.apache.cassandra.io.util.FileUtils;
 import org.apache.cassandra.utils.CLibrary;
 import org.apache.cassandra.utils.concurrent.OpOrder;
 import org.apache.cassandra.utils.concurrent.WaitQueue;
@@ -66,6 +54,14 @@ public abstract class CommitLogSegment
 private static final Logger logger = 
LoggerFactory.getLogger(CommitLogSegment.class);
 
 private final static long idBase;
+
+private CDCState cdcState = CDCState.PERMITTED;
+public enum CDCState {
+PERMITTED,
+FORBIDDEN,
+CONTAINS
+}
+
 private final static AtomicInteger nextId = new AtomicInteger(1);
 private static long replayLimitId;
 static
@@ -115,18 +111,20 @@ public abstract class CommitLogSegment
 final FileChannel channel;
 final int fd;
 
+protected final AbstractCommitLogSegmentManager manager;
+
 ByteBuffer buffer;
 private volatile boolean headerWritten;
 
 final CommitLog commitLog;
 public final CommitLogDescriptor descriptor;
 
-static CommitLogSegment createSegment(CommitLog commitLog, Runnable 
onClose)
+static CommitLogSegment createSegment(CommitLog commitLog, 
AbstractCommitLogSegmentManager manager, Runnable onClose)
 {
 Configuration config = commitLog.configuration;
-CommitLogSegment segment = config.useEncryption() ? new 
EncryptedSegment(commitLog, onClose)
-  : 
config.useCompression() ? new CompressedSegment(commitLog, onClose)
-   
 : new MemoryMappedSegment(commitLog);
+CommitLogSegment segment = config.useEncryption() ? new 
EncryptedSegment(commitLog, manager, onClose)
+  : 
config.useCompression() ? new CompressedSegment(commitLog, manager, onClose)
+   
 : new MemoryMappedSegment(commitLog, manager);
 segment.writeLogHeader();
 return segment;
 }
@@ -151,14 +149,16 @@ public abstract class CommitLogSegment
 /**
  * Constructs a new segment file.
  */
-CommitLogSegment(CommitLog commitLog)
+CommitLogSegment(CommitLog commitLog, AbstractCommitLogSegmentManager 
manager)
 {
 this.commitLog = commitLog;
+this.manager = manager;
+
 id = getNextId();
 descriptor = new CommitLogDescriptor(id,
  
commitLog.configuration.getCompressorClass(),
  
commitLog.configuration.getEncryptionContext());
-logFile = new File(commitLog.location, descriptor.fileName());
+logFile = new File(manager.storageDirectory, descriptor.fileName());
 
 try
 {
@@ -369,22 +369,11 @@ public abstract class CommitLogSegment
 }
 
 /**
- * Completely discards a segment file by deleting it. (Potentially 
blocking operation)
- */
-void discard(boolean deleteFile)
-{
-close();
-if (deleteFile)
-FileUtils.deleteWithConfirm(logFile);
-commitLog.allocator.addSize(-onDiskSize());
-}
-
-/**
- * @return the current ReplayPosition for this log segment
+ * @retur

[jira] [Updated] (CASSANDRA-8844) Change Data Capture (CDC)

2016-06-16 Thread Joshua McKenzie (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joshua McKenzie updated CASSANDRA-8844:
---
   Resolution: Fixed
Fix Version/s: (was: 3.x)
   3.8
   Status: Resolved  (was: Ready to Commit)

Switching between C# and Java everyday has its costs.

Fixed that, tidied up NEWS.txt (spacing and ordering on Upgrading and 
Deprecation), and 
[committed|https://git-wip-us.apache.org/repos/asf?p=cassandra.git;a=commit;h=5dcab286ca0fcd9a71e28dad805f028362572e21].

Thanks for the assist [~carlyeks] and [~blambov]!

I'll be creating a follow-up meta ticket w/subtasks from all the stuff that 
came up here that we deferred and link that to this ticket, as well as moving 
the link to CASSANDRA-11957 over there.

> Change Data Capture (CDC)
> -
>
> Key: CASSANDRA-8844
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8844
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Coordination, Local Write-Read Paths
>Reporter: Tupshin Harper
>Assignee: Joshua McKenzie
>Priority: Critical
> Fix For: 3.8
>
>
> "In databases, change data capture (CDC) is a set of software design patterns 
> used to determine (and track) the data that has changed so that action can be 
> taken using the changed data. Also, Change data capture (CDC) is an approach 
> to data integration that is based on the identification, capture and delivery 
> of the changes made to enterprise data sources."
> -Wikipedia
> As Cassandra is increasingly being used as the Source of Record (SoR) for 
> mission critical data in large enterprises, it is increasingly being called 
> upon to act as the central hub of traffic and data flow to other systems. In 
> order to try to address the general need, we (cc [~brianmhess]), propose 
> implementing a simple data logging mechanism to enable per-table CDC patterns.
> h2. The goals:
> # Use CQL as the primary ingestion mechanism, in order to leverage its 
> Consistency Level semantics, and in order to treat it as the single 
> reliable/durable SoR for the data.
> # To provide a mechanism for implementing good and reliable 
> (deliver-at-least-once with possible mechanisms for deliver-exactly-once ) 
> continuous semi-realtime feeds of mutations going into a Cassandra cluster.
> # To eliminate the developmental and operational burden of users so that they 
> don't have to do dual writes to other systems.
> # For users that are currently doing batch export from a Cassandra system, 
> give them the opportunity to make that realtime with a minimum of coding.
> h2. The mechanism:
> We propose a durable logging mechanism that functions similar to a commitlog, 
> with the following nuances:
> - Takes place on every node, not just the coordinator, so RF number of copies 
> are logged.
> - Separate log per table.
> - Per-table configuration. Only tables that are specified as CDC_LOG would do 
> any logging.
> - Per DC. We are trying to keep the complexity to a minimum to make this an 
> easy enhancement, but most likely use cases would prefer to only implement 
> CDC logging in one (or a subset) of the DCs that are being replicated to
> - In the critical path of ConsistencyLevel acknowledgment. Just as with the 
> commitlog, failure to write to the CDC log should fail that node's write. If 
> that means the requested consistency level was not met, then clients *should* 
> experience UnavailableExceptions.
> - Be written in a Row-centric manner such that it is easy for consumers to 
> reconstitute rows atomically.
> - Written in a simple format designed to be consumed *directly* by daemons 
> written in non JVM languages
> h2. Nice-to-haves
> I strongly suspect that the following features will be asked for, but I also 
> believe that they can be deferred for a subsequent release, and to guage 
> actual interest.
> - Multiple logs per table. This would make it easy to have multiple 
> "subscribers" to a single table's changes. A workaround would be to create a 
> forking daemon listener, but that's not a great answer.
> - Log filtering. Being able to apply filters, including UDF-based filters 
> would make Casandra a much more versatile feeder into other systems, and 
> again, reduce complexity that would otherwise need to be built into the 
> daemons.
> h2. Format and Consumption
> - Cassandra would only write to the CDC log, and never delete from it. 
> - Cleaning up consumed logfiles would be the client daemon's responibility
> - Logfile size should probably be configurable.
> - Logfiles should be named with a predictable naming schema, making it 
> triivial to process them in order.
> - Daemons should be able to checkpoint their work, and resume from where they 
> left off. This means they would have to leave some file artifact in th

[jira] [Updated] (CASSANDRA-8844) Change Data Capture (CDC)

2016-06-16 Thread Joshua McKenzie (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joshua McKenzie updated CASSANDRA-8844:
---
Status: Ready to Commit  (was: Patch Available)

> Change Data Capture (CDC)
> -
>
> Key: CASSANDRA-8844
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8844
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Coordination, Local Write-Read Paths
>Reporter: Tupshin Harper
>Assignee: Joshua McKenzie
>Priority: Critical
> Fix For: 3.x
>
>
> "In databases, change data capture (CDC) is a set of software design patterns 
> used to determine (and track) the data that has changed so that action can be 
> taken using the changed data. Also, Change data capture (CDC) is an approach 
> to data integration that is based on the identification, capture and delivery 
> of the changes made to enterprise data sources."
> -Wikipedia
> As Cassandra is increasingly being used as the Source of Record (SoR) for 
> mission critical data in large enterprises, it is increasingly being called 
> upon to act as the central hub of traffic and data flow to other systems. In 
> order to try to address the general need, we (cc [~brianmhess]), propose 
> implementing a simple data logging mechanism to enable per-table CDC patterns.
> h2. The goals:
> # Use CQL as the primary ingestion mechanism, in order to leverage its 
> Consistency Level semantics, and in order to treat it as the single 
> reliable/durable SoR for the data.
> # To provide a mechanism for implementing good and reliable 
> (deliver-at-least-once with possible mechanisms for deliver-exactly-once ) 
> continuous semi-realtime feeds of mutations going into a Cassandra cluster.
> # To eliminate the developmental and operational burden of users so that they 
> don't have to do dual writes to other systems.
> # For users that are currently doing batch export from a Cassandra system, 
> give them the opportunity to make that realtime with a minimum of coding.
> h2. The mechanism:
> We propose a durable logging mechanism that functions similar to a commitlog, 
> with the following nuances:
> - Takes place on every node, not just the coordinator, so RF number of copies 
> are logged.
> - Separate log per table.
> - Per-table configuration. Only tables that are specified as CDC_LOG would do 
> any logging.
> - Per DC. We are trying to keep the complexity to a minimum to make this an 
> easy enhancement, but most likely use cases would prefer to only implement 
> CDC logging in one (or a subset) of the DCs that are being replicated to
> - In the critical path of ConsistencyLevel acknowledgment. Just as with the 
> commitlog, failure to write to the CDC log should fail that node's write. If 
> that means the requested consistency level was not met, then clients *should* 
> experience UnavailableExceptions.
> - Be written in a Row-centric manner such that it is easy for consumers to 
> reconstitute rows atomically.
> - Written in a simple format designed to be consumed *directly* by daemons 
> written in non JVM languages
> h2. Nice-to-haves
> I strongly suspect that the following features will be asked for, but I also 
> believe that they can be deferred for a subsequent release, and to guage 
> actual interest.
> - Multiple logs per table. This would make it easy to have multiple 
> "subscribers" to a single table's changes. A workaround would be to create a 
> forking daemon listener, but that's not a great answer.
> - Log filtering. Being able to apply filters, including UDF-based filters 
> would make Casandra a much more versatile feeder into other systems, and 
> again, reduce complexity that would otherwise need to be built into the 
> daemons.
> h2. Format and Consumption
> - Cassandra would only write to the CDC log, and never delete from it. 
> - Cleaning up consumed logfiles would be the client daemon's responibility
> - Logfile size should probably be configurable.
> - Logfiles should be named with a predictable naming schema, making it 
> triivial to process them in order.
> - Daemons should be able to checkpoint their work, and resume from where they 
> left off. This means they would have to leave some file artifact in the CDC 
> log's directory.
> - A sophisticated daemon should be able to be written that could 
> -- Catch up, in written-order, even when it is multiple logfiles behind in 
> processing
> -- Be able to continuously "tail" the most recent logfile and get 
> low-latency(ms?) access to the data as it is written.
> h2. Alternate approach
> In order to make consuming a change log easy and efficient to do with low 
> latency, the following could supplement the approach outlined above
> - Instead of writing to a logfile, by default, Cassandra could expose a 
> socket for a daemon to connect to, and

[1/5] cassandra git commit: Add Change Data Capture

2016-06-16 Thread jmckenzie
Repository: cassandra
Updated Branches:
  refs/heads/trunk ed538f90e -> 5dcab286c


http://git-wip-us.apache.org/repos/asf/cassandra/blob/5dcab286/test/unit/org/apache/cassandra/db/commitlog/CommitLogReaderTest.java
--
diff --git 
a/test/unit/org/apache/cassandra/db/commitlog/CommitLogReaderTest.java 
b/test/unit/org/apache/cassandra/db/commitlog/CommitLogReaderTest.java
new file mode 100644
index 000..edff3b7
--- /dev/null
+++ b/test/unit/org/apache/cassandra/db/commitlog/CommitLogReaderTest.java
@@ -0,0 +1,267 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.cassandra.db.commitlog;
+
+import java.io.File;
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.List;
+
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.BeforeClass;
+import org.junit.Test;
+
+import org.apache.cassandra.config.CFMetaData;
+import org.apache.cassandra.config.ColumnDefinition;
+import org.apache.cassandra.config.Config;
+import org.apache.cassandra.config.DatabaseDescriptor;
+import org.apache.cassandra.cql3.CQLTester;
+import org.apache.cassandra.cql3.ColumnIdentifier;
+import org.apache.cassandra.db.Keyspace;
+import org.apache.cassandra.db.Mutation;
+import org.apache.cassandra.db.partitions.PartitionUpdate;
+import org.apache.cassandra.db.rows.Row;
+import org.apache.cassandra.utils.JVMStabilityInspector;
+import org.apache.cassandra.utils.KillerForTests;
+
+public class CommitLogReaderTest extends CQLTester
+{
+@BeforeClass
+public static void beforeClass()
+{
+
DatabaseDescriptor.setCommitFailurePolicy(Config.CommitFailurePolicy.ignore);
+JVMStabilityInspector.replaceKiller(new KillerForTests(false));
+}
+
+@Before
+public void before() throws IOException
+{
+CommitLog.instance.resetUnsafe(true);
+}
+
+@Test
+public void testReadAll() throws Throwable
+{
+int samples = 1000;
+populateData(samples);
+ArrayList toCheck = getCommitLogs();
+
+CommitLogReader reader = new CommitLogReader();
+
+TestCLRHandler testHandler = new 
TestCLRHandler(currentTableMetadata());
+for (File f : toCheck)
+reader.readCommitLogSegment(testHandler, f, 
CommitLogReader.ALL_MUTATIONS, false);
+
+Assert.assertEquals("Expected 1000 seen mutations, got: " + 
testHandler.seenMutationCount(),
+1000, testHandler.seenMutationCount());
+
+confirmReadOrder(testHandler, 0);
+}
+
+@Test
+public void testReadCount() throws Throwable
+{
+int samples = 50;
+int readCount = 10;
+populateData(samples);
+ArrayList toCheck = getCommitLogs();
+
+CommitLogReader reader = new CommitLogReader();
+TestCLRHandler testHandler = new TestCLRHandler();
+
+for (File f : toCheck)
+reader.readCommitLogSegment(testHandler, f, readCount - 
testHandler.seenMutationCount(), false);
+
+Assert.assertEquals("Expected " + readCount + " seen mutations, got: " 
+ testHandler.seenMutations.size(),
+readCount, testHandler.seenMutationCount());
+}
+
+@Test
+public void testReadFromMidpoint() throws Throwable
+{
+int samples = 1000;
+int readCount = 500;
+CommitLogPosition midpoint = populateData(samples);
+ArrayList toCheck = getCommitLogs();
+
+CommitLogReader reader = new CommitLogReader();
+TestCLRHandler testHandler = new TestCLRHandler();
+
+// Will skip on incorrect segments due to id mismatch on midpoint
+for (File f : toCheck)
+reader.readCommitLogSegment(testHandler, f, midpoint, readCount, 
false);
+
+// Confirm correct count on replay
+Assert.assertEquals("Expected " + readCount + " seen mutations, got: " 
+ testHandler.seenMutations.size(),
+readCount, testHandler.seenMutationCount());
+
+confirmReadOrder(testHandler, samples / 2);
+}
+
+@Test
+public void testReadFromMidpointTooMany() throws Throwable
+{
+int samples = 1000;
+  

[5/5] cassandra git commit: Add Change Data Capture

2016-06-16 Thread jmckenzie
Add Change Data Capture

Patch by jmckenzie; reviewed by cyeksigian and blambov for CASSANDRA-8844


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/5dcab286
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/5dcab286
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/5dcab286

Branch: refs/heads/trunk
Commit: 5dcab286ca0fcd9a71e28dad805f028362572e21
Parents: ed538f9
Author: Josh McKenzie 
Authored: Sun Mar 27 09:20:47 2016 -0400
Committer: Josh McKenzie 
Committed: Thu Jun 16 09:53:49 2016 -0400

--
 CHANGES.txt |   1 +
 NEWS.txt|  32 +-
 build.xml   |  51 +-
 conf/cassandra.yaml |  26 +
 pylib/cqlshlib/cql3handling.py  |   5 +-
 src/antlr/Parser.g  |   3 +-
 .../org/apache/cassandra/config/Config.java |   6 +
 .../cassandra/config/DatabaseDescriptor.java|  86 ++-
 .../statements/CreateKeyspaceStatement.java |   1 +
 .../cql3/statements/DropKeyspaceStatement.java  |   2 +-
 .../cql3/statements/TableAttributes.java|   3 +
 .../apache/cassandra/db/ColumnFamilyStore.java  |  67 +--
 .../org/apache/cassandra/db/Directories.java|  51 +-
 src/java/org/apache/cassandra/db/Keyspace.java  |  17 +-
 src/java/org/apache/cassandra/db/Memtable.java  |  47 +-
 src/java/org/apache/cassandra/db/Mutation.java  |  18 +
 .../org/apache/cassandra/db/SystemKeyspace.java |  26 +-
 src/java/org/apache/cassandra/db/WriteType.java |   3 +-
 .../AbstractCommitLogSegmentManager.java| 584 +++
 .../db/commitlog/AbstractCommitLogService.java  |   3 +-
 .../cassandra/db/commitlog/CommitLog.java   | 157 +++--
 .../db/commitlog/CommitLogPosition.java | 121 
 .../db/commitlog/CommitLogReadHandler.java  |  76 +++
 .../cassandra/db/commitlog/CommitLogReader.java | 501 
 .../db/commitlog/CommitLogReplayer.java | 582 +++---
 .../db/commitlog/CommitLogSegment.java  | 110 ++--
 .../db/commitlog/CommitLogSegmentManager.java   | 567 --
 .../commitlog/CommitLogSegmentManagerCDC.java   | 302 ++
 .../CommitLogSegmentManagerStandard.java|  89 +++
 .../db/commitlog/CommitLogSegmentReader.java| 366 
 .../db/commitlog/CompressedSegment.java |  12 +-
 .../db/commitlog/EncryptedSegment.java  |  18 +-
 .../db/commitlog/FileDirectSegment.java |  73 +--
 .../db/commitlog/MemoryMappedSegment.java   |   6 +-
 .../cassandra/db/commitlog/ReplayPosition.java  | 178 --
 .../cassandra/db/commitlog/SegmentReader.java   | 355 ---
 .../db/commitlog/SimpleCachedBufferPool.java| 118 
 .../apache/cassandra/db/lifecycle/Tracker.java  |   8 +-
 .../apache/cassandra/db/view/TableViews.java|   4 +-
 .../apache/cassandra/db/view/ViewManager.java   |   2 -
 .../io/sstable/format/SSTableReader.java|   1 -
 .../metadata/LegacyMetadataSerializer.java  |  12 +-
 .../io/sstable/metadata/MetadataCollector.java  |  16 +-
 .../io/sstable/metadata/StatsMetadata.java  |  24 +-
 .../cassandra/metrics/CommitLogMetrics.java |   9 +-
 .../apache/cassandra/schema/SchemaKeyspace.java |   6 +-
 .../apache/cassandra/schema/TableParams.java|  23 +-
 .../cassandra/service/CassandraDaemon.java  |   4 +-
 .../cassandra/streaming/StreamReceiveTask.java  |  36 +-
 .../utils/DirectorySizeCalculator.java  |  98 
 .../cassandra/utils/JVMStabilityInspector.java  |   3 +-
 .../cassandra/utils/memory/BufferPool.java  |   2 +-
 test/conf/cassandra-murmur.yaml |   2 +
 test/conf/cassandra.yaml|   2 +
 test/conf/cdc.yaml  |   1 +
 test/data/bloom-filter/ka/foo.cql   |   2 +-
 .../db/commitlog/CommitLogStressTest.java   | 123 ++--
 .../test/microbench/DirectorySizerBench.java| 105 
 .../OffsetAwareConfigurationLoader.java |  13 +-
 .../cassandra/batchlog/BatchlogManagerTest.java |   4 +-
 .../apache/cassandra/cql3/CDCStatementTest.java |  50 ++
 .../org/apache/cassandra/cql3/CQLTester.java|   4 +
 .../apache/cassandra/cql3/OutOfSpaceTest.java   |   2 +-
 .../cql3/validation/operations/CreateTest.java  |   5 +-
 .../apache/cassandra/db/ReadMessageTest.java|  10 +-
 .../db/commitlog/CommitLogReaderTest.java   | 267 +
 .../CommitLogSegmentManagerCDCTest.java | 220 +++
 .../commitlog/CommitLogSegmentManagerTest.java  |  23 +-
 .../cassandra/db/commitlog/CommitLogTest.java   | 130 +++--
 .../db/commitlog/CommitLogTestReplayer.java |  59 +-
 .../db/commitlog/CommitLogUpgradeTest.java  |  18 +-
 .../db/commitlog/CommitLogUpgradeTestMaker.java |   4 +-
 .../db/commitlog/SegmentReaderTest

[3/5] cassandra git commit: Add Change Data Capture

2016-06-16 Thread jmckenzie
http://git-wip-us.apache.org/repos/asf/cassandra/blob/5dcab286/src/java/org/apache/cassandra/db/commitlog/CommitLogSegment.java
--
diff --git a/src/java/org/apache/cassandra/db/commitlog/CommitLogSegment.java 
b/src/java/org/apache/cassandra/db/commitlog/CommitLogSegment.java
index 2045c35..2e97fd5 100644
--- a/src/java/org/apache/cassandra/db/commitlog/CommitLogSegment.java
+++ b/src/java/org/apache/cassandra/db/commitlog/CommitLogSegment.java
@@ -22,34 +22,22 @@ import java.io.IOException;
 import java.nio.ByteBuffer;
 import java.nio.channels.FileChannel;
 import java.nio.file.StandardOpenOption;
-import java.util.ArrayList;
-import java.util.Collection;
-import java.util.Collections;
-import java.util.Comparator;
-import java.util.Iterator;
-import java.util.List;
-import java.util.Map;
-import java.util.UUID;
+import java.util.*;
 import java.util.concurrent.ConcurrentHashMap;
 import java.util.concurrent.ConcurrentMap;
 import java.util.concurrent.atomic.AtomicInteger;
 import java.util.zip.CRC32;
 
-import com.codahale.metrics.Timer;
-
 import org.cliffc.high_scale_lib.NonBlockingHashMap;
-
 import org.slf4j.Logger;
 import org.slf4j.LoggerFactory;
 
-import org.apache.cassandra.config.CFMetaData;
-import org.apache.cassandra.config.DatabaseDescriptor;
-import org.apache.cassandra.config.Schema;
+import com.codahale.metrics.Timer;
+import org.apache.cassandra.config.*;
 import org.apache.cassandra.db.Mutation;
 import org.apache.cassandra.db.commitlog.CommitLog.Configuration;
 import org.apache.cassandra.db.partitions.PartitionUpdate;
 import org.apache.cassandra.io.FSWriteError;
-import org.apache.cassandra.io.util.FileUtils;
 import org.apache.cassandra.utils.CLibrary;
 import org.apache.cassandra.utils.concurrent.OpOrder;
 import org.apache.cassandra.utils.concurrent.WaitQueue;
@@ -66,6 +54,14 @@ public abstract class CommitLogSegment
 private static final Logger logger = 
LoggerFactory.getLogger(CommitLogSegment.class);
 
 private final static long idBase;
+
+private CDCState cdcState = CDCState.PERMITTED;
+public enum CDCState {
+PERMITTED,
+FORBIDDEN,
+CONTAINS
+}
+
 private final static AtomicInteger nextId = new AtomicInteger(1);
 private static long replayLimitId;
 static
@@ -115,18 +111,20 @@ public abstract class CommitLogSegment
 final FileChannel channel;
 final int fd;
 
+protected final AbstractCommitLogSegmentManager manager;
+
 ByteBuffer buffer;
 private volatile boolean headerWritten;
 
 final CommitLog commitLog;
 public final CommitLogDescriptor descriptor;
 
-static CommitLogSegment createSegment(CommitLog commitLog, Runnable 
onClose)
+static CommitLogSegment createSegment(CommitLog commitLog, 
AbstractCommitLogSegmentManager manager, Runnable onClose)
 {
 Configuration config = commitLog.configuration;
-CommitLogSegment segment = config.useEncryption() ? new 
EncryptedSegment(commitLog, onClose)
-  : 
config.useCompression() ? new CompressedSegment(commitLog, onClose)
-   
 : new MemoryMappedSegment(commitLog);
+CommitLogSegment segment = config.useEncryption() ? new 
EncryptedSegment(commitLog, manager, onClose)
+  : 
config.useCompression() ? new CompressedSegment(commitLog, manager, onClose)
+   
 : new MemoryMappedSegment(commitLog, manager);
 segment.writeLogHeader();
 return segment;
 }
@@ -151,14 +149,16 @@ public abstract class CommitLogSegment
 /**
  * Constructs a new segment file.
  */
-CommitLogSegment(CommitLog commitLog)
+CommitLogSegment(CommitLog commitLog, AbstractCommitLogSegmentManager 
manager)
 {
 this.commitLog = commitLog;
+this.manager = manager;
+
 id = getNextId();
 descriptor = new CommitLogDescriptor(id,
  
commitLog.configuration.getCompressorClass(),
  
commitLog.configuration.getEncryptionContext());
-logFile = new File(commitLog.location, descriptor.fileName());
+logFile = new File(manager.storageDirectory, descriptor.fileName());
 
 try
 {
@@ -369,22 +369,11 @@ public abstract class CommitLogSegment
 }
 
 /**
- * Completely discards a segment file by deleting it. (Potentially 
blocking operation)
- */
-void discard(boolean deleteFile)
-{
-close();
-if (deleteFile)
-FileUtils.deleteWithConfirm(logFile);
-commitLog.allocator.addSize(-onDiskSize());
-}
-
-/**
- * @return the current ReplayPosition for this log segment
+ * @retur

[jira] [Commented] (CASSANDRA-8844) Change Data Capture (CDC)

2016-06-16 Thread Branimir Lambov (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15333774#comment-15333774
 ] 

Branimir Lambov commented on CASSANDRA-8844:


+1, with a final rename nit: 
[{{Allocation.GetCommitLogPosition}}|https://github.com/apache/cassandra/compare/trunk...josh-mckenzie:8844_review#diff-7720d4b5123a354876e0b3139222f34eR669]
 is in PascalCase.

> Change Data Capture (CDC)
> -
>
> Key: CASSANDRA-8844
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8844
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Coordination, Local Write-Read Paths
>Reporter: Tupshin Harper
>Assignee: Joshua McKenzie
>Priority: Critical
> Fix For: 3.x
>
>
> "In databases, change data capture (CDC) is a set of software design patterns 
> used to determine (and track) the data that has changed so that action can be 
> taken using the changed data. Also, Change data capture (CDC) is an approach 
> to data integration that is based on the identification, capture and delivery 
> of the changes made to enterprise data sources."
> -Wikipedia
> As Cassandra is increasingly being used as the Source of Record (SoR) for 
> mission critical data in large enterprises, it is increasingly being called 
> upon to act as the central hub of traffic and data flow to other systems. In 
> order to try to address the general need, we (cc [~brianmhess]), propose 
> implementing a simple data logging mechanism to enable per-table CDC patterns.
> h2. The goals:
> # Use CQL as the primary ingestion mechanism, in order to leverage its 
> Consistency Level semantics, and in order to treat it as the single 
> reliable/durable SoR for the data.
> # To provide a mechanism for implementing good and reliable 
> (deliver-at-least-once with possible mechanisms for deliver-exactly-once ) 
> continuous semi-realtime feeds of mutations going into a Cassandra cluster.
> # To eliminate the developmental and operational burden of users so that they 
> don't have to do dual writes to other systems.
> # For users that are currently doing batch export from a Cassandra system, 
> give them the opportunity to make that realtime with a minimum of coding.
> h2. The mechanism:
> We propose a durable logging mechanism that functions similar to a commitlog, 
> with the following nuances:
> - Takes place on every node, not just the coordinator, so RF number of copies 
> are logged.
> - Separate log per table.
> - Per-table configuration. Only tables that are specified as CDC_LOG would do 
> any logging.
> - Per DC. We are trying to keep the complexity to a minimum to make this an 
> easy enhancement, but most likely use cases would prefer to only implement 
> CDC logging in one (or a subset) of the DCs that are being replicated to
> - In the critical path of ConsistencyLevel acknowledgment. Just as with the 
> commitlog, failure to write to the CDC log should fail that node's write. If 
> that means the requested consistency level was not met, then clients *should* 
> experience UnavailableExceptions.
> - Be written in a Row-centric manner such that it is easy for consumers to 
> reconstitute rows atomically.
> - Written in a simple format designed to be consumed *directly* by daemons 
> written in non JVM languages
> h2. Nice-to-haves
> I strongly suspect that the following features will be asked for, but I also 
> believe that they can be deferred for a subsequent release, and to guage 
> actual interest.
> - Multiple logs per table. This would make it easy to have multiple 
> "subscribers" to a single table's changes. A workaround would be to create a 
> forking daemon listener, but that's not a great answer.
> - Log filtering. Being able to apply filters, including UDF-based filters 
> would make Casandra a much more versatile feeder into other systems, and 
> again, reduce complexity that would otherwise need to be built into the 
> daemons.
> h2. Format and Consumption
> - Cassandra would only write to the CDC log, and never delete from it. 
> - Cleaning up consumed logfiles would be the client daemon's responibility
> - Logfile size should probably be configurable.
> - Logfiles should be named with a predictable naming schema, making it 
> triivial to process them in order.
> - Daemons should be able to checkpoint their work, and resume from where they 
> left off. This means they would have to leave some file artifact in the CDC 
> log's directory.
> - A sophisticated daemon should be able to be written that could 
> -- Catch up, in written-order, even when it is multiple logfiles behind in 
> processing
> -- Be able to continuously "tail" the most recent logfile and get 
> low-latency(ms?) access to the data as it is written.
> h2. Alternate approach
> In order to make consuming a change log easy an

[jira] [Commented] (CASSANDRA-11516) Make max number of streams configurable

2016-06-16 Thread Giampaolo (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15333742#comment-15333742
 ] 

Giampaolo commented on CASSANDRA-11516:
---

I'm studying how to solve this issue. A quick question: do you mean to put a 
configuration for [this 
line|https://github.com/apache/cassandra/blob/3dcbe90e02440e6ee534f643c7603d50ca08482b/src/java/org/apache/cassandra/streaming/StreamReceiveTask.java#L62]
 using {{newFixedThreadPool}} and defaulting to 
{{FBUtilities#getAvailableProcessors}}?

> Make max number of streams configurable
> ---
>
> Key: CASSANDRA-11516
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11516
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Sebastian Estevez
>  Labels: lhf
>
> Today we default to num cores. In large boxes (many cores), this is 
> suboptimal as it can generate huge amounts of garbage that GC can't keep up 
> with.
> Usually we tackle issues like this with the streaming throughput levers but 
> in this case the problem is CPU consumption by StreamReceiverTasks 
> specifically in the IntervalTree build -- 
> https://github.com/apache/cassandra/blob/cassandra-2.1.12/src/java/org/apache/cassandra/utils/IntervalTree.java#L257
> We need a max number of parallel streams lever to hanlde this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11873) Add duration type

2016-06-16 Thread Benjamin Lerer (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15333706#comment-15333706
 ] 

Benjamin Lerer commented on CASSANDRA-11873:


bq. What about leap year?

I think that point worth some discussion.
The current patch store the duration in a number of nanoseconds. Which means 
that some information will be lost. If a user provide {{3y}} or {{3 year}} it 
will be converted in nanoseconds and the {{now() - 3y}} will not result in the 
correct date. We can try to guess what the user intended but it is a risky 
business.
If we want to handle properly things like that it means that we have to use a 
more complex serialization format. Basically we need to store at least {{year}} 
and {{month}} separately from the remaining time in nanosecond (which is I 
guess the main reason why Influxdb is not supporting the month and year units).

Even if it allow a better handling of some use cases, I think that this 
solution will bring some problems. {{Java}} for example do not have a type that 
can be directly mapped to that (if I am not mistaken). It has 2 different 
classes: {{Period}} for the date part and {{Duration}} for the time part. By 
consequence, it can be difficult for the driver to handle such a type.
I also believe (even if I do not have some concret proof right now) that it 
will make some computations, like the one needed for CASSANDRA-11871, more 
expensives.

Overall, I am in favor of keeping the thing as simple as possible. Which is for 
me: storing the duration has nanoseconds, supporting as litterals only a number 
followed by a symbol  (in this first version at least)  and not supporting 
{{month}} or {{year}} units (the current patch does not support {{week}} but it 
can easily be added).

Having said that, I am fully open to discussion.

> Add duration type
> -
>
> Key: CASSANDRA-11873
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11873
> Project: Cassandra
>  Issue Type: Improvement
>  Components: CQL
>Reporter: Benjamin Lerer
>Assignee: Benjamin Lerer
>  Labels: client-impacting, doc-impacting
> Fix For: 3.x
>
>
> For CASSANDRA-11871 or to allow queries with {{WHERE}} clause like:
> {{... WHERE reading_time < now() - 2h}}, we need to support some duration 
> type.
> In my opinion, it should be represented internally as a number of 
> microseconds.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12015) Rebuilding from another DC should use different sources

2016-06-16 Thread DOAN DuyHai (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15333698#comment-15333698
 ] 

DOAN DuyHai commented on CASSANDRA-12015:
-

Here is the important code path

1) org.apache.cassandra.tools.nodetool.Rebuild::execute()
2) StorageService::rebuild(String sourceDc, String keyspace, String tokens)
3) RangeStreamer::getAllRangesWithSourcesFor(String keyspaceName, 
Collection> desiredRanges)

Inside the last metho, we call the snitch to sort replicas :

  List preferred = snitch.getSortedListByProximity(address, 
rangeAddresses.get(range));

If you're rebuilding nodes in new DC with "nodetool rebuild" command very fast, 
it may happen that one replica has better latency that the others so it will be 
picked up by DynamicSnitch 



> Rebuilding from another DC should use different sources
> ---
>
> Key: CASSANDRA-12015
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12015
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Fabien Rousseau
>
> Currently, when adding a new DC (ex: DC2) and rebuilding it from an existing 
> DC (ex: DC1), only the closest replica is used as a "source of data".
> It works but is not optimal, because in case of an RF=3 and 3 nodes cluster, 
> only one node in DC1 is streaming the data to DC2. 
> To build the new DC in a reasonable time, it would be better, in that case, 
> to stream from multiple sources, thus distributing more evenly the load.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8700) replace the wiki with docs in the git repo

2016-06-16 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15333615#comment-15333615
 ] 

Sylvain Lebresne commented on CASSANDRA-8700:
-

I included the pull requests made so far on [my 
branch|https://github.com/pcmanus/cassandra/commits/doc_in_tree] (cherry-picked 
because I like my branches like my complexities: linear). Note that I'm the 
middle of fixing the CQL doc so that's why it looks bad currently (I'm taking 
the time to add missing parts and reorganize things a bit so it's taking a bit 
of time).

On the cqlsh doc though, I wonder if it's a good idea to include the 
description of the command line options, and even of the special commands? 
Feels like it we'll easily forgot to update it and it doesn't seem to add a lot 
of value over getting the help from cqlsh directly. Maybe we could just point 
to how to get said help (just mentioning that you should use {{cqlsh -h}} for 
command line options and that there is a HELP command within cqlsh)? Or maybe 
we can have all that generated from cqlsh code directly so things stay in sync 
automatically?

> replace the wiki with docs in the git repo
> --
>
> Key: CASSANDRA-8700
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8700
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Documentation and Website
>Reporter: Jon Haddad
>Assignee: Sylvain Lebresne
>Priority: Blocker
> Fix For: 3.8
>
> Attachments: TombstonesAndGcGrace.md, bloom_filters.md, 
> compression.md, contributing.zip, getting_started.zip, hardware.md
>
>
> The wiki as it stands is pretty terrible.  It takes several minutes to apply 
> a single update, and as a result, it's almost never updated.  The information 
> there has very little context as to what version it applies to.  Most people 
> I've talked to that try to use the information they find there find it is 
> more confusing than helpful.
> I'd like to propose that instead of using the wiki, the doc directory in the 
> cassandra repo be used for docs (already used for CQL3 spec) in a format that 
> can be built to a variety of output formats like HTML / epub / etc.  I won't 
> start the bikeshedding on which markup format is preferable - but there are 
> several options that can work perfectly fine.  I've personally use sphinx w/ 
> restructured text, and markdown.  Both can build easily and as an added bonus 
> be pushed to readthedocs (or something similar) automatically.  For an 
> example, see cqlengine's documentation, which I think is already 
> significantly better than the wiki: 
> http://cqlengine.readthedocs.org/en/latest/
> In addition to being overall easier to maintain, putting the documentation in 
> the git repo adds context, since it evolves with the versions of Cassandra.
> If the wiki were kept even remotely up to date, I wouldn't bother with this, 
> but not having at least some basic documentation in the repo, or anywhere 
> associated with the project, is frustrating.
> For reference, the last 3 updates were:
> 1/15/15 - updating committers list
> 1/08/15 - updating contributers and how to contribute
> 12/16/14 - added a link to CQL docs from wiki frontpage (by me)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11873) Add duration type

2016-06-16 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15333585#comment-15333585
 ] 

Sylvain Lebresne commented on CASSANDRA-11873:
--

For the record, CQL is not SQL and it's not even close. Artificially forcing 
ourselves to reuse something existing in SQL *every single time* we need new 
syntax is largely pointless. Anyone trying to use CQL as if it was SQL is going 
to have a bad surprise, and small syntax differences is going to be the least 
of its problem.

Don't get me wrong, CQL has the same _general_ structure than SQL and so 
informing our choices with what SQL (and popular SQL databases) is doing and 
borrowing good ideas is certainly desirable. But that's only the beginning of 
the conversation, not the end (even more so when said existing SQL databases 
don't even agree between themselves). If we think an existing syntax is not 
particular good and we can do better for instance, why we would pick a lesser 
solution?

And in that particular case, I'm _convinced_ that the syntax currently 
implemented is better than what Postgres or Oracle do (I reckon that such 
statement is partly subjective, but I still stand by it). Certainly not a lot 
better, granted, but better because as intuitive as any of the options but more 
concise. For that reason, count me as a PMC-binding -1 on *not* supporting it. 
That said, I'm not against compromises, so please read below before answering.

bq. Of the formats I've seen here, Postgres native format is the most user 
friendly

And by "Postgres native format", you mean {{1 year 2 months 3 days 4 hours 5 
minutes 6 seconds}} right? If so (and as mentioned previously), I don't really 
mind supporting that (I guess for the sake of making the live of Postgres 
developer easier, or pleasing those that want to show off their touch-typing 
skills). I don't mind it as long as we also support the shorter version 
(because, if I don't care about Postgres, why wouldn't I be allowed to 
abbreviate the units? It surely is pretty natural).

> Add duration type
> -
>
> Key: CASSANDRA-11873
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11873
> Project: Cassandra
>  Issue Type: Improvement
>  Components: CQL
>Reporter: Benjamin Lerer
>Assignee: Benjamin Lerer
>  Labels: client-impacting, doc-impacting
> Fix For: 3.x
>
>
> For CASSANDRA-11871 or to allow queries with {{WHERE}} clause like:
> {{... WHERE reading_time < now() - 2h}}, we need to support some duration 
> type.
> In my opinion, it should be represented internally as a number of 
> microseconds.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-11873) Add duration type

2016-06-16 Thread Brian Hess (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15333529#comment-15333529
 ] 

 Brian Hess edited comment on CASSANDRA-11873 at 6/16/16 10:27 AM:
---

Being subtlety different on syntax is in some cases worse than being very 
different. So, if we are thinking we will go with ISO 8601 format (an option 
that could make sense - it is a widely recognized format and present in more 
than a few systems (not just databases, I mean)) then we should make sure we 
include the "P" and the "T". 

While Postgres does support ISO 8601 formats (of course I bothered to read it), 
in that format the highest resolution is seconds. There is a good reason to 
want milliseconds and microseconds (and maybe nanoseconds). The standard 
Postgres format support all of these (with the exception of nanoseconds, though 
that addition to their format would be straightforward to understand). If you 
want to shorten the Postgres format to save typing, what abbreviation do you 
propose for "minute" and "month"?

I will certainly agree that the Oracle syntax is not user-friendly. I think 
arguing it is desirable is a stretch. 

I have a reservation on the Influx syntax here, though. Influx does not support 
month or year. They only have up to week 
(https://docs.influxdata.com/influxdb/v0.13/query_language/data_exploration/#relative-time).
 So, it is not possible to say "now() - 2 months" or "now() - 1 year". To do 1 
year, what would you do? "now() - 365d"? What about leap year? What about going 
back one month?
In fact, if one bothers to read 
https://docs.influxdata.com/influxdb/v0.13/query_language/data_exploration/#time-syntax-in-queries,
 he would find that this patch only had a subset of Influx's supported format. 
I don't see a week unit. Moreover, Influx doesn't use "us", it uses just "u". 
So, our proposed syntax isn't even consistent (in subtle ways) with Influx's 
format. Let alone that Influx's format is incomplete (specifically, no support 
for months and years). 

Of the formats I've seen here, Postgres native format is the most user 
friendly, and accomplishes the goals of durations for us. I'm (non-PMC, 
non-binding) -1 on the currently proposed format from a 
usability/product/developer POV. 


was (Author: brianmhess):
Being subtlety different on syntax is in some cases worse than being very 
different. So, if we are thinking we will go with ISO 8601 format (an option 
that could make sense - it is a widely recognized format and present in more 
than a few systems (not just databases, I mean)) then we should make sure we 
include the "P" and the "T". 

While Postgres does support ISO 8601 formats (of course I bothered to read it), 
in that format the highest resolution is seconds. There is a good reason to 
want milliseconds and microseconds (and maybe nanoseconds). The standard 
Postgres format support all of these (with the exception of nanoseconds, though 
that addition to their format would be straightforward to understand). If you 
want to shorten the Postgres format to save typing, what abbreviation do you 
propose for "minute" and "month"?

I will certainly agree that the Oracle syntax is not user-friendly. I think 
arguing it is desirable is a stretch. 

I have a reservation on the Influx syntax here, though. Influx does not support 
month or year. They only have up to week 
(https://docs.influxdata.com/influxdb/v0.13/query_language/data_exploration/#relative-time).
 So, it is not possible to say "now() - 2 months" or "now() - 1 year". To do 1 
year, what would you do? "now() - 365d"? What about leap year? What about going 
back one month?
In fact, this patch only had a subset of Influx's supported format. I don't see 
a week unit. Moreover, Influx doesn't use "us", it uses just "u". So, our 
proposed syntax isn't even consistent (in subtle ways) with Influx's format. 
Let alone that Influx's format is incomplete (specifically, no support for 
months and years). 

Of the formats I've seen here, Postgres native format is the most user 
friendly, and accomplishes the goals of durations for us. I'm (non-PMC, 
non-binding) -1 on the currently proposed format from a 
usability/product/developer POV. 

> Add duration type
> -
>
> Key: CASSANDRA-11873
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11873
> Project: Cassandra
>  Issue Type: Improvement
>  Components: CQL
>Reporter: Benjamin Lerer
>Assignee: Benjamin Lerer
>  Labels: client-impacting, doc-impacting
> Fix For: 3.x
>
>
> For CASSANDRA-11871 or to allow queries with {{WHERE}} clause like:
> {{... WHERE reading_time < now() - 2h}}, we need to support some duration 
> type.
> In my opinion, it should be represented internally as a number of 
> microseconds.



--
This 

[jira] [Commented] (CASSANDRA-11873) Add duration type

2016-06-16 Thread Brian Hess (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15333529#comment-15333529
 ] 

 Brian Hess commented on CASSANDRA-11873:
-

Being subtlety different on syntax is in some cases worse than being very 
different. So, if we are thinking we will go with ISO 8601 format (an option 
that could make sense - it is a widely recognized format and present in more 
than a few systems (not just databases, I mean)) then we should make sure we 
include the "P" and the "T". 

While Postgres does support ISO 8601 formats (of course I bothered to read it), 
in that format the highest resolution is seconds. There is a good reason to 
want milliseconds and microseconds (and maybe nanoseconds). The standard 
Postgres format support all of these (with the exception of nanoseconds, though 
that addition to their format would be straightforward to understand). If you 
want to shorten the Postgres format to save typing, what abbreviation do you 
propose for "minute" and "month"?

I will certainly agree that the Oracle syntax is not user-friendly. I think 
arguing it is desirable is a stretch. 

I have a reservation on the Influx syntax here, though. Influx does not support 
month or year. They only have up to week 
(https://docs.influxdata.com/influxdb/v0.13/query_language/data_exploration/#relative-time).
 So, it is not possible to say "now() - 2 months" or "now() - 1 year". To do 1 
year, what would you do? "now() - 365d"? What about leap year? What about going 
back one month?
In fact, this patch only had a subset of Influx's supported format. I don't see 
a week unit. Moreover, Influx doesn't use "us", it uses just "u". So, our 
proposed syntax isn't even consistent (in subtle ways) with Influx's format. 
Let alone that Influx's format is incomplete (specifically, no support for 
months and years). 

Of the formats I've seen here, Postgres native format is the most user 
friendly, and accomplishes the goals of durations for us. I'm (non-PMC, 
non-binding) -1 on the currently proposed format from a 
usability/product/developer POV. 

> Add duration type
> -
>
> Key: CASSANDRA-11873
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11873
> Project: Cassandra
>  Issue Type: Improvement
>  Components: CQL
>Reporter: Benjamin Lerer
>Assignee: Benjamin Lerer
>  Labels: client-impacting, doc-impacting
> Fix For: 3.x
>
>
> For CASSANDRA-11871 or to allow queries with {{WHERE}} clause like:
> {{... WHERE reading_time < now() - 2h}}, we need to support some duration 
> type.
> In my opinion, it should be represented internally as a number of 
> microseconds.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11349) MerkleTree mismatch when multiple range tombstones exists for the same partition and interval

2016-06-16 Thread Fabien Rousseau (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15333457#comment-15333457
 ] 

Fabien Rousseau commented on CASSANDRA-11349:
-

Just to let you know that we packaged the patch done by Branimir (as it is the 
one that have more chances to be included mainstream).

We restored one cluster (3 nodes, 100GB of data per node, affected table is 
25GB) from a snapshot on new hardware, and did a full repair. So far, so good, 
not much differences are found for the affected table but this was expected 
because repairs are not run for a few months (around a hundred VS a few hundred 
of thousands before).

We will continue testing by recreating all of our clusters, and then, deploy it 
on our production (and I'll let you know once this is done).

> MerkleTree mismatch when multiple range tombstones exists for the same 
> partition and interval
> -
>
> Key: CASSANDRA-11349
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11349
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Fabien Rousseau
>Assignee: Stefan Podkowinski
>  Labels: repair
> Fix For: 2.1.x, 2.2.x
>
> Attachments: 11349-2.1-v2.patch, 11349-2.1-v3.patch, 
> 11349-2.1-v4.patch, 11349-2.1.patch, 11349-2.2-v4.patch
>
>
> We observed that repair, for some of our clusters, streamed a lot of data and 
> many partitions were "out of sync".
> Moreover, the read repair mismatch ratio is around 3% on those clusters, 
> which is really high.
> After investigation, it appears that, if two range tombstones exists for a 
> partition for the same range/interval, they're both included in the merkle 
> tree computation.
> But, if for some reason, on another node, the two range tombstones were 
> already compacted into a single range tombstone, this will result in a merkle 
> tree difference.
> Currently, this is clearly bad because MerkleTree differences are dependent 
> on compactions (and if a partition is deleted and created multiple times, the 
> only way to ensure that repair "works correctly"/"don't overstream data" is 
> to major compact before each repair... which is not really feasible).
> Below is a list of steps allowing to easily reproduce this case:
> {noformat}
> ccm create test -v 2.1.13 -n 2 -s
> ccm node1 cqlsh
> CREATE KEYSPACE test_rt WITH replication = {'class': 'SimpleStrategy', 
> 'replication_factor': 2};
> USE test_rt;
> CREATE TABLE IF NOT EXISTS table1 (
> c1 text,
> c2 text,
> c3 float,
> c4 float,
> PRIMARY KEY ((c1), c2)
> );
> INSERT INTO table1 (c1, c2, c3, c4) VALUES ( 'a', 'b', 1, 2);
> DELETE FROM table1 WHERE c1 = 'a' AND c2 = 'b';
> ctrl ^d
> # now flush only one of the two nodes
> ccm node1 flush 
> ccm node1 cqlsh
> USE test_rt;
> INSERT INTO table1 (c1, c2, c3, c4) VALUES ( 'a', 'b', 1, 3);
> DELETE FROM table1 WHERE c1 = 'a' AND c2 = 'b';
> ctrl ^d
> ccm node1 repair
> # now grep the log and observe that there was some inconstencies detected 
> between nodes (while it shouldn't have detected any)
> ccm node1 showlog | grep "out of sync"
> {noformat}
> Consequences of this are a costly repair, accumulating many small SSTables 
> (up to thousands for a rather short period of time when using VNodes, the 
> time for compaction to absorb those small files), but also an increased size 
> on disk.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-11726) IndexOutOfBoundsException when selecting (distinct) row ids from counter table.

2016-06-16 Thread Jaroslav Kamenik (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jaroslav Kamenik updated CASSANDRA-11726:
-
Reproduced In: 3.7, 3.5  (was: 3.5)

> IndexOutOfBoundsException when selecting (distinct) row ids from counter 
> table.
> ---
>
> Key: CASSANDRA-11726
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11726
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: C* 3.5, cluster of 4 nodes.
>Reporter: Jaroslav Kamenik
>
> I have simple table containing counters:
> {code}
> CREATE TABLE tablename (
> object_id ascii,
> counter_id ascii,
> count counter,
> PRIMARY KEY (object_id, counter_id)
> ) WITH CLUSTERING ORDER BY (counter_id ASC)
> AND bloom_filter_fp_chance = 0.01
> AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
> AND comment = ''
> AND compaction = {'class': 
> 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 
> 'max_threshold': '32', 'min_threshold': '4'}
> AND compression = {'enabled': 'false'}
> AND crc_check_chance = 1.0
> AND dclocal_read_repair_chance = 0.1
> AND default_time_to_live = 0
> AND gc_grace_seconds = 864000
> AND max_index_interval = 2048
> AND memtable_flush_period_in_ms = 0
> AND min_index_interval = 128
> AND read_repair_chance = 0.0
> AND speculative_retry = '99PERCENTILE';
> {code}
> Counters are often inc/decreased, whole rows are queried, deleted sometimes.
> After some time I tried to query all object_ids, but it failed with:
> {code}
> cqlsh:woc> consistency quorum;
> cqlsh:woc> select object_id from tablename;
> ServerError:  message="java.lang.IndexOutOfBoundsException">
> {code}
> select * from ..., select where .., updates works well..
> With consistency one it works sometimes, so it seems something is broken at 
> one server, but I tried to repair table there and it did not help. 
> Whole exception from server log:
> {code}
> java.lang.IndexOutOfBoundsException: null
> at java.nio.Buffer.checkIndex(Buffer.java:546) ~[na:1.8.0_73]
> at java.nio.HeapByteBuffer.getShort(HeapByteBuffer.java:314) 
> ~[na:1.8.0_73]
> at 
> org.apache.cassandra.db.context.CounterContext.headerLength(CounterContext.java:141)
>  ~[apache-cassandra-3.5.jar:3.5]
> at 
> org.apache.cassandra.db.context.CounterContext.access$100(CounterContext.java:76)
>  ~[apache-cassandra-3.5.jar:3.5]
> at 
> org.apache.cassandra.db.context.CounterContext$ContextState.(CounterContext.java:758)
>  ~[apache-cassandra-3.5.jar:3.5]
> at 
> org.apache.cassandra.db.context.CounterContext$ContextState.wrap(CounterContext.java:765)
>  ~[apache-cassandra-3.5.jar:3.5]
> at 
> org.apache.cassandra.db.context.CounterContext.merge(CounterContext.java:271) 
> ~[apache-cassandra-3.5.jar:3.5]
> at 
> org.apache.cassandra.db.Conflicts.mergeCounterValues(Conflicts.java:76) 
> ~[apache-cassandra-3.5.jar:3.5]
> at org.apache.cassandra.db.rows.Cells.reconcile(Cells.java:143) 
> ~[apache-cassandra-3.5.jar:3.5]
> at 
> org.apache.cassandra.db.rows.Row$Merger$ColumnDataReducer.getReduced(Row.java:591)
>  ~[apache-cassandra-3.5.jar:3.5]
> at 
> org.apache.cassandra.db.rows.Row$Merger$ColumnDataReducer.getReduced(Row.java:549)
>  ~[apache-cassandra-3.5.jar:3.5]
> at 
> org.apache.cassandra.utils.MergeIterator$ManyToOne.consume(MergeIterator.java:217)
>  ~[apache-cassandra-3.5.jar:3.5]
> at 
> org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:156)
>  ~[apache-cassandra-3.5.jar:3.5]
> at 
> org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) 
> ~[apache-cassandra-3.5.jar:3.5]
> at org.apache.cassandra.db.rows.Row$Merger.merge(Row.java:526) 
> ~[apache-cassandra-3.5.jar:3.5]
> at 
> org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator$MergeReducer.getReduced(UnfilteredRowIterators.java:473)
>  ~[apache-cassandra-3.5.jar:3.5]
> at 
> org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator$MergeReducer.getReduced(UnfilteredRowIterators.java:437)
>  ~[apache-cassandra-3.5.jar:3.5]
> at 
> org.apache.cassandra.utils.MergeIterator$ManyToOne.consume(MergeIterator.java:217)
>  ~[apache-cassandra-3.5.jar:3.5]
> at 
> org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:156)
>  ~[apache-cassandra-3.5.jar:3.5]
> at 
> org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) 
> ~[apache-cassandra-3.5.jar:3.5]
> at 
> org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator.computeNext(UnfilteredRowIterators.java:419

[jira] [Commented] (CASSANDRA-11726) IndexOutOfBoundsException when selecting (distinct) row ids from counter table.

2016-06-16 Thread Jaroslav Kamenik (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15333430#comment-15333430
 ] 

Jaroslav Kamenik commented on CASSANDRA-11726:
--

Hi, 

we have experienced same problem now, at C* 3.7.

It seems there are others with same problem, look at 
https://issues.apache.org/jira/browse/CASSANDRA-11812 .

> IndexOutOfBoundsException when selecting (distinct) row ids from counter 
> table.
> ---
>
> Key: CASSANDRA-11726
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11726
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: C* 3.5, cluster of 4 nodes.
>Reporter: Jaroslav Kamenik
>
> I have simple table containing counters:
> {code}
> CREATE TABLE tablename (
> object_id ascii,
> counter_id ascii,
> count counter,
> PRIMARY KEY (object_id, counter_id)
> ) WITH CLUSTERING ORDER BY (counter_id ASC)
> AND bloom_filter_fp_chance = 0.01
> AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
> AND comment = ''
> AND compaction = {'class': 
> 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 
> 'max_threshold': '32', 'min_threshold': '4'}
> AND compression = {'enabled': 'false'}
> AND crc_check_chance = 1.0
> AND dclocal_read_repair_chance = 0.1
> AND default_time_to_live = 0
> AND gc_grace_seconds = 864000
> AND max_index_interval = 2048
> AND memtable_flush_period_in_ms = 0
> AND min_index_interval = 128
> AND read_repair_chance = 0.0
> AND speculative_retry = '99PERCENTILE';
> {code}
> Counters are often inc/decreased, whole rows are queried, deleted sometimes.
> After some time I tried to query all object_ids, but it failed with:
> {code}
> cqlsh:woc> consistency quorum;
> cqlsh:woc> select object_id from tablename;
> ServerError:  message="java.lang.IndexOutOfBoundsException">
> {code}
> select * from ..., select where .., updates works well..
> With consistency one it works sometimes, so it seems something is broken at 
> one server, but I tried to repair table there and it did not help. 
> Whole exception from server log:
> {code}
> java.lang.IndexOutOfBoundsException: null
> at java.nio.Buffer.checkIndex(Buffer.java:546) ~[na:1.8.0_73]
> at java.nio.HeapByteBuffer.getShort(HeapByteBuffer.java:314) 
> ~[na:1.8.0_73]
> at 
> org.apache.cassandra.db.context.CounterContext.headerLength(CounterContext.java:141)
>  ~[apache-cassandra-3.5.jar:3.5]
> at 
> org.apache.cassandra.db.context.CounterContext.access$100(CounterContext.java:76)
>  ~[apache-cassandra-3.5.jar:3.5]
> at 
> org.apache.cassandra.db.context.CounterContext$ContextState.(CounterContext.java:758)
>  ~[apache-cassandra-3.5.jar:3.5]
> at 
> org.apache.cassandra.db.context.CounterContext$ContextState.wrap(CounterContext.java:765)
>  ~[apache-cassandra-3.5.jar:3.5]
> at 
> org.apache.cassandra.db.context.CounterContext.merge(CounterContext.java:271) 
> ~[apache-cassandra-3.5.jar:3.5]
> at 
> org.apache.cassandra.db.Conflicts.mergeCounterValues(Conflicts.java:76) 
> ~[apache-cassandra-3.5.jar:3.5]
> at org.apache.cassandra.db.rows.Cells.reconcile(Cells.java:143) 
> ~[apache-cassandra-3.5.jar:3.5]
> at 
> org.apache.cassandra.db.rows.Row$Merger$ColumnDataReducer.getReduced(Row.java:591)
>  ~[apache-cassandra-3.5.jar:3.5]
> at 
> org.apache.cassandra.db.rows.Row$Merger$ColumnDataReducer.getReduced(Row.java:549)
>  ~[apache-cassandra-3.5.jar:3.5]
> at 
> org.apache.cassandra.utils.MergeIterator$ManyToOne.consume(MergeIterator.java:217)
>  ~[apache-cassandra-3.5.jar:3.5]
> at 
> org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:156)
>  ~[apache-cassandra-3.5.jar:3.5]
> at 
> org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) 
> ~[apache-cassandra-3.5.jar:3.5]
> at org.apache.cassandra.db.rows.Row$Merger.merge(Row.java:526) 
> ~[apache-cassandra-3.5.jar:3.5]
> at 
> org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator$MergeReducer.getReduced(UnfilteredRowIterators.java:473)
>  ~[apache-cassandra-3.5.jar:3.5]
> at 
> org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator$MergeReducer.getReduced(UnfilteredRowIterators.java:437)
>  ~[apache-cassandra-3.5.jar:3.5]
> at 
> org.apache.cassandra.utils.MergeIterator$ManyToOne.consume(MergeIterator.java:217)
>  ~[apache-cassandra-3.5.jar:3.5]
> at 
> org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:156)
>  ~[apache-cassandra-3.5.jar:3.5]
> at 
> org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterato

[jira] [Commented] (CASSANDRA-11812) IndexOutOfBoundsException in CounterContext.headerLength

2016-06-16 Thread Jaroslav Kamenik (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15333427#comment-15333427
 ] 

Jaroslav Kamenik commented on CASSANDRA-11812:
--

Hi, it seems we have same problem, I have reported it here: 

https://issues.apache.org/jira/browse/CASSANDRA-11726

> IndexOutOfBoundsException in CounterContext.headerLength
> 
>
> Key: CASSANDRA-11812
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11812
> Project: Cassandra
>  Issue Type: Bug
> Environment: Ubuntu 14.04 TLS
> Cassandra 3.4
>Reporter: Jeff Evans
> Fix For: 3.x
>
>
> My team is using 
> https://github.com/Contrast-Security-OSS/cassandra-migration for schema 
> migrations, and it creates a table with a counter to store the schema 
> version.  We're able to create the table fine in 3.4, but when we run the 
> tool on an existing keyspace, the client reports an error from the server.  
> The server logs show an IndexOutOfBounds exception related to the counter 
> column.
> the library creates a table with name and count:
> {code}
> CREATE TABLE silver.cassandra_migration_version_counts (
> name text PRIMARY KEY,
> count counter
> ) WITH bloom_filter_fp_chance = 0.01
> AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
> AND comment = ''
> AND compaction = {'class': 
> 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 
> 'max_threshold': '32', 'min_threshold': '4'}
> AND compression = {'chunk_length_in_kb': '64', 'class': 
> 'org.apache.cassandra.io.compress.LZ4Compressor'}
> AND crc_check_chance = 1.0
> AND dclocal_read_repair_chance = 0.1
> AND default_time_to_live = 0
> AND gc_grace_seconds = 864000
> AND max_index_interval = 2048
> AND memtable_flush_period_in_ms = 0
> AND min_index_interval = 128
> AND read_repair_chance = 0.0
> AND speculative_retry = '99PERCENTILE';
> {code}
> then when the library performs a migration, it counts the rows in this table:
> {{SELECT count(\*) FROM silver.cassandra_migration_version_counts;}}
> this query throws the error:
> {code}
> cqlsh:silver> SELECT count(*) FROM silver.cassandra_migration_version_counts;
> ServerError:  message="java.lang.IndexOutOfBoundsException">
> {code}
> the client driver debug logs show the query is running with CONSISTENCY ALL
> {code}
> 2016-05-16 18:03:05 [cluster2-nio-worker-3] WARN  
> c.d.driver.core.RequestHandler - /172.24.131.52:9042 replied with server 
> error (java.lang.IndexOutOfBoundsException), defuncting connection.
> 2016-05-16 18:03:05 [cluster2-nio-worker-3] DEBUG 
> com.datastax.driver.core.Host.STATES - Defuncting 
> Connection[/172.24.131.52:9042-1, inFlight=0, closed=false] because: An 
> unexpected error occurred server side on /172.24.131.52:9042: 
> java.lang.IndexOutOfBoundsException
> 2016-05-16 18:03:05 [cluster2-nio-worker-3] DEBUG 
> com.datastax.driver.core.Connection - Connection[/172.24.131.52:9042-1, 
> inFlight=0, closed=true] closing connection
> 2016-05-16 18:03:05 [cluster2-nio-worker-3] DEBUG 
> com.datastax.driver.core.Host.STATES - [/172.24.131.52:9042] preventing new 
> connections for the next 1000 ms
> 2016-05-16 18:03:05 [cluster2-nio-worker-3] DEBUG 
> com.datastax.driver.core.Host.STATES - [/172.24.131.52:9042] 
> Connection[/172.24.131.52:9042-1, inFlight=0, closed=true] failed, remaining 
> = 0
> 2016-05-16 18:03:05 [cluster2-nio-worker-3] DEBUG 
> c.d.driver.core.RequestHandler - [937744315-1] Doing retry 1 for query SELECT 
> count(*) FROM silver.cassandra_migration_version_counts; at consistency ALL
> 2016-05-16 18:03:05 [cluster2-nio-worker-3] DEBUG 
> c.d.driver.core.RequestHandler - [937744315-1] Error querying 
> /172.24.131.52:9042 : com.datastax.driver.core.exceptions.ServerError: An 
> unexpected error occurred server side on /172.24.131.52:9042: 
> java.lang.IndexOutOfBoundsException
> {code}
> I can repro the error with this table, but I haven't found a simple repro yet.
> here's the call stack from the server log:
> {code}
> ERROR [SharedPool-Worker-2] 2016-05-16 17:00:39,313 ErrorMessage.java:338 - 
> Unexpected exception during request
> java.lang.IndexOutOfBoundsException: null
> at java.nio.Buffer.checkIndex(Buffer.java:546) ~[na:1.8.0_72]
> at java.nio.HeapByteBuffer.getShort(HeapByteBuffer.java:314) 
> ~[na:1.8.0_72]
> at 
> org.apache.cassandra.db.context.CounterContext.headerLength(CounterContext.java:141)
>  ~[apache-cassandra-3.4.jar:3.4]
> at 
> org.apache.cassandra.db.context.CounterContext.access$100(CounterContext.java:76)
>  ~[apache-cassandra-3.4.jar:3.4]
> at 
> org.apache.cassandra.db.context.CounterContext$ContextState.(CounterContext.java:758)
>  ~[apache-cassandra-3.4.jar:3.4]
> at 
> org.apache.c

[jira] [Created] (CASSANDRA-12015) Rebuilding from another DC should use different sources

2016-06-16 Thread Fabien Rousseau (JIRA)
Fabien Rousseau created CASSANDRA-12015:
---

 Summary: Rebuilding from another DC should use different sources
 Key: CASSANDRA-12015
 URL: https://issues.apache.org/jira/browse/CASSANDRA-12015
 Project: Cassandra
  Issue Type: Improvement
Reporter: Fabien Rousseau


Currently, when adding a new DC (ex: DC2) and rebuilding it from an existing DC 
(ex: DC1), only the closest replica is used as a "source of data".
It works but is not optimal, because in case of an RF=3 and 3 nodes cluster, 
only one node in DC1 is streaming the data to DC2. 

To build the new DC in a reasonable time, it would be better, in that case, to 
stream from multiple sources, thus distributing more evenly the load.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-10389) Repair session exception Validation failed

2016-06-16 Thread Heiko Sommer (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15333412#comment-15333412
 ] 

Heiko Sommer edited comment on CASSANDRA-10389 at 6/16/16 9:09 AM:
---

I'm getting the same problem with Cassandra 2.2.5, cluster of 6 nodes, RF=2. 
As a workaround I must restart all nodes before running a repair. 

For sure I do not start multiple repairs simultaneously. Here is what happened 
the last time I tried it out: The previous incremental repair ("{{nodetool 
repair --partitioner-range -- mykeyspace}}") started on a single node after 
rolling cluster restart finished nicely, with the expected number of "Session 
completed successfully" logs. There were no more repair tasks or anticompaction 
tasks running, the cluster was stable. I restarted C* on 4 nodes, but left it 
running on 2 nodes. On one of the restarted nodes I ran an incremental repair 
again, this time also with the "{{--sequential}}" option. 
On the repairing node I get failure logs such as
{noformat}
java.lang.RuntimeException: Could not create snapshot at /10.195.62.171
at 
org.apache.cassandra.repair.SnapshotTask$SnapshotCallback.onFailure(SnapshotTask.java:79)
 ~[apache-cassandra-2.2.5.jar:2.2.5]
ERROR [Repair#1:16] 2016-06-16 07:10:29,239 CassandraDaemon.java:185 - 
Exception in thread Thread[Repair#1:16,5,RMI Runtime]
com.google.common.util.concurrent.UncheckedExecutionException: 
java.lang.RuntimeException: Could not create snapshot at /10.195.62.171
at 
com.google.common.util.concurrent.Futures.wrapAndThrowUnchecked(Futures.java:1387)
 ~[guava-16.0.jar:na]
{noformat}
while on the failing target nodes (those that were not restarted before the 
repair) I get logs such as
{noformat}
ERROR [AntiEntropyStage:1] 2016-06-16 07:10:29,237 
RepairMessageVerbHandler.java:108 - Cannot start multiple repair sessions over 
the same sstables
{noformat}

Before that, I also tried with full repair, and got the impression that it is 
the same problem for full or incremental repairs. 
As I can reproduce the issue, I would be glad to provide you with more logs or 
some experimenting if that would help resolve the issue. 


was (Author: hsommer):
I'm getting the same problem with Cassandra 2.2.5, cluster of 6 nodes, RF=2. 
As a workaround I must restart all nodes before running a repair. 

For sure I do not start multiple repairs simultaneously. Here is what happened 
the last time I tried it out: The previous incremental repair ("nodetool repair 
--partitioner-range -- mykeyspace") started on a single node after rolling 
cluster restart finished nicely, with the expected number of "Session completed 
successfully" logs. There were no more repair tasks or anticompaction tasks 
running, the cluster was stable. I restarted C* on 4 nodes, but left it running 
on 2 nodes. On one of the restarted nodes I ran an incremental repair again, 
this time also with the "--sequential" option. 
On the repairing node I get failure logs such as
{noformat}
java.lang.RuntimeException: Could not create snapshot at /10.195.62.171
at 
org.apache.cassandra.repair.SnapshotTask$SnapshotCallback.onFailure(SnapshotTask.java:79)
 ~[apache-cassandra-2.2.5.jar:2.2.5]
ERROR [Repair#1:16] 2016-06-16 07:10:29,239 CassandraDaemon.java:185 - 
Exception in thread Thread[Repair#1:16,5,RMI Runtime]
com.google.common.util.concurrent.UncheckedExecutionException: 
java.lang.RuntimeException: Could not create snapshot at /10.195.62.171
at 
com.google.common.util.concurrent.Futures.wrapAndThrowUnchecked(Futures.java:1387)
 ~[guava-16.0.jar:na]
{noformat}
while on the failing target nodes (those that were not restarted before the 
repair) I get logs such as
{noformat}
ERROR [AntiEntropyStage:1] 2016-06-16 07:10:29,237 
RepairMessageVerbHandler.java:108 - Cannot start multiple repair sessions over 
the same sstables
{noformat}

Before that, I also tried with full repair, and got the impression that it is 
the same problem for full or incremental repairs. 
As I can reproduce the issue, I would be glad to provide you with more logs or 
some experimenting if that would help resolve the issue. 

> Repair session exception Validation failed
> --
>
> Key: CASSANDRA-10389
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10389
> Project: Cassandra
>  Issue Type: Bug
> Environment: Debian 8, Java 1.8.0_60, Cassandra 2.2.1 (datastax 
> compilation)
>Reporter: Jędrzej Sieracki
> Fix For: 2.2.x
>
>
> I'm running a repair on a ring of nodes, that was recently extented from 3 to 
> 13 nodes. The extension was done two days ago, the repair was attempted 
> yesterday.
> {quote}
> [2015-09-22 11:55:55,266] Starting repair command #9, repairing keyspace 
> perspectiv with repair options (parallelism: parallel

[jira] [Commented] (CASSANDRA-10389) Repair session exception Validation failed

2016-06-16 Thread Heiko Sommer (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15333412#comment-15333412
 ] 

Heiko Sommer commented on CASSANDRA-10389:
--

I'm getting the same problem with Cassandra 2.2.5, cluster of 6 nodes, RF=2. 
As a workaround I must restart all nodes before running a repair. 

For sure I do not start multiple repairs simultaneously. Here is what happened 
the last time I tried it out: The previous incremental repair ("nodetool repair 
--partitioner-range -- mykeyspace") started on a single node after rolling 
cluster restart finished nicely, with the expected number of "Session completed 
successfully" logs. There were no more repair tasks or anticompaction tasks 
running, the cluster was stable. I restarted C* on 4 nodes, but left it running 
on 2 nodes. On one of the restarted nodes I ran an incremental repair again, 
this time also with the "--sequential" option. 
On the repairing node I get failure logs such as
{noformat}
java.lang.RuntimeException: Could not create snapshot at /10.195.62.171
at 
org.apache.cassandra.repair.SnapshotTask$SnapshotCallback.onFailure(SnapshotTask.java:79)
 ~[apache-cassandra-2.2.5.jar:2.2.5]
ERROR [Repair#1:16] 2016-06-16 07:10:29,239 CassandraDaemon.java:185 - 
Exception in thread Thread[Repair#1:16,5,RMI Runtime]
com.google.common.util.concurrent.UncheckedExecutionException: 
java.lang.RuntimeException: Could not create snapshot at /10.195.62.171
at 
com.google.common.util.concurrent.Futures.wrapAndThrowUnchecked(Futures.java:1387)
 ~[guava-16.0.jar:na]
{noformat}
while on the failing target nodes (those that were not restarted before the 
repair) I get logs such as
{noformat}
ERROR [AntiEntropyStage:1] 2016-06-16 07:10:29,237 
RepairMessageVerbHandler.java:108 - Cannot start multiple repair sessions over 
the same sstables
{noformat}

Before that, I also tried with full repair, and got the impression that it is 
the same problem for full or incremental repairs. 
As I can reproduce the issue, I would be glad to provide you with more logs or 
some experimenting if that would help resolve the issue. 

> Repair session exception Validation failed
> --
>
> Key: CASSANDRA-10389
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10389
> Project: Cassandra
>  Issue Type: Bug
> Environment: Debian 8, Java 1.8.0_60, Cassandra 2.2.1 (datastax 
> compilation)
>Reporter: Jędrzej Sieracki
> Fix For: 2.2.x
>
>
> I'm running a repair on a ring of nodes, that was recently extented from 3 to 
> 13 nodes. The extension was done two days ago, the repair was attempted 
> yesterday.
> {quote}
> [2015-09-22 11:55:55,266] Starting repair command #9, repairing keyspace 
> perspectiv with repair options (parallelism: parallel, primary range: false, 
> incremental: true, job threads: 1, ColumnFamilies: [], dataCenters: [], 
> hosts: [], # of ranges: 517)
> [2015-09-22 11:55:58,043] Repair session 1f7c50c0-6110-11e5-b992-9f13fa8664c8 
> for range (-5927186132136652665,-5917344746039874798] failed with error 
> [repair #1f7c50c0-6110-11e5-b992-9f13fa8664c8 on 
> perspectiv/stock_increment_agg, (-5927186132136652665,-5917344746039874798]] 
> Validation failed in cblade1.XXX/XXX (progress: 0%)
> {quote}
> BTW, I am ignoring the LEAK errors for now, that's outside of the scope of 
> the main issue:
> {quote}
> ERROR [Reference-Reaper:1] 2015-09-22 11:58:27,843 Ref.java:187 - LEAK 
> DETECTED: a reference 
> (org.apache.cassandra.utils.concurrent.Ref$State@4d25ad8f) to class 
> org.apache.cassandra.io.sstable.format.SSTableReader$InstanceTidier@896826067:/var/lib/cassandra/data/perspectiv/stock_increment_agg-840cad405de711e5b9929f13fa8664c8/la-73-big
>  was not released before the reference was garbage collected
> {quote}
> I scrubbed the sstable with failed validation on cblade1 with nodetool scrub 
> perspectiv stock_increment_agg:
> {quote}
> INFO  [CompactionExecutor:1704] 2015-09-22 12:05:31,615 OutputHandler.java:42 
> - Scrubbing 
> BigTableReader(path='/var/lib/cassandra/data/perspectiv/stock_increment_agg-840cad405de711e5b9929f13fa8664c8/la-83-big-Data.db')
>  (345466609 bytes)
> INFO  [CompactionExecutor:1703] 2015-09-22 12:05:31,615 OutputHandler.java:42 
> - Scrubbing 
> BigTableReader(path='/var/lib/cassandra/data/perspectiv/stock_increment_agg-840cad405de711e5b9929f13fa8664c8/la-82-big-Data.db')
>  (60496378 bytes)
> ERROR [Reference-Reaper:1] 2015-09-22 12:05:31,676 Ref.java:187 - LEAK 
> DETECTED: a reference 
> (org.apache.cassandra.utils.concurrent.Ref$State@4ca8951e) to class 
> org.apache.cassandra.io.sstable.format.SSTableReader$InstanceTidier@114161559:/var/lib/cassandra/data/perspectiv/receipt_agg_total-76abb0625de711e59f6e0b7d98a25b6e/la-48-big
>  was not released before the reference was garbage collecte

[jira] [Commented] (CASSANDRA-12002) SSTable tools mishandling LocalPartitioner

2016-06-16 Thread Stefania (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1575#comment-1575
 ] 

Stefania commented on CASSANDRA-12002:
--

Thanks for the trunk patch. I do agree on committing only to trunk, since this 
is a minor bug and sstablemetadata is using the partitioner only since 
CASSANDRA-7159, which was committed in 3.6, whilst sstabledump already handles 
secondary indexes as you mention.

This worried me a little bit: {{if 
(validationMetadata.partitioner.endsWith("LocalPartitioner"))}}, although 
_extremely_ unlikely to be a problem, I prefer to use the fully qualified class 
name, see this commit 
[here|https://github.com/stef1927/cassandra/commit/5e5e3c818adea83b702e52ee6653c2a54c7dbef4],
 could you cross review it?

I've started CI tests for trunk here:

|[patch|https://github.com/stef1927/cassandra/commits/12002]|
|[testall|http://cassci.datastax.com/view/Dev/view/stef1927/job/stef1927-12002-testall/]|
|[dtest|http://cassci.datastax.com/view/Dev/view/stef1927/job/stef1927-12002-dtest/]|

It would be nice to have a dtest or unit test to exercise this code, how did 
you bump into this issue?

> SSTable tools mishandling LocalPartitioner
> --
>
> Key: CASSANDRA-12002
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12002
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tools
>Reporter: Chris Lohfink
>Assignee: Chris Lohfink
>Priority: Minor
> Attachments: CASSADNRA-12002.txt
>
>
> The sstabledump and sstablemetadata tools use the FBUtilities.newPartitioner 
> from the name of the partitioner in the validation component. This fails on 
> sstables that are created with things that use the LocalPartitioner 
> (secondary indexes, and the system.batches table). The sstabledump had a 
> check for secondary indexes, but still failed for the system table it was 
> failing for all in the metadata tool.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12012) CQLSSTableWriter and composite clustering keys trigger NPE

2016-06-16 Thread Pierre N. (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1565#comment-1565
 ] 

Pierre N. commented on CASSANDRA-12012:
---

hasSupportingIndex() of 
org.apache.cassandra.cql3.restrictions.PrimaryKeyRestrictionSet is doing 
Keyspace.openAndGetStore(cfm) which trigger error because of uninitialized 
keyspace in client mode. 

I hotfixed by adding this check : 
{code}
+import org.apache.cassandra.config.Config;
 import org.apache.cassandra.cql3.QueryOptions;
 import org.apache.cassandra.cql3.functions.Function;
 import org.apache.cassandra.cql3.statements.Bound;
@@ -115,7 +116,7 @@ final class PrimaryKeyRestrictionSet extends 
AbstractPrimaryKeyRestrictions
 this.isPartitionKey = primaryKeyRestrictions.isPartitionKey;
 this.cfm = primaryKeyRestrictions.cfm;
 
-if (!primaryKeyRestrictions.isEmpty() && 
!hasSupportingIndex(restriction))
+if (!Config.isClientMode() && !primaryKeyRestrictions.isEmpty() && 
!hasSupportingIndex(restriction))
 {
 ColumnDefinition lastRestrictionStart = 
primaryKeyRestrictions.restrictions.lastRestriction().getFirstColumn();
 ColumnDefinition newRestrictionStart = 
restriction.getFirstColumn();
n))
{code}

It works and generate a valid sstable, however, not sure this is the best way 
to fix it.

> CQLSSTableWriter and composite clustering keys trigger NPE
> --
>
> Key: CASSANDRA-12012
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12012
> Project: Cassandra
>  Issue Type: Bug
>  Components: Streaming and Messaging
>Reporter: Pierre N.
>Assignee: Mahdi Mohammadi
>
> It triggers when using multiple clustering keys in the primary keys
> {code}
> package tests;
> import java.io.File;
> import org.apache.cassandra.io.sstable.CQLSSTableWriter;
> import org.apache.cassandra.config.Config;
> public class DefaultWriter {
> 
> public static void main(String[] args) throws Exception {
> Config.setClientMode(true);
> 
> String createTableQuery = "CREATE TABLE ks_test.table_test ("
> + "pk1 int,"
> + "ck1 int,"
> + "ck2 int,"
> + "PRIMARY KEY ((pk1), ck1, ck2)"
> + ");";
> String insertQuery = "INSERT INTO ks_test.table_test(pk1, ck1, ck2) 
> VALUES(?,?,?)";
> 
> CQLSSTableWriter writer = CQLSSTableWriter.builder()
> .inDirectory(Files.createTempDirectory("sst").toFile())
> .forTable(createTableQuery)
> .using(insertQuery)
> .build();
> writer.close();
> }
> }
> {code}
> Exception : 
> {code}
> Exception in thread "main" java.lang.ExceptionInInitializerError
>   at org.apache.cassandra.db.Keyspace.initCf(Keyspace.java:368)
>   at org.apache.cassandra.db.Keyspace.(Keyspace.java:305)
>   at org.apache.cassandra.db.Keyspace.open(Keyspace.java:129)
>   at org.apache.cassandra.db.Keyspace.open(Keyspace.java:106)
>   at org.apache.cassandra.db.Keyspace.openAndGetStore(Keyspace.java:159)
>   at 
> org.apache.cassandra.cql3.restrictions.PrimaryKeyRestrictionSet.hasSupportingIndex(PrimaryKeyRestrictionSet.java:156)
>   at 
> org.apache.cassandra.cql3.restrictions.PrimaryKeyRestrictionSet.(PrimaryKeyRestrictionSet.java:118)
>   at 
> org.apache.cassandra.cql3.restrictions.PrimaryKeyRestrictionSet.mergeWith(PrimaryKeyRestrictionSet.java:213)
>   at 
> org.apache.cassandra.cql3.restrictions.StatementRestrictions.addSingleColumnRestriction(StatementRestrictions.java:266)
>   at 
> org.apache.cassandra.cql3.restrictions.StatementRestrictions.addRestriction(StatementRestrictions.java:250)
>   at 
> org.apache.cassandra.cql3.restrictions.StatementRestrictions.(StatementRestrictions.java:159)
>   at 
> org.apache.cassandra.cql3.statements.UpdateStatement$ParsedInsert.prepareInternal(UpdateStatement.java:183)
>   at 
> org.apache.cassandra.cql3.statements.ModificationStatement$Parsed.prepare(ModificationStatement.java:782)
>   at 
> org.apache.cassandra.cql3.statements.ModificationStatement$Parsed.prepare(ModificationStatement.java:768)
>   at 
> org.apache.cassandra.cql3.QueryProcessor.getStatement(QueryProcessor.java:505)
>   at 
> org.apache.cassandra.io.sstable.CQLSSTableWriter$Builder.getStatement(CQLSSTableWriter.java:508)
>   at 
> org.apache.cassandra.io.sstable.CQLSSTableWriter$Builder.using(CQLSSTableWriter.java:439)
>   at tests.DefaultWriter.main(DefaultWriter.java:29)
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.cassandra.config.DatabaseDescriptor.getFlushWriters(DatabaseDescriptor.java:1188)
>   at 
> org.apache.cassandra.db.ColumnFamilyStore.(ColumnFamilyStore.java:127)
>   ... 18 more

[jira] [Updated] (CASSANDRA-12012) CQLSSTableWriter and composite clustering keys trigger NPE

2016-06-16 Thread Pierre N. (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pierre N. updated CASSANDRA-12012:
--
Description: 
It triggers when using multiple clustering keys in the primary keys

{code}
package tests;

import java.io.File;
import org.apache.cassandra.io.sstable.CQLSSTableWriter;
import org.apache.cassandra.config.Config;

public class DefaultWriter {

public static void main(String[] args) throws Exception {
Config.setClientMode(true);

String createTableQuery = "CREATE TABLE ks_test.table_test ("
+ "pk1 int,"
+ "ck1 int,"
+ "ck2 int,"
+ "PRIMARY KEY ((pk1), ck1, ck2)"
+ ");";
String insertQuery = "INSERT INTO ks_test.table_test(pk1, ck1, ck2) 
VALUES(?,?,?)";

CQLSSTableWriter writer = CQLSSTableWriter.builder()
.inDirectory(Files.createTempDirectory("sst").toFile())
.forTable(createTableQuery)
.using(insertQuery)
.build();
writer.close();
}
}
{code}

Exception : 

{code}
Exception in thread "main" java.lang.ExceptionInInitializerError
at org.apache.cassandra.db.Keyspace.initCf(Keyspace.java:368)
at org.apache.cassandra.db.Keyspace.(Keyspace.java:305)
at org.apache.cassandra.db.Keyspace.open(Keyspace.java:129)
at org.apache.cassandra.db.Keyspace.open(Keyspace.java:106)
at org.apache.cassandra.db.Keyspace.openAndGetStore(Keyspace.java:159)
at 
org.apache.cassandra.cql3.restrictions.PrimaryKeyRestrictionSet.hasSupportingIndex(PrimaryKeyRestrictionSet.java:156)
at 
org.apache.cassandra.cql3.restrictions.PrimaryKeyRestrictionSet.(PrimaryKeyRestrictionSet.java:118)
at 
org.apache.cassandra.cql3.restrictions.PrimaryKeyRestrictionSet.mergeWith(PrimaryKeyRestrictionSet.java:213)
at 
org.apache.cassandra.cql3.restrictions.StatementRestrictions.addSingleColumnRestriction(StatementRestrictions.java:266)
at 
org.apache.cassandra.cql3.restrictions.StatementRestrictions.addRestriction(StatementRestrictions.java:250)
at 
org.apache.cassandra.cql3.restrictions.StatementRestrictions.(StatementRestrictions.java:159)
at 
org.apache.cassandra.cql3.statements.UpdateStatement$ParsedInsert.prepareInternal(UpdateStatement.java:183)
at 
org.apache.cassandra.cql3.statements.ModificationStatement$Parsed.prepare(ModificationStatement.java:782)
at 
org.apache.cassandra.cql3.statements.ModificationStatement$Parsed.prepare(ModificationStatement.java:768)
at 
org.apache.cassandra.cql3.QueryProcessor.getStatement(QueryProcessor.java:505)
at 
org.apache.cassandra.io.sstable.CQLSSTableWriter$Builder.getStatement(CQLSSTableWriter.java:508)
at 
org.apache.cassandra.io.sstable.CQLSSTableWriter$Builder.using(CQLSSTableWriter.java:439)
at tests.DefaultWriter.main(DefaultWriter.java:29)
Caused by: java.lang.NullPointerException
at 
org.apache.cassandra.config.DatabaseDescriptor.getFlushWriters(DatabaseDescriptor.java:1188)
at 
org.apache.cassandra.db.ColumnFamilyStore.(ColumnFamilyStore.java:127)
... 18 more
{code}

  was:
It triggers when using multiple clustering keys in the primary keys

{code}
package tests;

import java.io.File;
import org.apache.cassandra.io.sstable.CQLSSTableWriter;
import org.apache.cassandra.config.Config;

public class DefaultWriter {

public static void main(String[] args) throws Exception {
Config.setClientMode(true);

String createTableQuery = "CREATE TABLE ks_test.table_test ("
+ "pk1 int,"
+ "ck1 int,"
+ "ck2 int,"
+ "PRIMARY KEY ((pk1), ck1, ck2)"
+ ");";
String insertQuery = "INSERT INTO ks_test.table_test(pk1, ck1, ck2) 
VALUES(?,?,?)";

CQLSSTableWriter writer = CQLSSTableWriter.builder()
.inDirectory(File.createTempFile("sstdir", "-tmp"))
.forTable(createTableQuery)
.using(insertQuery)
.build();
writer.close();
}
}
{code}

Exception : 

{code}
Exception in thread "main" java.lang.ExceptionInInitializerError
at org.apache.cassandra.db.Keyspace.initCf(Keyspace.java:368)
at org.apache.cassandra.db.Keyspace.(Keyspace.java:305)
at org.apache.cassandra.db.Keyspace.open(Keyspace.java:129)
at org.apache.cassandra.db.Keyspace.open(Keyspace.java:106)
at org.apache.cassandra.db.Keyspace.openAndGetStore(Keyspace.java:159)
at 
org.apache.cassandra.cql3.restrictions.PrimaryKeyRestrictionSet.hasSupportingIndex(PrimaryKeyRestrictionSet.java:156)
at 
org.apache.cassandra.cql3.restrictions.PrimaryKeyRestrictionSet.(PrimaryKeyRestrictionSet.java:118)
at 
org.apache.cassandra.cql3.restrictions.PrimaryKeyRestrictionSet.mergeWith(PrimaryKeyRestrictionSet.ja

[jira] [Issue Comment Deleted] (CASSANDRA-11873) Add duration type

2016-06-16 Thread Benjamin Lerer (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Lerer updated CASSANDRA-11873:
---
Comment: was deleted

(was: I am -1 on this. I think those syntaxes are more complex than needed. In 
my opinion {{now() - 3d}} will be easily understood by everybody  and I do not 
think that there is a need to have to write {{now() -3 day}}. In the case of 
CASSANDRA-11871 I found that the {{INTERVAL}} syntax is making the query much 
more verbose and less readable: {{GROUP BY floor(time, INTERVAL '1' HOUR)}}. )

> Add duration type
> -
>
> Key: CASSANDRA-11873
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11873
> Project: Cassandra
>  Issue Type: Improvement
>  Components: CQL
>Reporter: Benjamin Lerer
>Assignee: Benjamin Lerer
>  Labels: client-impacting, doc-impacting
> Fix For: 3.x
>
>
> For CASSANDRA-11871 or to allow queries with {{WHERE}} clause like:
> {{... WHERE reading_time < now() - 2h}}, we need to support some duration 
> type.
> In my opinion, it should be represented internally as a number of 
> microseconds.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11873) Add duration type

2016-06-16 Thread Benjamin Lerer (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1528#comment-1528
 ] 

Benjamin Lerer commented on CASSANDRA-11873:


I am -1 on this. I think those syntaxes are more complex than needed. In my 
opinion {{now() - 3d}} will be easily understood by everybody  and I do not 
think that there is a need to have to write {{now() -3 day}}. In the case of 
CASSANDRA-11871 I found that the {{INTERVAL}} syntax is making the query much 
more verbose and less readable: {{GROUP BY floor(time, INTERVAL '1' HOUR)}}. 

> Add duration type
> -
>
> Key: CASSANDRA-11873
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11873
> Project: Cassandra
>  Issue Type: Improvement
>  Components: CQL
>Reporter: Benjamin Lerer
>Assignee: Benjamin Lerer
>  Labels: client-impacting, doc-impacting
> Fix For: 3.x
>
>
> For CASSANDRA-11871 or to allow queries with {{WHERE}} clause like:
> {{... WHERE reading_time < now() - 2h}}, we need to support some duration 
> type.
> In my opinion, it should be represented internally as a number of 
> microseconds.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11873) Add duration type

2016-06-16 Thread Benjamin Lerer (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1529#comment-1529
 ] 

Benjamin Lerer commented on CASSANDRA-11873:


I am -1 on this. I think those syntaxes are more complex than needed. In my 
opinion {{now() - 3d}} will be easily understood by everybody  and I do not 
think that there is a need to have to write {{now() -3 day}}. In the case of 
CASSANDRA-11871 I found that the {{INTERVAL}} syntax is making the query much 
more verbose and less readable: {{GROUP BY floor(time, INTERVAL '1' HOUR)}}. 

> Add duration type
> -
>
> Key: CASSANDRA-11873
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11873
> Project: Cassandra
>  Issue Type: Improvement
>  Components: CQL
>Reporter: Benjamin Lerer
>Assignee: Benjamin Lerer
>  Labels: client-impacting, doc-impacting
> Fix For: 3.x
>
>
> For CASSANDRA-11871 or to allow queries with {{WHERE}} clause like:
> {{... WHERE reading_time < now() - 2h}}, we need to support some duration 
> type.
> In my opinion, it should be represented internally as a number of 
> microseconds.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11873) Add duration type

2016-06-16 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1527#comment-1527
 ] 

Sylvain Lebresne commented on CASSANDRA-11873:
--

Well,
# We're not reinventing the wheel, we're reusing [influxdb 
syntax|https://docs.influxdata.com/influxdb/v0.13/query_language/data_exploration/#time-syntax-in-queries].
 Even besides that, calling a syntax like {{2h3m}} "reinventing the wheel" 
feels to me a bit of a strech.
# If one bothers reading the [linked Postgres 
page|https://www.postgresql.org/docs/current/static/datatype-datetime.html#DATATYPE-INTERVAL-INPUT-EXAMPLES],
 he'll note that Postgres supports {{P2h3m}} which is pretty damn close (it 
also supports {{2 hours 3 minutes}} which I don't think is necessary but 
wouldn't mind supporting as alternative to the shorted version). Surely, 
Postgres veterans are smart enough to not be thrown off by us dropping the 
{{P}} at the beginning.
# Regarding the Oracle syntax, I think it's terrible. The goal of this ticket 
is to add a user-friendly syntax for inputing durations, but imo {{now() - 
(INTERVAL '4 5:12' DAY TO MINUTE)}} (to mean {{now() - 4d5h12m}}) is verbose, 
unintuitive and plain ugly. And as far as I can tell, it's nowhere near 
standard (Postgres don't seem to support it for instance). So I'm basically a 
strong PMC binding -1 on it.

Overall, we're not "Making up completely new syntax". {{3h2m5s}} is pretty 
standard (as in, in life in general) and concise, and it's even supported by 
some other database (influxdb and, up to a minor detail, Postgres). And I don't 
see any other syntax being a de-factor standard in other databases. 


> Add duration type
> -
>
> Key: CASSANDRA-11873
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11873
> Project: Cassandra
>  Issue Type: Improvement
>  Components: CQL
>Reporter: Benjamin Lerer
>Assignee: Benjamin Lerer
>  Labels: client-impacting, doc-impacting
> Fix For: 3.x
>
>
> For CASSANDRA-11871 or to allow queries with {{WHERE}} clause like:
> {{... WHERE reading_time < now() - 2h}}, we need to support some duration 
> type.
> In my opinion, it should be represented internally as a number of 
> microseconds.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)