[jira] [Commented] (CASSANDRA-6431) Prevent same CF from being enqueued to flush more than once

2013-12-03 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13837439#comment-13837439
 ] 

Sylvain Lebresne commented on CASSANDRA-6431:
-

I might misunderstand that suggestion but I would say that the fact the block 
writes if writes gets in faster than we're able to flush is a feature (to 
avoid OOM), even if all writes goes to the same sstable. That is, it could be 
that we're too agressive in blocking writes in some cases because our heuristic 
for writes are faster than we can flush is not good enough, but it's not 
entirely clear to me what not queuing 2 memtables for the same CF achieve 
(outside potentially having the memtable we don't queue grow unbounded and 
OOMing us that is).

 Prevent same CF from being enqueued to flush more than once
 ---

 Key: CASSANDRA-6431
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6431
 Project: Cassandra
  Issue Type: Bug
Reporter: Benedict
Assignee: Benedict
Priority: Minor

 As things stand we can, in certain circumstances, fill up the flush queue 
 with multiple requests to flush the same CF, which will lead to all writes 
 blocking until the CF is flushed. Ideally we would only enqueue each 
 CF/Memtable once and, if requested to be flushed whilst already enqueued, 
 mark it to be requeued once the outstanding flush completes.
 On a related note, a single table can already block writes if it has flush 
 queue size or more secondary indexes. At the same time it might be worth 
 deciding if this is also a problem and address it.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-6218) Reduce WAN traffic while doing repairs

2013-12-03 Thread sankalp kohli (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13837447#comment-13837447
 ] 

sankalp kohli commented on CASSANDRA-6218:
--

As per my first comment, we need a way to run repair among specified endpoints. 
I think with this change, you can specify data centers. So it will help people 
who have 3 or more DC but not with 2 DC. 

 Reduce WAN traffic while doing repairs
 --

 Key: CASSANDRA-6218
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6218
 Project: Cassandra
  Issue Type: New Feature
  Components: Core, Tools
Reporter: sankalp kohli
Assignee: Jimmy Mårdell
Priority: Minor
 Fix For: 2.0.4

 Attachments: trunk-6218-v2.txt, trunk-6218-v3.patch, trunk-6218.txt


 The way we send out data that does not match over WAN can be improved. 
 Example: Say there are four nodes(A,B,C,D) which are replica of a range we 
 are repairing. A, B is in DC1 and C,D is in DC2. If A does not have the data 
 which other replicas have, then we will have following streams
 1) A to B and back
 2) A to C and back(Goes over WAN)
 3) A to D and back(Goes over WAN)
 One of the ways of doing it to reduce WAN traffic is this.
 1) Repair A and B only with each other and C and D with each other starting 
 at same time t. 
 2) Once these repairs have finished, A,B and C,D are in sync with respect to 
 time t. 
 3) Now run a repair between A and C, the streams which are exchanged as a 
 result of the diff will also be streamed to B and D via A and C(C and D 
 behaves like a proxy to the streams).
 For a replication of DC1:2,DC2:2, the WAN traffic will get reduced by 50% and 
 even more for higher replication factors. 
  



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-6435) nodetool outputs xss and jamm errors in 1.2.12

2013-12-03 Thread Sam Tunnicliffe (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13837453#comment-13837453
 ] 

Sam Tunnicliffe commented on CASSANDRA-6435:


This is a partial duplicate of CASSANDRA-6404 (partial because that only 
addresses the jamm ERRORs, the echo of the xss... string is separate).

 nodetool outputs xss and jamm errors in 1.2.12
 --

 Key: CASSANDRA-6435
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6435
 Project: Cassandra
  Issue Type: Bug
Reporter: Karl Mueller
Assignee: Brandon Williams
Priority: Minor

 Since 1.2.12, just running nodetool is producing this output. Probably this 
 is related to CASSANDRA-6273.
 it's unclear to me whether jamm is actually not being loaded, but clearly 
 nodetool should not be having this output, which is likely from 
 cassandra-env.sh
 [cassandra@dev-cass00 cassandra]$ /data2/cassandra/bin/nodetool ring
 xss =  -ea -javaagent:/data2/cassandra/bin/../lib/jamm-0.2.5.jar 
 -XX:+UseThreadPriorities -XX:ThreadPriorityPolicy=42 -Xms14G -Xmx14G -Xmn1G 
 -XX:+HeapDumpOnOutOfMemoryError -Xss256k
 Note: Ownership information does not include topology; for complete 
 information, specify a keyspace
 Datacenter: datacenter1
 ==
 Address  RackStatus State   LoadOwns
 Token
 
 170141183460469231731687303715884105727
 10.93.15.10  rack1   Up Normal  123.82 GB   20.00%  
 34028236692093846346337460743176821145
 10.93.15.11  rack1   Up Normal  124 GB  20.00%  
 68056473384187692692674921486353642290
 10.93.15.12  rack1   Up Normal  123.97 GB   20.00%  
 102084710076281539039012382229530463436
 10.93.15.13  rack1   Up Normal  124.03 GB   20.00%  
 136112946768375385385349842972707284581
 10.93.15.14  rack1   Up Normal  123.93 GB   20.00%  
 170141183460469231731687303715884105727
 ERROR 16:20:01,408 Unable to initialize MemoryMeter (jamm not specified as 
 javaagent).  This means Cassandra will be unable to measure object sizes 
 accurately and may consequently OOM.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Comment Edited] (CASSANDRA-5074) Add an official way to disable compaction

2013-12-03 Thread Ngoc Minh Vo (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13837004#comment-13837004
 ] 

Ngoc Minh Vo edited comment on CASSANDRA-5074 at 12/3/13 9:23 AM:
--

Thanks a lot for your quick answer.
It is indeed very weird. I downloaded the binary for Windows from this address 
last week:
http://archive.apache.org/dist/cassandra/2.0.3/

I will recheck it tomorrow ...
(edit)
The cqlsh shows this:
{quote}
[cqlsh 4.1.0 | Cassandra 2.0.3 | CQL spec 3.1.1 | Thrift protocol 19.38.0]
{quote}




was (Author: vongocminh):
Thanks a lot for your quick answer.
It is indeed very weird. I downloaded the binary for Windows from this address 
last week:
http://archive.apache.org/dist/cassandra/2.0.3/

I will recheck it tomorrow ...

 Add an official way to disable compaction
 -

 Key: CASSANDRA-5074
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5074
 Project: Cassandra
  Issue Type: Bug
Reporter: Jonathan Ellis
Assignee: Marcus Eriksson
Priority: Minor
 Fix For: 2.0 beta 1

 Attachments: 
 0001-CASSANDRA-5074-make-it-possible-to-disable-autocompa.patch, 
 0001-CASSANDRA-5074-v2.patch


 We've traditionally used min or max compaction threshold = 0 to disable 
 compaction, but this isn't exactly intuitive and it's inconsistently 
 implemented -- allowed from jmx, not allowed from cli.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-6431) Prevent same CF from being enqueued to flush more than once

2013-12-03 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13837531#comment-13837531
 ] 

Benedict commented on CASSANDRA-6431:
-

Yeah, I've been thinking about this and realise we need to probably split 
cfs.forceFlush() into cfs.forceFlushNow() and cfs.forceEnqueueFlush(), the 
former used only for flushes that are due to memory pressure, and the latter 
for any other reasons.

I don't want to go too crazy on this, as I'll need to change this altogether 
very soon with CASSANDRA-5549 (perhaps split it off into another ticket), but 
this should be reasonably manageable.

 Prevent same CF from being enqueued to flush more than once
 ---

 Key: CASSANDRA-6431
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6431
 Project: Cassandra
  Issue Type: Bug
Reporter: Benedict
Assignee: Benedict
Priority: Minor

 As things stand we can, in certain circumstances, fill up the flush queue 
 with multiple requests to flush the same CF, which will lead to all writes 
 blocking until the CF is flushed. Ideally we would only enqueue each 
 CF/Memtable once and, if requested to be flushed whilst already enqueued, 
 mark it to be requeued once the outstanding flush completes.
 On a related note, a single table can already block writes if it has flush 
 queue size or more secondary indexes. At the same time it might be worth 
 deciding if this is also a problem and address it.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-4880) Endless loop flushing+compacting system/schema_keyspaces and system/schema_columnfamilies

2013-12-03 Thread Wei-dun Teng (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13837624#comment-13837624
 ] 

Wei-dun Teng commented on CASSANDRA-4880:
-

I have encountered similar bug after I upgrade one of my 1.2.2 node to 1.2.12.
Was using FreeBSD 8.2 + diablo-jre-1.6.0.07.02_18.

Before I upgrading the node to 1.2.12, I changed JVM to openjdk-7.25.15_2 (in 
retrospective probably not a good idea ...), saw flipping Memtable flushes, 
changes JVM back to diablo-jre-1.6.0.07.02_18, and still seeing rapid flushes. 
Now I've taken the 1.2.12 node offline (but not decommissioned it).

After that I see tens of Memtable flushes on the 1.2.12 per second, while once 
or twice a second on nodes with 1.2.2.

Attached files are the flipping schema versions.

 Endless loop flushing+compacting system/schema_keyspaces and 
 system/schema_columnfamilies
 -

 Key: CASSANDRA-4880
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4880
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.1.6, 1.2.0 beta 1
 Environment: Linux x86_64 3.4.9, sun-jdk 1.6.0_33
Reporter: Mina Naguib
Assignee: Pavel Yaskevich
 Fix For: 1.1.7, 1.2.0 beta 3

 Attachments: 131203-schema-1.txt, 131203-schema-2.txt, 
 CASSANDRA-4880-fix.patch, CASSANDRA-4880.patch


 After upgrading a node from 1.1.2 to 1.1.6, the startup sequence entered a 
 loop as seen here:
 http://mina.naguib.ca/misc/cassandra_116_startup_loop.txt
 Stopping and starting the node entered the same loop.
 Reverting back to 1.1.2 started successfully.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


git commit: Secondary index support for collections

2013-12-03 Thread slebresne
Updated Branches:
  refs/heads/trunk 57516e082 - d12a0d7b0


Secondary index support for collections

patch by slebresne; reviewed by iamaleksey for CASSANDRA-4511


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/d12a0d7b
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/d12a0d7b
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/d12a0d7b

Branch: refs/heads/trunk
Commit: d12a0d7b0299786bf1d0484f3770bae6a94cb0c9
Parents: 57516e0
Author: Sylvain Lebresne sylv...@datastax.com
Authored: Thu Nov 14 09:17:51 2013 +0100
Committer: Sylvain Lebresne sylv...@datastax.com
Committed: Tue Dec 3 14:49:02 2013 +0100

--
 CHANGES.txt |   1 +
 src/java/org/apache/cassandra/cql3/Cql.g|   4 +
 .../org/apache/cassandra/cql3/Relation.java |  15 ++-
 .../cql3/statements/CreateIndexStatement.java   |  21 +++-
 .../cassandra/cql3/statements/Restriction.java  | 111 +-
 .../cql3/statements/SelectStatement.java|  71 ++--
 .../apache/cassandra/db/ColumnFamilyStore.java  |   2 +-
 .../apache/cassandra/db/IndexExpression.java|  19 +++-
 .../cassandra/db/filter/ExtendedFilter.java |  47 ++--
 .../AbstractSimplePerColumnSecondaryIndex.java  |  13 ++-
 .../db/index/SecondaryIndexSearcher.java|   2 +-
 .../db/index/composites/CompositesIndex.java|  50 -
 .../CompositesIndexOnCollectionKey.java | 112 +++
 .../CompositesIndexOnCollectionValue.java   | 110 ++
 .../db/index/composites/CompositesSearcher.java |  21 +++-
 15 files changed, 566 insertions(+), 33 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/d12a0d7b/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 3bc50ac..08c3a67 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -14,6 +14,7 @@
  * User-defined types for CQL3 (CASSANDRA-5590)
  * Use of o.a.c.metrics in nodetool (CASSANDRA-5871, 6406)
  * Batch read from OTC's queue and cleanup (CASSANDRA-1632)
+ * Secondary index support for collections (CASSANDRA-4511)
 
 
 2.0.4

http://git-wip-us.apache.org/repos/asf/cassandra/blob/d12a0d7b/src/java/org/apache/cassandra/cql3/Cql.g
--
diff --git a/src/java/org/apache/cassandra/cql3/Cql.g 
b/src/java/org/apache/cassandra/cql3/Cql.g
index 325d6f6..fb0054d 100644
--- a/src/java/org/apache/cassandra/cql3/Cql.g
+++ b/src/java/org/apache/cassandra/cql3/Cql.g
@@ -947,6 +947,8 @@ relation[ListRelation clauses]
 { $clauses.add(new Relation(name, Relation.Type.IN, marker)); }
 | name=cident K_IN { Relation rel = Relation.createInRelation($name.id); }
'(' ( f1=term { rel.addInValue(f1); } (',' fN=term { 
rel.addInValue(fN); } )* )? ')' { $clauses.add(rel); }
+| name=cident K_CONTAINS { Relation.Type rt = Relation.Type.CONTAINS; } /* 
(K_KEY { rt = Relation.Type.CONTAINS_KEY })? */
+t=term { $clauses.add(new Relation(name, rt, t)); }
 | '(' relation[$clauses] ')'
 ;
 
@@ -1045,6 +1047,7 @@ basic_unreserved_keyword returns [String str]
 | K_CUSTOM
 | K_TRIGGER
 | K_DISTINCT
+| K_CONTAINS
 ) { $str = $k.text; }
 ;
 
@@ -1101,6 +1104,7 @@ K_DESC:D E S C;
 K_ALLOW:   A L L O W;
 K_FILTERING:   F I L T E R I N G;
 K_IF:  I F;
+K_CONTAINS:C O N T A I N S;
 
 K_GRANT:   G R A N T;
 K_ALL: A L L;

http://git-wip-us.apache.org/repos/asf/cassandra/blob/d12a0d7b/src/java/org/apache/cassandra/cql3/Relation.java
--
diff --git a/src/java/org/apache/cassandra/cql3/Relation.java 
b/src/java/org/apache/cassandra/cql3/Relation.java
index 15ed540..cfcdd54 100644
--- a/src/java/org/apache/cassandra/cql3/Relation.java
+++ b/src/java/org/apache/cassandra/cql3/Relation.java
@@ -35,7 +35,20 @@ public class Relation
 
 public static enum Type
 {
-EQ, LT, LTE, GTE, GT, IN;
+EQ, LT, LTE, GTE, GT, IN, CONTAINS, CONTAINS_KEY;
+
+public boolean allowsIndexQuery()
+{
+switch (this)
+{
+case EQ:
+case CONTAINS:
+case CONTAINS_KEY:
+return true;
+default:
+return false;
+}
+}
 }
 
 private Relation(ColumnIdentifier entity, Type type, Term.Raw value, 
ListTerm.Raw inValues, boolean onToken)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/d12a0d7b/src/java/org/apache/cassandra/cql3/statements/CreateIndexStatement.java
--
diff --git 

git commit: Warn when a read collection has 64k elements

2013-12-03 Thread slebresne
Updated Branches:
  refs/heads/cassandra-1.2 ecd94221a - f634ac7ea


Warn when a read collection has  64k elements

patch by slebresne; reviewed by iamaleksey for CASSANDRA-5428


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/f634ac7e
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/f634ac7e
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/f634ac7e

Branch: refs/heads/cassandra-1.2
Commit: f634ac7eae468b944d22951fc7c9d05aa6c7f447
Parents: ecd9422
Author: Sylvain Lebresne sylv...@datastax.com
Authored: Tue Dec 3 14:53:33 2013 +0100
Committer: Sylvain Lebresne sylv...@datastax.com
Committed: Tue Dec 3 14:53:33 2013 +0100

--
 CHANGES.txt|  1 +
 .../cassandra/db/marshal/CollectionType.java   | 17 +
 .../org/apache/cassandra/db/marshal/ListType.java  |  2 ++
 .../org/apache/cassandra/db/marshal/MapType.java   |  2 ++
 .../org/apache/cassandra/db/marshal/SetType.java   |  2 ++
 5 files changed, 24 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/f634ac7e/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index c80a00a..8e6cffa 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -8,6 +8,7 @@
  * Throw IRE if a prepared has more markers than supported (CASSANDRA-5598)
  * Expose Thread metrics for the native protocol server (CASSANDRA-6234)
  * Change snapshot response message verb (CASSANDRA-6415)
+ * Warn when collection read has  65K elements (CASSANDRA-5428)
 
 
 1.2.12

http://git-wip-us.apache.org/repos/asf/cassandra/blob/f634ac7e/src/java/org/apache/cassandra/db/marshal/CollectionType.java
--
diff --git a/src/java/org/apache/cassandra/db/marshal/CollectionType.java 
b/src/java/org/apache/cassandra/db/marshal/CollectionType.java
index ad2ea67..a34a2b7 100644
--- a/src/java/org/apache/cassandra/db/marshal/CollectionType.java
+++ b/src/java/org/apache/cassandra/db/marshal/CollectionType.java
@@ -20,6 +20,9 @@ package org.apache.cassandra.db.marshal;
 import java.nio.ByteBuffer;
 import java.util.List;
 
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
 import org.apache.cassandra.cql3.CQL3Type;
 import org.apache.cassandra.db.IColumn;
 import org.apache.cassandra.utils.ByteBufferUtil;
@@ -33,6 +36,10 @@ import org.apache.cassandra.utils.Pair;
  */
 public abstract class CollectionTypeT extends AbstractTypeT
 {
+private static final Logger logger = 
LoggerFactory.getLogger(CollectionType.class);
+
+public static final int MAX_ELEMENTS = 65535;
+
 public enum Kind
 {
 MAP, SET, LIST
@@ -105,6 +112,16 @@ public abstract class CollectionTypeT extends 
AbstractTypeT
 return (ByteBuffer)result.flip();
 }
 
+protected ListPairByteBuffer, IColumn 
enforceLimit(ListPairByteBuffer, IColumn columns)
+{
+if (columns.size() = MAX_ELEMENTS)
+return columns;
+
+logger.error(Detected collection with {} elements, more than the {} 
limit. Only the first {} elements will be returned to the client. 
+   + Please see 
http://cassandra.apache.org/doc/cql3/CQL.html#collections for more details., 
columns.size(), MAX_ELEMENTS, MAX_ELEMENTS);
+return columns.subList(0, MAX_ELEMENTS);
+}
+
 public static ByteBuffer pack(ListByteBuffer buffers, int elements)
 {
 int size = 0;

http://git-wip-us.apache.org/repos/asf/cassandra/blob/f634ac7e/src/java/org/apache/cassandra/db/marshal/ListType.java
--
diff --git a/src/java/org/apache/cassandra/db/marshal/ListType.java 
b/src/java/org/apache/cassandra/db/marshal/ListType.java
index b6613ae..b219af1 100644
--- a/src/java/org/apache/cassandra/db/marshal/ListType.java
+++ b/src/java/org/apache/cassandra/db/marshal/ListType.java
@@ -120,6 +120,8 @@ public class ListTypeT extends CollectionTypeListT
 
 public ByteBuffer serialize(ListPairByteBuffer, IColumn columns)
 {
+columns = enforceLimit(columns);
+
 ListByteBuffer bbs = new ArrayListByteBuffer(columns.size());
 int size = 0;
 for (PairByteBuffer, IColumn p : columns)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/f634ac7e/src/java/org/apache/cassandra/db/marshal/MapType.java
--
diff --git a/src/java/org/apache/cassandra/db/marshal/MapType.java 
b/src/java/org/apache/cassandra/db/marshal/MapType.java
index 19310df..750851e 100644
--- a/src/java/org/apache/cassandra/db/marshal/MapType.java
+++ b/src/java/org/apache/cassandra/db/marshal/MapType.java
@@ -137,6 +137,8 @@ public class 

[2/3] git commit: Merge branch 'cassandra-1.2' into cassandra-2.0

2013-12-03 Thread slebresne
Merge branch 'cassandra-1.2' into cassandra-2.0


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/b2da839f
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/b2da839f
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/b2da839f

Branch: refs/heads/cassandra-2.0
Commit: b2da839f076f14f35c5591b39736c8d7241974ee
Parents: 6724964 f634ac7
Author: Sylvain Lebresne sylv...@datastax.com
Authored: Tue Dec 3 14:54:41 2013 +0100
Committer: Sylvain Lebresne sylv...@datastax.com
Committed: Tue Dec 3 14:54:41 2013 +0100

--
 CHANGES.txt|  1 +
 .../cassandra/db/marshal/CollectionType.java   | 17 +
 .../org/apache/cassandra/db/marshal/ListType.java  |  2 ++
 .../org/apache/cassandra/db/marshal/MapType.java   |  2 ++
 .../org/apache/cassandra/db/marshal/SetType.java   |  2 ++
 5 files changed, 24 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/b2da839f/CHANGES.txt
--
diff --cc CHANGES.txt
index 11f4c09,8e6cffa..a7ab215
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@@ -13,44 -8,10 +13,45 @@@ Merged from 1.2
   * Throw IRE if a prepared has more markers than supported (CASSANDRA-5598)
   * Expose Thread metrics for the native protocol server (CASSANDRA-6234)
   * Change snapshot response message verb (CASSANDRA-6415)
+  * Warn when collection read has  65K elements (CASSANDRA-5428)
  
  
 -1.2.12
 +2.0.3
 + * Fix FD leak on slice read path (CASSANDRA-6275)
 + * Cancel read meter task when closing SSTR (CASSANDRA-6358)
 + * free off-heap IndexSummary during bulk (CASSANDRA-6359)
 + * Recover from IOException in accept() thread (CASSANDRA-6349)
 + * Improve Gossip tolerance of abnormally slow tasks (CASSANDRA-6338)
 + * Fix trying to hint timed out counter writes (CASSANDRA-6322)
 + * Allow restoring specific columnfamilies from archived CL (CASSANDRA-4809)
 + * Avoid flushing compaction_history after each operation (CASSANDRA-6287)
 + * Fix repair assertion error when tombstones expire (CASSANDRA-6277)
 + * Skip loading corrupt key cache (CASSANDRA-6260)
 + * Fixes for compacting larger-than-memory rows (CASSANDRA-6274)
 + * Compact hottest sstables first and optionally omit coldest from
 +   compaction entirely (CASSANDRA-6109)
 + * Fix modifying column_metadata from thrift (CASSANDRA-6182)
 + * cqlsh: fix LIST USERS output (CASSANDRA-6242)
 + * Add IRequestSink interface (CASSANDRA-6248)
 + * Update memtable size while flushing (CASSANDRA-6249)
 + * Provide hooks around CQL2/CQL3 statement execution (CASSANDRA-6252)
 + * Require Permission.SELECT for CAS updates (CASSANDRA-6247)
 + * New CQL-aware SSTableWriter (CASSANDRA-5894)
 + * Reject CAS operation when the protocol v1 is used (CASSANDRA-6270)
 + * Correctly throw error when frame too large (CASSANDRA-5981)
 + * Fix serialization bug in PagedRange with 2ndary indexes (CASSANDRA-6299)
 + * Fix CQL3 table validation in Thrift (CASSANDRA-6140)
 + * Fix bug missing results with IN clauses (CASSANDRA-6327)
 + * Fix paging with reversed slices (CASSANDRA-6343)
 + * Set minTimestamp correctly to be able to drop expired sstables 
(CASSANDRA-6337)
 + * Support NaN and Infinity as float literals (CASSANDRA-6003)
 + * Remove RF from nodetool ring output (CASSANDRA-6289)
 + * Fix attempting to flush empty rows (CASSANDRA-6374)
 + * Fix potential out of bounds exception when paging (CASSANDRA-6333)
 +Merged from 1.2:
 + * Optimize FD phi calculation (CASSANDRA-6386)
 + * Improve initial FD phi estimate when starting up (CASSANDRA-6385)
 + * Don't list CQL3 table in CLI describe even if named explicitely 
(CASSANDRA-5750)
   * Invalidate row cache when dropping CF (CASSANDRA-6351)
   * add non-jamm path for cached statements (CASSANDRA-6293)
   * (Hadoop) Require CFRR batchSize to be at least 2 (CASSANDRA-6114)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/b2da839f/src/java/org/apache/cassandra/db/marshal/CollectionType.java
--
diff --cc src/java/org/apache/cassandra/db/marshal/CollectionType.java
index f922d56,a34a2b7..9408980
--- a/src/java/org/apache/cassandra/db/marshal/CollectionType.java
+++ b/src/java/org/apache/cassandra/db/marshal/CollectionType.java
@@@ -20,9 -20,11 +20,12 @@@ package org.apache.cassandra.db.marshal
  import java.nio.ByteBuffer;
  import java.util.List;
  
+ import org.slf4j.Logger;
+ import org.slf4j.LoggerFactory;
+ 
  import org.apache.cassandra.cql3.CQL3Type;
 -import org.apache.cassandra.db.IColumn;
 +import org.apache.cassandra.db.Column;
 +import org.apache.cassandra.serializers.MarshalException;
  import org.apache.cassandra.utils.ByteBufferUtil;
  import org.apache.cassandra.utils.Pair;
  


[3/3] git commit: Fix merge

2013-12-03 Thread slebresne
Fix merge


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/1334f94e
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/1334f94e
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/1334f94e

Branch: refs/heads/cassandra-2.0
Commit: 1334f94e40ce5dbed7270808abb2330ea6d37c51
Parents: b2da839
Author: Sylvain Lebresne sylv...@datastax.com
Authored: Tue Dec 3 14:56:19 2013 +0100
Committer: Sylvain Lebresne sylv...@datastax.com
Committed: Tue Dec 3 14:56:19 2013 +0100

--
 src/java/org/apache/cassandra/db/marshal/CollectionType.java | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/1334f94e/src/java/org/apache/cassandra/db/marshal/CollectionType.java
--
diff --git a/src/java/org/apache/cassandra/db/marshal/CollectionType.java 
b/src/java/org/apache/cassandra/db/marshal/CollectionType.java
index 9408980..07c86e0 100644
--- a/src/java/org/apache/cassandra/db/marshal/CollectionType.java
+++ b/src/java/org/apache/cassandra/db/marshal/CollectionType.java
@@ -113,7 +113,7 @@ public abstract class CollectionTypeT extends 
AbstractTypeT
 return (ByteBuffer)result.flip();
 }
 
-protected ListPairByteBuffer, IColumn 
enforceLimit(ListPairByteBuffer, IColumn columns)
+protected ListPairByteBuffer, Column 
enforceLimit(ListPairByteBuffer, Column columns)
 {
 if (columns.size() = MAX_ELEMENTS)
 return columns;



[1/3] git commit: Warn when a read collection has 64k elements

2013-12-03 Thread slebresne
Updated Branches:
  refs/heads/cassandra-2.0 672496430 - 1334f94e4


Warn when a read collection has  64k elements

patch by slebresne; reviewed by iamaleksey for CASSANDRA-5428


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/f634ac7e
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/f634ac7e
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/f634ac7e

Branch: refs/heads/cassandra-2.0
Commit: f634ac7eae468b944d22951fc7c9d05aa6c7f447
Parents: ecd9422
Author: Sylvain Lebresne sylv...@datastax.com
Authored: Tue Dec 3 14:53:33 2013 +0100
Committer: Sylvain Lebresne sylv...@datastax.com
Committed: Tue Dec 3 14:53:33 2013 +0100

--
 CHANGES.txt|  1 +
 .../cassandra/db/marshal/CollectionType.java   | 17 +
 .../org/apache/cassandra/db/marshal/ListType.java  |  2 ++
 .../org/apache/cassandra/db/marshal/MapType.java   |  2 ++
 .../org/apache/cassandra/db/marshal/SetType.java   |  2 ++
 5 files changed, 24 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/f634ac7e/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index c80a00a..8e6cffa 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -8,6 +8,7 @@
  * Throw IRE if a prepared has more markers than supported (CASSANDRA-5598)
  * Expose Thread metrics for the native protocol server (CASSANDRA-6234)
  * Change snapshot response message verb (CASSANDRA-6415)
+ * Warn when collection read has  65K elements (CASSANDRA-5428)
 
 
 1.2.12

http://git-wip-us.apache.org/repos/asf/cassandra/blob/f634ac7e/src/java/org/apache/cassandra/db/marshal/CollectionType.java
--
diff --git a/src/java/org/apache/cassandra/db/marshal/CollectionType.java 
b/src/java/org/apache/cassandra/db/marshal/CollectionType.java
index ad2ea67..a34a2b7 100644
--- a/src/java/org/apache/cassandra/db/marshal/CollectionType.java
+++ b/src/java/org/apache/cassandra/db/marshal/CollectionType.java
@@ -20,6 +20,9 @@ package org.apache.cassandra.db.marshal;
 import java.nio.ByteBuffer;
 import java.util.List;
 
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
 import org.apache.cassandra.cql3.CQL3Type;
 import org.apache.cassandra.db.IColumn;
 import org.apache.cassandra.utils.ByteBufferUtil;
@@ -33,6 +36,10 @@ import org.apache.cassandra.utils.Pair;
  */
 public abstract class CollectionTypeT extends AbstractTypeT
 {
+private static final Logger logger = 
LoggerFactory.getLogger(CollectionType.class);
+
+public static final int MAX_ELEMENTS = 65535;
+
 public enum Kind
 {
 MAP, SET, LIST
@@ -105,6 +112,16 @@ public abstract class CollectionTypeT extends 
AbstractTypeT
 return (ByteBuffer)result.flip();
 }
 
+protected ListPairByteBuffer, IColumn 
enforceLimit(ListPairByteBuffer, IColumn columns)
+{
+if (columns.size() = MAX_ELEMENTS)
+return columns;
+
+logger.error(Detected collection with {} elements, more than the {} 
limit. Only the first {} elements will be returned to the client. 
+   + Please see 
http://cassandra.apache.org/doc/cql3/CQL.html#collections for more details., 
columns.size(), MAX_ELEMENTS, MAX_ELEMENTS);
+return columns.subList(0, MAX_ELEMENTS);
+}
+
 public static ByteBuffer pack(ListByteBuffer buffers, int elements)
 {
 int size = 0;

http://git-wip-us.apache.org/repos/asf/cassandra/blob/f634ac7e/src/java/org/apache/cassandra/db/marshal/ListType.java
--
diff --git a/src/java/org/apache/cassandra/db/marshal/ListType.java 
b/src/java/org/apache/cassandra/db/marshal/ListType.java
index b6613ae..b219af1 100644
--- a/src/java/org/apache/cassandra/db/marshal/ListType.java
+++ b/src/java/org/apache/cassandra/db/marshal/ListType.java
@@ -120,6 +120,8 @@ public class ListTypeT extends CollectionTypeListT
 
 public ByteBuffer serialize(ListPairByteBuffer, IColumn columns)
 {
+columns = enforceLimit(columns);
+
 ListByteBuffer bbs = new ArrayListByteBuffer(columns.size());
 int size = 0;
 for (PairByteBuffer, IColumn p : columns)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/f634ac7e/src/java/org/apache/cassandra/db/marshal/MapType.java
--
diff --git a/src/java/org/apache/cassandra/db/marshal/MapType.java 
b/src/java/org/apache/cassandra/db/marshal/MapType.java
index 19310df..750851e 100644
--- a/src/java/org/apache/cassandra/db/marshal/MapType.java
+++ b/src/java/org/apache/cassandra/db/marshal/MapType.java
@@ -137,6 +137,8 @@ public class 

[4/4] git commit: Merge branch 'cassandra-2.0' into trunk

2013-12-03 Thread slebresne
Merge branch 'cassandra-2.0' into trunk


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/b34d43f9
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/b34d43f9
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/b34d43f9

Branch: refs/heads/trunk
Commit: b34d43f9747d2ebc1feb516d9675801bcd293d8b
Parents: d12a0d7 1334f94
Author: Sylvain Lebresne sylv...@datastax.com
Authored: Tue Dec 3 14:57:06 2013 +0100
Committer: Sylvain Lebresne sylv...@datastax.com
Committed: Tue Dec 3 14:57:06 2013 +0100

--
 CHANGES.txt|  1 +
 .../cassandra/db/marshal/CollectionType.java   | 17 +
 .../org/apache/cassandra/db/marshal/ListType.java  |  2 ++
 .../org/apache/cassandra/db/marshal/MapType.java   |  2 ++
 .../org/apache/cassandra/db/marshal/SetType.java   |  2 ++
 5 files changed, 24 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/b34d43f9/CHANGES.txt
--



[1/4] git commit: Warn when a read collection has 64k elements

2013-12-03 Thread slebresne
Updated Branches:
  refs/heads/trunk d12a0d7b0 - b34d43f97


Warn when a read collection has  64k elements

patch by slebresne; reviewed by iamaleksey for CASSANDRA-5428


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/f634ac7e
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/f634ac7e
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/f634ac7e

Branch: refs/heads/trunk
Commit: f634ac7eae468b944d22951fc7c9d05aa6c7f447
Parents: ecd9422
Author: Sylvain Lebresne sylv...@datastax.com
Authored: Tue Dec 3 14:53:33 2013 +0100
Committer: Sylvain Lebresne sylv...@datastax.com
Committed: Tue Dec 3 14:53:33 2013 +0100

--
 CHANGES.txt|  1 +
 .../cassandra/db/marshal/CollectionType.java   | 17 +
 .../org/apache/cassandra/db/marshal/ListType.java  |  2 ++
 .../org/apache/cassandra/db/marshal/MapType.java   |  2 ++
 .../org/apache/cassandra/db/marshal/SetType.java   |  2 ++
 5 files changed, 24 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/f634ac7e/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index c80a00a..8e6cffa 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -8,6 +8,7 @@
  * Throw IRE if a prepared has more markers than supported (CASSANDRA-5598)
  * Expose Thread metrics for the native protocol server (CASSANDRA-6234)
  * Change snapshot response message verb (CASSANDRA-6415)
+ * Warn when collection read has  65K elements (CASSANDRA-5428)
 
 
 1.2.12

http://git-wip-us.apache.org/repos/asf/cassandra/blob/f634ac7e/src/java/org/apache/cassandra/db/marshal/CollectionType.java
--
diff --git a/src/java/org/apache/cassandra/db/marshal/CollectionType.java 
b/src/java/org/apache/cassandra/db/marshal/CollectionType.java
index ad2ea67..a34a2b7 100644
--- a/src/java/org/apache/cassandra/db/marshal/CollectionType.java
+++ b/src/java/org/apache/cassandra/db/marshal/CollectionType.java
@@ -20,6 +20,9 @@ package org.apache.cassandra.db.marshal;
 import java.nio.ByteBuffer;
 import java.util.List;
 
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
 import org.apache.cassandra.cql3.CQL3Type;
 import org.apache.cassandra.db.IColumn;
 import org.apache.cassandra.utils.ByteBufferUtil;
@@ -33,6 +36,10 @@ import org.apache.cassandra.utils.Pair;
  */
 public abstract class CollectionTypeT extends AbstractTypeT
 {
+private static final Logger logger = 
LoggerFactory.getLogger(CollectionType.class);
+
+public static final int MAX_ELEMENTS = 65535;
+
 public enum Kind
 {
 MAP, SET, LIST
@@ -105,6 +112,16 @@ public abstract class CollectionTypeT extends 
AbstractTypeT
 return (ByteBuffer)result.flip();
 }
 
+protected ListPairByteBuffer, IColumn 
enforceLimit(ListPairByteBuffer, IColumn columns)
+{
+if (columns.size() = MAX_ELEMENTS)
+return columns;
+
+logger.error(Detected collection with {} elements, more than the {} 
limit. Only the first {} elements will be returned to the client. 
+   + Please see 
http://cassandra.apache.org/doc/cql3/CQL.html#collections for more details., 
columns.size(), MAX_ELEMENTS, MAX_ELEMENTS);
+return columns.subList(0, MAX_ELEMENTS);
+}
+
 public static ByteBuffer pack(ListByteBuffer buffers, int elements)
 {
 int size = 0;

http://git-wip-us.apache.org/repos/asf/cassandra/blob/f634ac7e/src/java/org/apache/cassandra/db/marshal/ListType.java
--
diff --git a/src/java/org/apache/cassandra/db/marshal/ListType.java 
b/src/java/org/apache/cassandra/db/marshal/ListType.java
index b6613ae..b219af1 100644
--- a/src/java/org/apache/cassandra/db/marshal/ListType.java
+++ b/src/java/org/apache/cassandra/db/marshal/ListType.java
@@ -120,6 +120,8 @@ public class ListTypeT extends CollectionTypeListT
 
 public ByteBuffer serialize(ListPairByteBuffer, IColumn columns)
 {
+columns = enforceLimit(columns);
+
 ListByteBuffer bbs = new ArrayListByteBuffer(columns.size());
 int size = 0;
 for (PairByteBuffer, IColumn p : columns)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/f634ac7e/src/java/org/apache/cassandra/db/marshal/MapType.java
--
diff --git a/src/java/org/apache/cassandra/db/marshal/MapType.java 
b/src/java/org/apache/cassandra/db/marshal/MapType.java
index 19310df..750851e 100644
--- a/src/java/org/apache/cassandra/db/marshal/MapType.java
+++ b/src/java/org/apache/cassandra/db/marshal/MapType.java
@@ -137,6 +137,8 @@ public class MapTypeK, V extends 

[2/4] git commit: Merge branch 'cassandra-1.2' into cassandra-2.0

2013-12-03 Thread slebresne
Merge branch 'cassandra-1.2' into cassandra-2.0


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/b2da839f
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/b2da839f
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/b2da839f

Branch: refs/heads/trunk
Commit: b2da839f076f14f35c5591b39736c8d7241974ee
Parents: 6724964 f634ac7
Author: Sylvain Lebresne sylv...@datastax.com
Authored: Tue Dec 3 14:54:41 2013 +0100
Committer: Sylvain Lebresne sylv...@datastax.com
Committed: Tue Dec 3 14:54:41 2013 +0100

--
 CHANGES.txt|  1 +
 .../cassandra/db/marshal/CollectionType.java   | 17 +
 .../org/apache/cassandra/db/marshal/ListType.java  |  2 ++
 .../org/apache/cassandra/db/marshal/MapType.java   |  2 ++
 .../org/apache/cassandra/db/marshal/SetType.java   |  2 ++
 5 files changed, 24 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/b2da839f/CHANGES.txt
--
diff --cc CHANGES.txt
index 11f4c09,8e6cffa..a7ab215
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@@ -13,44 -8,10 +13,45 @@@ Merged from 1.2
   * Throw IRE if a prepared has more markers than supported (CASSANDRA-5598)
   * Expose Thread metrics for the native protocol server (CASSANDRA-6234)
   * Change snapshot response message verb (CASSANDRA-6415)
+  * Warn when collection read has  65K elements (CASSANDRA-5428)
  
  
 -1.2.12
 +2.0.3
 + * Fix FD leak on slice read path (CASSANDRA-6275)
 + * Cancel read meter task when closing SSTR (CASSANDRA-6358)
 + * free off-heap IndexSummary during bulk (CASSANDRA-6359)
 + * Recover from IOException in accept() thread (CASSANDRA-6349)
 + * Improve Gossip tolerance of abnormally slow tasks (CASSANDRA-6338)
 + * Fix trying to hint timed out counter writes (CASSANDRA-6322)
 + * Allow restoring specific columnfamilies from archived CL (CASSANDRA-4809)
 + * Avoid flushing compaction_history after each operation (CASSANDRA-6287)
 + * Fix repair assertion error when tombstones expire (CASSANDRA-6277)
 + * Skip loading corrupt key cache (CASSANDRA-6260)
 + * Fixes for compacting larger-than-memory rows (CASSANDRA-6274)
 + * Compact hottest sstables first and optionally omit coldest from
 +   compaction entirely (CASSANDRA-6109)
 + * Fix modifying column_metadata from thrift (CASSANDRA-6182)
 + * cqlsh: fix LIST USERS output (CASSANDRA-6242)
 + * Add IRequestSink interface (CASSANDRA-6248)
 + * Update memtable size while flushing (CASSANDRA-6249)
 + * Provide hooks around CQL2/CQL3 statement execution (CASSANDRA-6252)
 + * Require Permission.SELECT for CAS updates (CASSANDRA-6247)
 + * New CQL-aware SSTableWriter (CASSANDRA-5894)
 + * Reject CAS operation when the protocol v1 is used (CASSANDRA-6270)
 + * Correctly throw error when frame too large (CASSANDRA-5981)
 + * Fix serialization bug in PagedRange with 2ndary indexes (CASSANDRA-6299)
 + * Fix CQL3 table validation in Thrift (CASSANDRA-6140)
 + * Fix bug missing results with IN clauses (CASSANDRA-6327)
 + * Fix paging with reversed slices (CASSANDRA-6343)
 + * Set minTimestamp correctly to be able to drop expired sstables 
(CASSANDRA-6337)
 + * Support NaN and Infinity as float literals (CASSANDRA-6003)
 + * Remove RF from nodetool ring output (CASSANDRA-6289)
 + * Fix attempting to flush empty rows (CASSANDRA-6374)
 + * Fix potential out of bounds exception when paging (CASSANDRA-6333)
 +Merged from 1.2:
 + * Optimize FD phi calculation (CASSANDRA-6386)
 + * Improve initial FD phi estimate when starting up (CASSANDRA-6385)
 + * Don't list CQL3 table in CLI describe even if named explicitely 
(CASSANDRA-5750)
   * Invalidate row cache when dropping CF (CASSANDRA-6351)
   * add non-jamm path for cached statements (CASSANDRA-6293)
   * (Hadoop) Require CFRR batchSize to be at least 2 (CASSANDRA-6114)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/b2da839f/src/java/org/apache/cassandra/db/marshal/CollectionType.java
--
diff --cc src/java/org/apache/cassandra/db/marshal/CollectionType.java
index f922d56,a34a2b7..9408980
--- a/src/java/org/apache/cassandra/db/marshal/CollectionType.java
+++ b/src/java/org/apache/cassandra/db/marshal/CollectionType.java
@@@ -20,9 -20,11 +20,12 @@@ package org.apache.cassandra.db.marshal
  import java.nio.ByteBuffer;
  import java.util.List;
  
+ import org.slf4j.Logger;
+ import org.slf4j.LoggerFactory;
+ 
  import org.apache.cassandra.cql3.CQL3Type;
 -import org.apache.cassandra.db.IColumn;
 +import org.apache.cassandra.db.Column;
 +import org.apache.cassandra.serializers.MarshalException;
  import org.apache.cassandra.utils.ByteBufferUtil;
  import org.apache.cassandra.utils.Pair;
  


[3/4] git commit: Fix merge

2013-12-03 Thread slebresne
Fix merge


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/1334f94e
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/1334f94e
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/1334f94e

Branch: refs/heads/trunk
Commit: 1334f94e40ce5dbed7270808abb2330ea6d37c51
Parents: b2da839
Author: Sylvain Lebresne sylv...@datastax.com
Authored: Tue Dec 3 14:56:19 2013 +0100
Committer: Sylvain Lebresne sylv...@datastax.com
Committed: Tue Dec 3 14:56:19 2013 +0100

--
 src/java/org/apache/cassandra/db/marshal/CollectionType.java | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/1334f94e/src/java/org/apache/cassandra/db/marshal/CollectionType.java
--
diff --git a/src/java/org/apache/cassandra/db/marshal/CollectionType.java 
b/src/java/org/apache/cassandra/db/marshal/CollectionType.java
index 9408980..07c86e0 100644
--- a/src/java/org/apache/cassandra/db/marshal/CollectionType.java
+++ b/src/java/org/apache/cassandra/db/marshal/CollectionType.java
@@ -113,7 +113,7 @@ public abstract class CollectionTypeT extends 
AbstractTypeT
 return (ByteBuffer)result.flip();
 }
 
-protected ListPairByteBuffer, IColumn 
enforceLimit(ListPairByteBuffer, IColumn columns)
+protected ListPairByteBuffer, Column 
enforceLimit(ListPairByteBuffer, Column columns)
 {
 if (columns.size() = MAX_ELEMENTS)
 return columns;



[jira] [Created] (CASSANDRA-6436) AbstractColumnFamilyInputFormat does not use start and end tokens configured via ConfigHelper.setInputRange()

2013-12-03 Thread Paulo Ricardo Motta Gomes (JIRA)
Paulo Ricardo Motta Gomes created CASSANDRA-6436:


 Summary: AbstractColumnFamilyInputFormat does not use start and 
end tokens configured via ConfigHelper.setInputRange()
 Key: CASSANDRA-6436
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6436
 Project: Cassandra
  Issue Type: Bug
  Components: Hadoop
Reporter: Paulo Ricardo Motta Gomes
 Fix For: 1.2.6


ConfigHelper allows to set a token input range via the setInputRange(conf, 
startToken, endToken) call (ConfigHelper:254).

We used this feature to limit a hadoop job range to a single Cassandra node's 
range, or even to single row key, mostly for testing purposes. 

This worked before the fix for CASSANDRA-5536 
(https://github.com/apache/cassandra/commit/aaf18bd08af50bbaae0954d78d5e6cbb684aded9),
 but after this ColumnFamilyInputFormat never uses the value of 
KeyRange.start_token when defining the input splits 
(AbstractColumnFamilyInputFormat:142-160), but only KeyRange.start_key, which 
needs an order preserving partitioner to work.

I propose the attached fix in order to allow defining Cassandra token ranges 
for a given Hadoop job even when using a non-order preserving partitioner.

Example use of ConfigHelper.setInputRange(conf, startToken, endToken) to limit 
the range to a single Cassandra Key with RandomPartitioner: 

IPartitioner part = ConfigHelper.getInputPartitioner(job.getConfiguration());
Token token = part.getToken(ByteBufferUtil.bytes(Cassandra Key));
BigInteger endToken = (BigInteger) new 
BigIntegerConverter().convert(BigInteger.class, 
part.getTokenFactory().toString(token));
BigInteger startToken = endToken.subtract(new BigInteger(1));
ConfigHelper.setInputRange(job.getConfiguration(), startToken.toString(), 
endToken.toString());



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (CASSANDRA-6436) AbstractColumnFamilyInputFormat does not use start and end tokens configured via ConfigHelper.setInputRange()

2013-12-03 Thread Paulo Ricardo Motta Gomes (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paulo Ricardo Motta Gomes updated CASSANDRA-6436:
-

Attachment: cassandra-1.2-6436.txt

Fix patch attached.

 AbstractColumnFamilyInputFormat does not use start and end tokens configured 
 via ConfigHelper.setInputRange()
 -

 Key: CASSANDRA-6436
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6436
 Project: Cassandra
  Issue Type: Bug
  Components: Hadoop
Reporter: Paulo Ricardo Motta Gomes
  Labels: hadoop, patch
 Fix For: 1.2.6

 Attachments: cassandra-1.2-6436.txt, cassandra-1.2-6436.txt


 ConfigHelper allows to set a token input range via the setInputRange(conf, 
 startToken, endToken) call (ConfigHelper:254).
 We used this feature to limit a hadoop job range to a single Cassandra node's 
 range, or even to single row key, mostly for testing purposes. 
 This worked before the fix for CASSANDRA-5536 
 (https://github.com/apache/cassandra/commit/aaf18bd08af50bbaae0954d78d5e6cbb684aded9),
  but after this ColumnFamilyInputFormat never uses the value of 
 KeyRange.start_token when defining the input splits 
 (AbstractColumnFamilyInputFormat:142-160), but only KeyRange.start_key, which 
 needs an order preserving partitioner to work.
 I propose the attached fix in order to allow defining Cassandra token ranges 
 for a given Hadoop job even when using a non-order preserving partitioner.
 Example use of ConfigHelper.setInputRange(conf, startToken, endToken) to 
 limit the range to a single Cassandra Key with RandomPartitioner: 
 IPartitioner part = ConfigHelper.getInputPartitioner(job.getConfiguration());
 Token token = part.getToken(ByteBufferUtil.bytes(Cassandra Key));
 BigInteger endToken = (BigInteger) new 
 BigIntegerConverter().convert(BigInteger.class, 
 part.getTokenFactory().toString(token));
 BigInteger startToken = endToken.subtract(new BigInteger(1));
 ConfigHelper.setInputRange(job.getConfiguration(), startToken.toString(), 
 endToken.toString());



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (CASSANDRA-6436) AbstractColumnFamilyInputFormat does not use start and end tokens configured via ConfigHelper.setInputRange()

2013-12-03 Thread Paulo Ricardo Motta Gomes (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paulo Ricardo Motta Gomes updated CASSANDRA-6436:
-

Attachment: cassandra-1.2-6436.txt

Fix patch attached.

 AbstractColumnFamilyInputFormat does not use start and end tokens configured 
 via ConfigHelper.setInputRange()
 -

 Key: CASSANDRA-6436
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6436
 Project: Cassandra
  Issue Type: Bug
  Components: Hadoop
Reporter: Paulo Ricardo Motta Gomes
  Labels: hadoop, patch
 Fix For: 1.2.6

 Attachments: cassandra-1.2-6436.txt, cassandra-1.2-6436.txt


 ConfigHelper allows to set a token input range via the setInputRange(conf, 
 startToken, endToken) call (ConfigHelper:254).
 We used this feature to limit a hadoop job range to a single Cassandra node's 
 range, or even to single row key, mostly for testing purposes. 
 This worked before the fix for CASSANDRA-5536 
 (https://github.com/apache/cassandra/commit/aaf18bd08af50bbaae0954d78d5e6cbb684aded9),
  but after this ColumnFamilyInputFormat never uses the value of 
 KeyRange.start_token when defining the input splits 
 (AbstractColumnFamilyInputFormat:142-160), but only KeyRange.start_key, which 
 needs an order preserving partitioner to work.
 I propose the attached fix in order to allow defining Cassandra token ranges 
 for a given Hadoop job even when using a non-order preserving partitioner.
 Example use of ConfigHelper.setInputRange(conf, startToken, endToken) to 
 limit the range to a single Cassandra Key with RandomPartitioner: 
 IPartitioner part = ConfigHelper.getInputPartitioner(job.getConfiguration());
 Token token = part.getToken(ByteBufferUtil.bytes(Cassandra Key));
 BigInteger endToken = (BigInteger) new 
 BigIntegerConverter().convert(BigInteger.class, 
 part.getTokenFactory().toString(token));
 BigInteger startToken = endToken.subtract(new BigInteger(1));
 ConfigHelper.setInputRange(job.getConfiguration(), startToken.toString(), 
 endToken.toString());



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Assigned] (CASSANDRA-5864) Scrub should discard the columns from CFMetaData.droppedColumns map (if they are old enough)

2013-12-03 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-5864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis reassigned CASSANDRA-5864:
-

Assignee: Tyler Hobbs  (was: Jonathan Ellis)

WDYT [~thobbs]? ^

 Scrub should discard the columns from CFMetaData.droppedColumns map (if they 
 are old enough)
 

 Key: CASSANDRA-5864
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5864
 Project: Cassandra
  Issue Type: Improvement
Reporter: Aleksey Yeschenko
Assignee: Tyler Hobbs
Priority: Minor
  Labels: scrub

 CASSANDRA-3919 restored ALTER TABLE DROP support in CQL3 and it would be nice 
 to make scrub dropped-columns-aware.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Resolved] (CASSANDRA-6127) vnodes don't scale to hundreds of nodes

2013-12-03 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis resolved CASSANDRA-6127.
---

   Resolution: Fixed
Reproduced In: 1.2.9, 1.2.6  (was: 1.2.6, 1.2.9)

I think we've addressed the major problems in the related tickets above.  No 
single culprit.

 vnodes don't scale to hundreds of nodes
 ---

 Key: CASSANDRA-6127
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6127
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Any cluster that has vnodes and consists of hundreds of 
 physical nodes.
Reporter: Tupshin Harper
Assignee: Jonathan Ellis
 Attachments: 2013-11-05_18-04-03_no_compression_cpu_time.png, 
 2013-11-05_18-09-38_compression_on_cpu_time.png, 6000vnodes.patch, 
 AdjustableGossipPeriod.patch, cpu-vs-token-graph.png, 
 delayEstimatorUntilStatisticallyValid.patch, flaps-vs-tokens.png, vnodes  
 gossip flaps.png


 There are a lot of gossip-related issues related to very wide clusters that 
 also have vnodes enabled. Let's use this ticket as a master in case there are 
 sub-tickets.
 The most obvious symptom I've seen is with 1000 nodes in EC2 with m1.xlarge 
 instances. Each node configured with 32 vnodes.
 Without vnodes, cluster spins up fine and is ready to handle requests within 
 30 minutes or less. 
 With vnodes, nodes are reporting constant up/down flapping messages with no 
 external load on the cluster. After a couple of hours, they were still 
 flapping, had very high cpu load, and the cluster never looked like it was 
 going to stabilize or be useful for traffic.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-6428) Inconsistency in CQL native protocol

2013-12-03 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13837745#comment-13837745
 ] 

Sylvain Lebresne commented on CASSANDRA-6428:
-

As said in CASSANDRA-5428, the use of a short in the native protocol is, to a 
large extent, an oversight. And we'll fix it in the next iteration of the 
binary protocol, but we can't break every driver for that in the current 
version (for hopefully obvious reasons).

Collections are not meant to be large because it just doesn't work well from a 
language point of view. A CQL collection is, from an API point of view, just 
one CQL column in one CQL row. This is not the right place for things large, 
where by large I mean something that is not meant to queried in it's 
entirety. CQL provides the notion of clustering columns that allow to keep row 
sorted and query ranges of them. That is the right place for large things.

Now there is some wiggle room in what large (again in the sense of always 
fetch entirely) means which depends on the use case. So maybe that yes, a 
collection would actually be a great fit for you but the 64k limit is just a 
little too low. Sorry if that's the case, we'll lift that limitation at some 
point but again, we just can't break everyone by changing the collection format 
in current version of the protocol.

For alternatives, you can split into 2 tables. Or maybe, if the problem is just 
that the current limit is slightly too low, you have an easy way to distribute 
your set values into 5-10 separate set columns. Or otherwise, you can always do 
it like you would have in thrift, and use a table like:
{noformat}
CREATE TABLE t (
  pk text,
  name text,
  value blob,
  PRIMARY KEY (pk, name)
)
{noformat}
where name would be both the name of your static properties (so 'val1', 
'val2', etc..) and your set elements (maybe with a prefix character that make 
sure they don't conflict with the other properties), and value would be the 
values for the static properties and just an empty blob for the set elements. 
It does mean you need to handle the coding/decoding of your property values 
client side, but it's not *that* hard either.

And I'm sure there can be other idea but I don't know the details of your use 
case (and tbh, it's not the right place to discuss modeling questions). 

 Inconsistency in CQL native protocol
 

 Key: CASSANDRA-6428
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6428
 Project: Cassandra
  Issue Type: Bug
Reporter: Jan Chochol

 We are trying to use Cassandra CQL3 collections (sets and maps) for 
 denormalizing data.
 Problem is, when size of these collections go above some limit. We found that 
 current limitation is 64k - 1 (65535) items in collection.
 We found that there is inconsistency in CQL binary protocol (all current 
 available versions). 
 In protocol (for set) there are these fields:
 {noformat}
 [value size: int] [items count: short] [items] ...
 {noformat}
 One example in our case (collection with 65536 elements):
 {noformat}
 00 21 ff ee 00 00 00 20 30 30 30 30 35 63 38 69 65 33 67 37 73 61 ...
 {noformat}
 So decode {{value size}} is 1245166 bytes and {{items count}} is 0.
 This is wrong - you can not have collection with 0 items occupying more than 
 1MB.
 I understand that in unsigned short you can not have more than 65535, but I 
 do not understand why there is such limitation in protocol, when all data are 
 currently sent.
 In this case we have several possibilities:
 * ignore {{items count}} field and read all bytes specified in {{value size}}
 ** there is problem that we can not be sure, that this behaviour will be kept 
 over for future versions of Cassandra, as it is quite strange
 * refactor our code to use only small collections (this seems quite odd, as 
 Cassandra has no problems with wide rows)
 * do not use collections, and fall-back to net wide rows
 * wait for change in protocol for removing unnecessary limitation



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Resolved] (CASSANDRA-2338) C* consistency level needs to be pluggable

2013-12-03 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis resolved CASSANDRA-2338.
---

Resolution: Won't Fix

I don't think this is actually feasible.

 C* consistency level needs to be pluggable
 --

 Key: CASSANDRA-2338
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2338
 Project: Cassandra
  Issue Type: New Feature
Reporter: Matthew F. Dennis
Priority: Minor

 for cases where people want to run C* across multiple DCs for disaster 
 recovery et cetera where normal operations only happen in the first DC (e.g. 
 no writes/reads happen in the remove DC under normal operation) neither 
 LOCAL_QUORUM or EACH_QUORUM really suffices.  
 Consider the case with RF of DC1:3 DC2:2
 LOCAL_QUORUM doesn't provide any guarantee that data is in the remote DC.
 EACH_QUORUM requires that both nodes in the remote DC are up.
 It would be useful in some situations to be able to specify a strategy where 
 LOCAL_QUORUM is used for the local DC and at least one in a remote DC (and/or 
 at least in *each* remote DC).



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Resolved] (CASSANDRA-6259) Cassandra 2.0.1 server has too many tcp connections in CLOSE_WAIT

2013-12-03 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis resolved CASSANDRA-6259.
---

Resolution: Cannot Reproduce

 Cassandra 2.0.1 server has too many tcp connections in CLOSE_WAIT
 -

 Key: CASSANDRA-6259
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6259
 Project: Cassandra
  Issue Type: Bug
Reporter: Prateek
Assignee: Sylvain Lebresne

 We are using cassandra 2.0.1 server with cascading client. The cassandra tap 
 used is https://github.com/ifesdjeen/cascading-cassandra (1.0.0-rc6). The 
 problem arises after the server is running for a few days. The server has 
 100,000+ connections in tcp CLOSE_WAIT state and cannot accept any more 
 connections. All map reduce jobs start failing. This seems to be a bug with 
 cassandra 2.0.1 server not closing connections properly.
 [(bloomreach-ami) ubuntu@ip-10-91-15-6 :/mnt/cassandra/data]# lsof -n | grep 
 java | grep CLOSE_WAIT | wc -l
 116321
 java  25427  ubuntu *537u IPv493375120t0  
   TCP 10.91.15.6:9042-10.171.11.168:34217 (CLOSE_WAIT)
 java  25427  ubuntu *540u IPv491079330t0  
   TCP 10.91.15.6:9042-10.92.99.19:45820 (CLOSE_WAIT)
 java  25427  ubuntu *543u IPv491101000t0  
   TCP 10.91.15.6:9042-10.86.106.249:47585 (CLOSE_WAIT)
 java  25427  ubuntu *544u IPv491100720t0  
   TCP 10.91.15.6:9042-10.86.106.249:47364 (CLOSE_WAIT)
 java  25427  ubuntu *546u IPv491101100t0  
   TCP 10.91.15.6:9042-10.92.99.19:46162 (CLOSE_WAIT)
 java  25427  ubuntu *547u IPv491100930t0  
   TCP 10.91.15.6:9042-10.86.106.249:47518 (CLOSE_WAIT)
 java  25427  ubuntu *548u IPv493375830t0  
   TCP 10.91.15.6:9042-10.171.11.168:34361 (CLOSE_WAIT)
 java  25427  ubuntu *549u IPv491101140t0  
   TCP 10.91.15.6:9042-10.92.99.19:46212 (CLOSE_WAIT)
 java  25427  ubuntu *551u IPv491101170t0  



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Created] (CASSANDRA-6437) Datastax C# driver not able to execute CAS after upgrade 2.0.2 - 2.0.3

2013-12-03 Thread JIRA
Michał Ziemski created CASSANDRA-6437:
-

 Summary: Datastax C# driver not able to execute CAS after upgrade 
2.0.2 - 2.0.3
 Key: CASSANDRA-6437
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6437
 Project: Cassandra
  Issue Type: Bug
  Components: Core, Drivers (now out of tree)
 Environment: 4 node Centod 6.4 x64 casandra 2.0.3 (datastax community)
Reporter: Michał Ziemski


The following code:
  var cl = 
Cluster.Builder().WithConnectionString(ConfigurationManager.ConnectionStrings.[Cassandra].ConnectionString).Build();
  var  ses = cl.Connect();
  ses.Execute(INSERT INTO appsrv(id) values ('abc') IF NOT EXISTS, 
ConsistencyLevel.Quorum);

Worked fine with cassandra 2.0.2
After upgrading to 2.0.3 I get an error stating that conditional updates are 
not supported by the protocol version and I should upgrade to v2.

I'm not really sure if it's a problem with C* or the Datastax C# Driver.
The error appeared afeter upgrading C* so I decided to post it here.




--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Created] (CASSANDRA-6438) Decide if we want to make user types keyspace scoped

2013-12-03 Thread Sylvain Lebresne (JIRA)
Sylvain Lebresne created CASSANDRA-6438:
---

 Summary: Decide if we want to make user types keyspace scoped
 Key: CASSANDRA-6438
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6438
 Project: Cassandra
  Issue Type: Improvement
Reporter: Sylvain Lebresne


Currently, user types are declared at the top level. I wonder however if we 
might not want to make them scoped to a given keyspace. It was not done in the 
initial patch for simplicity and because I was not sure of the advantages of 
doing so. However, if we ever want to use user types in system tables, having 
them scoped by keyspace means we won't have to care about the new type 
conflicting with another existing type.

Besides, having user types be part of a keyspace would allow for slightly more 
fine grained permissions on them. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-6438) Decide if we want to make user types keyspace scoped

2013-12-03 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13837785#comment-13837785
 ] 

Jonathan Ellis commented on CASSANDRA-6438:
---

Makes sense to me.  (And I note that postgresql user types are schema-scoped.)

 Decide if we want to make user types keyspace scoped
 

 Key: CASSANDRA-6438
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6438
 Project: Cassandra
  Issue Type: Improvement
Reporter: Sylvain Lebresne

 Currently, user types are declared at the top level. I wonder however if we 
 might not want to make them scoped to a given keyspace. It was not done in 
 the initial patch for simplicity and because I was not sure of the advantages 
 of doing so. However, if we ever want to use user types in system tables, 
 having them scoped by keyspace means we won't have to care about the new type 
 conflicting with another existing type.
 Besides, having user types be part of a keyspace would allow for slightly 
 more fine grained permissions on them. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Resolved] (CASSANDRA-6433) snapshot race with compaction causes missing link error

2013-12-03 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis resolved CASSANDRA-6433.
---

Resolution: Duplicate

If it's from drop/recreate, it will be fixed by CASSANDRA-5202.

 snapshot race with compaction causes missing link error
 ---

 Key: CASSANDRA-6433
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6433
 Project: Cassandra
  Issue Type: Bug
 Environment: EL6
 Oracle Java 1.7.40
Reporter: Karl Mueller
Priority: Minor

 Cassandra 1.2.11
 When trying to snapshot, I encountered this error. It appears that snapshot 
 doesn't lock the sstable list in a keyspace which can cause a race condition 
 with compaction. (I think it's compaction, at least)
 [cassandra@dev-cass00 ~]$ cas cluster snap pre-1.2.12
 *** dev-cass01 (1) ***
  
 Nodetool command snapshot -t pre-1.2.12 failed!
  
 Output:
  
 Requested creating snapshot for: all keyspaces
 Exception in thread main java.lang.RuntimeException: Tried to hard link to 
 file that does not exist 
 /data2/data-cassandra/csprocessor/csprocessor/csprocessor-csprocessor-ic-4-Summary.db
 at 
 org.apache.cassandra.io.util.FileUtils.createHardLink(FileUtils.java:72)
 at 
 org.apache.cassandra.io.sstable.SSTableReader.createLinks(SSTableReader.java:1095)
 at 
 org.apache.cassandra.db.ColumnFamilyStore.snapshotWithoutFlush(ColumnFamilyStore.java:1567)
 at 
 org.apache.cassandra.db.ColumnFamilyStore.snapshot(ColumnFamilyStore.java:1612)
 at org.apache.cassandra.db.Table.snapshot(Table.java:194)
 at 
 org.apache.cassandra.service.StorageService.takeSnapshot(StorageService.java:2233)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at sun.reflect.misc.Trampoline.invoke(MethodUtil.java:75)
 at sun.reflect.GeneratedMethodAccessor13.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at sun.reflect.misc.MethodUtil.invoke(MethodUtil.java:279)
 at 
 com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:112)
 at 
 com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:46)
 at 
 com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:237)
 at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:138)
 at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:252)
 at 
 com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:819)
 at 
 com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:801)
 at 
 javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1487)
 at 
 javax.management.remote.rmi.RMIConnectionImpl.access$300(RMIConnectionImpl.java:97)
 at 
 javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1328)
 at 
 javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1420)
 at 
 javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:848)
 at sun.reflect.GeneratedMethodAccessor18.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:322)
 at sun.rmi.transport.Transport$1.run(Transport.java:177)
 at sun.rmi.transport.Transport$1.run(Transport.java:174)
 at java.security.AccessController.doPrivileged(Native Method)
 at sun.rmi.transport.Transport.serviceCall(Transport.java:173)
 at 
 sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:556)
 at 
 sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:811)
 at 
 sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:670)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:724)



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Resolved] (CASSANDRA-6218) Reduce WAN traffic while doing repairs

2013-12-03 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis resolved CASSANDRA-6218.
---

Resolution: Fixed

Go ahead and create a new ticket then because it will probably target a 
different release than this one.

 Reduce WAN traffic while doing repairs
 --

 Key: CASSANDRA-6218
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6218
 Project: Cassandra
  Issue Type: New Feature
  Components: Core, Tools
Reporter: sankalp kohli
Assignee: Jimmy Mårdell
Priority: Minor
 Fix For: 2.0.4

 Attachments: trunk-6218-v2.txt, trunk-6218-v3.patch, trunk-6218.txt


 The way we send out data that does not match over WAN can be improved. 
 Example: Say there are four nodes(A,B,C,D) which are replica of a range we 
 are repairing. A, B is in DC1 and C,D is in DC2. If A does not have the data 
 which other replicas have, then we will have following streams
 1) A to B and back
 2) A to C and back(Goes over WAN)
 3) A to D and back(Goes over WAN)
 One of the ways of doing it to reduce WAN traffic is this.
 1) Repair A and B only with each other and C and D with each other starting 
 at same time t. 
 2) Once these repairs have finished, A,B and C,D are in sync with respect to 
 time t. 
 3) Now run a repair between A and C, the streams which are exchanged as a 
 result of the diff will also be streamed to B and D via A and C(C and D 
 behaves like a proxy to the streams).
 For a replication of DC1:2,DC2:2, the WAN traffic will get reduced by 50% and 
 even more for higher replication factors. 
  



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Resolved] (CASSANDRA-6435) nodetool outputs xss and jamm errors in 1.2.12

2013-12-03 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis resolved CASSANDRA-6435.
---

Resolution: Duplicate

Already fixed the echo so that just leaves the duplicate part.

 nodetool outputs xss and jamm errors in 1.2.12
 --

 Key: CASSANDRA-6435
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6435
 Project: Cassandra
  Issue Type: Bug
Reporter: Karl Mueller
Assignee: Brandon Williams
Priority: Minor

 Since 1.2.12, just running nodetool is producing this output. Probably this 
 is related to CASSANDRA-6273.
 it's unclear to me whether jamm is actually not being loaded, but clearly 
 nodetool should not be having this output, which is likely from 
 cassandra-env.sh
 [cassandra@dev-cass00 cassandra]$ /data2/cassandra/bin/nodetool ring
 xss =  -ea -javaagent:/data2/cassandra/bin/../lib/jamm-0.2.5.jar 
 -XX:+UseThreadPriorities -XX:ThreadPriorityPolicy=42 -Xms14G -Xmx14G -Xmn1G 
 -XX:+HeapDumpOnOutOfMemoryError -Xss256k
 Note: Ownership information does not include topology; for complete 
 information, specify a keyspace
 Datacenter: datacenter1
 ==
 Address  RackStatus State   LoadOwns
 Token
 
 170141183460469231731687303715884105727
 10.93.15.10  rack1   Up Normal  123.82 GB   20.00%  
 34028236692093846346337460743176821145
 10.93.15.11  rack1   Up Normal  124 GB  20.00%  
 68056473384187692692674921486353642290
 10.93.15.12  rack1   Up Normal  123.97 GB   20.00%  
 102084710076281539039012382229530463436
 10.93.15.13  rack1   Up Normal  124.03 GB   20.00%  
 136112946768375385385349842972707284581
 10.93.15.14  rack1   Up Normal  123.93 GB   20.00%  
 170141183460469231731687303715884105727
 ERROR 16:20:01,408 Unable to initialize MemoryMeter (jamm not specified as 
 javaagent).  This means Cassandra will be unable to measure object sizes 
 accurately and may consequently OOM.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Resolved] (CASSANDRA-6431) Prevent same CF from being enqueued to flush more than once

2013-12-03 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6431?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis resolved CASSANDRA-6431.
---

Resolution: Won't Fix

Let's just worry about 5549 then.  Both fixes are 2.1-scoped so let's only 
bother with this if we can't finish 5549 in time.

 Prevent same CF from being enqueued to flush more than once
 ---

 Key: CASSANDRA-6431
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6431
 Project: Cassandra
  Issue Type: Bug
Reporter: Benedict
Assignee: Benedict
Priority: Minor

 As things stand we can, in certain circumstances, fill up the flush queue 
 with multiple requests to flush the same CF, which will lead to all writes 
 blocking until the CF is flushed. Ideally we would only enqueue each 
 CF/Memtable once and, if requested to be flushed whilst already enqueued, 
 mark it to be requeued once the outstanding flush completes.
 On a related note, a single table can already block writes if it has flush 
 queue size or more secondary indexes. At the same time it might be worth 
 deciding if this is also a problem and address it.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-6437) Datastax C# driver not able to execute CAS after upgrade 2.0.2 - 2.0.3

2013-12-03 Thread JIRA

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13837804#comment-13837804
 ] 

Michał Ziemski commented on CASSANDRA-6437:
---

Isn't DataStax C# Driver release 1.0.2 a v2 driver?

 Datastax C# driver not able to execute CAS after upgrade 2.0.2 - 2.0.3
 ---

 Key: CASSANDRA-6437
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6437
 Project: Cassandra
  Issue Type: Bug
  Components: Core, Drivers (now out of tree)
 Environment: 4 node Centod 6.4 x64 casandra 2.0.3 (datastax community)
Reporter: Michał Ziemski

 The following code:
   var cl = 
 Cluster.Builder().WithConnectionString(ConfigurationManager.ConnectionStrings.[Cassandra].ConnectionString).Build();
   var  ses = cl.Connect();
   ses.Execute(INSERT INTO appsrv(id) values ('abc') IF NOT EXISTS, 
 ConsistencyLevel.Quorum);
 Worked fine with cassandra 2.0.2
 After upgrading to 2.0.3 I get an error stating that conditional updates are 
 not supported by the protocol version and I should upgrade to v2.
 I'm not really sure if it's a problem with C* or the Datastax C# Driver.
 The error appeared afeter upgrading C* so I decided to post it here.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-6437) Datastax C# driver not able to execute CAS after upgrade 2.0.2 - 2.0.3

2013-12-03 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13837810#comment-13837810
 ] 

Jonathan Ellis commented on CASSANDRA-6437:
---

No.  Driver 2.x is a v2 driver.

 Datastax C# driver not able to execute CAS after upgrade 2.0.2 - 2.0.3
 ---

 Key: CASSANDRA-6437
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6437
 Project: Cassandra
  Issue Type: Bug
  Components: Core, Drivers (now out of tree)
 Environment: 4 node Centod 6.4 x64 casandra 2.0.3 (datastax community)
Reporter: Michał Ziemski

 The following code:
   var cl = 
 Cluster.Builder().WithConnectionString(ConfigurationManager.ConnectionStrings.[Cassandra].ConnectionString).Build();
   var  ses = cl.Connect();
   ses.Execute(INSERT INTO appsrv(id) values ('abc') IF NOT EXISTS, 
 ConsistencyLevel.Quorum);
 Worked fine with cassandra 2.0.2
 After upgrading to 2.0.3 I get an error stating that conditional updates are 
 not supported by the protocol version and I should upgrade to v2.
 I'm not really sure if it's a problem with C* or the Datastax C# Driver.
 The error appeared afeter upgrading C* so I decided to post it here.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (CASSANDRA-6428) Use 4 bytes to encode collection size in next native protocol version

2013-12-03 Thread Sylvain Lebresne (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sylvain Lebresne updated CASSANDRA-6428:


Summary: Use 4 bytes to encode collection size in next native protocol 
version  (was: Inconsistency in CQL native protocol)

 Use 4 bytes to encode collection size in next native protocol version
 -

 Key: CASSANDRA-6428
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6428
 Project: Cassandra
  Issue Type: Bug
Reporter: Jan Chochol

 We are trying to use Cassandra CQL3 collections (sets and maps) for 
 denormalizing data.
 Problem is, when size of these collections go above some limit. We found that 
 current limitation is 64k - 1 (65535) items in collection.
 We found that there is inconsistency in CQL binary protocol (all current 
 available versions). 
 In protocol (for set) there are these fields:
 {noformat}
 [value size: int] [items count: short] [items] ...
 {noformat}
 One example in our case (collection with 65536 elements):
 {noformat}
 00 21 ff ee 00 00 00 20 30 30 30 30 35 63 38 69 65 33 67 37 73 61 ...
 {noformat}
 So decode {{value size}} is 1245166 bytes and {{items count}} is 0.
 This is wrong - you can not have collection with 0 items occupying more than 
 1MB.
 I understand that in unsigned short you can not have more than 65535, but I 
 do not understand why there is such limitation in protocol, when all data are 
 currently sent.
 In this case we have several possibilities:
 * ignore {{items count}} field and read all bytes specified in {{value size}}
 ** there is problem that we can not be sure, that this behaviour will be kept 
 over for future versions of Cassandra, as it is quite strange
 * refactor our code to use only small collections (this seems quite odd, as 
 Cassandra has no problems with wide rows)
 * do not use collections, and fall-back to net wide rows
 * wait for change in protocol for removing unnecessary limitation



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (CASSANDRA-6313) Refactor dtests to use python driver instead of cassandra-dbapi2

2013-12-03 Thread Ryan McGuire (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ryan McGuire updated CASSANDRA-6313:


Priority: Major  (was: Minor)

 Refactor dtests to use python driver instead of cassandra-dbapi2
 

 Key: CASSANDRA-6313
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6313
 Project: Cassandra
  Issue Type: Test
Reporter: Ryan McGuire
Assignee: Ryan McGuire

 cassandra-dbapi2 is effectively deprecated. The python driver is the future, 
 we should refactor our dtests to use it.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-6313) Refactor dtests to use python driver instead of cassandra-dbapi2

2013-12-03 Thread Ryan McGuire (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13837843#comment-13837843
 ] 

Ryan McGuire commented on CASSANDRA-6313:
-

You may be right, the correct approach for this is probably to assume that it 
will work in most cases and then see where it breaks (if it even does.) I can't 
really remember what I thought wasn't supported.

 Refactor dtests to use python driver instead of cassandra-dbapi2
 

 Key: CASSANDRA-6313
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6313
 Project: Cassandra
  Issue Type: Test
Reporter: Ryan McGuire
Assignee: Ryan McGuire
Priority: Minor

 cassandra-dbapi2 is effectively deprecated. The python driver is the future, 
 we should refactor our dtests to use it.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-6428) Use 4 bytes to encode collection size in next native protocol version

2013-12-03 Thread JIRA

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13837864#comment-13837864
 ] 

Ondřej Černoš commented on CASSANDRA-6428:
--

Thanks a lot Sylvain for your time and answers. It is really appreciated.

I think the whole thing boils down to two issues:

* the size of collection in native protocol, which is workaroundable now by 
just ignoring the field in the protocol (the data are fetched from the storage 
now, only the value of the field is incorrect if the size is bigger than 64k)
* the usage of collections for mixed cql3 rows (mixing static and dynamic 
content, i.e. mixing narrow-row and wide-row in underlying storage terminology).

We shall probably need to split the above described table (having 20 or so 
static columns and a hundreds of thousands elements long set) into two tables, 
one  for the static column set and other for the wide row. So instead of using:

{noformat}
CREATE TABLE test (
  id text PRIMARY KEY,
  val1 text,
  val2 int,
  val3 timestamp,
  valN text,
  some_set settext
)
{noformat}

we will have to have two tables:

{noformat}
CREATE TABLE test_narrow (
  id text PRIMARY KEY,
  val1 text,
  val2 int,
  val3 timestamp,
  valN text
)

CREATE TABLE test_wide (
  id text,
  val text,
  PRIMARY KEY (id, val)
)
{noformat}

The reason is not a modelling one (the first approach is much more comfortable 
and more compliant with the _denormalize everything_ approach), but performance 
one. The problem is cassandra always performs range query over all the columns 
of the underlying row if the table is not created with compact storage. So a 
query like {{select val1, val2 from test where id='some_key'}} performs poorly 
if the {{set}} in the table is big (~400 ms primary key lookup on a table 
having roughly 150k records on a row with a set with roughly 150k records on a 
2 CPU machine with enough memory and DB all mapped into RAM - no disk ops 
involved), even though we don't fetch the set in the select.

The question is: is this behaviour by design and is this the reason behind the 
recommendation not to use big collections?

I know and agree this is not the best place for modelling questions, but again 
- maybe this is useful for you as the designer of the feature to see how it is 
perceived by users and what issues we run into (by the way, we are new 
cassandra users and we started on cql3 from scratch - we are not thrift 
old-timers). I may take this whole topic to user list if you wish.

 Use 4 bytes to encode collection size in next native protocol version
 -

 Key: CASSANDRA-6428
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6428
 Project: Cassandra
  Issue Type: Bug
Reporter: Jan Chochol

 We are trying to use Cassandra CQL3 collections (sets and maps) for 
 denormalizing data.
 Problem is, when size of these collections go above some limit. We found that 
 current limitation is 64k - 1 (65535) items in collection.
 We found that there is inconsistency in CQL binary protocol (all current 
 available versions). 
 In protocol (for set) there are these fields:
 {noformat}
 [value size: int] [items count: short] [items] ...
 {noformat}
 One example in our case (collection with 65536 elements):
 {noformat}
 00 21 ff ee 00 00 00 20 30 30 30 30 35 63 38 69 65 33 67 37 73 61 ...
 {noformat}
 So decode {{value size}} is 1245166 bytes and {{items count}} is 0.
 This is wrong - you can not have collection with 0 items occupying more than 
 1MB.
 I understand that in unsigned short you can not have more than 65535, but I 
 do not understand why there is such limitation in protocol, when all data are 
 currently sent.
 In this case we have several possibilities:
 * ignore {{items count}} field and read all bytes specified in {{value size}}
 ** there is problem that we can not be sure, that this behaviour will be kept 
 over for future versions of Cassandra, as it is quite strange
 * refactor our code to use only small collections (this seems quite odd, as 
 Cassandra has no problems with wide rows)
 * do not use collections, and fall-back to net wide rows
 * wait for change in protocol for removing unnecessary limitation



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Created] (CASSANDRA-6439) Token ranges are erroneously swapped

2013-12-03 Thread Francesco Piccinno (JIRA)
Francesco Piccinno created CASSANDRA-6439:
-

 Summary: Token ranges are erroneously swapped
 Key: CASSANDRA-6439
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6439
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Francesco Piccinno
Priority: Critical
 Fix For: 2.0.0


I am trying to achieve a linear scan on the data contained in a cassandra node 
by exploiting tokens. The idea behind my approach is to request through the 
pycassa SystemManager a list of tokens that the cluster is responsible of, and 
then for each token, issue a {{key_range}} command specifying {{start}} and 
{{end}} interval. The problem is that apparently some tokens returned by the 
server does not respect the property {{start}}  {{end}}. I think that the bug 
resides in the fact that Murmur3 RandomPartitioner is used.

Anyway here are the steps to reproduce the bug:

{code:title=Triggering the bug}
$ pycassaShell -k Crawler
In [1]: tokens = map(lambda x: (int(x.start_token), int(x.end_token)), 
SYSTEM_MANAGER.describe_ring('Crawler'))

In [2]: len(filter(lambda x: x[0]  x[1], tokens))
Out[2]: 255

In [3]: len(filter(lambda x: x[0]  x[1], tokens))
Out[3]: 1

In [4]: filter(lambda x: x[0]  x[1], tokens)
Out[4]: [(9207458196362321348, -9182599474778206823)]

In [5]: for i in CF.get_range(start_token=9207458196362321348, 
finish_token=-9182599474778206823): print i
# ...
# after some objects are printed
# ...
InvalidRequestException: InvalidRequestException(why=Start key's token sorts 
after end token)
{code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (CASSANDRA-6439) Token ranges are erroneously swapped

2013-12-03 Thread Francesco Piccinno (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francesco Piccinno updated CASSANDRA-6439:
--

Description: 
I am trying to achieve a linear scan on the data contained in a cassandra node 
by exploiting tokens. The idea behind my approach is to request through the 
pycassa SystemManager a list of tokens that the cluster is responsible of, and 
then for each token, issue a {{key_range}} command specifying {{start}} and 
{{end}} interval. The problem is that apparently some tokens returned by the 
server does not respect the property {{start}}  {{end}}. I think that the bug 
is due to the fact that {{Murmur3RandomPartitioner}} is used, but I am just 
guessing here.

Anyway here are the steps to reproduce the bug:

{code:title=Triggering the bug}
$ pycassaShell -k Crawler
In [1]: tokens = map(lambda x: (int(x.start_token), int(x.end_token)), 
SYSTEM_MANAGER.describe_ring('Crawler'))

In [2]: len(filter(lambda x: x[0]  x[1], tokens))
Out[2]: 255

In [3]: len(filter(lambda x: x[0]  x[1], tokens))
Out[3]: 1

In [4]: filter(lambda x: x[0]  x[1], tokens)
Out[4]: [(9207458196362321348, -9182599474778206823)]

In [5]: for i in CF.get_range(start_token=9207458196362321348, 
finish_token=-9182599474778206823): print i
# ...
# after some objects are printed
# ...
InvalidRequestException: InvalidRequestException(why=Start key's token sorts 
after end token)
{code}

  was:
I am trying to achieve a linear scan on the data contained in a cassandra node 
by exploiting tokens. The idea behind my approach is to request through the 
pycassa SystemManager a list of tokens that the cluster is responsible of, and 
then for each token, issue a {{key_range}} command specifying {{start}} and 
{{end}} interval. The problem is that apparently some tokens returned by the 
server does not respect the property {{start}}  {{end}}. I think that the bug 
resides in the fact that Murmur3 RandomPartitioner is used.

Anyway here are the steps to reproduce the bug:

{code:title=Triggering the bug}
$ pycassaShell -k Crawler
In [1]: tokens = map(lambda x: (int(x.start_token), int(x.end_token)), 
SYSTEM_MANAGER.describe_ring('Crawler'))

In [2]: len(filter(lambda x: x[0]  x[1], tokens))
Out[2]: 255

In [3]: len(filter(lambda x: x[0]  x[1], tokens))
Out[3]: 1

In [4]: filter(lambda x: x[0]  x[1], tokens)
Out[4]: [(9207458196362321348, -9182599474778206823)]

In [5]: for i in CF.get_range(start_token=9207458196362321348, 
finish_token=-9182599474778206823): print i
# ...
# after some objects are printed
# ...
InvalidRequestException: InvalidRequestException(why=Start key's token sorts 
after end token)
{code}


 Token ranges are erroneously swapped
 

 Key: CASSANDRA-6439
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6439
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Francesco Piccinno
Priority: Critical
 Fix For: 2.0.0


 I am trying to achieve a linear scan on the data contained in a cassandra 
 node by exploiting tokens. The idea behind my approach is to request through 
 the pycassa SystemManager a list of tokens that the cluster is responsible 
 of, and then for each token, issue a {{key_range}} command specifying 
 {{start}} and {{end}} interval. The problem is that apparently some tokens 
 returned by the server does not respect the property {{start}}  {{end}}. I 
 think that the bug is due to the fact that {{Murmur3RandomPartitioner}} is 
 used, but I am just guessing here.
 Anyway here are the steps to reproduce the bug:
 {code:title=Triggering the bug}
 $ pycassaShell -k Crawler
 In [1]: tokens = map(lambda x: (int(x.start_token), int(x.end_token)), 
 SYSTEM_MANAGER.describe_ring('Crawler'))
 In [2]: len(filter(lambda x: x[0]  x[1], tokens))
 Out[2]: 255
 In [3]: len(filter(lambda x: x[0]  x[1], tokens))
 Out[3]: 1
 In [4]: filter(lambda x: x[0]  x[1], tokens)
 Out[4]: [(9207458196362321348, -9182599474778206823)]
 In [5]: for i in CF.get_range(start_token=9207458196362321348, 
 finish_token=-9182599474778206823): print i
 # ...
 # after some objects are printed
 # ...
 InvalidRequestException: InvalidRequestException(why=Start key's token sorts 
 after end token)
 {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (CASSANDRA-6439) Token ranges are erroneously swapped

2013-12-03 Thread Francesco Piccinno (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francesco Piccinno updated CASSANDRA-6439:
--

Description: 
I am trying to achieve a linear scan on the data contained in a cassandra node 
by exploiting tokens. The idea behind my approach is to request through the 
pycassa SystemManager a list of tokens that the cluster is responsible of, and 
then for each token, issue a {{key_range}} command specifying {{start}} and 
{{end}} interval. The problem is that apparently some tokens returned by the 
server does not respect the property {{start}}  {{end}}. I think that the bug 
is due to the fact that {{Murmur3RandomPartitioner}} is used, but I am just 
guessing here.

Anyway here are the steps to reproduce the bug:

{code:title=Triggering the bug}
$ pycassaShell -k KEYSPACE
In [1]: tokens = map(lambda x: (int(x.start_token), int(x.end_token)), 
SYSTEM_MANAGER.describe_ring('Crawler'))

In [2]: len(filter(lambda x: x[0]  x[1], tokens))
Out[2]: 255

In [3]: len(filter(lambda x: x[0]  x[1], tokens))
Out[3]: 1

In [4]: filter(lambda x: x[0]  x[1], tokens)
Out[4]: [(9207458196362321348, -9182599474778206823)]

In [5]: for i in CF.get_range(start_token=9207458196362321348, 
finish_token=-9182599474778206823): print i
# ...
# after some objects are printed
# ...
InvalidRequestException: InvalidRequestException(why=Start key's token sorts 
after end token)
{code}

  was:
I am trying to achieve a linear scan on the data contained in a cassandra node 
by exploiting tokens. The idea behind my approach is to request through the 
pycassa SystemManager a list of tokens that the cluster is responsible of, and 
then for each token, issue a {{key_range}} command specifying {{start}} and 
{{end}} interval. The problem is that apparently some tokens returned by the 
server does not respect the property {{start}}  {{end}}. I think that the bug 
is due to the fact that {{Murmur3RandomPartitioner}} is used, but I am just 
guessing here.

Anyway here are the steps to reproduce the bug:

{code:title=Triggering the bug}
$ pycassaShell -k Crawler
In [1]: tokens = map(lambda x: (int(x.start_token), int(x.end_token)), 
SYSTEM_MANAGER.describe_ring('Crawler'))

In [2]: len(filter(lambda x: x[0]  x[1], tokens))
Out[2]: 255

In [3]: len(filter(lambda x: x[0]  x[1], tokens))
Out[3]: 1

In [4]: filter(lambda x: x[0]  x[1], tokens)
Out[4]: [(9207458196362321348, -9182599474778206823)]

In [5]: for i in CF.get_range(start_token=9207458196362321348, 
finish_token=-9182599474778206823): print i
# ...
# after some objects are printed
# ...
InvalidRequestException: InvalidRequestException(why=Start key's token sorts 
after end token)
{code}


 Token ranges are erroneously swapped
 

 Key: CASSANDRA-6439
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6439
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Francesco Piccinno
Priority: Critical
 Fix For: 2.0.0


 I am trying to achieve a linear scan on the data contained in a cassandra 
 node by exploiting tokens. The idea behind my approach is to request through 
 the pycassa SystemManager a list of tokens that the cluster is responsible 
 of, and then for each token, issue a {{key_range}} command specifying 
 {{start}} and {{end}} interval. The problem is that apparently some tokens 
 returned by the server does not respect the property {{start}}  {{end}}. I 
 think that the bug is due to the fact that {{Murmur3RandomPartitioner}} is 
 used, but I am just guessing here.
 Anyway here are the steps to reproduce the bug:
 {code:title=Triggering the bug}
 $ pycassaShell -k KEYSPACE
 In [1]: tokens = map(lambda x: (int(x.start_token), int(x.end_token)), 
 SYSTEM_MANAGER.describe_ring('Crawler'))
 In [2]: len(filter(lambda x: x[0]  x[1], tokens))
 Out[2]: 255
 In [3]: len(filter(lambda x: x[0]  x[1], tokens))
 Out[3]: 1
 In [4]: filter(lambda x: x[0]  x[1], tokens)
 Out[4]: [(9207458196362321348, -9182599474778206823)]
 In [5]: for i in CF.get_range(start_token=9207458196362321348, 
 finish_token=-9182599474778206823): print i
 # ...
 # after some objects are printed
 # ...
 InvalidRequestException: InvalidRequestException(why=Start key's token sorts 
 after end token)
 {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (CASSANDRA-6439) Token ranges are erroneously swapped

2013-12-03 Thread Francesco Piccinno (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francesco Piccinno updated CASSANDRA-6439:
--

Description: 
I am trying to achieve a linear scan on the data contained in a cassandra node 
by exploiting tokens. The idea behind my approach is to request through the 
pycassa SystemManager a list of tokens that the cluster is responsible of, and 
then for each token, issue a {{key_range}} command specifying {{start}} and 
{{end}} interval. The problem is that apparently some tokens returned by the 
server does not respect the property {{start}}  {{end}}. I think that the bug 
is due to the fact that {{Murmur3RandomPartitioner}} is used, but I am just 
guessing here.

Anyway here are the steps to reproduce the bug:

{code:title=Triggering the bug}
$ pycassaShell -k KEYSPACE
In [1]: tokens = map(lambda x: (int(x.start_token), int(x.end_token)), 
SYSTEM_MANAGER.describe_ring('KEYSPACE'))

In [2]: len(filter(lambda x: x[0]  x[1], tokens))
Out[2]: 255

In [3]: len(filter(lambda x: x[0]  x[1], tokens))
Out[3]: 1

In [4]: filter(lambda x: x[0]  x[1], tokens)
Out[4]: [(9207458196362321348, -9182599474778206823)]

In [5]: for i in CF.get_range(start_token=9207458196362321348, 
finish_token=-9182599474778206823): print i
# ...
# after some objects are printed
# ...
InvalidRequestException: InvalidRequestException(why=Start key's token sorts 
after end token)
{code}

  was:
I am trying to achieve a linear scan on the data contained in a cassandra node 
by exploiting tokens. The idea behind my approach is to request through the 
pycassa SystemManager a list of tokens that the cluster is responsible of, and 
then for each token, issue a {{key_range}} command specifying {{start}} and 
{{end}} interval. The problem is that apparently some tokens returned by the 
server does not respect the property {{start}}  {{end}}. I think that the bug 
is due to the fact that {{Murmur3RandomPartitioner}} is used, but I am just 
guessing here.

Anyway here are the steps to reproduce the bug:

{code:title=Triggering the bug}
$ pycassaShell -k KEYSPACE
In [1]: tokens = map(lambda x: (int(x.start_token), int(x.end_token)), 
SYSTEM_MANAGER.describe_ring('Crawler'))

In [2]: len(filter(lambda x: x[0]  x[1], tokens))
Out[2]: 255

In [3]: len(filter(lambda x: x[0]  x[1], tokens))
Out[3]: 1

In [4]: filter(lambda x: x[0]  x[1], tokens)
Out[4]: [(9207458196362321348, -9182599474778206823)]

In [5]: for i in CF.get_range(start_token=9207458196362321348, 
finish_token=-9182599474778206823): print i
# ...
# after some objects are printed
# ...
InvalidRequestException: InvalidRequestException(why=Start key's token sorts 
after end token)
{code}


 Token ranges are erroneously swapped
 

 Key: CASSANDRA-6439
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6439
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Francesco Piccinno
Priority: Critical
 Fix For: 2.0.0


 I am trying to achieve a linear scan on the data contained in a cassandra 
 node by exploiting tokens. The idea behind my approach is to request through 
 the pycassa SystemManager a list of tokens that the cluster is responsible 
 of, and then for each token, issue a {{key_range}} command specifying 
 {{start}} and {{end}} interval. The problem is that apparently some tokens 
 returned by the server does not respect the property {{start}}  {{end}}. I 
 think that the bug is due to the fact that {{Murmur3RandomPartitioner}} is 
 used, but I am just guessing here.
 Anyway here are the steps to reproduce the bug:
 {code:title=Triggering the bug}
 $ pycassaShell -k KEYSPACE
 In [1]: tokens = map(lambda x: (int(x.start_token), int(x.end_token)), 
 SYSTEM_MANAGER.describe_ring('KEYSPACE'))
 In [2]: len(filter(lambda x: x[0]  x[1], tokens))
 Out[2]: 255
 In [3]: len(filter(lambda x: x[0]  x[1], tokens))
 Out[3]: 1
 In [4]: filter(lambda x: x[0]  x[1], tokens)
 Out[4]: [(9207458196362321348, -9182599474778206823)]
 In [5]: for i in CF.get_range(start_token=9207458196362321348, 
 finish_token=-9182599474778206823): print i
 # ...
 # after some objects are printed
 # ...
 InvalidRequestException: InvalidRequestException(why=Start key's token sorts 
 after end token)
 {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (CASSANDRA-6439) Token ranges are erroneously swapped

2013-12-03 Thread Francesco Piccinno (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francesco Piccinno updated CASSANDRA-6439:
--

Description: 
I am trying to achieve a linear scan on the data contained in a cassandra node 
by exploiting tokens. The idea behind my approach is to request through the 
pycassa SystemManager a list of tokens that the cluster is responsible of, and 
then for each token, issue a {{key_range}} command specifying {{start}} and 
{{end}} interval. The problem is that apparently some tokens returned by the 
server does not respect the property {{start}}  {{end}}. I think that the bug 
is due to the fact that {{Murmur3RandomPartitioner}} is used, but I am just 
guessing here.

Anyway here are the steps to reproduce the bug:

{code:title=Triggering the bug}
$ pycassaShell -k KEYSPACE
In [1]: tokens = map(lambda x: (int(x.start_token), int(x.end_token)), 
SYSTEM_MANAGER.describe_ring('KEYSPACE'))

In [2]: len(filter(lambda x: x[0]  x[1], tokens))
Out[2]: 255

In [3]: len(filter(lambda x: x[0]  x[1], tokens))
Out[3]: 1

In [4]: filter(lambda x: x[0]  x[1], tokens)
Out[4]: [(9207458196362321348, -9182599474778206823)]

In [5]: for i in CF.get_range(start_token=9207458196362321348, 
finish_token=-9182599474778206823): print i
# ...
# after some objects are printed
# ...
InvalidRequestException: InvalidRequestException(why=Start key's token sorts 
after end token)
{code}

I gue

  was:
I am trying to achieve a linear scan on the data contained in a cassandra node 
by exploiting tokens. The idea behind my approach is to request through the 
pycassa SystemManager a list of tokens that the cluster is responsible of, and 
then for each token, issue a {{key_range}} command specifying {{start}} and 
{{end}} interval. The problem is that apparently some tokens returned by the 
server does not respect the property {{start}}  {{end}}. I think that the bug 
is due to the fact that {{Murmur3RandomPartitioner}} is used, but I am just 
guessing here.

Anyway here are the steps to reproduce the bug:

{code:title=Triggering the bug}
$ pycassaShell -k KEYSPACE
In [1]: tokens = map(lambda x: (int(x.start_token), int(x.end_token)), 
SYSTEM_MANAGER.describe_ring('KEYSPACE'))

In [2]: len(filter(lambda x: x[0]  x[1], tokens))
Out[2]: 255

In [3]: len(filter(lambda x: x[0]  x[1], tokens))
Out[3]: 1

In [4]: filter(lambda x: x[0]  x[1], tokens)
Out[4]: [(9207458196362321348, -9182599474778206823)]

In [5]: for i in CF.get_range(start_token=9207458196362321348, 
finish_token=-9182599474778206823): print i
# ...
# after some objects are printed
# ...
InvalidRequestException: InvalidRequestException(why=Start key's token sorts 
after end token)
{code}


 Token ranges are erroneously swapped
 

 Key: CASSANDRA-6439
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6439
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Francesco Piccinno
Priority: Critical
 Fix For: 2.0.0


 I am trying to achieve a linear scan on the data contained in a cassandra 
 node by exploiting tokens. The idea behind my approach is to request through 
 the pycassa SystemManager a list of tokens that the cluster is responsible 
 of, and then for each token, issue a {{key_range}} command specifying 
 {{start}} and {{end}} interval. The problem is that apparently some tokens 
 returned by the server does not respect the property {{start}}  {{end}}. I 
 think that the bug is due to the fact that {{Murmur3RandomPartitioner}} is 
 used, but I am just guessing here.
 Anyway here are the steps to reproduce the bug:
 {code:title=Triggering the bug}
 $ pycassaShell -k KEYSPACE
 In [1]: tokens = map(lambda x: (int(x.start_token), int(x.end_token)), 
 SYSTEM_MANAGER.describe_ring('KEYSPACE'))
 In [2]: len(filter(lambda x: x[0]  x[1], tokens))
 Out[2]: 255
 In [3]: len(filter(lambda x: x[0]  x[1], tokens))
 Out[3]: 1
 In [4]: filter(lambda x: x[0]  x[1], tokens)
 Out[4]: [(9207458196362321348, -9182599474778206823)]
 In [5]: for i in CF.get_range(start_token=9207458196362321348, 
 finish_token=-9182599474778206823): print i
 # ...
 # after some objects are printed
 # ...
 InvalidRequestException: InvalidRequestException(why=Start key's token sorts 
 after end token)
 {code}
 I gue



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (CASSANDRA-6439) Token ranges are erroneously swapped

2013-12-03 Thread Francesco Piccinno (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francesco Piccinno updated CASSANDRA-6439:
--

Description: 
I am trying to achieve a linear scan on the data contained in a cassandra node 
by exploiting tokens. The idea behind my approach is to request through the 
pycassa SystemManager a list of tokens that the cluster is responsible of, and 
then for each token, issue a {{key_range}} command specifying {{start}} and 
{{end}} interval. The problem is that apparently some tokens returned by the 
server does not respect the property {{start}}  {{end}}. I think that the bug 
is due to the fact that {{Murmur3RandomPartitioner}} is used, but I am just 
guessing here.

Anyway here are the steps to reproduce the bug:

{code:title=Triggering the bug}
$ pycassaShell -k KEYSPACE
In [1]: tokens = map(lambda x: (int(x.start_token), int(x.end_token)), 
SYSTEM_MANAGER.describe_ring('KEYSPACE'))

In [2]: len(filter(lambda x: x[0]  x[1], tokens))
Out[2]: 255

In [3]: len(filter(lambda x: x[0]  x[1], tokens))
Out[3]: 1

In [4]: filter(lambda x: x[0]  x[1], tokens)
Out[4]: [(9207458196362321348, -9182599474778206823)]

In [5]: for i in CF.get_range(start_token=9207458196362321348, 
finish_token=-9182599474778206823): print i
# ...
# after some objects are printed
# ...
InvalidRequestException: InvalidRequestException(why=Start key's token sorts 
after end token)
{code}

  was:
I am trying to achieve a linear scan on the data contained in a cassandra node 
by exploiting tokens. The idea behind my approach is to request through the 
pycassa SystemManager a list of tokens that the cluster is responsible of, and 
then for each token, issue a {{key_range}} command specifying {{start}} and 
{{end}} interval. The problem is that apparently some tokens returned by the 
server does not respect the property {{start}}  {{end}}. I think that the bug 
is due to the fact that {{Murmur3RandomPartitioner}} is used, but I am just 
guessing here.

Anyway here are the steps to reproduce the bug:

{code:title=Triggering the bug}
$ pycassaShell -k KEYSPACE
In [1]: tokens = map(lambda x: (int(x.start_token), int(x.end_token)), 
SYSTEM_MANAGER.describe_ring('KEYSPACE'))

In [2]: len(filter(lambda x: x[0]  x[1], tokens))
Out[2]: 255

In [3]: len(filter(lambda x: x[0]  x[1], tokens))
Out[3]: 1

In [4]: filter(lambda x: x[0]  x[1], tokens)
Out[4]: [(9207458196362321348, -9182599474778206823)]

In [5]: for i in CF.get_range(start_token=9207458196362321348, 
finish_token=-9182599474778206823): print i
# ...
# after some objects are printed
# ...
InvalidRequestException: InvalidRequestException(why=Start key's token sorts 
after end token)
{code}

I gue


 Token ranges are erroneously swapped
 

 Key: CASSANDRA-6439
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6439
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Francesco Piccinno
Priority: Critical
 Fix For: 2.0.0


 I am trying to achieve a linear scan on the data contained in a cassandra 
 node by exploiting tokens. The idea behind my approach is to request through 
 the pycassa SystemManager a list of tokens that the cluster is responsible 
 of, and then for each token, issue a {{key_range}} command specifying 
 {{start}} and {{end}} interval. The problem is that apparently some tokens 
 returned by the server does not respect the property {{start}}  {{end}}. I 
 think that the bug is due to the fact that {{Murmur3RandomPartitioner}} is 
 used, but I am just guessing here.
 Anyway here are the steps to reproduce the bug:
 {code:title=Triggering the bug}
 $ pycassaShell -k KEYSPACE
 In [1]: tokens = map(lambda x: (int(x.start_token), int(x.end_token)), 
 SYSTEM_MANAGER.describe_ring('KEYSPACE'))
 In [2]: len(filter(lambda x: x[0]  x[1], tokens))
 Out[2]: 255
 In [3]: len(filter(lambda x: x[0]  x[1], tokens))
 Out[3]: 1
 In [4]: filter(lambda x: x[0]  x[1], tokens)
 Out[4]: [(9207458196362321348, -9182599474778206823)]
 In [5]: for i in CF.get_range(start_token=9207458196362321348, 
 finish_token=-9182599474778206823): print i
 # ...
 # after some objects are printed
 # ...
 InvalidRequestException: InvalidRequestException(why=Start key's token sorts 
 after end token)
 {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-5864) Scrub should discard the columns from CFMetaData.droppedColumns map (if they are old enough)

2013-12-03 Thread Tyler Hobbs (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13837956#comment-13837956
 ] 

Tyler Hobbs commented on CASSANDRA-5864:


I feel like scrub is only for attempting to repair damaged/problematic 
sstables.  If dropped columns are causing problems somehow, I would prefer to 
fix whatever is breaking.  If that's not a good option for some reason, then it 
might make sense to add this to scrub.

 Scrub should discard the columns from CFMetaData.droppedColumns map (if they 
 are old enough)
 

 Key: CASSANDRA-5864
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5864
 Project: Cassandra
  Issue Type: Improvement
Reporter: Aleksey Yeschenko
Assignee: Tyler Hobbs
Priority: Minor
  Labels: scrub

 CASSANDRA-3919 restored ALTER TABLE DROP support in CQL3 and it would be nice 
 to make scrub dropped-columns-aware.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-3578) Multithreaded commitlog

2013-12-03 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13838087#comment-13838087
 ] 

Benedict commented on CASSANDRA-3578:
-

As discussed on IRC, with Batch CL in particular you could see a lot of warning 
messages in the log about the sync lagging. Whilst this was working as 
intended, it was a bit much for BatchCL with a small window.

I've uploaded a patch to 
[https://github.com/belliottsmith/cassandra/tree/iss-3578-4] that should 
address this, by aggregating lags into a 5m period, for which we only report if 
the lag exceeds 5% of the wall time of the period. Meaning it should only crop 
up when it's really appreciably affecting performance.

 Multithreaded commitlog
 ---

 Key: CASSANDRA-3578
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3578
 Project: Cassandra
  Issue Type: Improvement
Reporter: Jonathan Ellis
Assignee: Benedict
Priority: Minor
  Labels: performance
 Fix For: 2.1

 Attachments: 0001-CASSANDRA-3578.patch, ComitlogStress.java, 
 Current-CL.png, Multi-Threded-CL.png, TestEA.java, latency.svg, oprate.svg, 
 parallel_commit_log_2.patch


 Brian Aker pointed out a while ago that allowing multiple threads to modify 
 the commitlog simultaneously (reserving space for each with a CAS first, the 
 way we do in the SlabAllocator.Region.allocate) can improve performance, 
 since you're not bottlenecking on a single thread to do all the copying and 
 CRC computation.
 Now that we use mmap'd CommitLog segments (CASSANDRA-3411) this becomes 
 doable.
 (moved from CASSANDRA-622, which was getting a bit muddled.)



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (CASSANDRA-6218) Repair should allow repairing particular data centers to reduce WAN usage

2013-12-03 Thread sankalp kohli (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sankalp kohli updated CASSANDRA-6218:
-

Summary: Repair should allow repairing particular data centers to reduce 
WAN usage  (was: Reduce WAN traffic while doing repairs)

 Repair should allow repairing particular data centers to reduce WAN usage
 -

 Key: CASSANDRA-6218
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6218
 Project: Cassandra
  Issue Type: New Feature
  Components: Core, Tools
Reporter: sankalp kohli
Assignee: Jimmy Mårdell
Priority: Minor
 Fix For: 2.0.4

 Attachments: trunk-6218-v2.txt, trunk-6218-v3.patch, trunk-6218.txt


 The way we send out data that does not match over WAN can be improved. 
 Example: Say there are four nodes(A,B,C,D) which are replica of a range we 
 are repairing. A, B is in DC1 and C,D is in DC2. If A does not have the data 
 which other replicas have, then we will have following streams
 1) A to B and back
 2) A to C and back(Goes over WAN)
 3) A to D and back(Goes over WAN)
 One of the ways of doing it to reduce WAN traffic is this.
 1) Repair A and B only with each other and C and D with each other starting 
 at same time t. 
 2) Once these repairs have finished, A,B and C,D are in sync with respect to 
 time t. 
 3) Now run a repair between A and C, the streams which are exchanged as a 
 result of the diff will also be streamed to B and D via A and C(C and D 
 behaves like a proxy to the streams).
 For a replication of DC1:2,DC2:2, the WAN traffic will get reduced by 50% and 
 even more for higher replication factors. 
  



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Created] (CASSANDRA-6440) Repair should allow repairing particular endpoints to reduce WAN usage.

2013-12-03 Thread sankalp kohli (JIRA)
sankalp kohli created CASSANDRA-6440:


 Summary: Repair should allow repairing particular endpoints to 
reduce WAN usage. 
 Key: CASSANDRA-6440
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6440
 Project: Cassandra
  Issue Type: New Feature
Reporter: sankalp kohli
Priority: Minor


The way we send out data that does not match over WAN can be improved. 
Example: Say there are four nodes(A,B,C,D) which are replica of a range we are 
repairing. A, B is in DC1 and C,D is in DC2. If A does not have the data which 
other replicas have, then we will have following streams
1) A to B and back
2) A to C and back(Goes over WAN)
3) A to D and back(Goes over WAN)
One of the ways of doing it to reduce WAN traffic is this.
1) Repair A and B only with each other and C and D with each other starting at 
same time t. 
2) Once these repairs have finished, A,B and C,D are in sync with respect to 
time t. 
3) Now run a repair between A and C, the streams which are exchanged as a 
result of the diff will also be streamed to B and D via A and C(C and D behaves 
like a proxy to the streams).
For a replication of DC1:2,DC2:2, the WAN traffic will get reduced by 50% and 
even more for higher replication factors.

Another easy way to do this is to have repair command take nodes with which you 
want to repair with. Then we can do something like this.
1) Run repair between (A and B) and (C and D)
2) Run repair between (A and C)
3) Run repair between (A and B) and (C and D)
But this will increase the traffic inside the DC as we wont be doing proxy.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-6440) Repair should allow repairing particular endpoints to reduce WAN usage.

2013-12-03 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13838110#comment-13838110
 ] 

Jonathan Ellis commented on CASSANDRA-6440:
---

(See discussion on CASSANDRA-6218.)

 Repair should allow repairing particular endpoints to reduce WAN usage. 
 

 Key: CASSANDRA-6440
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6440
 Project: Cassandra
  Issue Type: New Feature
Reporter: sankalp kohli
Priority: Minor

 The way we send out data that does not match over WAN can be improved. 
 Example: Say there are four nodes(A,B,C,D) which are replica of a range we 
 are repairing. A, B is in DC1 and C,D is in DC2. If A does not have the data 
 which other replicas have, then we will have following streams
 1) A to B and back
 2) A to C and back(Goes over WAN)
 3) A to D and back(Goes over WAN)
 One of the ways of doing it to reduce WAN traffic is this.
 1) Repair A and B only with each other and C and D with each other starting 
 at same time t. 
 2) Once these repairs have finished, A,B and C,D are in sync with respect to 
 time t. 
 3) Now run a repair between A and C, the streams which are exchanged as a 
 result of the diff will also be streamed to B and D via A and C(C and D 
 behaves like a proxy to the streams).
 For a replication of DC1:2,DC2:2, the WAN traffic will get reduced by 50% and 
 even more for higher replication factors.
 Another easy way to do this is to have repair command take nodes with which 
 you want to repair with. Then we can do something like this.
 1) Run repair between (A and B) and (C and D)
 2) Run repair between (A and C)
 3) Run repair between (A and B) and (C and D)
 But this will increase the traffic inside the DC as we wont be doing proxy.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-5864) Scrub should discard the columns from CFMetaData.droppedColumns map (if they are old enough)

2013-12-03 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13838140#comment-13838140
 ] 

Sylvain Lebresne commented on CASSANDRA-5864:
-

Are we sure it doesn't do it already? My reading of the code is that dropped 
columns are removed by CFS.removeDeleted() and this, irregardless of gcBefore 
(as should be). Since LazilyCompactedRow calls CFS.removeDeleted() and scrub 
uses LazilyCompactedRow  Am I missing something?

As for whether scrub should or should not drop them, I'd go with it doesn't 
really matter so let's do whatever is easier.

 Scrub should discard the columns from CFMetaData.droppedColumns map (if they 
 are old enough)
 

 Key: CASSANDRA-5864
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5864
 Project: Cassandra
  Issue Type: Improvement
Reporter: Aleksey Yeschenko
Assignee: Tyler Hobbs
Priority: Minor
  Labels: scrub

 CASSANDRA-3919 restored ALTER TABLE DROP support in CQL3 and it would be nice 
 to make scrub dropped-columns-aware.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (CASSANDRA-6441) Explore merging memtables directly with L1

2013-12-03 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-6441:
--

Fix Version/s: 3.0

 Explore merging memtables directly with L1
 --

 Key: CASSANDRA-6441
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6441
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Jonathan Ellis
Priority: Minor
  Labels: compaction
 Fix For: 3.0


 Currently, memtables flush to L0 and are then compacted with L1, so you 
 automatically have 100% write amplification for unique cells right off the 
 bat.
 http://dl.acm.org/citation.cfm?id=2213862 suggests splitting the memtable 
 into pieces corresponding to the ranges of the sstables in L1 and turning the 
 flush + compact into a single write -- that is, we'd compact the data in 
 the L1 sstable with the corresponding data in the memtable.
 This would add some complexity around blocking memtable sections until the 
 corresponding L1 piece is no longer involved in its own compaction with L2, 
 and probably a panic dump to the old L0 behavior if we run low on memory.  
 But in theory it sounds like a promising optimization.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Created] (CASSANDRA-6441) Explore merging memtables directly with L1

2013-12-03 Thread Jonathan Ellis (JIRA)
Jonathan Ellis created CASSANDRA-6441:
-

 Summary: Explore merging memtables directly with L1
 Key: CASSANDRA-6441
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6441
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Jonathan Ellis
Priority: Minor


Currently, memtables flush to L0 and are then compacted with L1, so you 
automatically have 100% write amplification for unique cells right off the bat.

http://dl.acm.org/citation.cfm?id=2213862 suggests splitting the memtable into 
pieces corresponding to the ranges of the sstables in L1 and turning the flush 
+ compact into a single write -- that is, we'd compact the data in the L1 
sstable with the corresponding data in the memtable.

This would add some complexity around blocking memtable sections until the 
corresponding L1 piece is no longer involved in its own compaction with L2, and 
probably a panic dump to the old L0 behavior if we run low on memory.  But in 
theory it sounds like a promising optimization.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Created] (CASSANDRA-6442) CQLSH in CQL2 Mode (-2) lowercases keyspace after use statements breaking autocomplete

2013-12-03 Thread Russell Alexander Spitzer (JIRA)
Russell Alexander Spitzer created CASSANDRA-6442:


 Summary: CQLSH in CQL2 Mode (-2) lowercases keyspace after use 
statements breaking autocomplete
 Key: CASSANDRA-6442
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6442
 Project: Cassandra
  Issue Type: Bug
  Components: Tools
Reporter: Russell Alexander Spitzer
Priority: Minor


When running Cqlsh in cql2 mode, using a keyspace lowercases it. 

{code}
cqlsh CREATE KEYSPACE MixedCase WITH strategy_class = 'SimpleStrategy' AND 
strategy_options:replication_factor = 1;
cqlsh use MixedCase ;
cqlsh:mixedcase 
{code}

This is slightly annoying since cqlsh cannot autocomplete correctly with the 
wrong keyspace name. 
{code}
cqlsh:mixedcase CREATE TABLE test (a int PRIMARY KEY , b int ) ;
cqlsh:mixedcase SELECT * FROM [TAB PRESSED]
HiveMetaStore.  MixedCase.  cfs.cfs_archive.cql3ks. 
dse_security.   dse_system. fun.system. system_traces.  
testKS.  
{code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Created] (CASSANDRA-6443) CQLSH in CQL2 mode (-2) Cannot set compaction_strategy

2013-12-03 Thread Russell Alexander Spitzer (JIRA)
Russell Alexander Spitzer created CASSANDRA-6443:


 Summary: CQLSH in CQL2 mode (-2) Cannot set compaction_strategy
 Key: CASSANDRA-6443
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6443
 Project: Cassandra
  Issue Type: Bug
Reporter: Russell Alexander Spitzer


Attempting to create a table in CQLSH -2 will always result in 
SizeTieredCompaction regardless of specified option.

{code}
cqlsh:fun  CREATE TABLE asciiCFLVL (  key ascii PRIMARY KEY,  asciiA ascii,  
asciiB ascii  ) with compaction_strategy_class = 'LeveledCompactionStrategy' ;
cqlsh:fun DESCRIBE TABLE asciiCFLVL ;

CREATE TABLE asciiCFLVL (
  'key' ascii PRIMARY KEY,
  asciiB ascii,
  asciiA ascii
) WITH
  comment='' AND
  comparator=text AND
  read_repair_chance=0.10 AND
  gc_grace_seconds=864000 AND
  default_validation=text AND
  min_compaction_threshold=4 AND
  max_compaction_threshold=32 AND
  replicate_on_write='true' AND
  compaction_strategy_class='SizeTieredCompactionStrategy' AND
  compression_parameters:sstable_compression='LZ4Compressor';
{code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (CASSANDRA-6442) CQLSH in CQL2 Mode (-2) lowercases keyspace after use statements breaking autocomplete

2013-12-03 Thread Russell Alexander Spitzer (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Russell Alexander Spitzer updated CASSANDRA-6442:
-

Reproduced In: 2.0.2, 1.2.12
Since Version: 1.2.12

 CQLSH in CQL2 Mode (-2) lowercases keyspace after use statements breaking 
 autocomplete
 --

 Key: CASSANDRA-6442
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6442
 Project: Cassandra
  Issue Type: Bug
  Components: Tools
Reporter: Russell Alexander Spitzer
Priority: Minor

 When running Cqlsh in cql2 mode, using a keyspace lowercases it. 
 {code}
 cqlsh CREATE KEYSPACE MixedCase WITH strategy_class = 'SimpleStrategy' AND 
 strategy_options:replication_factor = 1;
 cqlsh use MixedCase ;
 cqlsh:mixedcase 
 {code}
 This is slightly annoying since cqlsh cannot autocomplete correctly with the 
 wrong keyspace name. 
 {code}
 cqlsh:mixedcase CREATE TABLE test (a int PRIMARY KEY , b int ) ;
 cqlsh:mixedcase SELECT * FROM [TAB PRESSED]
 HiveMetaStore.  MixedCase.  cfs.cfs_archive.cql3ks.   
   dse_security.   dse_system. fun.system. 
 system_traces.  testKS.  
 {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (CASSANDRA-6442) CQLSH in CQL2 Mode (-2) lowercases keyspace after use statements breaking autocomplete

2013-12-03 Thread Russell Alexander Spitzer (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Russell Alexander Spitzer updated CASSANDRA-6442:
-

Since Version: 1.2.11  (was: 1.2.12)

 CQLSH in CQL2 Mode (-2) lowercases keyspace after use statements breaking 
 autocomplete
 --

 Key: CASSANDRA-6442
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6442
 Project: Cassandra
  Issue Type: Bug
  Components: Tools
Reporter: Russell Alexander Spitzer
Priority: Minor

 When running Cqlsh in cql2 mode, using a keyspace lowercases it. 
 {code}
 cqlsh CREATE KEYSPACE MixedCase WITH strategy_class = 'SimpleStrategy' AND 
 strategy_options:replication_factor = 1;
 cqlsh use MixedCase ;
 cqlsh:mixedcase 
 {code}
 This is slightly annoying since cqlsh cannot autocomplete correctly with the 
 wrong keyspace name. 
 {code}
 cqlsh:mixedcase CREATE TABLE test (a int PRIMARY KEY , b int ) ;
 cqlsh:mixedcase SELECT * FROM [TAB PRESSED]
 HiveMetaStore.  MixedCase.  cfs.cfs_archive.cql3ks.   
   dse_security.   dse_system. fun.system. 
 system_traces.  testKS.  
 {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (CASSANDRA-6442) CQLSH in CQL2 Mode (-2) lowercases keyspace after use statements breaking autocomplete

2013-12-03 Thread Russell Alexander Spitzer (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Russell Alexander Spitzer updated CASSANDRA-6442:
-

Reproduced In: 2.0.2, 1.2.11  (was: 1.2.12, 2.0.2)

 CQLSH in CQL2 Mode (-2) lowercases keyspace after use statements breaking 
 autocomplete
 --

 Key: CASSANDRA-6442
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6442
 Project: Cassandra
  Issue Type: Bug
  Components: Tools
Reporter: Russell Alexander Spitzer
Priority: Minor

 When running Cqlsh in cql2 mode, using a keyspace lowercases it. 
 {code}
 cqlsh CREATE KEYSPACE MixedCase WITH strategy_class = 'SimpleStrategy' AND 
 strategy_options:replication_factor = 1;
 cqlsh use MixedCase ;
 cqlsh:mixedcase 
 {code}
 This is slightly annoying since cqlsh cannot autocomplete correctly with the 
 wrong keyspace name. 
 {code}
 cqlsh:mixedcase CREATE TABLE test (a int PRIMARY KEY , b int ) ;
 cqlsh:mixedcase SELECT * FROM [TAB PRESSED]
 HiveMetaStore.  MixedCase.  cfs.cfs_archive.cql3ks.   
   dse_security.   dse_system. fun.system. 
 system_traces.  testKS.  
 {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (CASSANDRA-6443) CQLSH in CQL2 mode (-2) Cannot set compaction_strategy

2013-12-03 Thread Russell Alexander Spitzer (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Russell Alexander Spitzer updated CASSANDRA-6443:
-

Since Version: 1.2.11  (was: 1.2.12)

 CQLSH in CQL2 mode (-2) Cannot set compaction_strategy
 --

 Key: CASSANDRA-6443
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6443
 Project: Cassandra
  Issue Type: Bug
Reporter: Russell Alexander Spitzer

 Attempting to create a table in CQLSH -2 will always result in 
 SizeTieredCompaction regardless of specified option.
 {code}
 cqlsh:fun  CREATE TABLE asciiCFLVL (  key ascii PRIMARY KEY,  asciiA ascii,  
 asciiB ascii  ) with compaction_strategy_class = 'LeveledCompactionStrategy' ;
 cqlsh:fun DESCRIBE TABLE asciiCFLVL ;
 CREATE TABLE asciiCFLVL (
   'key' ascii PRIMARY KEY,
   asciiB ascii,
   asciiA ascii
 ) WITH
   comment='' AND
   comparator=text AND
   read_repair_chance=0.10 AND
   gc_grace_seconds=864000 AND
   default_validation=text AND
   min_compaction_threshold=4 AND
   max_compaction_threshold=32 AND
   replicate_on_write='true' AND
   compaction_strategy_class='SizeTieredCompactionStrategy' AND
   compression_parameters:sstable_compression='LZ4Compressor';
 {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (CASSANDRA-6443) CQLSH in CQL2 mode (-2) Cannot set compaction_strategy

2013-12-03 Thread Russell Alexander Spitzer (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Russell Alexander Spitzer updated CASSANDRA-6443:
-

Reproduced In: 2.0.2, 1.2.11  (was: 1.2.12, 2.0.2)

 CQLSH in CQL2 mode (-2) Cannot set compaction_strategy
 --

 Key: CASSANDRA-6443
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6443
 Project: Cassandra
  Issue Type: Bug
Reporter: Russell Alexander Spitzer

 Attempting to create a table in CQLSH -2 will always result in 
 SizeTieredCompaction regardless of specified option.
 {code}
 cqlsh:fun  CREATE TABLE asciiCFLVL (  key ascii PRIMARY KEY,  asciiA ascii,  
 asciiB ascii  ) with compaction_strategy_class = 'LeveledCompactionStrategy' ;
 cqlsh:fun DESCRIBE TABLE asciiCFLVL ;
 CREATE TABLE asciiCFLVL (
   'key' ascii PRIMARY KEY,
   asciiB ascii,
   asciiA ascii
 ) WITH
   comment='' AND
   comparator=text AND
   read_repair_chance=0.10 AND
   gc_grace_seconds=864000 AND
   default_validation=text AND
   min_compaction_threshold=4 AND
   max_compaction_threshold=32 AND
   replicate_on_write='true' AND
   compaction_strategy_class='SizeTieredCompactionStrategy' AND
   compression_parameters:sstable_compression='LZ4Compressor';
 {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Resolved] (CASSANDRA-6442) CQLSH in CQL2 Mode (-2) lowercases keyspace after use statements breaking autocomplete

2013-12-03 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis resolved CASSANDRA-6442.
---

   Resolution: Won't Fix
Reproduced In: 2.0.2, 1.2.11  (was: 1.2.11, 2.0.2)

cql 2 mode is purely legacy code at this point; we're not expending effort on 
enhancements.

 CQLSH in CQL2 Mode (-2) lowercases keyspace after use statements breaking 
 autocomplete
 --

 Key: CASSANDRA-6442
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6442
 Project: Cassandra
  Issue Type: Bug
  Components: Tools
Reporter: Russell Alexander Spitzer
Priority: Minor

 When running Cqlsh in cql2 mode, using a keyspace lowercases it. 
 {code}
 cqlsh CREATE KEYSPACE MixedCase WITH strategy_class = 'SimpleStrategy' AND 
 strategy_options:replication_factor = 1;
 cqlsh use MixedCase ;
 cqlsh:mixedcase 
 {code}
 This is slightly annoying since cqlsh cannot autocomplete correctly with the 
 wrong keyspace name. 
 {code}
 cqlsh:mixedcase CREATE TABLE test (a int PRIMARY KEY , b int ) ;
 cqlsh:mixedcase SELECT * FROM [TAB PRESSED]
 HiveMetaStore.  MixedCase.  cfs.cfs_archive.cql3ks.   
   dse_security.   dse_system. fun.system. 
 system_traces.  testKS.  
 {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-6435) nodetool outputs xss and jamm errors in 1.2.12

2013-12-03 Thread Mikhail Stepura (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13838234#comment-13838234
 ] 

Mikhail Stepura commented on CASSANDRA-6435:


For reference: 
https://github.com/apache/cassandra/commit/3f66fbfc63c728778325e3be958019a0da1b47d5

 nodetool outputs xss and jamm errors in 1.2.12
 --

 Key: CASSANDRA-6435
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6435
 Project: Cassandra
  Issue Type: Bug
Reporter: Karl Mueller
Assignee: Brandon Williams
Priority: Minor

 Since 1.2.12, just running nodetool is producing this output. Probably this 
 is related to CASSANDRA-6273.
 it's unclear to me whether jamm is actually not being loaded, but clearly 
 nodetool should not be having this output, which is likely from 
 cassandra-env.sh
 [cassandra@dev-cass00 cassandra]$ /data2/cassandra/bin/nodetool ring
 xss =  -ea -javaagent:/data2/cassandra/bin/../lib/jamm-0.2.5.jar 
 -XX:+UseThreadPriorities -XX:ThreadPriorityPolicy=42 -Xms14G -Xmx14G -Xmn1G 
 -XX:+HeapDumpOnOutOfMemoryError -Xss256k
 Note: Ownership information does not include topology; for complete 
 information, specify a keyspace
 Datacenter: datacenter1
 ==
 Address  RackStatus State   LoadOwns
 Token
 
 170141183460469231731687303715884105727
 10.93.15.10  rack1   Up Normal  123.82 GB   20.00%  
 34028236692093846346337460743176821145
 10.93.15.11  rack1   Up Normal  124 GB  20.00%  
 68056473384187692692674921486353642290
 10.93.15.12  rack1   Up Normal  123.97 GB   20.00%  
 102084710076281539039012382229530463436
 10.93.15.13  rack1   Up Normal  124.03 GB   20.00%  
 136112946768375385385349842972707284581
 10.93.15.14  rack1   Up Normal  123.93 GB   20.00%  
 170141183460469231731687303715884105727
 ERROR 16:20:01,408 Unable to initialize MemoryMeter (jamm not specified as 
 javaagent).  This means Cassandra will be unable to measure object sizes 
 accurately and may consequently OOM.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Assigned] (CASSANDRA-5879) cqlsh shouldn't lower case keyspace or column family names

2013-12-03 Thread Russell Alexander Spitzer (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-5879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Russell Alexander Spitzer reassigned CASSANDRA-5879:


Assignee: Russell Alexander Spitzer

 cqlsh shouldn't lower case keyspace or column family names
 --

 Key: CASSANDRA-5879
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5879
 Project: Cassandra
  Issue Type: Bug
  Components: Tools
Reporter: Richard Low
Assignee: Russell Alexander Spitzer
Priority: Minor

 Keyspace and column family names appear to be case sensitive.  But cqlsh 
 converts them to lower case.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (CASSANDRA-5879) cqlsh shouldn't lower case keyspace or column family names

2013-12-03 Thread Russell Alexander Spitzer (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-5879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Russell Alexander Spitzer updated CASSANDRA-5879:
-

Assignee: (was: Russell Alexander Spitzer)

 cqlsh shouldn't lower case keyspace or column family names
 --

 Key: CASSANDRA-5879
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5879
 Project: Cassandra
  Issue Type: Bug
  Components: Tools
Reporter: Richard Low
Priority: Minor

 Keyspace and column family names appear to be case sensitive.  But cqlsh 
 converts them to lower case.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (CASSANDRA-6443) CQLSH in CQL2 mode (-2) Cannot set compaction_strategy

2013-12-03 Thread Russell Alexander Spitzer (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Russell Alexander Spitzer updated CASSANDRA-6443:
-

Component/s: Tools
 Drivers (now out of tree)

 CQLSH in CQL2 mode (-2) Cannot set compaction_strategy
 --

 Key: CASSANDRA-6443
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6443
 Project: Cassandra
  Issue Type: Bug
  Components: Drivers (now out of tree), Tools
Reporter: Russell Alexander Spitzer

 Attempting to create a table in CQLSH -2 will always result in 
 SizeTieredCompaction regardless of specified option.
 {code}
 cqlsh:fun  CREATE TABLE asciiCFLVL (  key ascii PRIMARY KEY,  asciiA ascii,  
 asciiB ascii  ) with compaction_strategy_class = 'LeveledCompactionStrategy' ;
 cqlsh:fun DESCRIBE TABLE asciiCFLVL ;
 CREATE TABLE asciiCFLVL (
   'key' ascii PRIMARY KEY,
   asciiB ascii,
   asciiA ascii
 ) WITH
   comment='' AND
   comparator=text AND
   read_repair_chance=0.10 AND
   gc_grace_seconds=864000 AND
   default_validation=text AND
   min_compaction_threshold=4 AND
   max_compaction_threshold=32 AND
   replicate_on_write='true' AND
   compaction_strategy_class='SizeTieredCompactionStrategy' AND
   compression_parameters:sstable_compression='LZ4Compressor';
 {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Comment Edited] (CASSANDRA-6443) CQLSH in CQL2 mode (-2) Cannot set compaction_strategy

2013-12-03 Thread Russell Alexander Spitzer (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13838276#comment-13838276
 ] 

Russell Alexander Spitzer edited comment on CASSANDRA-6443 at 12/3/13 10:52 PM:


The actual bug seems to be in the python cql module. 

{code}
 import cql  
 con=cql.connect('127.0.0.1')
 cur=con.cursor()
 cur.execute('use testKS')
True
 cur.execute(CREATE TABLE tb (a int PRIMARY KEY, b int) with 
 compaction_strategy_class = 'LeveledCompactionStrategy')
True
 quit()
{code}
In cqlsh
{code}
cqlsh DESCRIBE table testKS.tb;

CREATE TABLE tb (
  a int PRIMARY KEY,
  b int
) WITH COMPACT STORAGE AND
  bloom_filter_fp_chance=0.01 AND
  caching='KEYS_ONLY' AND
  comment='' AND
  dclocal_read_repair_chance=0.00 AND
  gc_grace_seconds=864000 AND
  read_repair_chance=0.10 AND
  replicate_on_write='true' AND
  populate_io_cache_on_flush='false' AND
  compaction={'class': 'SizeTieredCompactionStrategy'} AND
  compression={'sstable_compression': 'LZ4Compressor'};
{code}


was (Author: rspitzer):
The actual bug seems to be in the python cql module. 

{code}
 import cql  
 con=cql.connect('127.0.0.1')
 cur=con.cursor()
 cur.execute('use testKS')
True
 cur.execute(CREATE TABLE tb (a int PRIMARY KEY, b int) with 
 compaction_strategy_class = 'MemoryOnlyStrategy')
True
 quit()
{code}
In cqlsh
{code}
cqlsh DESCRIBE table testKS.tb;

CREATE TABLE tb (
  a int PRIMARY KEY,
  b int
) WITH COMPACT STORAGE AND
  bloom_filter_fp_chance=0.01 AND
  caching='KEYS_ONLY' AND
  comment='' AND
  dclocal_read_repair_chance=0.00 AND
  gc_grace_seconds=864000 AND
  read_repair_chance=0.10 AND
  replicate_on_write='true' AND
  populate_io_cache_on_flush='false' AND
  compaction={'class': 'SizeTieredCompactionStrategy'} AND
  compression={'sstable_compression': 'LZ4Compressor'};
{code}

 CQLSH in CQL2 mode (-2) Cannot set compaction_strategy
 --

 Key: CASSANDRA-6443
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6443
 Project: Cassandra
  Issue Type: Bug
  Components: Drivers (now out of tree), Tools
Reporter: Russell Alexander Spitzer

 Attempting to create a table in CQLSH -2 will always result in 
 SizeTieredCompaction regardless of specified option.
 {code}
 cqlsh:fun  CREATE TABLE asciiCFLVL (  key ascii PRIMARY KEY,  asciiA ascii,  
 asciiB ascii  ) with compaction_strategy_class = 'LeveledCompactionStrategy' ;
 cqlsh:fun DESCRIBE TABLE asciiCFLVL ;
 CREATE TABLE asciiCFLVL (
   'key' ascii PRIMARY KEY,
   asciiB ascii,
   asciiA ascii
 ) WITH
   comment='' AND
   comparator=text AND
   read_repair_chance=0.10 AND
   gc_grace_seconds=864000 AND
   default_validation=text AND
   min_compaction_threshold=4 AND
   max_compaction_threshold=32 AND
   replicate_on_write='true' AND
   compaction_strategy_class='SizeTieredCompactionStrategy' AND
   compression_parameters:sstable_compression='LZ4Compressor';
 {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (CASSANDRA-5201) Cassandra/Hadoop does not support current Hadoop releases

2013-12-03 Thread Benjamin Coverston (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-5201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Coverston updated CASSANDRA-5201:
--

Attachment: hadoopCompat.patch

Poking around at other projects this generally gets solved in one of two ways: 
Ship two versions of their Hadoop integrations (one compiled for the old, and 
one compiled for the new), or use a little reflection to make things work 
across the board.

I'm attaching a patch that uses the hadoopCompat subproject of elephantbird. 
This will allow us to compile a single binary and run with the new and old 
context objects.

I've tested this patch with HDP 2.0, and Apache Hadoop 1.0.4 and it works fine 
with both (including Hive in DSE). With Pig I needed to compile our (optional) 
pig dependency with:

bq. ant clean jar-withouthadoop -Dhadoopversion=23

Only really needed if you're using one of the current versions of thrift with 
the new JobContext.


 Cassandra/Hadoop does not support current Hadoop releases
 -

 Key: CASSANDRA-5201
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5201
 Project: Cassandra
  Issue Type: Bug
  Components: Hadoop
Affects Versions: 1.2.0
Reporter: Brian Jeltema
Assignee: Dave Brosius
 Attachments: 5201_a.txt, hadoopCompat.patch


 Using Hadoop 0.22.0 with Cassandra results in the stack trace below.
 It appears that version 0.21+ changed org.apache.hadoop.mapreduce.JobContext
 from a class to an interface.
 Exception in thread main java.lang.IncompatibleClassChangeError: Found 
 interface org.apache.hadoop.mapreduce.JobContext, but class was expected
   at 
 org.apache.cassandra.hadoop.ColumnFamilyInputFormat.getSplits(ColumnFamilyInputFormat.java:103)
   at 
 org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(JobSubmitter.java:445)
   at 
 org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:462)
   at 
 org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:357)
   at org.apache.hadoop.mapreduce.Job$2.run(Job.java:1045)
   at org.apache.hadoop.mapreduce.Job$2.run(Job.java:1042)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:415)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1153)
   at org.apache.hadoop.mapreduce.Job.submit(Job.java:1042)
   at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1062)
   at MyHadoopApp.run(MyHadoopApp.java:163)
   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:69)
   at MyHadoopApp.main(MyHadoopApp.java:82)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:601)
   at org.apache.hadoop.util.RunJar.main(RunJar.java:192)



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-6432) Calculate estimated Cql row count per token range

2013-12-03 Thread Alex Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6432?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13838301#comment-13838301
 ] 

Alex Liu commented on CASSANDRA-6432:
-

SSTableMetadata.estimatedColumnCount collects column counts per SSTable, but 
there is no column counts per key, so we can't use the current statistics to 
calculate the columns per token range.

Same column can be distributed across multiple sstables, so we need merging the 
columns to count the unique columns which is not applicable. 

Select count(*) from cf scans all the rows, then it's not useful for big data.

 Calculate estimated Cql row count per token range
 -

 Key: CASSANDRA-6432
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6432
 Project: Cassandra
  Issue Type: Bug
  Components: Hadoop
Reporter: Alex Liu

 CASSANDRA-6311 use the client side to calculate actual CF row count for 
 hadoop job. We need fix it by using Cql row count, which need estimated Cql 
 row count per token range.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (CASSANDRA-6364) There should be different disk_failure_policies for data and commit volumes or commit volume failure should always cause node exit

2013-12-03 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-6364:
--

  Component/s: Core
Fix Version/s: 2.0.4
 Assignee: Benedict  (was: Aleksey Yeschenko)

Would like to fix this in 2.0.x as well as 2.1 but I will settle for 2.1 if 
that's onerous.

 There should be different disk_failure_policies for data and commit volumes 
 or commit volume failure should always cause node exit
 --

 Key: CASSANDRA-6364
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6364
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: JBOD, single dedicated commit disk
Reporter: J. Ryan Earl
Assignee: Benedict
 Fix For: 2.0.4


 We're doing fault testing on a pre-production Cassandra cluster.  One of the 
 tests was to simulation failure of the commit volume/disk, which in our case 
 is on a dedicated disk.  We expected failure of the commit volume to be 
 handled somehow, but what we found was that no action was taken by Cassandra 
 when the commit volume fail.  We simulated this simply by pulling the 
 physical disk that backed the commit volume, which resulted in filesystem I/O 
 errors on the mount point.
 What then happened was that the Cassandra Heap filled up to the point that it 
 was spending 90% of its time doing garbage collection.  No errors were logged 
 in regards to the failed commit volume.  Gossip on other nodes in the cluster 
 eventually flagged the node as down.  Gossip on the local node showed itself 
 as up, and all other nodes as down.
 The most serious problem was that connections to the coordinator on this node 
 became very slow due to the on-going GC, as I assume uncommitted writes piled 
 up on the JVM heap.  What we believe should have happened is that Cassandra 
 should have caught the I/O error and exited with a useful log message, or 
 otherwise done some sort of useful cleanup.  Otherwise the node goes into a 
 sort of Zombie state, spending most of its time in GC, and thus slowing down 
 any transactions that happen to use the coordinator on said node.
 A limit on in-memory, unflushed writes before refusing requests may also 
 work.  Point being, something should be done to handle the commit volume 
 dying as doing nothing results in affecting the entire cluster.  I should 
 note, we are using: disk_failure_policy: best_effort



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-5201) Cassandra/Hadoop does not support current Hadoop releases

2013-12-03 Thread Benjamin Coverston (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13838325#comment-13838325
 ] 

Benjamin Coverston commented on CASSANDRA-5201:
---

These changes also depend on CASSANDRA-6309 for anything to work.

 Cassandra/Hadoop does not support current Hadoop releases
 -

 Key: CASSANDRA-5201
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5201
 Project: Cassandra
  Issue Type: Bug
  Components: Hadoop
Affects Versions: 1.2.0
Reporter: Brian Jeltema
Assignee: Dave Brosius
 Attachments: 5201_a.txt, hadoopCompat.patch


 Using Hadoop 0.22.0 with Cassandra results in the stack trace below.
 It appears that version 0.21+ changed org.apache.hadoop.mapreduce.JobContext
 from a class to an interface.
 Exception in thread main java.lang.IncompatibleClassChangeError: Found 
 interface org.apache.hadoop.mapreduce.JobContext, but class was expected
   at 
 org.apache.cassandra.hadoop.ColumnFamilyInputFormat.getSplits(ColumnFamilyInputFormat.java:103)
   at 
 org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(JobSubmitter.java:445)
   at 
 org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:462)
   at 
 org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:357)
   at org.apache.hadoop.mapreduce.Job$2.run(Job.java:1045)
   at org.apache.hadoop.mapreduce.Job$2.run(Job.java:1042)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:415)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1153)
   at org.apache.hadoop.mapreduce.Job.submit(Job.java:1042)
   at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1062)
   at MyHadoopApp.run(MyHadoopApp.java:163)
   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:69)
   at MyHadoopApp.main(MyHadoopApp.java:82)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:601)
   at org.apache.hadoop.util.RunJar.main(RunJar.java:192)



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Resolved] (CASSANDRA-6443) CQLSH in CQL2 mode (-2) Cannot set compaction_strategy

2013-12-03 Thread Aleksey Yeschenko (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko resolved CASSANDRA-6443.
--

   Resolution: Won't Fix
Reproduced In: 2.0.2, 1.2.11  (was: 1.2.11, 2.0.2)

CQL2 is dead. Use CQL3.

 CQLSH in CQL2 mode (-2) Cannot set compaction_strategy
 --

 Key: CASSANDRA-6443
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6443
 Project: Cassandra
  Issue Type: Bug
  Components: Drivers (now out of tree), Tools
Reporter: Russell Alexander Spitzer

 Attempting to create a table in CQLSH -2 will always result in 
 SizeTieredCompaction regardless of specified option.
 {code}
 cqlsh:fun  CREATE TABLE asciiCFLVL (  key ascii PRIMARY KEY,  asciiA ascii,  
 asciiB ascii  ) with compaction_strategy_class = 'LeveledCompactionStrategy' ;
 cqlsh:fun DESCRIBE TABLE asciiCFLVL ;
 CREATE TABLE asciiCFLVL (
   'key' ascii PRIMARY KEY,
   asciiB ascii,
   asciiA ascii
 ) WITH
   comment='' AND
   comparator=text AND
   read_repair_chance=0.10 AND
   gc_grace_seconds=864000 AND
   default_validation=text AND
   min_compaction_threshold=4 AND
   max_compaction_threshold=32 AND
   replicate_on_write='true' AND
   compaction_strategy_class='SizeTieredCompactionStrategy' AND
   compression_parameters:sstable_compression='LZ4Compressor';
 {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (CASSANDRA-4476) Support 2ndary index queries with only non-EQ clauses

2013-12-03 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-4476?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-4476:
--

Assignee: (was: Marcus Eriksson)

[~slebresne] how much more complicated does CASSANDRA-4511 make this?

 Support 2ndary index queries with only non-EQ clauses
 -

 Key: CASSANDRA-4476
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4476
 Project: Cassandra
  Issue Type: Improvement
  Components: API, Core
Reporter: Sylvain Lebresne
Priority: Minor
 Fix For: 2.1


 Currently, a query that uses 2ndary indexes must have at least one EQ clause 
 (on an indexed column). Given that indexed CFs are local (and use 
 LocalPartitioner that order the row by the type of the indexed column), we 
 should extend 2ndary indexes to allow querying indexed columns even when no 
 EQ clause is provided.
 As far as I can tell, the main problem to solve for this is to update 
 KeysSearcher.highestSelectivityPredicate(). I.e. how do we estimate the 
 selectivity of non-EQ clauses? I note however that if we can do that estimate 
 reasonably accurately, this might provide better performance even for index 
 queries that both EQ and non-EQ clauses, because some non-EQ clauses may have 
 a much better selectivity than EQ ones (say you index both the user country 
 and birth date, for SELECT * FROM users WHERE country = 'US' AND birthdate  
 'Jan 2009' AND birtdate  'July 2009', you'd better use the birthdate index 
 first).



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-6410) gossip memory usage improvement

2013-12-03 Thread Quentin Conner (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13838372#comment-13838372
 ] 

Quentin Conner commented on CASSANDRA-6410:
---

Chris, the Heap memory usage numbers are from a single node.  No aggregation 
across the cluster.

 gossip memory usage improvement
 ---

 Key: CASSANDRA-6410
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6410
 Project: Cassandra
  Issue Type: Improvement
Reporter: Quentin Conner
Assignee: Jonathan Ellis
Priority: Minor
 Fix For: 2.0.4

 Attachments: 6410-EnumMap.txt, gossip-intern.txt


 It looks to me that any given node will need ~2 MB of Java VM heap for each 
 other node in the ring.  This was observed with num_tokens=512 but still 
 seems excessive.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Created] (CASSANDRA-6444) Have a nodetool command which emits true data size

2013-12-03 Thread sankalp kohli (JIRA)
sankalp kohli created CASSANDRA-6444:


 Summary: Have a nodetool command which emits true data size
 Key: CASSANDRA-6444
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6444
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: sankalp kohli
Priority: Minor


Sometimes we have an unbalanced clusters and it is difficult to know whether 
some nodes are taking more space because updated have not yet been compacted 
away or it is due to distribution of data.
So we need to know the true fully compacted data size. 
We can do this with validation compaction and summing up the size of all rows. 
We should also emit such a sum during repair when the Merkle tree is being 
generated. 




--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-5839) Save repair data to system table

2013-12-03 Thread Jason Brown (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13838626#comment-13838626
 ] 

Jason Brown commented on CASSANDRA-5839:


Also, feel free to make me the reviewer for any patches when you are ready.

 Save repair data to system table
 

 Key: CASSANDRA-5839
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5839
 Project: Cassandra
  Issue Type: New Feature
  Components: Core, Tools
Reporter: Jonathan Ellis
Assignee: Jimmy Mårdell
Priority: Minor
 Fix For: 2.0.4


 As noted in CASSANDRA-2405, it would be useful to store repair results, 
 particularly with sub-range repair available (CASSANDRA-5280).



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-5839) Save repair data to system table

2013-12-03 Thread Jason Brown (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13838625#comment-13838625
 ] 

Jason Brown commented on CASSANDRA-5839:


[~yarin] Not a problem if you want to knock it out. If you are interested, I 
got probably about half way through this ticket a few months ago, and I've 
pushed my work here: https://github.com/jasobrown/cassandra/tree/5839. Might be 
useful, maybe not, but at least it's there if you want to compare notes.

 Save repair data to system table
 

 Key: CASSANDRA-5839
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5839
 Project: Cassandra
  Issue Type: New Feature
  Components: Core, Tools
Reporter: Jonathan Ellis
Assignee: Jimmy Mårdell
Priority: Minor
 Fix For: 2.0.4


 As noted in CASSANDRA-2405, it would be useful to store repair results, 
 particularly with sub-range repair available (CASSANDRA-5280).



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-5549) Remove Table.switchLock

2013-12-03 Thread Vijay (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13838627#comment-13838627
 ] 

Vijay commented on CASSANDRA-5549:
--

{quote}
Without switch lock, we won't have anything preventing writes coming through 
when we're over-burdened with memory use by memtables.
{quote}
I should be missing something, how does the switch RW Lock to a kind of CAS 
operation change this schematics?
Are we talking about additional requirement/enhancements to this ticket?

{quote}
 When we flush a memtable we release permits equal to the estimated size of 
each RM
{quote}
IMHO, that might not be good enough since Java's memory over head is not 
considered. And calculating the object size is not cheap either

 Remove Table.switchLock
 ---

 Key: CASSANDRA-5549
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5549
 Project: Cassandra
  Issue Type: Bug
Reporter: Jonathan Ellis
Assignee: Vijay
  Labels: performance
 Fix For: 2.1

 Attachments: 5549-removed-switchlock.png, 5549-sunnyvale.png


 As discussed in CASSANDRA-5422, Table.switchLock is a bottleneck on the write 
 path.  ReentrantReadWriteLock is not lightweight, even if there is no 
 contention per se between readers and writers of the lock (in Cassandra, 
 memtable updates and switches).



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-6413) Saved KeyCache prints success to log; but no file present

2013-12-03 Thread Mikhail Stepura (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13838642#comment-13838642
 ] 

Mikhail Stepura commented on CASSANDRA-6413:


Silly me :) The bug is so obvious, but only manifests itself if both Key cache 
AND row cache are enabled

{code:title=org.apache.cassandra.cache.AutoSavingCache.java|borderStyle=solid}
private void deleteOldCacheFiles()
{
File savedCachesDir = new 
File(DatabaseDescriptor.getSavedCachesLocation());

if (savedCachesDir.exists()  savedCachesDir.isDirectory())
{
for (File file : savedCachesDir.listFiles())
{
if (file.isFile()  
file.getName().endsWith(cacheType.toString()))
{
if (!file.delete())
logger.warn(Failed to delete {}, 
file.getAbsolutePath());
}

if (file.isFile()  
file.getName().endsWith(CURRENT_VERSION + .db))
{
if (!file.delete())
logger.warn(Failed to delete {}, 
file.getAbsolutePath());
}
}
}
}
{code}

So, each cache deletes FILES FROM ALL CACHES from the {{saved_caches}} and then 
happily writes its own single file.
The last save wins, 

 Saved KeyCache prints success to log; but no file present
 -

 Key: CASSANDRA-6413
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6413
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: 1.2.11
Reporter: Chris Burroughs
Assignee: Mikhail Stepura

 Cluster has a single keyspace with 3 CFs.  All used to have ROWS_ONLY, two 
 were switched to KEYS_ONLY about 2 days ago.  Row cache continues to save 
 fine, but there is no saved key cache file present on any node in the cluster.
 {noformat}
 6925: INFO [CompactionExecutor:12] 2013-11-27 10:12:02,284 
 AutoSavingCache.java (line 289) Saved RowCache (5 items) in 118 ms
 6941:DEBUG [CompactionExecutor:14] 2013-11-27 10:17:02,163 
 AutoSavingCache.java (line 233) Deleting old RowCache files.
 6942: INFO [CompactionExecutor:14] 2013-11-27 10:17:02,310 
 AutoSavingCache.java (line 289) Saved RowCache (5 items) in 146 ms
 8745:DEBUG [CompactionExecutor:6] 2013-11-27 10:37:25,140 
 AutoSavingCache.java (line 233) Deleting old RowCache files.
 8746: INFO [CompactionExecutor:6] 2013-11-27 10:37:25,283 
 AutoSavingCache.java (line 289) Saved RowCache (5 items) in 143 ms
 8747:DEBUG [CompactionExecutor:6] 2013-11-27 10:37:25,283 
 AutoSavingCache.java (line 233) Deleting old KeyCache files.
 8748: INFO [CompactionExecutor:6] 2013-11-27 10:37:25,625 
 AutoSavingCache.java (line 289) Saved KeyCache (21181 items) in 342 ms
 8749:DEBUG [CompactionExecutor:6] 2013-11-27 10:37:25,625 
 AutoSavingCache.java (line 233) Deleting old RowCache files.
 8750: INFO [CompactionExecutor:6] 2013-11-27 10:37:25,759 
 AutoSavingCache.java (line 289) Saved RowCache (5 items) in 134 ms
 8751:DEBUG [CompactionExecutor:6] 2013-11-27 10:37:25,759 
 AutoSavingCache.java (line 233) Deleting old RowCache files.
 8752: INFO [CompactionExecutor:6] 2013-11-27 10:37:25,893 
 AutoSavingCache.java (line 289) Saved RowCache (5 items) in 133 ms
 8753:DEBUG [CompactionExecutor:6] 2013-11-27 10:37:25,893 
 AutoSavingCache.java (line 233) Deleting old RowCache files.
 8754: INFO [CompactionExecutor:6] 2013-11-27 10:37:26,026 
 AutoSavingCache.java (line 289) Saved RowCache (5 items) in 133 ms
 9915:DEBUG [CompactionExecutor:18] 2013-11-27 10:42:01,851 
 AutoSavingCache.java (line 233) Deleting old KeyCache files.
 9916: INFO [CompactionExecutor:18] 2013-11-27 10:42:02,185 
 AutoSavingCache.java (line 289) Saved KeyCache (22067 items) in 334 ms
 9917:DEBUG [CompactionExecutor:17] 2013-11-27 10:42:02,279 
 AutoSavingCache.java (line 233) Deleting old RowCache files.
 9918: INFO [CompactionExecutor:17] 2013-11-27 10:42:02,411 
 AutoSavingCache.java (line 289) Saved RowCache (5 items) in 131 ms
 {noformat}
 {noformat}
 $ ll ~/shared/saved_caches/
 total 3472
 -rw-rw-r-- 1 cassandra cassandra 3551608 Nov 27 10:42 Foo-Bar-RowCache-b.db
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Comment Edited] (CASSANDRA-6413) Saved KeyCache prints success to log; but no file present

2013-12-03 Thread Mikhail Stepura (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13838642#comment-13838642
 ] 

Mikhail Stepura edited comment on CASSANDRA-6413 at 12/4/13 6:29 AM:
-

Silly me :) The bug is so obvious, but only manifests itself if both Key cache 
AND row cache are enabled

{code:title=org.apache.cassandra.cache.AutoSavingCache.java|borderStyle=solid}
private void deleteOldCacheFiles()
{
File savedCachesDir = new 
File(DatabaseDescriptor.getSavedCachesLocation());

if (savedCachesDir.exists()  savedCachesDir.isDirectory())
{
for (File file : savedCachesDir.listFiles())
{
if (file.isFile()  
file.getName().endsWith(cacheType.toString()))
{
if (!file.delete())
logger.warn(Failed to delete {}, 
file.getAbsolutePath());
}

if (file.isFile()  
file.getName().endsWith(CURRENT_VERSION + .db))
{
if (!file.delete())
logger.warn(Failed to delete {}, 
file.getAbsolutePath());
}
}
}
}
{code}

So, each cache deletes FILES FROM ALL CACHES from the {{saved_caches}} and then 
happily writes its files.
The last save wins, 


was (Author: mishail):
Silly me :) The bug is so obvious, but only manifests itself if both Key cache 
AND row cache are enabled

{code:title=org.apache.cassandra.cache.AutoSavingCache.java|borderStyle=solid}
private void deleteOldCacheFiles()
{
File savedCachesDir = new 
File(DatabaseDescriptor.getSavedCachesLocation());

if (savedCachesDir.exists()  savedCachesDir.isDirectory())
{
for (File file : savedCachesDir.listFiles())
{
if (file.isFile()  
file.getName().endsWith(cacheType.toString()))
{
if (!file.delete())
logger.warn(Failed to delete {}, 
file.getAbsolutePath());
}

if (file.isFile()  
file.getName().endsWith(CURRENT_VERSION + .db))
{
if (!file.delete())
logger.warn(Failed to delete {}, 
file.getAbsolutePath());
}
}
}
}
{code}

So, each cache deletes FILES FROM ALL CACHES from the {{saved_caches}} and then 
happily writes its own single file.
The last save wins, 

 Saved KeyCache prints success to log; but no file present
 -

 Key: CASSANDRA-6413
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6413
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: 1.2.11
Reporter: Chris Burroughs
Assignee: Mikhail Stepura

 Cluster has a single keyspace with 3 CFs.  All used to have ROWS_ONLY, two 
 were switched to KEYS_ONLY about 2 days ago.  Row cache continues to save 
 fine, but there is no saved key cache file present on any node in the cluster.
 {noformat}
 6925: INFO [CompactionExecutor:12] 2013-11-27 10:12:02,284 
 AutoSavingCache.java (line 289) Saved RowCache (5 items) in 118 ms
 6941:DEBUG [CompactionExecutor:14] 2013-11-27 10:17:02,163 
 AutoSavingCache.java (line 233) Deleting old RowCache files.
 6942: INFO [CompactionExecutor:14] 2013-11-27 10:17:02,310 
 AutoSavingCache.java (line 289) Saved RowCache (5 items) in 146 ms
 8745:DEBUG [CompactionExecutor:6] 2013-11-27 10:37:25,140 
 AutoSavingCache.java (line 233) Deleting old RowCache files.
 8746: INFO [CompactionExecutor:6] 2013-11-27 10:37:25,283 
 AutoSavingCache.java (line 289) Saved RowCache (5 items) in 143 ms
 8747:DEBUG [CompactionExecutor:6] 2013-11-27 10:37:25,283 
 AutoSavingCache.java (line 233) Deleting old KeyCache files.
 8748: INFO [CompactionExecutor:6] 2013-11-27 10:37:25,625 
 AutoSavingCache.java (line 289) Saved KeyCache (21181 items) in 342 ms
 8749:DEBUG [CompactionExecutor:6] 2013-11-27 10:37:25,625 
 AutoSavingCache.java (line 233) Deleting old RowCache files.
 8750: INFO [CompactionExecutor:6] 2013-11-27 10:37:25,759 
 AutoSavingCache.java (line 289) Saved RowCache (5 items) in 134 ms
 8751:DEBUG [CompactionExecutor:6] 2013-11-27 10:37:25,759 
 AutoSavingCache.java (line 233) Deleting old RowCache files.
 8752: INFO [CompactionExecutor:6] 2013-11-27 10:37:25,893 
 AutoSavingCache.java (line 289) Saved RowCache (5 items) in 133 ms
 8753:DEBUG [CompactionExecutor:6] 2013-11-27 10:37:25,893 
 AutoSavingCache.java (line 233) Deleting old RowCache files.
 8754: INFO [CompactionExecutor:6] 2013-11-27 

[jira] [Updated] (CASSANDRA-6413) Saved KeyCache prints success to log; but no file present

2013-12-03 Thread Mikhail Stepura (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Stepura updated CASSANDRA-6413:
---

Attachment: CASSANDRA-1.2-6413.patch

Patch: each cache should delete only its own files

 Saved KeyCache prints success to log; but no file present
 -

 Key: CASSANDRA-6413
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6413
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: 1.2.11
Reporter: Chris Burroughs
Assignee: Mikhail Stepura
 Attachments: CASSANDRA-1.2-6413.patch


 Cluster has a single keyspace with 3 CFs.  All used to have ROWS_ONLY, two 
 were switched to KEYS_ONLY about 2 days ago.  Row cache continues to save 
 fine, but there is no saved key cache file present on any node in the cluster.
 {noformat}
 6925: INFO [CompactionExecutor:12] 2013-11-27 10:12:02,284 
 AutoSavingCache.java (line 289) Saved RowCache (5 items) in 118 ms
 6941:DEBUG [CompactionExecutor:14] 2013-11-27 10:17:02,163 
 AutoSavingCache.java (line 233) Deleting old RowCache files.
 6942: INFO [CompactionExecutor:14] 2013-11-27 10:17:02,310 
 AutoSavingCache.java (line 289) Saved RowCache (5 items) in 146 ms
 8745:DEBUG [CompactionExecutor:6] 2013-11-27 10:37:25,140 
 AutoSavingCache.java (line 233) Deleting old RowCache files.
 8746: INFO [CompactionExecutor:6] 2013-11-27 10:37:25,283 
 AutoSavingCache.java (line 289) Saved RowCache (5 items) in 143 ms
 8747:DEBUG [CompactionExecutor:6] 2013-11-27 10:37:25,283 
 AutoSavingCache.java (line 233) Deleting old KeyCache files.
 8748: INFO [CompactionExecutor:6] 2013-11-27 10:37:25,625 
 AutoSavingCache.java (line 289) Saved KeyCache (21181 items) in 342 ms
 8749:DEBUG [CompactionExecutor:6] 2013-11-27 10:37:25,625 
 AutoSavingCache.java (line 233) Deleting old RowCache files.
 8750: INFO [CompactionExecutor:6] 2013-11-27 10:37:25,759 
 AutoSavingCache.java (line 289) Saved RowCache (5 items) in 134 ms
 8751:DEBUG [CompactionExecutor:6] 2013-11-27 10:37:25,759 
 AutoSavingCache.java (line 233) Deleting old RowCache files.
 8752: INFO [CompactionExecutor:6] 2013-11-27 10:37:25,893 
 AutoSavingCache.java (line 289) Saved RowCache (5 items) in 133 ms
 8753:DEBUG [CompactionExecutor:6] 2013-11-27 10:37:25,893 
 AutoSavingCache.java (line 233) Deleting old RowCache files.
 8754: INFO [CompactionExecutor:6] 2013-11-27 10:37:26,026 
 AutoSavingCache.java (line 289) Saved RowCache (5 items) in 133 ms
 9915:DEBUG [CompactionExecutor:18] 2013-11-27 10:42:01,851 
 AutoSavingCache.java (line 233) Deleting old KeyCache files.
 9916: INFO [CompactionExecutor:18] 2013-11-27 10:42:02,185 
 AutoSavingCache.java (line 289) Saved KeyCache (22067 items) in 334 ms
 9917:DEBUG [CompactionExecutor:17] 2013-11-27 10:42:02,279 
 AutoSavingCache.java (line 233) Deleting old RowCache files.
 9918: INFO [CompactionExecutor:17] 2013-11-27 10:42:02,411 
 AutoSavingCache.java (line 289) Saved RowCache (5 items) in 131 ms
 {noformat}
 {noformat}
 $ ll ~/shared/saved_caches/
 total 3472
 -rw-rw-r-- 1 cassandra cassandra 3551608 Nov 27 10:42 Foo-Bar-RowCache-b.db
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-6413) Saved KeyCache prints success to log; but no file present

2013-12-03 Thread Mikhail Stepura (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13838670#comment-13838670
 ] 

Mikhail Stepura commented on CASSANDRA-6413:


Looks like it was introduced in 
https://github.com/apache/cassandra/commit/cfe585c2c420c6e8445eb4c3309b09db8cf134ac
 for CASSANDRA-3762

 Saved KeyCache prints success to log; but no file present
 -

 Key: CASSANDRA-6413
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6413
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: 1.2.11
Reporter: Chris Burroughs
Assignee: Mikhail Stepura
 Attachments: CASSANDRA-1.2-6413.patch


 Cluster has a single keyspace with 3 CFs.  All used to have ROWS_ONLY, two 
 were switched to KEYS_ONLY about 2 days ago.  Row cache continues to save 
 fine, but there is no saved key cache file present on any node in the cluster.
 {noformat}
 6925: INFO [CompactionExecutor:12] 2013-11-27 10:12:02,284 
 AutoSavingCache.java (line 289) Saved RowCache (5 items) in 118 ms
 6941:DEBUG [CompactionExecutor:14] 2013-11-27 10:17:02,163 
 AutoSavingCache.java (line 233) Deleting old RowCache files.
 6942: INFO [CompactionExecutor:14] 2013-11-27 10:17:02,310 
 AutoSavingCache.java (line 289) Saved RowCache (5 items) in 146 ms
 8745:DEBUG [CompactionExecutor:6] 2013-11-27 10:37:25,140 
 AutoSavingCache.java (line 233) Deleting old RowCache files.
 8746: INFO [CompactionExecutor:6] 2013-11-27 10:37:25,283 
 AutoSavingCache.java (line 289) Saved RowCache (5 items) in 143 ms
 8747:DEBUG [CompactionExecutor:6] 2013-11-27 10:37:25,283 
 AutoSavingCache.java (line 233) Deleting old KeyCache files.
 8748: INFO [CompactionExecutor:6] 2013-11-27 10:37:25,625 
 AutoSavingCache.java (line 289) Saved KeyCache (21181 items) in 342 ms
 8749:DEBUG [CompactionExecutor:6] 2013-11-27 10:37:25,625 
 AutoSavingCache.java (line 233) Deleting old RowCache files.
 8750: INFO [CompactionExecutor:6] 2013-11-27 10:37:25,759 
 AutoSavingCache.java (line 289) Saved RowCache (5 items) in 134 ms
 8751:DEBUG [CompactionExecutor:6] 2013-11-27 10:37:25,759 
 AutoSavingCache.java (line 233) Deleting old RowCache files.
 8752: INFO [CompactionExecutor:6] 2013-11-27 10:37:25,893 
 AutoSavingCache.java (line 289) Saved RowCache (5 items) in 133 ms
 8753:DEBUG [CompactionExecutor:6] 2013-11-27 10:37:25,893 
 AutoSavingCache.java (line 233) Deleting old RowCache files.
 8754: INFO [CompactionExecutor:6] 2013-11-27 10:37:26,026 
 AutoSavingCache.java (line 289) Saved RowCache (5 items) in 133 ms
 9915:DEBUG [CompactionExecutor:18] 2013-11-27 10:42:01,851 
 AutoSavingCache.java (line 233) Deleting old KeyCache files.
 9916: INFO [CompactionExecutor:18] 2013-11-27 10:42:02,185 
 AutoSavingCache.java (line 289) Saved KeyCache (22067 items) in 334 ms
 9917:DEBUG [CompactionExecutor:17] 2013-11-27 10:42:02,279 
 AutoSavingCache.java (line 233) Deleting old RowCache files.
 9918: INFO [CompactionExecutor:17] 2013-11-27 10:42:02,411 
 AutoSavingCache.java (line 289) Saved RowCache (5 items) in 131 ms
 {noformat}
 {noformat}
 $ ll ~/shared/saved_caches/
 total 3472
 -rw-rw-r-- 1 cassandra cassandra 3551608 Nov 27 10:42 Foo-Bar-RowCache-b.db
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.1#6144)