[jira] [Commented] (CASSANDRA-6875) CQL3: select multiple CQL rows in a single partition using IN

2014-05-01 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13986466#comment-13986466
 ] 

Sylvain Lebresne commented on CASSANDRA-6875:
-

bq. we can't use prepared statements with the dtests right now (as far as I can 
tell)

We can, see cql_prepared_test.py (arguably our number of tests for prepared 
statement is deeply lacking, but it's possible to have some).

On the more general question of where tests should go, we could have a debate I 
suppose (but definitively not here), but frankly, I don't think there is very 
many downside to having the tests in the dtests and since we have tons of them 
there already, I'd much rather not waste precious time at changing for the sake 
of change. But quickly, the reasons why I think dtests are really not that bad 
here:
# it doesn't get a whole lot more end-to-end than the CQL tests imo.
# dtests feels to me a lot more readable and easier to work with. Mainly 
because for that kind of tests python is just more comfortable/quick to get 
things done.
# there *is* a few of the CQL dtests where we do want a distributed setup, like 
CAS tests. Of course we could left those in dtests and move the rest in the 
unit test suite, but keeping all CQL tests at the same place just feels simpler 
(you don't duplicate all those small utility functions that you invariably need 
to make tests easier).
# I work with unit tests and dtests daily, and it honestly is not at all my 
experience that working with unit tests is a lot faster. Quite the contrary 
in fact. I'm willing to admit that one may be more comfortable with one suite 
or the other, but I'll fight to the death the concept that unit test are a 
*lot* faster to work with as an absolute truth.

I'll note the CQL dtests are not perfect. They could use being reorganized a 
bit, and we can speed them up dramatically by not spinning up a cluster on 
every test. That said, fixing those issue is likely simpler than migrating all 
the existing tests back to the unit tests.

All this said, to focus back on this issue, I'd rather keep CQL tests to dtests 
for now (even if we do start a debate on changing that fact on dev list), but I 
won't block the issue if that's not the case. That remark was in the category 
minor comments.

 CQL3: select multiple CQL rows in a single partition using IN
 -

 Key: CASSANDRA-6875
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6875
 Project: Cassandra
  Issue Type: Bug
  Components: API
Reporter: Nicolas Favre-Felix
Assignee: Tyler Hobbs
Priority: Minor
 Fix For: 2.0.8


 In the spirit of CASSANDRA-4851 and to bring CQL to parity with Thrift, it is 
 important to support reading several distinct CQL rows from a given partition 
 using a distinct set of coordinates for these rows within the partition.
 CASSANDRA-4851 introduced a range scan over the multi-dimensional space of 
 clustering keys. We also need to support a multi-get of CQL rows, 
 potentially using the IN keyword to define a set of clustering keys to 
 fetch at once.
 (reusing the same example\:)
 Consider the following table:
 {code}
 CREATE TABLE test (
   k int,
   c1 int,
   c2 int,
   PRIMARY KEY (k, c1, c2)
 );
 {code}
 with the following data:
 {code}
  k | c1 | c2
 ---++
  0 |  0 |  0
  0 |  0 |  1
  0 |  1 |  0
  0 |  1 |  1
 {code}
 We can fetch a single row or a range of rows, but not a set of them:
 {code}
  SELECT * FROM test WHERE k = 0 AND (c1, c2) IN ((0, 0), (1,1)) ;
 Bad Request: line 1:54 missing EOF at ','
 {code}
 Supporting this syntax would return:
 {code}
  k | c1 | c2
 ---++
  0 |  0 |  0
  0 |  1 |  1
 {code}
 Being able to fetch these two CQL rows in a single read is important to 
 maintain partition-level isolation.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


git commit: Fix regression from CASSANDRA-6855

2014-05-01 Thread slebresne
Repository: cassandra
Updated Branches:
  refs/heads/cassandra-2.1 68961a6a9 - 233761e53


Fix regression from CASSANDRA-6855


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/233761e5
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/233761e5
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/233761e5

Branch: refs/heads/cassandra-2.1
Commit: 233761e53988301222dfaa590dc7fd8ee396c9b4
Parents: 68961a6
Author: Sylvain Lebresne sylv...@datastax.com
Authored: Thu May 1 11:37:36 2014 +0200
Committer: Sylvain Lebresne sylv...@datastax.com
Committed: Thu May 1 11:37:36 2014 +0200

--
 src/java/org/apache/cassandra/cql3/QueryOptions.java  |  4 +---
 .../cassandra/serializers/CollectionSerializer.java   | 10 ++
 2 files changed, 7 insertions(+), 7 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/233761e5/src/java/org/apache/cassandra/cql3/QueryOptions.java
--
diff --git a/src/java/org/apache/cassandra/cql3/QueryOptions.java 
b/src/java/org/apache/cassandra/cql3/QueryOptions.java
index 12accaf..5801d55 100644
--- a/src/java/org/apache/cassandra/cql3/QueryOptions.java
+++ b/src/java/org/apache/cassandra/cql3/QueryOptions.java
@@ -58,7 +58,7 @@ public abstract class QueryOptions
 
 public static QueryOptions forInternalCalls(ConsistencyLevel consistency, 
ListByteBuffer values)
 {
-return new DefaultQueryOptions(consistency, values, false, 
SpecificOptions.DEFAULT, 0);
+return new DefaultQueryOptions(consistency, values, false, 
SpecificOptions.DEFAULT, 3);
 }
 
 public static QueryOptions fromPreV3Batch(ConsistencyLevel consistency)
@@ -123,8 +123,6 @@ public abstract class QueryOptions
 
 private final SpecificOptions options;
 
-// The protocol version of incoming queries. This is set during 
deserializaion and will be 0
-// if the QueryOptions does not come from a user message (or come from 
thrift).
 private final transient int protocolVersion;
 
 DefaultQueryOptions(ConsistencyLevel consistency, ListByteBuffer 
values, boolean skipMetadata, SpecificOptions options, int protocolVersion)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/233761e5/src/java/org/apache/cassandra/serializers/CollectionSerializer.java
--
diff --git 
a/src/java/org/apache/cassandra/serializers/CollectionSerializer.java 
b/src/java/org/apache/cassandra/serializers/CollectionSerializer.java
index 0e16fda..43b04f3 100644
--- a/src/java/org/apache/cassandra/serializers/CollectionSerializer.java
+++ b/src/java/org/apache/cassandra/serializers/CollectionSerializer.java
@@ -38,15 +38,17 @@ public abstract class CollectionSerializerT implements 
TypeSerializerT
 public ByteBuffer serialize(T value)
 {
 ListByteBuffer values = serializeValues(value);
-// The only case we serialize/deserialize collections internally (i.e. 
not for the protocol sake),
-// is when collections are in UDT values. There, we use the protocol 3 
version since it's more flexible.
+// See deserialize() for why using the protocol v3 variant is the 
right thing to do.
 return pack(values, getElementCount(value), 3);
 }
 
 public T deserialize(ByteBuffer bytes)
 {
-// The only case we serialize/deserialize collections internally (i.e. 
not for the protocol sake),
-// is when collections are in UDT values. There, we use the protocol 3 
version since it's more flexible.
+// The only cases we serialize/deserialize collections internally 
(i.e. not for the protocol sake),
+// is:
+//  1) when collections are in UDT values
+//  2) for internal calls.
+// In both case, using the protocol 3 version variant is the right 
thing to do.
 return deserializeForNativeProtocol(bytes, 3);
 }
 



[2/2] git commit: Merge branch 'cassandra-2.1' into trunk

2014-05-01 Thread slebresne
Merge branch 'cassandra-2.1' into trunk


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/c0be3418
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/c0be3418
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/c0be3418

Branch: refs/heads/trunk
Commit: c0be34182b0051fd9881d268244408c33cdcb846
Parents: e78b002 233761e
Author: Sylvain Lebresne sylv...@datastax.com
Authored: Thu May 1 11:46:29 2014 +0200
Committer: Sylvain Lebresne sylv...@datastax.com
Committed: Thu May 1 11:46:29 2014 +0200

--
 src/java/org/apache/cassandra/cql3/QueryOptions.java  |  4 +---
 .../cassandra/serializers/CollectionSerializer.java   | 10 ++
 2 files changed, 7 insertions(+), 7 deletions(-)
--




[1/2] git commit: Fix regression from CASSANDRA-6855

2014-05-01 Thread slebresne
Repository: cassandra
Updated Branches:
  refs/heads/trunk e78b002b1 - c0be34182


Fix regression from CASSANDRA-6855


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/233761e5
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/233761e5
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/233761e5

Branch: refs/heads/trunk
Commit: 233761e53988301222dfaa590dc7fd8ee396c9b4
Parents: 68961a6
Author: Sylvain Lebresne sylv...@datastax.com
Authored: Thu May 1 11:37:36 2014 +0200
Committer: Sylvain Lebresne sylv...@datastax.com
Committed: Thu May 1 11:37:36 2014 +0200

--
 src/java/org/apache/cassandra/cql3/QueryOptions.java  |  4 +---
 .../cassandra/serializers/CollectionSerializer.java   | 10 ++
 2 files changed, 7 insertions(+), 7 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/233761e5/src/java/org/apache/cassandra/cql3/QueryOptions.java
--
diff --git a/src/java/org/apache/cassandra/cql3/QueryOptions.java 
b/src/java/org/apache/cassandra/cql3/QueryOptions.java
index 12accaf..5801d55 100644
--- a/src/java/org/apache/cassandra/cql3/QueryOptions.java
+++ b/src/java/org/apache/cassandra/cql3/QueryOptions.java
@@ -58,7 +58,7 @@ public abstract class QueryOptions
 
 public static QueryOptions forInternalCalls(ConsistencyLevel consistency, 
ListByteBuffer values)
 {
-return new DefaultQueryOptions(consistency, values, false, 
SpecificOptions.DEFAULT, 0);
+return new DefaultQueryOptions(consistency, values, false, 
SpecificOptions.DEFAULT, 3);
 }
 
 public static QueryOptions fromPreV3Batch(ConsistencyLevel consistency)
@@ -123,8 +123,6 @@ public abstract class QueryOptions
 
 private final SpecificOptions options;
 
-// The protocol version of incoming queries. This is set during 
deserializaion and will be 0
-// if the QueryOptions does not come from a user message (or come from 
thrift).
 private final transient int protocolVersion;
 
 DefaultQueryOptions(ConsistencyLevel consistency, ListByteBuffer 
values, boolean skipMetadata, SpecificOptions options, int protocolVersion)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/233761e5/src/java/org/apache/cassandra/serializers/CollectionSerializer.java
--
diff --git 
a/src/java/org/apache/cassandra/serializers/CollectionSerializer.java 
b/src/java/org/apache/cassandra/serializers/CollectionSerializer.java
index 0e16fda..43b04f3 100644
--- a/src/java/org/apache/cassandra/serializers/CollectionSerializer.java
+++ b/src/java/org/apache/cassandra/serializers/CollectionSerializer.java
@@ -38,15 +38,17 @@ public abstract class CollectionSerializerT implements 
TypeSerializerT
 public ByteBuffer serialize(T value)
 {
 ListByteBuffer values = serializeValues(value);
-// The only case we serialize/deserialize collections internally (i.e. 
not for the protocol sake),
-// is when collections are in UDT values. There, we use the protocol 3 
version since it's more flexible.
+// See deserialize() for why using the protocol v3 variant is the 
right thing to do.
 return pack(values, getElementCount(value), 3);
 }
 
 public T deserialize(ByteBuffer bytes)
 {
-// The only case we serialize/deserialize collections internally (i.e. 
not for the protocol sake),
-// is when collections are in UDT values. There, we use the protocol 3 
version since it's more flexible.
+// The only cases we serialize/deserialize collections internally 
(i.e. not for the protocol sake),
+// is:
+//  1) when collections are in UDT values
+//  2) for internal calls.
+// In both case, using the protocol 3 version variant is the right 
thing to do.
 return deserializeForNativeProtocol(bytes, 3);
 }
 



[jira] [Commented] (CASSANDRA-7128) Switch NBHS for CHS on read path

2014-05-01 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13986468#comment-13986468
 ] 

Benedict commented on CASSANDRA-7128:
-

Actually, this points to the fact we should probably update NBHM, separately: 
[https://github.com/boundary/high-scale-lib/commit/a21a81a9293013233784566bc0e5e83f94d6408a]

 Switch NBHS for CHS on read path 
 -

 Key: CASSANDRA-7128
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7128
 Project: Cassandra
  Issue Type: Improvement
Reporter: T Jake Luciani
Assignee: T Jake Luciani
Priority: Trivial
  Labels: performance
 Fix For: 2.1 beta2

 Attachments: 7128.txt, Screen Shot 2014-04-30 at 11.07.22 PM.png


 AbstractRowResolver uses a NBHM for each read request.  
 Profiler flagged this as a bottleneck since the init() call creates a 
 AtomicReferenceFieldUpdater which is stored in a synchronized collection.
 A NBHS is most certainly overkill for such a short lived object.  And turns 
 out switching it to a CHS in my tests yields a ~5-10% read boost.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6855) Native protocol V3

2014-05-01 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13986467#comment-13986467
 ] 

Sylvain Lebresne commented on CASSANDRA-6855:
-

bq.  introduced a large number of errors in the auth_test dtest (though fyi I 
still have one auth test failure on my box, for grant_revoke_cleanup_test, but 
that looks unrelated to this issue).

Yes, my bad, that was a small oversight (which was not auth related per-se so 
it might have broken a few other tests too btw). Anyway, I ninja-fixed it in 
commit 233761e.

And as you said, I did commited that but forgotten to close the issue, so doing 
it now. Thanks.

 Native protocol V3
 --

 Key: CASSANDRA-6855
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6855
 Project: Cassandra
  Issue Type: New Feature
Reporter: Sylvain Lebresne
Assignee: Sylvain Lebresne
 Fix For: 2.1 beta2

 Attachments: auth_test_dtest-2.1~git9872b74.node1.log, 
 auth_test_dtest-2.1~git9872b74.txt


 I think we need a V3 of the protocol for 2.1. The things that this 
 could/should includes are:
 # Adding an optional Serial CL for protocol batches (like we have for QUERY 
 and EXECUTE). It was an oversight of V2 of not adding it, and now that we can 
 batch conditional updates, it's definitively missing.
 # Proper type codes for UDT. This is not *strictly* needed to be able to 
 support UDT since currently a UDT will be sent as a custom type with his 
 fully class name + arguments. But parsing that is no fun nor convenient for 
 clients. It's also not particular space efficient (though that's probably not 
 a huge concern since with prepared statement you can avoid sending the 
 ResultSet metadata every time).
 # Serialization format for collections. Currently the serialization format 
 only allow for 65K elements, each of 65K bytes size at most. While 
 collections are not meant to store large amount of data, having the 
 limitation in the protocol serialization format is the wrong way to deal with 
 that. Concretely, the current workaround for CASSANDRA-5428 is ugly. I'll 
 note that the current serialization format is also an obstacle to supporting 
 null inside collections (whether or not we want to support null there is a 
 good question, but here again I'm not sure being limited by the serialization 
 format is a good idea).
 # CASSANDRA-6178: I continue to believe that in many case it makes somewhat 
 more sense to have the default timestamp provided by the client (this is a 
 necessary condition for true idempotent retries in particular). I'm 
 absolutely fine making that optional and leaving server-side generated 
 timestamps by default, but since client can already provide timestamp in 
 query string anyway, I don't see a big deal in making it easier for client 
 driver to control that without messing with the query string.
 # Optional names for values in QUERY messages: it has been brought to my 
 attention that while V2 allows to send a query string with values for a 
 one-roundtrip bind-and-execute, a driver can't really support named bind 
 marker with that feature properly without parsing the query. The proposition 
 is thus to make it (optionally) possible to ship the name of the marker each 
 value is supposed to be bound to.
 I think that 1) and 2) are enough reason to make a V3 (even if there is 
 disagreement on the rest that is).
 3) is a little bit more involved tbh but I do think having the current 
 limitations bolted in the protocol serialization format is wrong in the long 
 run, and it turns out that due to UDT we will start storing serialized 
 collections internally so if we want to lift said limitation in the 
 serialization format, we should do it now and everywhere, as doing it 
 afterwards will be a lot more painful.
 4) and 5) are probably somewhat more minor, but at the same time, both are 
 completely optional (a driver won't have to support those if he doesn't 
 want). They are really just about making things more flexible for client 
 drivers and they are not particularly hard to support so I don't see too many 
 reasons not to include them.
 Last but not least, I know that some may find it wrong to do a new protocol 
 version with each major of C*, so let me state my view here: I fully agree 
 that we shouldn't make an habit of that in the long run and that's 
 definitively *not* my objective. However, it would be silly to expect that we 
 could get everything right and forget nothing in the very first version. It 
 shouldn't be surprising that we'll have to burn a few versions (and there 
 might be a few more yet) before getting something more stable and complete 
 and I think that delaying the addition of stuffs that are useful to create 
 some fake notion of stability would be even more silly. On the bright 

[jira] [Commented] (CASSANDRA-7125) Fail to start by default if Commit Log fails to validate any messages

2014-05-01 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13986474#comment-13986474
 ] 

Sylvain Lebresne commented on CASSANDRA-7125:
-

bq. and introduce a cassandra.yaml option that permits overriding the default 
behaviour

bikesheddingI feel a command line override flag (i.e. something like 
-Dcassandra.commitlog.skipbrokenentries=true) would be more appropriate 
here/bikeshedding. But agreed on  changing the default behavior otherwise. 

 Fail to start by default if Commit Log fails to validate any messages
 -

 Key: CASSANDRA-7125
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7125
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
Priority: Minor
  Labels: correctness
 Fix For: 2.1.0


 Current behaviour can be pretty dangerous, and also has a tendency to mask 
 bugs during development. We should change the behaviour to default to failure 
 if anything unexpected happens, and introduce a cassandra.yaml option that 
 permits overriding the default behaviour.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7123) BATCH documentation should be explicit about ordering guarantees

2014-05-01 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13986483#comment-13986483
 ] 

Sylvain Lebresne commented on CASSANDRA-7123:
-

Actually, saying There is no guaranteed order for the application of 
operations kind of suggest it's random but it's not. By default all operation 
are applied with the same timestamp which is both well defined and guaranteed. 
So I'd prefer something longer but more precise, along the lines of if no 
timestamp is manually specified on the operations, then it is guaranteed that 
all operations will apply with the same timestamp. Please note that this might 
not correspond to applying operations in the order they are declared in the 
BATCH. For instance, if the BATCH contains both an insertion and a deletion of 
the same row, then the deletion will have priority (even if it appears before 
the update/insert in the BATCH order), since deletions have priority over 
writes on timestamp tie in Cassandra. You can force a particular operation 
ordering by using per-operation timestamps

 BATCH documentation should be explicit about ordering guarantees
 

 Key: CASSANDRA-7123
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7123
 Project: Cassandra
  Issue Type: Task
  Components: Documentation  website
Reporter: Tyler Hobbs
Assignee: Tyler Hobbs
Priority: Minor
 Attachments: 7123.txt


 In the CQL3 [batch statement 
 documentation](http://cassandra.apache.org/doc/cql3/CQL.html#batchStmt) we 
 don't mention that there are no ordering guarantees, which can lead to 
 somewhat surprising behavior (CASSANDRA-6291).
 We should also mention that you could specify timestamps in order to achieve 
 a particular ordering.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7125) Fail to start by default if Commit Log fails to validate any messages

2014-05-01 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13986491#comment-13986491
 ] 

Benedict commented on CASSANDRA-7125:
-

Agreed. For some reason I thought it would be annoying to pass VM parameters 
through, but looks like we make it easy with the startup script.

 Fail to start by default if Commit Log fails to validate any messages
 -

 Key: CASSANDRA-7125
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7125
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
Priority: Minor
  Labels: correctness
 Fix For: 2.1.0


 Current behaviour can be pretty dangerous, and also has a tendency to mask 
 bugs during development. We should change the behaviour to default to failure 
 if anything unexpected happens, and introduce a cassandra.yaml option that 
 permits overriding the default behaviour.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Reopened] (CASSANDRA-7128) Switch NBHS for CHS on read path

2014-05-01 Thread Benedict (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benedict reopened CASSANDRA-7128:
-


 Switch NBHS for CHS on read path 
 -

 Key: CASSANDRA-7128
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7128
 Project: Cassandra
  Issue Type: Improvement
Reporter: T Jake Luciani
Assignee: T Jake Luciani
Priority: Trivial
  Labels: performance
 Fix For: 2.1 beta2

 Attachments: 7128.txt, Screen Shot 2014-04-30 at 11.07.22 PM.png


 AbstractRowResolver uses a NBHM for each read request.  
 Profiler flagged this as a bottleneck since the init() call creates a 
 AtomicReferenceFieldUpdater which is stored in a synchronized collection.
 A NBHS is most certainly overkill for such a short lived object.  And turns 
 out switching it to a CHS in my tests yields a ~5-10% read boost.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7128) Switch NBHS for CHS on read path

2014-05-01 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13986493#comment-13986493
 ] 

Benedict commented on CASSANDRA-7128:
-

Perhaps we should move to 
[https://github.com/boundary/high-scale-lib/commits/master]? Seems to be kept 
more up-to-date.

 Switch NBHS for CHS on read path 
 -

 Key: CASSANDRA-7128
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7128
 Project: Cassandra
  Issue Type: Improvement
Reporter: T Jake Luciani
Assignee: T Jake Luciani
Priority: Trivial
  Labels: performance
 Fix For: 2.1 beta2

 Attachments: 7128.txt, Screen Shot 2014-04-30 at 11.07.22 PM.png


 AbstractRowResolver uses a NBHM for each read request.  
 Profiler flagged this as a bottleneck since the init() call creates a 
 AtomicReferenceFieldUpdater which is stored in a synchronized collection.
 A NBHS is most certainly overkill for such a short lived object.  And turns 
 out switching it to a CHS in my tests yields a ~5-10% read boost.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6572) Workload recording / playback

2014-05-01 Thread Lyuben Todorov (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13986542#comment-13986542
 ] 

Lyuben Todorov commented on CASSANDRA-6572:
---

I added a commit 
([5494...ea44|https://github.com/lyubent/cassandra/commit/54945177af29cfa8f4019c0f84429afdc026ea44])
 where:
# base64 encoding / decoding is removed
# the logfile is now a byte[] that gets flushed to disk once full

Next up is working in prepared statements and logging client threading info.

 Workload recording / playback
 -

 Key: CASSANDRA-6572
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6572
 Project: Cassandra
  Issue Type: New Feature
  Components: Core, Tools
Reporter: Jonathan Ellis
Assignee: Lyuben Todorov
 Fix For: 2.1.1

 Attachments: 6572-trunk.diff


 Write sample mode gets us part way to testing new versions against a real 
 world workload, but we need an easy way to test the query side as well.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7128) Switch NBHS for CHS on read path

2014-05-01 Thread T Jake Luciani (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13986547#comment-13986547
 ] 

T Jake Luciani commented on CASSANDRA-7128:
---

Yeah interestingly enough they may have just fixed this issue: 
https://github.com/boundary/high-scale-lib/commit/7baaeaef26e090601fa601862589d402000a3de2

 Switch NBHS for CHS on read path 
 -

 Key: CASSANDRA-7128
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7128
 Project: Cassandra
  Issue Type: Improvement
Reporter: T Jake Luciani
Assignee: T Jake Luciani
Priority: Trivial
  Labels: performance
 Fix For: 2.1 beta2

 Attachments: 7128.txt, Screen Shot 2014-04-30 at 11.07.22 PM.png


 AbstractRowResolver uses a NBHM for each read request.  
 Profiler flagged this as a bottleneck since the init() call creates a 
 AtomicReferenceFieldUpdater which is stored in a synchronized collection.
 A NBHS is most certainly overkill for such a short lived object.  And turns 
 out switching it to a CHS in my tests yields a ~5-10% read boost.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Issue Comment Deleted] (CASSANDRA-7128) Upgrade NBHM

2014-05-01 Thread T Jake Luciani (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

T Jake Luciani updated CASSANDRA-7128:
--

Comment: was deleted

(was: Yeah interestingly enough they may have just fixed this issue: 
https://github.com/boundary/high-scale-lib/commit/7baaeaef26e090601fa601862589d402000a3de2)

 Upgrade NBHM
 

 Key: CASSANDRA-7128
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7128
 Project: Cassandra
  Issue Type: Improvement
Reporter: T Jake Luciani
Assignee: T Jake Luciani
Priority: Trivial
  Labels: performance
 Fix For: 2.1 beta2

 Attachments: 7128.txt, Screen Shot 2014-04-30 at 11.07.22 PM.png


 AbstractRowResolver uses a NBHM for each read request.  
 Profiler flagged this as a bottleneck since the init() call creates a 
 AtomicReferenceFieldUpdater which is stored in a synchronized collection.
 A NBHS is most certainly overkill for such a short lived object.  And turns 
 out switching it to a CHS in my tests yields a ~5-10% read boost.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (CASSANDRA-7128) Upgrade NBHM

2014-05-01 Thread T Jake Luciani (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

T Jake Luciani updated CASSANDRA-7128:
--

Summary: Upgrade NBHM  (was: Switch NBHS for CHS on read path )

 Upgrade NBHM
 

 Key: CASSANDRA-7128
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7128
 Project: Cassandra
  Issue Type: Improvement
Reporter: T Jake Luciani
Assignee: T Jake Luciani
Priority: Trivial
  Labels: performance
 Fix For: 2.1 beta2

 Attachments: 7128.txt, Screen Shot 2014-04-30 at 11.07.22 PM.png


 AbstractRowResolver uses a NBHM for each read request.  
 Profiler flagged this as a bottleneck since the init() call creates a 
 AtomicReferenceFieldUpdater which is stored in a synchronized collection.
 A NBHS is most certainly overkill for such a short lived object.  And turns 
 out switching it to a CHS in my tests yields a ~5-10% read boost.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (CASSANDRA-4718) More-efficient ExecutorService for improved throughput

2014-05-01 Thread Jason Brown (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-4718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Brown updated CASSANDRA-4718:
---

Attachment: backpressure-stress.out.txt

I've run both of [~benedict]'s patches on my test cluster, and the results are 
quite hopeful (see backpressure-stress.out). Short story is anything we do for 
the native protocol will at least double the throughput and halve the latency, 
but that's the low end of the improvements.

Benedict's second patch is about 5-10% faster than his first, and both are 
slightly slower than my original patch. As I think it's important to have back 
pressure for any pool, at a minimum I think we should go with [~benedict]'s 
first patch. I like the ideas in the second patch, but would like another day 
to digest the implementation.

 More-efficient ExecutorService for improved throughput
 --

 Key: CASSANDRA-4718
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4718
 Project: Cassandra
  Issue Type: Improvement
Reporter: Jonathan Ellis
Assignee: Jason Brown
Priority: Minor
  Labels: performance
 Fix For: 2.1.0

 Attachments: 4718-v1.patch, PerThreadQueue.java, 
 backpressure-stress.out.txt, baq vs trunk.png, op costs of various 
 queues.ods, stress op rate with various queues.ods, v1-stress.out


 Currently all our execution stages dequeue tasks one at a time.  This can 
 result in contention between producers and consumers (although we do our best 
 to minimize this by using LinkedBlockingQueue).
 One approach to mitigating this would be to make consumer threads do more 
 work in bulk instead of just one task per dequeue.  (Producer threads tend 
 to be single-task oriented by nature, so I don't see an equivalent 
 opportunity there.)
 BlockingQueue has a drainTo(collection, int) method that would be perfect for 
 this.  However, no ExecutorService in the jdk supports using drainTo, nor 
 could I google one.
 What I would like to do here is create just such a beast and wire it into (at 
 least) the write and read stages.  (Other possible candidates for such an 
 optimization, such as the CommitLog and OutboundTCPConnection, are not 
 ExecutorService-based and will need to be one-offs.)
 AbstractExecutorService may be useful.  The implementations of 
 ICommitLogExecutorService may also be useful. (Despite the name these are not 
 actual ExecutorServices, although they share the most important properties of 
 one.)



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7128) Upgrade NBHM

2014-05-01 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13986579#comment-13986579
 ] 

Jonathan Ellis commented on CASSANDRA-7128:
---

New ticket or ninja?

 Upgrade NBHM
 

 Key: CASSANDRA-7128
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7128
 Project: Cassandra
  Issue Type: Improvement
Reporter: T Jake Luciani
Assignee: T Jake Luciani
Priority: Trivial
  Labels: performance
 Fix For: 2.1 beta2

 Attachments: 7128.txt, Screen Shot 2014-04-30 at 11.07.22 PM.png


 AbstractRowResolver uses a NBHM for each read request.  
 Profiler flagged this as a bottleneck since the init() call creates a 
 AtomicReferenceFieldUpdater which is stored in a synchronized collection.
 A NBHS is most certainly overkill for such a short lived object.  And turns 
 out switching it to a CHS in my tests yields a ~5-10% read boost.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6875) CQL3: select multiple CQL rows in a single partition using IN

2014-05-01 Thread Ryan McGuire (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13986584#comment-13986584
 ] 

Ryan McGuire commented on CASSANDRA-6875:
-

bq. We can, see cql_prepared_test.py (arguably our number of tests for prepared 
statement is deeply lacking, but it's possible to have some).

[~slebresne] Is that test possibly uncommitted? I actually don't have that test 
in my repo.

 CQL3: select multiple CQL rows in a single partition using IN
 -

 Key: CASSANDRA-6875
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6875
 Project: Cassandra
  Issue Type: Bug
  Components: API
Reporter: Nicolas Favre-Felix
Assignee: Tyler Hobbs
Priority: Minor
 Fix For: 2.0.8


 In the spirit of CASSANDRA-4851 and to bring CQL to parity with Thrift, it is 
 important to support reading several distinct CQL rows from a given partition 
 using a distinct set of coordinates for these rows within the partition.
 CASSANDRA-4851 introduced a range scan over the multi-dimensional space of 
 clustering keys. We also need to support a multi-get of CQL rows, 
 potentially using the IN keyword to define a set of clustering keys to 
 fetch at once.
 (reusing the same example\:)
 Consider the following table:
 {code}
 CREATE TABLE test (
   k int,
   c1 int,
   c2 int,
   PRIMARY KEY (k, c1, c2)
 );
 {code}
 with the following data:
 {code}
  k | c1 | c2
 ---++
  0 |  0 |  0
  0 |  0 |  1
  0 |  1 |  0
  0 |  1 |  1
 {code}
 We can fetch a single row or a range of rows, but not a set of them:
 {code}
  SELECT * FROM test WHERE k = 0 AND (c1, c2) IN ((0, 0), (1,1)) ;
 Bad Request: line 1:54 missing EOF at ','
 {code}
 Supporting this syntax would return:
 {code}
  k | c1 | c2
 ---++
  0 |  0 |  0
  0 |  1 |  1
 {code}
 Being able to fetch these two CQL rows in a single read is important to 
 maintain partition-level isolation.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


git commit: Support consistent range movements.

2014-05-01 Thread jake
Repository: cassandra
Updated Branches:
  refs/heads/cassandra-2.1 233761e53 - 9f60c55ba


Support consistent range movements.

patch by tjake; reviewed by thobbs for CASSANDRA-2434


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/9f60c55b
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/9f60c55b
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/9f60c55b

Branch: refs/heads/cassandra-2.1
Commit: 9f60c55ba42ff56aa58c3790b9c55924c4deedf4
Parents: 233761e
Author: T Jake Luciani j...@apache.org
Authored: Thu May 1 09:47:22 2014 -0400
Committer: T Jake Luciani j...@apache.org
Committed: Thu May 1 09:47:22 2014 -0400

--
 CHANGES.txt |  1 +
 NEWS.txt|  5 ++
 .../org/apache/cassandra/dht/BootStrapper.java  |  2 +-
 .../org/apache/cassandra/dht/RangeStreamer.java | 81 +++-
 .../cassandra/service/StorageService.java   | 44 ++-
 5 files changed, 127 insertions(+), 6 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/9f60c55b/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 34533cc..be72ad1 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -56,6 +56,7 @@
  * Optimize cellname comparison (CASSANDRA-6934)
  * Native protocol v3 (CASSANDRA-6855)
  * Optimize Cell liveness checks and clean up Cell (CASSANDRA-7119)
+ * Support consistent range movements (CASSANDRA-2434)
 Merged from 2.0:
  * Allow overriding cassandra-rackdc.properties file (CASSANDRA-7072)
  * Set JMX RMI port to 7199 (CASSANDRA-7087)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/9f60c55b/NEWS.txt
--
diff --git a/NEWS.txt b/NEWS.txt
index 86c6f64..5d59460 100644
--- a/NEWS.txt
+++ b/NEWS.txt
@@ -30,6 +30,11 @@ New features
  repair session. Use nodetool repair -par -inc to use this feature.
  A tool to manually mark/unmark sstables as repaired is available in
  tools/bin/sstablerepairedset.
+   - Bootstrapping now ensures that range movements are consistent,
+ meaning the data for the new node is taken from the node that is no 
+ longer a responsible for that range of keys.  
+ If you want the old behavior (due to a lost node perhaps)
+ you can set the following property (-Dconsistent.rangemovement=false)
 
 Upgrading
 -

http://git-wip-us.apache.org/repos/asf/cassandra/blob/9f60c55b/src/java/org/apache/cassandra/dht/BootStrapper.java
--
diff --git a/src/java/org/apache/cassandra/dht/BootStrapper.java 
b/src/java/org/apache/cassandra/dht/BootStrapper.java
index 343748b..cbbd100 100644
--- a/src/java/org/apache/cassandra/dht/BootStrapper.java
+++ b/src/java/org/apache/cassandra/dht/BootStrapper.java
@@ -63,7 +63,7 @@ public class BootStrapper
 if (logger.isDebugEnabled())
 logger.debug(Beginning bootstrap process);
 
-RangeStreamer streamer = new RangeStreamer(tokenMetadata, address, 
Bootstrap);
+RangeStreamer streamer = new RangeStreamer(tokenMetadata, tokens, 
address, Bootstrap);
 streamer.addSourceFilter(new 
RangeStreamer.FailureDetectorSourceFilter(FailureDetector.instance));
 
 for (String keyspaceName : Schema.instance.getNonSystemKeyspaces())

http://git-wip-us.apache.org/repos/asf/cassandra/blob/9f60c55b/src/java/org/apache/cassandra/dht/RangeStreamer.java
--
diff --git a/src/java/org/apache/cassandra/dht/RangeStreamer.java 
b/src/java/org/apache/cassandra/dht/RangeStreamer.java
index 7ab39a4..2308d30 100644
--- a/src/java/org/apache/cassandra/dht/RangeStreamer.java
+++ b/src/java/org/apache/cassandra/dht/RangeStreamer.java
@@ -23,6 +23,8 @@ import java.util.*;
 import com.google.common.collect.ArrayListMultimap;
 import com.google.common.collect.HashMultimap;
 import com.google.common.collect.Multimap;
+import com.google.common.collect.Sets;
+import org.apache.cassandra.gms.EndpointState;
 import org.apache.commons.lang3.StringUtils;
 import org.slf4j.Logger;
 import org.slf4j.LoggerFactory;
@@ -30,6 +32,7 @@ import org.slf4j.LoggerFactory;
 import org.apache.cassandra.config.DatabaseDescriptor;
 import org.apache.cassandra.db.Keyspace;
 import org.apache.cassandra.gms.FailureDetector;
+import org.apache.cassandra.gms.Gossiper;
 import org.apache.cassandra.gms.IFailureDetector;
 import org.apache.cassandra.locator.AbstractReplicationStrategy;
 import org.apache.cassandra.locator.IEndpointSnitch;
@@ -44,7 +47,8 @@ import org.apache.cassandra.utils.FBUtilities;
 public class RangeStreamer
 {
 private static final 

[jira] [Resolved] (CASSANDRA-2434) range movements can violate consistency

2014-05-01 Thread T Jake Luciani (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

T Jake Luciani resolved CASSANDRA-2434.
---

Resolution: Fixed

Committed c* code , I'll push to dtests now

 range movements can violate consistency
 ---

 Key: CASSANDRA-2434
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2434
 Project: Cassandra
  Issue Type: Bug
Reporter: Peter Schuller
Assignee: T Jake Luciani
 Fix For: 2.1 beta2

 Attachments: 2434-3.patch.txt, 2434-testery.patch.txt


 My reading (a while ago) of the code indicates that there is no logic 
 involved during bootstrapping that avoids consistency level violations. If I 
 recall correctly it just grabs neighbors that are currently up.
 There are at least two issues I have with this behavior:
 * If I have a cluster where I have applications relying on QUORUM with RF=3, 
 and bootstrapping complete based on only one node, I have just violated the 
 supposedly guaranteed consistency semantics of the cluster.
 * Nodes can flap up and down at any time, so even if a human takes care to 
 look at which nodes are up and things about it carefully before 
 bootstrapping, there's no guarantee.
 A complication is that not only does it depend on use-case where this is an 
 issue (if all you ever do you do at CL.ONE, it's fine); even in a cluster 
 which is otherwise used for QUORUM operations you may wish to accept 
 less-than-quorum nodes during bootstrap in various emergency situations.
 A potential easy fix is to have bootstrap take an argument which is the 
 number of hosts to bootstrap from, or to assume QUORUM if none is given.
 (A related concern is bootstrapping across data centers. You may *want* to 
 bootstrap to a local node and then do a repair to avoid sending loads of data 
 across DC:s while still achieving consistency. Or even if you don't care 
 about the consistency issues, I don't think there is currently a way to 
 bootstrap from local nodes only.)
 Thoughts?



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6572) Workload recording / playback

2014-05-01 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13986591#comment-13986591
 ] 

Benedict commented on CASSANDRA-6572:
-

Some comments on the in progress patch:

* Don't create a string with the header and convert it to bytes - convert the 
string to bytes and write a normal byte-encoded header with timestamp + length 
as a long. This will make encoding the prepared statement parameters much 
easier also
* Encapsulate queryQue and logPosition into a single object, and use an 
atomicinteger for the position - don't synchronise, just bump the position 
however much you need, then write to the owned range. On flush swap the object 
(use an AtomicReference to track the current buffer)
* On flush, append directly from the byte buffer, don't copy it. Create a 
FileOutputStream and call its appropriate write method with the range that is 
in use
* On the read path, you're now eagerly reading _all_ files which is likely to 
blow up the heap; at least create an Iterator that only reads a whole file at 
once (preferably read a chunk of a file at a time, with a BufferedInputStream)
* On replay timing we want to target hitting the same delta from epoch for 
running the query, not the delta from the prior query - this should help 
prevent massive timing drifts
* Query frequency can be an int rather than an Integer to avoid unboxing
* I think it would be nice if we checked the actual CFMetaData for the 
keyspaces we're modifying in the CQLStatement, rather than doing a find within 
the whole string, but it's not too big a deal
* atomicCounterLock needs to be removed
* As a general rule, never copy array contents with a loop - always use 
System.arraycopy
* Still need to log the thread + query id as Jonathan mentioned



 Workload recording / playback
 -

 Key: CASSANDRA-6572
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6572
 Project: Cassandra
  Issue Type: New Feature
  Components: Core, Tools
Reporter: Jonathan Ellis
Assignee: Lyuben Todorov
 Fix For: 2.1.1

 Attachments: 6572-trunk.diff


 Write sample mode gets us part way to testing new versions against a real 
 world workload, but we need an easy way to test the query side as well.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Comment Edited] (CASSANDRA-6572) Workload recording / playback

2014-05-01 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13986591#comment-13986591
 ] 

Benedict edited comment on CASSANDRA-6572 at 5/1/14 1:57 PM:
-

Some comments on the in progress patch:

* Don't create a string with the header and convert it to bytes - convert the 
string to bytes and write a normal byte-encoded header with timestamp + length 
as longs. This will make encoding the prepared statement parameters much easier 
also
* Encapsulate queryQue and logPosition into a single object, and use an 
atomicinteger for the position - don't synchronise, just bump the position 
however much you need, then write to the owned range. On flush swap the object 
(use an AtomicReference to track the current buffer)
* On flush, append directly from the byte buffer, don't copy it. Create a 
FileOutputStream and call its appropriate write method with the range that is 
in use
* On the read path, you're now eagerly reading _all_ files which is likely to 
blow up the heap; at least create an Iterator that only reads a whole file at 
once (preferably read a chunk of a file at a time, with a BufferedInputStream)
* On replay timing we want to target hitting the same delta from epoch for 
running the query, not the delta from the prior query - this should help 
prevent massive timing drifts
* Query frequency can be an int rather than an Integer to avoid unboxing
* I think it would be nice if we checked the actual CFMetaData for the 
keyspaces we're modifying in the CQLStatement, rather than doing a find within 
the whole string, but it's not too big a deal
* atomicCounterLock needs to be removed
* As a general rule, never copy array contents with a loop - always use 
System.arraycopy
* Still need to log the thread + session id as Jonathan mentioned




was (Author: benedict):
Some comments on the in progress patch:

* Don't create a string with the header and convert it to bytes - convert the 
string to bytes and write a normal byte-encoded header with timestamp + length 
as a long. This will make encoding the prepared statement parameters much 
easier also
* Encapsulate queryQue and logPosition into a single object, and use an 
atomicinteger for the position - don't synchronise, just bump the position 
however much you need, then write to the owned range. On flush swap the object 
(use an AtomicReference to track the current buffer)
* On flush, append directly from the byte buffer, don't copy it. Create a 
FileOutputStream and call its appropriate write method with the range that is 
in use
* On the read path, you're now eagerly reading _all_ files which is likely to 
blow up the heap; at least create an Iterator that only reads a whole file at 
once (preferably read a chunk of a file at a time, with a BufferedInputStream)
* On replay timing we want to target hitting the same delta from epoch for 
running the query, not the delta from the prior query - this should help 
prevent massive timing drifts
* Query frequency can be an int rather than an Integer to avoid unboxing
* I think it would be nice if we checked the actual CFMetaData for the 
keyspaces we're modifying in the CQLStatement, rather than doing a find within 
the whole string, but it's not too big a deal
* atomicCounterLock needs to be removed
* As a general rule, never copy array contents with a loop - always use 
System.arraycopy
* Still need to log the thread + query id as Jonathan mentioned



 Workload recording / playback
 -

 Key: CASSANDRA-6572
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6572
 Project: Cassandra
  Issue Type: New Feature
  Components: Core, Tools
Reporter: Jonathan Ellis
Assignee: Lyuben Todorov
 Fix For: 2.1.1

 Attachments: 6572-trunk.diff


 Write sample mode gets us part way to testing new versions against a real 
 world workload, but we need an easy way to test the query side as well.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6875) CQL3: select multiple CQL rows in a single partition using IN

2014-05-01 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13986597#comment-13986597
 ] 

Sylvain Lebresne commented on CASSANDRA-6875:
-

Ahah, it is uncommitted. Well I just did commit it for info. It's just one 
uninteresting test though, not sure why I never committed it.

 CQL3: select multiple CQL rows in a single partition using IN
 -

 Key: CASSANDRA-6875
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6875
 Project: Cassandra
  Issue Type: Bug
  Components: API
Reporter: Nicolas Favre-Felix
Assignee: Tyler Hobbs
Priority: Minor
 Fix For: 2.0.8


 In the spirit of CASSANDRA-4851 and to bring CQL to parity with Thrift, it is 
 important to support reading several distinct CQL rows from a given partition 
 using a distinct set of coordinates for these rows within the partition.
 CASSANDRA-4851 introduced a range scan over the multi-dimensional space of 
 clustering keys. We also need to support a multi-get of CQL rows, 
 potentially using the IN keyword to define a set of clustering keys to 
 fetch at once.
 (reusing the same example\:)
 Consider the following table:
 {code}
 CREATE TABLE test (
   k int,
   c1 int,
   c2 int,
   PRIMARY KEY (k, c1, c2)
 );
 {code}
 with the following data:
 {code}
  k | c1 | c2
 ---++
  0 |  0 |  0
  0 |  0 |  1
  0 |  1 |  0
  0 |  1 |  1
 {code}
 We can fetch a single row or a range of rows, but not a set of them:
 {code}
  SELECT * FROM test WHERE k = 0 AND (c1, c2) IN ((0, 0), (1,1)) ;
 Bad Request: line 1:54 missing EOF at ','
 {code}
 Supporting this syntax would return:
 {code}
  k | c1 | c2
 ---++
  0 |  0 |  0
  0 |  1 |  1
 {code}
 Being able to fetch these two CQL rows in a single read is important to 
 maintain partition-level isolation.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (CASSANDRA-7129) Consider allowing clients to make the Paging State available to users

2014-05-01 Thread JIRA
Michaël Figuière created CASSANDRA-7129:
---

 Summary: Consider allowing clients to make the Paging State 
available to users
 Key: CASSANDRA-7129
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7129
 Project: Cassandra
  Issue Type: Improvement
  Components: API
Reporter: Michaël Figuière


This is a follow up to a ticket that has been opened on the DataStax Java 
Driver JIRA (https://datastax-oss.atlassian.net/browse/JAVA-323).

Currently the Paging State is described as an internal data structure that 
might change in any upcoming version. As a consequence it isn't safe to make it 
available to users of the Cassandra Drivers.

It would be and interesting feature to work on making Cassandra safe against 
all the situation that might happen after unleashing paging states in the wild 
on the client side: they could end up being included in some web cookies, 
allowing malicious users to forge some, we might also have some compatibility 
issues as some paging states might come back to Cassandra after an upgrade of 
the cluster,...

If the discussion in this ticket turns out to conclude that the paging state 
SHOULD NOT be made available to users, at least it will be a clarification of 
something that was mostly implicit (AFAIK) so far.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6855) Native protocol V3

2014-05-01 Thread Michael Shuler (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13986621#comment-13986621
 ] 

Michael Shuler commented on CASSANDRA-6855:
---

There were a couple other tests that SpecificOptions.DEFAULT,3 indeed fixed. 
Thanks!

 Native protocol V3
 --

 Key: CASSANDRA-6855
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6855
 Project: Cassandra
  Issue Type: New Feature
Reporter: Sylvain Lebresne
Assignee: Sylvain Lebresne
 Fix For: 2.1 beta2

 Attachments: auth_test_dtest-2.1~git9872b74.node1.log, 
 auth_test_dtest-2.1~git9872b74.txt


 I think we need a V3 of the protocol for 2.1. The things that this 
 could/should includes are:
 # Adding an optional Serial CL for protocol batches (like we have for QUERY 
 and EXECUTE). It was an oversight of V2 of not adding it, and now that we can 
 batch conditional updates, it's definitively missing.
 # Proper type codes for UDT. This is not *strictly* needed to be able to 
 support UDT since currently a UDT will be sent as a custom type with his 
 fully class name + arguments. But parsing that is no fun nor convenient for 
 clients. It's also not particular space efficient (though that's probably not 
 a huge concern since with prepared statement you can avoid sending the 
 ResultSet metadata every time).
 # Serialization format for collections. Currently the serialization format 
 only allow for 65K elements, each of 65K bytes size at most. While 
 collections are not meant to store large amount of data, having the 
 limitation in the protocol serialization format is the wrong way to deal with 
 that. Concretely, the current workaround for CASSANDRA-5428 is ugly. I'll 
 note that the current serialization format is also an obstacle to supporting 
 null inside collections (whether or not we want to support null there is a 
 good question, but here again I'm not sure being limited by the serialization 
 format is a good idea).
 # CASSANDRA-6178: I continue to believe that in many case it makes somewhat 
 more sense to have the default timestamp provided by the client (this is a 
 necessary condition for true idempotent retries in particular). I'm 
 absolutely fine making that optional and leaving server-side generated 
 timestamps by default, but since client can already provide timestamp in 
 query string anyway, I don't see a big deal in making it easier for client 
 driver to control that without messing with the query string.
 # Optional names for values in QUERY messages: it has been brought to my 
 attention that while V2 allows to send a query string with values for a 
 one-roundtrip bind-and-execute, a driver can't really support named bind 
 marker with that feature properly without parsing the query. The proposition 
 is thus to make it (optionally) possible to ship the name of the marker each 
 value is supposed to be bound to.
 I think that 1) and 2) are enough reason to make a V3 (even if there is 
 disagreement on the rest that is).
 3) is a little bit more involved tbh but I do think having the current 
 limitations bolted in the protocol serialization format is wrong in the long 
 run, and it turns out that due to UDT we will start storing serialized 
 collections internally so if we want to lift said limitation in the 
 serialization format, we should do it now and everywhere, as doing it 
 afterwards will be a lot more painful.
 4) and 5) are probably somewhat more minor, but at the same time, both are 
 completely optional (a driver won't have to support those if he doesn't 
 want). They are really just about making things more flexible for client 
 drivers and they are not particularly hard to support so I don't see too many 
 reasons not to include them.
 Last but not least, I know that some may find it wrong to do a new protocol 
 version with each major of C*, so let me state my view here: I fully agree 
 that we shouldn't make an habit of that in the long run and that's 
 definitively *not* my objective. However, it would be silly to expect that we 
 could get everything right and forget nothing in the very first version. It 
 shouldn't be surprising that we'll have to burn a few versions (and there 
 might be a few more yet) before getting something more stable and complete 
 and I think that delaying the addition of stuffs that are useful to create 
 some fake notion of stability would be even more silly. On the bright side, 
 the additions of this V3 are comparatively much more simple to implement for 
 a client that those of V2 (in fact, for clients that want to support UDT, it 
 will probably require less effort to add the changes for this new version 
 than to try to support UDT without it), so I do think we make good progress 
 on getting the protocol stabilized 



--
This message was sent by 

[jira] [Commented] (CASSANDRA-4718) More-efficient ExecutorService for improved throughput

2014-05-01 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13986626#comment-13986626
 ] 

Benedict commented on CASSANDRA-4718:
-

For comparison, a graph of Jason's results: 
https://docs.google.com/spreadsheets/d/1mLxyY9syaAlDb1ALGQ-oF7Qo0tQffbcNgFMVPktde88/edit?usp=sharing

I'd like to do a couple of things here: 
# Tweak the Low Signal patch to potentially signal more intelligently rather 
than just always aggregating the last 5us of requests
# Try increasing the queue length
# Try these tests for a standardized load - the stress functionality we're 
using is great for giving a good ballpark idea of performance, but it varies 
the number of ops with each run, so running with a fixed 10M ops per run might 
be useful (stress could maybe do with an ops per thread option, as for the 
low thread counts this is a lot of work, but for high counts not very much)

The lowsignal patch looks to outperform at certain thresholds, but underperform 
at others, and I'm hoping 1 and 2 might help us make it better overall. At high 
thread counts the difference is almost 20% for writes, which is non-trivial.

 More-efficient ExecutorService for improved throughput
 --

 Key: CASSANDRA-4718
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4718
 Project: Cassandra
  Issue Type: Improvement
Reporter: Jonathan Ellis
Assignee: Jason Brown
Priority: Minor
  Labels: performance
 Fix For: 2.1.0

 Attachments: 4718-v1.patch, PerThreadQueue.java, 
 backpressure-stress.out.txt, baq vs trunk.png, op costs of various 
 queues.ods, stress op rate with various queues.ods, v1-stress.out


 Currently all our execution stages dequeue tasks one at a time.  This can 
 result in contention between producers and consumers (although we do our best 
 to minimize this by using LinkedBlockingQueue).
 One approach to mitigating this would be to make consumer threads do more 
 work in bulk instead of just one task per dequeue.  (Producer threads tend 
 to be single-task oriented by nature, so I don't see an equivalent 
 opportunity there.)
 BlockingQueue has a drainTo(collection, int) method that would be perfect for 
 this.  However, no ExecutorService in the jdk supports using drainTo, nor 
 could I google one.
 What I would like to do here is create just such a beast and wire it into (at 
 least) the write and read stages.  (Other possible candidates for such an 
 optimization, such as the CommitLog and OutboundTCPConnection, are not 
 ExecutorService-based and will need to be one-offs.)
 AbstractExecutorService may be useful.  The implementations of 
 ICommitLogExecutorService may also be useful. (Despite the name these are not 
 actual ExecutorServices, although they share the most important properties of 
 one.)



--
This message was sent by Atlassian JIRA
(v6.2#6252)


git commit: Update versions and add licenses for 2.1-beta2 release

2014-05-01 Thread slebresne
Repository: cassandra
Updated Branches:
  refs/heads/cassandra-2.1 9f60c55ba - 48727b4cc


Update versions and add licenses for 2.1-beta2 release


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/48727b4c
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/48727b4c
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/48727b4c

Branch: refs/heads/cassandra-2.1
Commit: 48727b4ccb3930d4cd56c1fde3784f7e862a2f94
Parents: 9f60c55
Author: Sylvain Lebresne sylv...@datastax.com
Authored: Thu May 1 16:40:15 2014 +0200
Committer: Sylvain Lebresne sylv...@datastax.com
Committed: Thu May 1 16:40:15 2014 +0200

--
 .rat-excludes|  1 +
 build.xml|  2 +-
 conf/logback-tools.xml   | 19 +++
 conf/logback.xml | 19 +++
 debian/changelog |  6 ++
 .../io/util/ChecksummedSequentialWriter.java | 17 +
 test/conf/logback-test.xml   | 19 +++
 .../stress/settings/OptionCompaction.java| 17 +
 8 files changed, 99 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/48727b4c/.rat-excludes
--
diff --git a/.rat-excludes b/.rat-excludes
index 871f5df..0da5ab9 100644
--- a/.rat-excludes
+++ b/.rat-excludes
@@ -5,6 +5,7 @@ debian/**
 **/.project
 **/.pydevproject
 CHANGES.txt
+README.asc
 .git/**
 **/*.json
 **/*.patch

http://git-wip-us.apache.org/repos/asf/cassandra/blob/48727b4c/build.xml
--
diff --git a/build.xml b/build.xml
index ba29b37..36b9998 100644
--- a/build.xml
+++ b/build.xml
@@ -25,7 +25,7 @@
 property name=debuglevel value=source,lines,vars/
 
 !-- default version and SCM information --
-property name=base.version value=2.1.0-beta1/
+property name=base.version value=2.1.0-beta2/
 property name=scm.connection 
value=scm:git://git.apache.org/cassandra.git/
 property name=scm.developerConnection 
value=scm:git://git.apache.org/cassandra.git/
 property name=scm.url 
value=http://git-wip-us.apache.org/repos/asf?p=cassandra.git;a=tree/

http://git-wip-us.apache.org/repos/asf/cassandra/blob/48727b4c/conf/logback-tools.xml
--
diff --git a/conf/logback-tools.xml b/conf/logback-tools.xml
index c472ae4..ade6c12 100644
--- a/conf/logback-tools.xml
+++ b/conf/logback-tools.xml
@@ -1,3 +1,22 @@
+!--
+ Licensed to the Apache Software Foundation (ASF) under one
+ or more contributor license agreements.  See the NOTICE file
+ distributed with this work for additional information
+ regarding copyright ownership.  The ASF licenses this file
+ to you under the Apache License, Version 2.0 (the
+ License); you may not use this file except in compliance
+ with the License.  You may obtain a copy of the License at
+
+   http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing,
+ software distributed under the License is distributed on an
+ AS IS BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ KIND, either express or implied.  See the License for the
+ specific language governing permissions and limitations
+ under the License.
+--
+
 configuration
   appender name=STDERR target=System.err 
class=ch.qos.logback.core.ConsoleAppender
 encoder

http://git-wip-us.apache.org/repos/asf/cassandra/blob/48727b4c/conf/logback.xml
--
diff --git a/conf/logback.xml b/conf/logback.xml
index 2657174..61e5a13 100644
--- a/conf/logback.xml
+++ b/conf/logback.xml
@@ -1,3 +1,22 @@
+!--
+ Licensed to the Apache Software Foundation (ASF) under one
+ or more contributor license agreements.  See the NOTICE file
+ distributed with this work for additional information
+ regarding copyright ownership.  The ASF licenses this file
+ to you under the Apache License, Version 2.0 (the
+ License); you may not use this file except in compliance
+ with the License.  You may obtain a copy of the License at
+
+   http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing,
+ software distributed under the License is distributed on an
+ AS IS BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ KIND, either express or implied.  See the License for the
+ specific language governing permissions and limitations
+ under the License.
+--
+
 configuration scan=true
   jmxConfigurator /
   appender name=FILE 
class=ch.qos.logback.core.rolling.RollingFileAppender


[jira] [Commented] (CASSANDRA-7128) Upgrade NBHM

2014-05-01 Thread T Jake Luciani (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13986640#comment-13986640
 ] 

T Jake Luciani commented on CASSANDRA-7128:
---

Can't I use this ticket? I updated the title...

 Upgrade NBHM
 

 Key: CASSANDRA-7128
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7128
 Project: Cassandra
  Issue Type: Improvement
Reporter: T Jake Luciani
Assignee: T Jake Luciani
Priority: Trivial
  Labels: performance
 Fix For: 2.1 beta2

 Attachments: 7128.txt, Screen Shot 2014-04-30 at 11.07.22 PM.png


 AbstractRowResolver uses a NBHM for each read request.  
 Profiler flagged this as a bottleneck since the init() call creates a 
 AtomicReferenceFieldUpdater which is stored in a synchronized collection.
 A NBHS is most certainly overkill for such a short lived object.  And turns 
 out switching it to a CHS in my tests yields a ~5-10% read boost.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Git Push Summary

2014-05-01 Thread slebresne
Repository: cassandra
Updated Tags:  refs/tags/2.1.0-beta2-tentative [created] 48727b4cc


[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables

2014-05-01 Thread Michael Shuler (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13986642#comment-13986642
 ] 

Michael Shuler commented on CASSANDRA-6694:
---

Comparing the default 2.1 dtest runs with offheap dtest runs shows the same 
results:
http://cassci.datastax.com/job/cassandra-2.1_dtest/132/
http://cassci.datastax.com/job/cassandra-2.1_offheap_dtest/4/

The 2.1 unit tests got a default config with this commit for 
memtable_allocation_type: offheap_objects and they appear pretty stable. 
(some new unit test errors have come up since CASSANDRA-6855, but no big 
regression appears to have come from this commit)

 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2

 Attachments: 6694.fix1.txt


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7128) Upgrade NBHM

2014-05-01 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13986655#comment-13986655
 ] 

Jonathan Ellis commented on CASSANDRA-7128:
---

I thought cleaning out the links and attachments would be more work than a new 
issue, but go for it

 Upgrade NBHM
 

 Key: CASSANDRA-7128
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7128
 Project: Cassandra
  Issue Type: Improvement
Reporter: T Jake Luciani
Assignee: T Jake Luciani
Priority: Trivial
  Labels: performance
 Fix For: 2.1 beta2

 Attachments: 7128.txt, Screen Shot 2014-04-30 at 11.07.22 PM.png


 AbstractRowResolver uses a NBHM for each read request.  
 Profiler flagged this as a bottleneck since the init() call creates a 
 AtomicReferenceFieldUpdater which is stored in a synchronized collection.
 A NBHS is most certainly overkill for such a short lived object.  And turns 
 out switching it to a CHS in my tests yields a ~5-10% read boost.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (CASSANDRA-6831) Updates to COMPACT STORAGE tables via cli drop CQL information

2014-05-01 Thread Sylvain Lebresne (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sylvain Lebresne updated CASSANDRA-6831:


Attachment: 0002-CompoundDenseCellNameType-fix.txt
0001-Properly-recompute-denseness-of-the-table-on-thrift-up.txt

[~mishail] You might not be testing the right commit (or use the right test) 
because I do get the test failure Tyler mentions.

The problem is a tad subtle, and not directly related to the patch (which 
mainly uncovered something that was wrong before).  What happens is that when 
we build the comparator in fromThrit, we don't take the copied CQL metadata to 
decide if the table is dense or not. This result in the addition of the index, 
which adds column_metadata where there was none, to switch the comparator from 
an initially dense one to a sparse one. But that's what confuse the rebuild() 
since we end up with a non-composite sparse comparator and a CLUSTERING_KEY 
definition, which incompatible.

Now the reason I said this is not directly related to the patch is that before 
this patch, fromThrift was already returning a sparse comparator when it 
shouldn't have (since the table was dense initially), but because we were not 
copying the CQL metadata, no exception was triggered and the mistake was 
repaired as soon as the update was written on disk since then previous CQL 
metadata were picked up so that the table was left dense as should be. In other 
words, the code was fishy and not doing what we meant it to do, but that had no 
visible consequence out of sheer luck.

Anyway I'm attaching a patch that slightly refactor the fromThrift so that we 
do take all metadata into account when computing isDense(). The patch is 2.1 
only, 2.0 is not affected because we don't compute isDense at the same places.

Now I'll note that while that make the pycassa tests run, there is 3 failures.  
One of them is actually due to the fact that the CASSANDRA-6738 patch was 
incomplete so attaching a simple 2nd patch to fix that (I can create a separate 
issue for that if people prefer but well, that's a simple fix). The other two 
failure are super-columns related but I haven't yet looked into them (but it's 
likely they are not related to this ticket).


 Updates to COMPACT STORAGE tables via cli drop CQL information
 --

 Key: CASSANDRA-6831
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6831
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Russell Bradberry
Assignee: Mikhail Stepura
Priority: Minor
 Fix For: 1.2.17, 2.0.8, 2.1 beta2

 Attachments: 
 0001-Properly-recompute-denseness-of-the-table-on-thrift-up.txt, 
 0002-CompoundDenseCellNameType-fix.txt, 6831-1.2.patch, 6831-2.0-v2.txt, 
 6831-2.1.patch


 If a COMPACT STORAGE table is altered using the CLI all information about the 
 column names reverts to the initial key, column1, column2 namings.  
 Additionally, the changes in the columns name will not take effect until the 
 Cassandra service is restarted.  This means that the clients using CQL will 
 continue to work properly until the service is restarted, at which time they 
 will start getting errors about non-existant columns in the table.
 When attempting to rename the columns back using ALTER TABLE an error stating 
 the column already exists will be raised.  The only way to get it back is to 
 ALTER TABLE and change the comment or something, which will bring back all 
 the original column names.
 This seems to be related to CASSANDRA-6676 and CASSANDRA-6370
 In cqlsh
 {code}
 Connected to cluster1 at 127.0.0.3:9160.
 [cqlsh 3.1.8 | Cassandra 1.2.15-SNAPSHOT | CQL spec 3.0.0 | Thrift protocol 
 19.36.2]
 Use HELP for help.
 cqlsh CREATE KEYSPACE test WITH REPLICATION = { 'class' : 'SimpleStrategy', 
 'replication_factor' : 3 };
 cqlsh USE test;
 cqlsh:test CREATE TABLE foo (bar text, baz text, qux text, PRIMARY KEY(bar, 
 baz) ) WITH COMPACT STORAGE;
 cqlsh:test describe table foo;
 CREATE TABLE foo (
   bar text,
   baz text,
   qux text,
   PRIMARY KEY (bar, baz)
 ) WITH COMPACT STORAGE AND
   bloom_filter_fp_chance=0.01 AND
   caching='KEYS_ONLY' AND
   comment='' AND
   dclocal_read_repair_chance=0.00 AND
   gc_grace_seconds=864000 AND
   read_repair_chance=0.10 AND
   replicate_on_write='true' AND
   populate_io_cache_on_flush='false' AND
   compaction={'class': 'SizeTieredCompactionStrategy'} AND
   compression={'sstable_compression': 'SnappyCompressor'};
 {code}
 Now in cli:
 {code}
   Connected to: cluster1 on 127.0.0.3/9160
 Welcome to Cassandra CLI version 1.2.15-SNAPSHOT
 Type 'help;' or '?' for help.
 Type 'quit;' or 'exit;' to quit.
 [default@unknown] use test;
 Authenticated to keyspace: test
 [default@test] UPDATE COLUMN FAMILY foo 

[jira] [Commented] (CASSANDRA-7123) BATCH documentation should be explicit about ordering guarantees

2014-05-01 Thread Alex P (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13986672#comment-13986672
 ] 

Alex P commented on CASSANDRA-7123:
---

I agree that the first part is better, but the 2nd part refers to Cassandra 
operations priority order which afaict is not defined anywhere in the spec, 
thus making this part raise questions (or even worse might lead users to depend 
on this sort of order instead of using timestamps). 

{quote}
if no timestamp is provided for each operation, then all operations will be 
applied with the same timestamp and that might result to an order that doesn't 
correspond to the order they are declared in the BATCH. You can force a 
particular operation ordering by using per-operation timestamps
{quote} 



 BATCH documentation should be explicit about ordering guarantees
 

 Key: CASSANDRA-7123
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7123
 Project: Cassandra
  Issue Type: Task
  Components: Documentation  website
Reporter: Tyler Hobbs
Assignee: Tyler Hobbs
Priority: Minor
 Attachments: 7123.txt


 In the CQL3 [batch statement 
 documentation](http://cassandra.apache.org/doc/cql3/CQL.html#batchStmt) we 
 don't mention that there are no ordering guarantees, which can lead to 
 somewhat surprising behavior (CASSANDRA-6291).
 We should also mention that you could specify timestamps in order to achieve 
 a particular ordering.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6696) Drive replacement in JBOD can cause data to reappear.

2014-05-01 Thread Marcus Eriksson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13986688#comment-13986688
 ] 

Marcus Eriksson commented on CASSANDRA-6696:


Pushed a semi-working sstable-per vnode version here: 
https://github.com/krummas/cassandra/commits/marcuse/6696-3 (by no means 
review-ready)

* flushes to vnode-separate sstables, spread out over the disks available
* keeps the sstables separate during compaction, for STCS by grouping the 
compactionbuckets by overlapping sstables, and with LCS by keeping a separate 
manifest for every vnode.

Still quite broken, but i think good enough to evaluate if we want to go this 
way, drawback is mainly that it takes a looong time to flush to 768 sstables 
instead of one (768 = num_tokens=256 and rf = 3). Doing 768 parallel 
compactions is also quite heavy. 

Unless anyone has a brilliant idea how to make flushing and compaction less 
heavy, I think we need some sort of balance here, maybe grouping the vnodes (8 
or 16 vnodes per sstable perhaps?) so that we flush a more reasonable amout of 
sstables, or even just going with the per-disk approach? 

 Drive replacement in JBOD can cause data to reappear. 
 --

 Key: CASSANDRA-6696
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6696
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: sankalp kohli
Assignee: Marcus Eriksson
 Fix For: 3.0


 In JBOD, when someone gets a bad drive, the bad drive is replaced with a new 
 empty one and repair is run. 
 This can cause deleted data to come back in some cases. Also this is true for 
 corrupt stables in which we delete the corrupt stable and run repair. 
 Here is an example:
 Say we have 3 nodes A,B and C and RF=3 and GC grace=10days. 
 row=sankalp col=sankalp is written 20 days back and successfully went to all 
 three nodes. 
 Then a delete/tombstone was written successfully for the same row column 15 
 days back. 
 Since this tombstone is more than gc grace, it got compacted in Nodes A and B 
 since it got compacted with the actual data. So there is no trace of this row 
 column in node A and B.
 Now in node C, say the original data is in drive1 and tombstone is in drive2. 
 Compaction has not yet reclaimed the data and tombstone.  
 Drive2 becomes corrupt and was replaced with new empty drive. 
 Due to the replacement, the tombstone in now gone and row=sankalp col=sankalp 
 has come back to life. 
 Now after replacing the drive we run repair. This data will be propagated to 
 all nodes. 
 Note: This is still a problem even if we run repair every gc grace. 
  



--
This message was sent by Atlassian JIRA
(v6.2#6252)


git commit: Update debian packaging following README extension change

2014-05-01 Thread slebresne
Repository: cassandra
Updated Branches:
  refs/heads/cassandra-2.0 44a5fc1df - d5a9c9254


Update debian packaging following README extension change


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/d5a9c925
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/d5a9c925
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/d5a9c925

Branch: refs/heads/cassandra-2.0
Commit: d5a9c9254ff418541f22dcb335ca48b139d22ff5
Parents: 44a5fc1
Author: Sylvain Lebresne sylv...@datastax.com
Authored: Thu May 1 17:59:35 2014 +0200
Committer: Sylvain Lebresne sylv...@datastax.com
Committed: Thu May 1 17:59:35 2014 +0200

--
 debian/rules | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/d5a9c925/debian/rules
--
diff --git a/debian/rules b/debian/rules
index 11b78e7..7261aa0 100755
--- a/debian/rules
+++ b/debian/rules
@@ -60,7 +60,7 @@ binary-indep: build install
dh_testroot
dh_installchangelogs
dh_installinit -u'start 50 2 3 4 5 . stop 50 0 1 6 .'
-   dh_installdocs README.txt CHANGES.txt NEWS.txt
+   dh_installdocs README.asc CHANGES.txt NEWS.txt
dh_compress
dh_fixperms
dh_installdeb



[1/2] git commit: Update debian packaging following README extension change

2014-05-01 Thread slebresne
Repository: cassandra
Updated Branches:
  refs/heads/cassandra-2.1 48727b4cc - e786101eb


Update debian packaging following README extension change


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/d5a9c925
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/d5a9c925
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/d5a9c925

Branch: refs/heads/cassandra-2.1
Commit: d5a9c9254ff418541f22dcb335ca48b139d22ff5
Parents: 44a5fc1
Author: Sylvain Lebresne sylv...@datastax.com
Authored: Thu May 1 17:59:35 2014 +0200
Committer: Sylvain Lebresne sylv...@datastax.com
Committed: Thu May 1 17:59:35 2014 +0200

--
 debian/rules | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/d5a9c925/debian/rules
--
diff --git a/debian/rules b/debian/rules
index 11b78e7..7261aa0 100755
--- a/debian/rules
+++ b/debian/rules
@@ -60,7 +60,7 @@ binary-indep: build install
dh_testroot
dh_installchangelogs
dh_installinit -u'start 50 2 3 4 5 . stop 50 0 1 6 .'
-   dh_installdocs README.txt CHANGES.txt NEWS.txt
+   dh_installdocs README.asc CHANGES.txt NEWS.txt
dh_compress
dh_fixperms
dh_installdeb



[2/2] git commit: Merge branch 'cassandra-2.0' into cassandra-2.1

2014-05-01 Thread slebresne
Merge branch 'cassandra-2.0' into cassandra-2.1


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/e786101e
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/e786101e
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/e786101e

Branch: refs/heads/cassandra-2.1
Commit: e786101ebabb17135d6b03a76769c08b863c
Parents: 48727b4 d5a9c92
Author: Sylvain Lebresne sylv...@datastax.com
Authored: Thu May 1 18:00:03 2014 +0200
Committer: Sylvain Lebresne sylv...@datastax.com
Committed: Thu May 1 18:00:03 2014 +0200

--
 debian/rules | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/e786101e/debian/rules
--



[4/5] git commit: Merge branch 'cassandra-2.0' into cassandra-2.1

2014-05-01 Thread slebresne
Merge branch 'cassandra-2.0' into cassandra-2.1


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/e786101e
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/e786101e
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/e786101e

Branch: refs/heads/trunk
Commit: e786101ebabb17135d6b03a76769c08b863c
Parents: 48727b4 d5a9c92
Author: Sylvain Lebresne sylv...@datastax.com
Authored: Thu May 1 18:00:03 2014 +0200
Committer: Sylvain Lebresne sylv...@datastax.com
Committed: Thu May 1 18:00:03 2014 +0200

--
 debian/rules | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/e786101e/debian/rules
--



[1/5] git commit: Support consistent range movements.

2014-05-01 Thread slebresne
Repository: cassandra
Updated Branches:
  refs/heads/trunk c0be34182 - d8490ccea


Support consistent range movements.

patch by tjake; reviewed by thobbs for CASSANDRA-2434


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/9f60c55b
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/9f60c55b
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/9f60c55b

Branch: refs/heads/trunk
Commit: 9f60c55ba42ff56aa58c3790b9c55924c4deedf4
Parents: 233761e
Author: T Jake Luciani j...@apache.org
Authored: Thu May 1 09:47:22 2014 -0400
Committer: T Jake Luciani j...@apache.org
Committed: Thu May 1 09:47:22 2014 -0400

--
 CHANGES.txt |  1 +
 NEWS.txt|  5 ++
 .../org/apache/cassandra/dht/BootStrapper.java  |  2 +-
 .../org/apache/cassandra/dht/RangeStreamer.java | 81 +++-
 .../cassandra/service/StorageService.java   | 44 ++-
 5 files changed, 127 insertions(+), 6 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/9f60c55b/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 34533cc..be72ad1 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -56,6 +56,7 @@
  * Optimize cellname comparison (CASSANDRA-6934)
  * Native protocol v3 (CASSANDRA-6855)
  * Optimize Cell liveness checks and clean up Cell (CASSANDRA-7119)
+ * Support consistent range movements (CASSANDRA-2434)
 Merged from 2.0:
  * Allow overriding cassandra-rackdc.properties file (CASSANDRA-7072)
  * Set JMX RMI port to 7199 (CASSANDRA-7087)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/9f60c55b/NEWS.txt
--
diff --git a/NEWS.txt b/NEWS.txt
index 86c6f64..5d59460 100644
--- a/NEWS.txt
+++ b/NEWS.txt
@@ -30,6 +30,11 @@ New features
  repair session. Use nodetool repair -par -inc to use this feature.
  A tool to manually mark/unmark sstables as repaired is available in
  tools/bin/sstablerepairedset.
+   - Bootstrapping now ensures that range movements are consistent,
+ meaning the data for the new node is taken from the node that is no 
+ longer a responsible for that range of keys.  
+ If you want the old behavior (due to a lost node perhaps)
+ you can set the following property (-Dconsistent.rangemovement=false)
 
 Upgrading
 -

http://git-wip-us.apache.org/repos/asf/cassandra/blob/9f60c55b/src/java/org/apache/cassandra/dht/BootStrapper.java
--
diff --git a/src/java/org/apache/cassandra/dht/BootStrapper.java 
b/src/java/org/apache/cassandra/dht/BootStrapper.java
index 343748b..cbbd100 100644
--- a/src/java/org/apache/cassandra/dht/BootStrapper.java
+++ b/src/java/org/apache/cassandra/dht/BootStrapper.java
@@ -63,7 +63,7 @@ public class BootStrapper
 if (logger.isDebugEnabled())
 logger.debug(Beginning bootstrap process);
 
-RangeStreamer streamer = new RangeStreamer(tokenMetadata, address, 
Bootstrap);
+RangeStreamer streamer = new RangeStreamer(tokenMetadata, tokens, 
address, Bootstrap);
 streamer.addSourceFilter(new 
RangeStreamer.FailureDetectorSourceFilter(FailureDetector.instance));
 
 for (String keyspaceName : Schema.instance.getNonSystemKeyspaces())

http://git-wip-us.apache.org/repos/asf/cassandra/blob/9f60c55b/src/java/org/apache/cassandra/dht/RangeStreamer.java
--
diff --git a/src/java/org/apache/cassandra/dht/RangeStreamer.java 
b/src/java/org/apache/cassandra/dht/RangeStreamer.java
index 7ab39a4..2308d30 100644
--- a/src/java/org/apache/cassandra/dht/RangeStreamer.java
+++ b/src/java/org/apache/cassandra/dht/RangeStreamer.java
@@ -23,6 +23,8 @@ import java.util.*;
 import com.google.common.collect.ArrayListMultimap;
 import com.google.common.collect.HashMultimap;
 import com.google.common.collect.Multimap;
+import com.google.common.collect.Sets;
+import org.apache.cassandra.gms.EndpointState;
 import org.apache.commons.lang3.StringUtils;
 import org.slf4j.Logger;
 import org.slf4j.LoggerFactory;
@@ -30,6 +32,7 @@ import org.slf4j.LoggerFactory;
 import org.apache.cassandra.config.DatabaseDescriptor;
 import org.apache.cassandra.db.Keyspace;
 import org.apache.cassandra.gms.FailureDetector;
+import org.apache.cassandra.gms.Gossiper;
 import org.apache.cassandra.gms.IFailureDetector;
 import org.apache.cassandra.locator.AbstractReplicationStrategy;
 import org.apache.cassandra.locator.IEndpointSnitch;
@@ -44,7 +47,8 @@ import org.apache.cassandra.utils.FBUtilities;
 public class RangeStreamer
 {
 private static final Logger logger = 

[3/5] git commit: Update debian packaging following README extension change

2014-05-01 Thread slebresne
Update debian packaging following README extension change


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/d5a9c925
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/d5a9c925
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/d5a9c925

Branch: refs/heads/trunk
Commit: d5a9c9254ff418541f22dcb335ca48b139d22ff5
Parents: 44a5fc1
Author: Sylvain Lebresne sylv...@datastax.com
Authored: Thu May 1 17:59:35 2014 +0200
Committer: Sylvain Lebresne sylv...@datastax.com
Committed: Thu May 1 17:59:35 2014 +0200

--
 debian/rules | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/d5a9c925/debian/rules
--
diff --git a/debian/rules b/debian/rules
index 11b78e7..7261aa0 100755
--- a/debian/rules
+++ b/debian/rules
@@ -60,7 +60,7 @@ binary-indep: build install
dh_testroot
dh_installchangelogs
dh_installinit -u'start 50 2 3 4 5 . stop 50 0 1 6 .'
-   dh_installdocs README.txt CHANGES.txt NEWS.txt
+   dh_installdocs README.asc CHANGES.txt NEWS.txt
dh_compress
dh_fixperms
dh_installdeb



[5/5] git commit: Merge branch 'cassandra-2.1' into trunk

2014-05-01 Thread slebresne
Merge branch 'cassandra-2.1' into trunk


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/d8490cce
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/d8490cce
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/d8490cce

Branch: refs/heads/trunk
Commit: d8490ccea1e38c1a4246a025fc01b76191387a0f
Parents: c0be341 e786101
Author: Sylvain Lebresne sylv...@datastax.com
Authored: Thu May 1 18:00:31 2014 +0200
Committer: Sylvain Lebresne sylv...@datastax.com
Committed: Thu May 1 18:00:31 2014 +0200

--
 .rat-excludes   |  1 +
 CHANGES.txt |  1 +
 NEWS.txt|  5 ++
 build.xml   |  2 +-
 conf/logback-tools.xml  | 19 +
 conf/logback.xml| 19 +
 debian/changelog|  6 ++
 debian/rules|  2 +-
 .../org/apache/cassandra/dht/BootStrapper.java  |  2 +-
 .../org/apache/cassandra/dht/RangeStreamer.java | 81 +++-
 .../io/util/ChecksummedSequentialWriter.java| 17 
 .../cassandra/service/StorageService.java   | 44 ++-
 test/conf/logback-test.xml  | 19 +
 .../stress/settings/OptionCompaction.java   | 17 
 14 files changed, 227 insertions(+), 8 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/d8490cce/CHANGES.txt
--

http://git-wip-us.apache.org/repos/asf/cassandra/blob/d8490cce/NEWS.txt
--

http://git-wip-us.apache.org/repos/asf/cassandra/blob/d8490cce/build.xml
--

http://git-wip-us.apache.org/repos/asf/cassandra/blob/d8490cce/src/java/org/apache/cassandra/service/StorageService.java
--



[jira] [Commented] (CASSANDRA-6696) Drive replacement in JBOD can cause data to reappear.

2014-05-01 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13986700#comment-13986700
 ] 

Benedict commented on CASSANDRA-6696:
-

I think we may be able to get a good half-way house by setting a minimum 
sstable size below which we aggregate vnodes into a single sstable, ensuring we 
always keep a whole vnode in one table (unless that vnode is larger than the 
maximum sstable size, in which case we split it, and it alone) - this should be 
cost free and tend rapidly towards separate sstables per vnode for all but the 
most recent data, which could simply ALL be copied over to any nodes we want to 
duplicate data to, as the overhead would be approximately constant regardless 
of the amount of data the node is managing. We could introduce a tool to split 
out a single token range from those files for users who wanted to avoid this 
fixed overhead cost.

 Drive replacement in JBOD can cause data to reappear. 
 --

 Key: CASSANDRA-6696
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6696
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: sankalp kohli
Assignee: Marcus Eriksson
 Fix For: 3.0


 In JBOD, when someone gets a bad drive, the bad drive is replaced with a new 
 empty one and repair is run. 
 This can cause deleted data to come back in some cases. Also this is true for 
 corrupt stables in which we delete the corrupt stable and run repair. 
 Here is an example:
 Say we have 3 nodes A,B and C and RF=3 and GC grace=10days. 
 row=sankalp col=sankalp is written 20 days back and successfully went to all 
 three nodes. 
 Then a delete/tombstone was written successfully for the same row column 15 
 days back. 
 Since this tombstone is more than gc grace, it got compacted in Nodes A and B 
 since it got compacted with the actual data. So there is no trace of this row 
 column in node A and B.
 Now in node C, say the original data is in drive1 and tombstone is in drive2. 
 Compaction has not yet reclaimed the data and tombstone.  
 Drive2 becomes corrupt and was replaced with new empty drive. 
 Due to the replacement, the tombstone in now gone and row=sankalp col=sankalp 
 has come back to life. 
 Now after replacing the drive we run repair. This data will be propagated to 
 all nodes. 
 Note: This is still a problem even if we run repair every gc grace. 
  



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[2/5] git commit: Update versions and add licenses for 2.1-beta2 release

2014-05-01 Thread slebresne
Update versions and add licenses for 2.1-beta2 release


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/48727b4c
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/48727b4c
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/48727b4c

Branch: refs/heads/trunk
Commit: 48727b4ccb3930d4cd56c1fde3784f7e862a2f94
Parents: 9f60c55
Author: Sylvain Lebresne sylv...@datastax.com
Authored: Thu May 1 16:40:15 2014 +0200
Committer: Sylvain Lebresne sylv...@datastax.com
Committed: Thu May 1 16:40:15 2014 +0200

--
 .rat-excludes|  1 +
 build.xml|  2 +-
 conf/logback-tools.xml   | 19 +++
 conf/logback.xml | 19 +++
 debian/changelog |  6 ++
 .../io/util/ChecksummedSequentialWriter.java | 17 +
 test/conf/logback-test.xml   | 19 +++
 .../stress/settings/OptionCompaction.java| 17 +
 8 files changed, 99 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/48727b4c/.rat-excludes
--
diff --git a/.rat-excludes b/.rat-excludes
index 871f5df..0da5ab9 100644
--- a/.rat-excludes
+++ b/.rat-excludes
@@ -5,6 +5,7 @@ debian/**
 **/.project
 **/.pydevproject
 CHANGES.txt
+README.asc
 .git/**
 **/*.json
 **/*.patch

http://git-wip-us.apache.org/repos/asf/cassandra/blob/48727b4c/build.xml
--
diff --git a/build.xml b/build.xml
index ba29b37..36b9998 100644
--- a/build.xml
+++ b/build.xml
@@ -25,7 +25,7 @@
 property name=debuglevel value=source,lines,vars/
 
 !-- default version and SCM information --
-property name=base.version value=2.1.0-beta1/
+property name=base.version value=2.1.0-beta2/
 property name=scm.connection 
value=scm:git://git.apache.org/cassandra.git/
 property name=scm.developerConnection 
value=scm:git://git.apache.org/cassandra.git/
 property name=scm.url 
value=http://git-wip-us.apache.org/repos/asf?p=cassandra.git;a=tree/

http://git-wip-us.apache.org/repos/asf/cassandra/blob/48727b4c/conf/logback-tools.xml
--
diff --git a/conf/logback-tools.xml b/conf/logback-tools.xml
index c472ae4..ade6c12 100644
--- a/conf/logback-tools.xml
+++ b/conf/logback-tools.xml
@@ -1,3 +1,22 @@
+!--
+ Licensed to the Apache Software Foundation (ASF) under one
+ or more contributor license agreements.  See the NOTICE file
+ distributed with this work for additional information
+ regarding copyright ownership.  The ASF licenses this file
+ to you under the Apache License, Version 2.0 (the
+ License); you may not use this file except in compliance
+ with the License.  You may obtain a copy of the License at
+
+   http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing,
+ software distributed under the License is distributed on an
+ AS IS BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ KIND, either express or implied.  See the License for the
+ specific language governing permissions and limitations
+ under the License.
+--
+
 configuration
   appender name=STDERR target=System.err 
class=ch.qos.logback.core.ConsoleAppender
 encoder

http://git-wip-us.apache.org/repos/asf/cassandra/blob/48727b4c/conf/logback.xml
--
diff --git a/conf/logback.xml b/conf/logback.xml
index 2657174..61e5a13 100644
--- a/conf/logback.xml
+++ b/conf/logback.xml
@@ -1,3 +1,22 @@
+!--
+ Licensed to the Apache Software Foundation (ASF) under one
+ or more contributor license agreements.  See the NOTICE file
+ distributed with this work for additional information
+ regarding copyright ownership.  The ASF licenses this file
+ to you under the Apache License, Version 2.0 (the
+ License); you may not use this file except in compliance
+ with the License.  You may obtain a copy of the License at
+
+   http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing,
+ software distributed under the License is distributed on an
+ AS IS BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ KIND, either express or implied.  See the License for the
+ specific language governing permissions and limitations
+ under the License.
+--
+
 configuration scan=true
   jmxConfigurator /
   appender name=FILE 
class=ch.qos.logback.core.rolling.RollingFileAppender

http://git-wip-us.apache.org/repos/asf/cassandra/blob/48727b4c/debian/changelog

[jira] [Created] (CASSANDRA-7130) Make sstable checksum type configurable and optional

2014-05-01 Thread Benedict (JIRA)
Benedict created CASSANDRA-7130:
---

 Summary: Make sstable checksum type configurable and optional
 Key: CASSANDRA-7130
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7130
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
Priority: Minor
 Fix For: 3.0


A lot of our users are becoming bottlenecked on CPU rather than IO, and whilst 
Adler32 is faster than CRC, it isn't anything like as fast as xxhash (used by 
LZ4), which can push Gb/s. I propose making the checksum type configurable so 
that users who want speed can shift to xxhash, and those who want security can 
use Adler or CRC.

It's worth noting that at some point in the future (JDK8?) optimised 
implementations using latest intel crc instructions will be added, though it's 
not clear from the mailing list discussion if/when that will materialise:

http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2013-May/010775.html



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (CASSANDRA-6831) Updates to COMPACT STORAGE tables via cli drop CQL information

2014-05-01 Thread Mikhail Stepura (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Stepura updated CASSANDRA-6831:
---

Assignee: Sylvain Lebresne  (was: Mikhail Stepura)

 Updates to COMPACT STORAGE tables via cli drop CQL information
 --

 Key: CASSANDRA-6831
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6831
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Russell Bradberry
Assignee: Sylvain Lebresne
Priority: Minor
 Fix For: 1.2.17, 2.0.8, 2.1 beta2

 Attachments: 
 0001-Properly-recompute-denseness-of-the-table-on-thrift-up.txt, 
 0002-CompoundDenseCellNameType-fix.txt, 6831-1.2.patch, 6831-2.0-v2.txt, 
 6831-2.1.patch


 If a COMPACT STORAGE table is altered using the CLI all information about the 
 column names reverts to the initial key, column1, column2 namings.  
 Additionally, the changes in the columns name will not take effect until the 
 Cassandra service is restarted.  This means that the clients using CQL will 
 continue to work properly until the service is restarted, at which time they 
 will start getting errors about non-existant columns in the table.
 When attempting to rename the columns back using ALTER TABLE an error stating 
 the column already exists will be raised.  The only way to get it back is to 
 ALTER TABLE and change the comment or something, which will bring back all 
 the original column names.
 This seems to be related to CASSANDRA-6676 and CASSANDRA-6370
 In cqlsh
 {code}
 Connected to cluster1 at 127.0.0.3:9160.
 [cqlsh 3.1.8 | Cassandra 1.2.15-SNAPSHOT | CQL spec 3.0.0 | Thrift protocol 
 19.36.2]
 Use HELP for help.
 cqlsh CREATE KEYSPACE test WITH REPLICATION = { 'class' : 'SimpleStrategy', 
 'replication_factor' : 3 };
 cqlsh USE test;
 cqlsh:test CREATE TABLE foo (bar text, baz text, qux text, PRIMARY KEY(bar, 
 baz) ) WITH COMPACT STORAGE;
 cqlsh:test describe table foo;
 CREATE TABLE foo (
   bar text,
   baz text,
   qux text,
   PRIMARY KEY (bar, baz)
 ) WITH COMPACT STORAGE AND
   bloom_filter_fp_chance=0.01 AND
   caching='KEYS_ONLY' AND
   comment='' AND
   dclocal_read_repair_chance=0.00 AND
   gc_grace_seconds=864000 AND
   read_repair_chance=0.10 AND
   replicate_on_write='true' AND
   populate_io_cache_on_flush='false' AND
   compaction={'class': 'SizeTieredCompactionStrategy'} AND
   compression={'sstable_compression': 'SnappyCompressor'};
 {code}
 Now in cli:
 {code}
   Connected to: cluster1 on 127.0.0.3/9160
 Welcome to Cassandra CLI version 1.2.15-SNAPSHOT
 Type 'help;' or '?' for help.
 Type 'quit;' or 'exit;' to quit.
 [default@unknown] use test;
 Authenticated to keyspace: test
 [default@test] UPDATE COLUMN FAMILY foo WITH comment='hey this is a comment';
 3bf5fa49-5d03-34f0-b46c-6745f7740925
 {code}
 Now back in cqlsh:
 {code}
 cqlsh:test describe table foo;
 CREATE TABLE foo (
   bar text,
   column1 text,
   value text,
   PRIMARY KEY (bar, column1)
 ) WITH COMPACT STORAGE AND
   bloom_filter_fp_chance=0.01 AND
   caching='KEYS_ONLY' AND
   comment='hey this is a comment' AND
   dclocal_read_repair_chance=0.00 AND
   gc_grace_seconds=864000 AND
   read_repair_chance=0.10 AND
   replicate_on_write='true' AND
   populate_io_cache_on_flush='false' AND
   compaction={'class': 'SizeTieredCompactionStrategy'} AND
   compression={'sstable_compression': 'SnappyCompressor'};
 cqlsh:test ALTER TABLE foo WITH comment='this is a new comment';
 cqlsh:test describe table foo;
 CREATE TABLE foo (
   bar text,
   baz text,
   qux text,
   PRIMARY KEY (bar, baz)
 ) WITH COMPACT STORAGE AND
   bloom_filter_fp_chance=0.01 AND
   caching='KEYS_ONLY' AND
   comment='this is a new comment' AND
   dclocal_read_repair_chance=0.00 AND
   gc_grace_seconds=864000 AND
   read_repair_chance=0.10 AND
   replicate_on_write='true' AND
   populate_io_cache_on_flush='false' AND
   compaction={'class': 'SizeTieredCompactionStrategy'} AND
   compression={'sstable_compression': 'SnappyCompressor'};
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (CASSANDRA-7123) BATCH documentation should be explicit about ordering guarantees

2014-05-01 Thread Tyler Hobbs (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tyler Hobbs updated CASSANDRA-7123:
---

Attachment: 7123-v2.txt

v2 of the patch is the same, except for the bullet point in debate:

bq. If a timestamp is not specified for each operation, then all operations 
will be applied with the same timestamp. Due to Cassandra's conflict resolution 
procedure in the case of timestamp ties, operations may be applied in an order 
that is different from the order they are listed in the @BATCH@ statement. To 
force a particular operation ordering, you must specify per-operation 
timestamps.

This hints at timestamp ties and conflict resolution being the cause for 
unexpected operation ordering, but doesn't mention deletes winning over writes, 
or the highest value winning for normal writes.  (It should be extremely rare 
for somebody to rely on this exact conflict resolution behavior for normal 
operations.)

 BATCH documentation should be explicit about ordering guarantees
 

 Key: CASSANDRA-7123
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7123
 Project: Cassandra
  Issue Type: Task
  Components: Documentation  website
Reporter: Tyler Hobbs
Assignee: Tyler Hobbs
Priority: Minor
 Attachments: 7123-v2.txt, 7123.txt


 In the CQL3 [batch statement 
 documentation](http://cassandra.apache.org/doc/cql3/CQL.html#batchStmt) we 
 don't mention that there are no ordering guarantees, which can lead to 
 somewhat surprising behavior (CASSANDRA-6291).
 We should also mention that you could specify timestamps in order to achieve 
 a particular ordering.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6831) Updates to COMPACT STORAGE tables via cli drop CQL information

2014-05-01 Thread Tyler Hobbs (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13986758#comment-13986758
 ] 

Tyler Hobbs commented on CASSANDRA-6831:


bq. One of them is actually due to the fact that the CASSANDRA-6738 patch was 
incomplete so attaching a simple 2nd patch to fix that (I can create a separate 
issue for that if people prefer but well, that's a simple fix).

I created CASSANDRA-7112 a couple of days ago for that.  Want to move the patch 
and review there?

bq. The other two failure are super-columns related but I haven't yet looked 
into them (but it's likely they are not related to this ticket).

On 7112 I mentioned that the failing super column tests may be related, but if 
they're not, we can create a new ticket for those.

 Updates to COMPACT STORAGE tables via cli drop CQL information
 --

 Key: CASSANDRA-6831
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6831
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Russell Bradberry
Assignee: Sylvain Lebresne
Priority: Minor
 Fix For: 1.2.17, 2.0.8, 2.1 beta2

 Attachments: 
 0001-Properly-recompute-denseness-of-the-table-on-thrift-up.txt, 
 0002-CompoundDenseCellNameType-fix.txt, 6831-1.2.patch, 6831-2.0-v2.txt, 
 6831-2.1.patch


 If a COMPACT STORAGE table is altered using the CLI all information about the 
 column names reverts to the initial key, column1, column2 namings.  
 Additionally, the changes in the columns name will not take effect until the 
 Cassandra service is restarted.  This means that the clients using CQL will 
 continue to work properly until the service is restarted, at which time they 
 will start getting errors about non-existant columns in the table.
 When attempting to rename the columns back using ALTER TABLE an error stating 
 the column already exists will be raised.  The only way to get it back is to 
 ALTER TABLE and change the comment or something, which will bring back all 
 the original column names.
 This seems to be related to CASSANDRA-6676 and CASSANDRA-6370
 In cqlsh
 {code}
 Connected to cluster1 at 127.0.0.3:9160.
 [cqlsh 3.1.8 | Cassandra 1.2.15-SNAPSHOT | CQL spec 3.0.0 | Thrift protocol 
 19.36.2]
 Use HELP for help.
 cqlsh CREATE KEYSPACE test WITH REPLICATION = { 'class' : 'SimpleStrategy', 
 'replication_factor' : 3 };
 cqlsh USE test;
 cqlsh:test CREATE TABLE foo (bar text, baz text, qux text, PRIMARY KEY(bar, 
 baz) ) WITH COMPACT STORAGE;
 cqlsh:test describe table foo;
 CREATE TABLE foo (
   bar text,
   baz text,
   qux text,
   PRIMARY KEY (bar, baz)
 ) WITH COMPACT STORAGE AND
   bloom_filter_fp_chance=0.01 AND
   caching='KEYS_ONLY' AND
   comment='' AND
   dclocal_read_repair_chance=0.00 AND
   gc_grace_seconds=864000 AND
   read_repair_chance=0.10 AND
   replicate_on_write='true' AND
   populate_io_cache_on_flush='false' AND
   compaction={'class': 'SizeTieredCompactionStrategy'} AND
   compression={'sstable_compression': 'SnappyCompressor'};
 {code}
 Now in cli:
 {code}
   Connected to: cluster1 on 127.0.0.3/9160
 Welcome to Cassandra CLI version 1.2.15-SNAPSHOT
 Type 'help;' or '?' for help.
 Type 'quit;' or 'exit;' to quit.
 [default@unknown] use test;
 Authenticated to keyspace: test
 [default@test] UPDATE COLUMN FAMILY foo WITH comment='hey this is a comment';
 3bf5fa49-5d03-34f0-b46c-6745f7740925
 {code}
 Now back in cqlsh:
 {code}
 cqlsh:test describe table foo;
 CREATE TABLE foo (
   bar text,
   column1 text,
   value text,
   PRIMARY KEY (bar, column1)
 ) WITH COMPACT STORAGE AND
   bloom_filter_fp_chance=0.01 AND
   caching='KEYS_ONLY' AND
   comment='hey this is a comment' AND
   dclocal_read_repair_chance=0.00 AND
   gc_grace_seconds=864000 AND
   read_repair_chance=0.10 AND
   replicate_on_write='true' AND
   populate_io_cache_on_flush='false' AND
   compaction={'class': 'SizeTieredCompactionStrategy'} AND
   compression={'sstable_compression': 'SnappyCompressor'};
 cqlsh:test ALTER TABLE foo WITH comment='this is a new comment';
 cqlsh:test describe table foo;
 CREATE TABLE foo (
   bar text,
   baz text,
   qux text,
   PRIMARY KEY (bar, baz)
 ) WITH COMPACT STORAGE AND
   bloom_filter_fp_chance=0.01 AND
   caching='KEYS_ONLY' AND
   comment='this is a new comment' AND
   dclocal_read_repair_chance=0.00 AND
   gc_grace_seconds=864000 AND
   read_repair_chance=0.10 AND
   replicate_on_write='true' AND
   populate_io_cache_on_flush='false' AND
   compaction={'class': 'SizeTieredCompactionStrategy'} AND
   compression={'sstable_compression': 'SnappyCompressor'};
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7123) BATCH documentation should be explicit about ordering guarantees

2014-05-01 Thread Alex P (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13986769#comment-13986769
 ] 

Alex P commented on CASSANDRA-7123:
---

lgtm

 BATCH documentation should be explicit about ordering guarantees
 

 Key: CASSANDRA-7123
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7123
 Project: Cassandra
  Issue Type: Task
  Components: Documentation  website
Reporter: Tyler Hobbs
Assignee: Tyler Hobbs
Priority: Minor
 Attachments: 7123-v2.txt, 7123.txt


 In the CQL3 [batch statement 
 documentation](http://cassandra.apache.org/doc/cql3/CQL.html#batchStmt) we 
 don't mention that there are no ordering guarantees, which can lead to 
 somewhat surprising behavior (CASSANDRA-6291).
 We should also mention that you could specify timestamps in order to achieve 
 a particular ordering.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6875) CQL3: select multiple CQL rows in a single partition using IN

2014-05-01 Thread Tyler Hobbs (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13986770#comment-13986770
 ] 

Tyler Hobbs commented on CASSANDRA-6875:


bq. We can, see cql_prepared_test.py (arguably our number of tests for prepared 
statement is deeply lacking, but it's possible to have some).

Ah, thanks, good to know.

bq. I'll fight to the death the concept that unit test are a lot faster to 
work with as an absolute truth.

Don't worry, I'm not going to challenge you to a duel :).  It's not an absolute 
truth, but it's easy to do things like run a unit test with the debugger on, 
which makes a big difference in some cases.

bq. fixing those issue is likely simpler than migrating all the existing tests 
back to the unit tests.

I'm definitely not suggesting moving any existing dtests to unit tests.  I'm 
just proposing that we allow some mix of unit tests and dtests for newly 
written tests.

bq.  we could have a debate I suppose (but definitively not here)

Do you mind if I start a dev ML thread?  It would be good to get input from 
other devs and QA.



 CQL3: select multiple CQL rows in a single partition using IN
 -

 Key: CASSANDRA-6875
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6875
 Project: Cassandra
  Issue Type: Bug
  Components: API
Reporter: Nicolas Favre-Felix
Assignee: Tyler Hobbs
Priority: Minor
 Fix For: 2.0.8


 In the spirit of CASSANDRA-4851 and to bring CQL to parity with Thrift, it is 
 important to support reading several distinct CQL rows from a given partition 
 using a distinct set of coordinates for these rows within the partition.
 CASSANDRA-4851 introduced a range scan over the multi-dimensional space of 
 clustering keys. We also need to support a multi-get of CQL rows, 
 potentially using the IN keyword to define a set of clustering keys to 
 fetch at once.
 (reusing the same example\:)
 Consider the following table:
 {code}
 CREATE TABLE test (
   k int,
   c1 int,
   c2 int,
   PRIMARY KEY (k, c1, c2)
 );
 {code}
 with the following data:
 {code}
  k | c1 | c2
 ---++
  0 |  0 |  0
  0 |  0 |  1
  0 |  1 |  0
  0 |  1 |  1
 {code}
 We can fetch a single row or a range of rows, but not a set of them:
 {code}
  SELECT * FROM test WHERE k = 0 AND (c1, c2) IN ((0, 0), (1,1)) ;
 Bad Request: line 1:54 missing EOF at ','
 {code}
 Supporting this syntax would return:
 {code}
  k | c1 | c2
 ---++
  0 |  0 |  0
  0 |  1 |  1
 {code}
 Being able to fetch these two CQL rows in a single read is important to 
 maintain partition-level isolation.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Comment Edited] (CASSANDRA-7123) BATCH documentation should be explicit about ordering guarantees

2014-05-01 Thread Alex P (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13986769#comment-13986769
 ] 

Alex P edited comment on CASSANDRA-7123 at 5/1/14 5:06 PM:
---

lgtm. (the fact that we hint the cause of ordering without giving details that 
might be misused is imo the right approach)


was (Author: alexyz):
lgtm

 BATCH documentation should be explicit about ordering guarantees
 

 Key: CASSANDRA-7123
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7123
 Project: Cassandra
  Issue Type: Task
  Components: Documentation  website
Reporter: Tyler Hobbs
Assignee: Tyler Hobbs
Priority: Minor
 Attachments: 7123-v2.txt, 7123.txt


 In the CQL3 [batch statement 
 documentation](http://cassandra.apache.org/doc/cql3/CQL.html#batchStmt) we 
 don't mention that there are no ordering guarantees, which can lead to 
 somewhat surprising behavior (CASSANDRA-6291).
 We should also mention that you could specify timestamps in order to achieve 
 a particular ordering.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6696) Drive replacement in JBOD can cause data to reappear.

2014-05-01 Thread Marcus Eriksson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13986799#comment-13986799
 ] 

Marcus Eriksson commented on CASSANDRA-6696:


this special cases compaction a bit though, we could have sstables that overlap 
with other sstables of similar size that we can't really compact together 
(which we probably shouldn't since they overlap too little (CASSANDRA-6474)).

for LCS i guess we could align the vnode start/end to the sstables start/end. 
Ie, in level 1 (10 sstables) each sstable would contain ~100 vnodes, in level2 
(100 sstables) ~10, and in level3 (1000 sstables) 1 vnode. Then we could flush 
sstables mapping to the sstables in level1 to only compact those together.


 Drive replacement in JBOD can cause data to reappear. 
 --

 Key: CASSANDRA-6696
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6696
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: sankalp kohli
Assignee: Marcus Eriksson
 Fix For: 3.0


 In JBOD, when someone gets a bad drive, the bad drive is replaced with a new 
 empty one and repair is run. 
 This can cause deleted data to come back in some cases. Also this is true for 
 corrupt stables in which we delete the corrupt stable and run repair. 
 Here is an example:
 Say we have 3 nodes A,B and C and RF=3 and GC grace=10days. 
 row=sankalp col=sankalp is written 20 days back and successfully went to all 
 three nodes. 
 Then a delete/tombstone was written successfully for the same row column 15 
 days back. 
 Since this tombstone is more than gc grace, it got compacted in Nodes A and B 
 since it got compacted with the actual data. So there is no trace of this row 
 column in node A and B.
 Now in node C, say the original data is in drive1 and tombstone is in drive2. 
 Compaction has not yet reclaimed the data and tombstone.  
 Drive2 becomes corrupt and was replaced with new empty drive. 
 Due to the replacement, the tombstone in now gone and row=sankalp col=sankalp 
 has come back to life. 
 Now after replacing the drive we run repair. This data will be propagated to 
 all nodes. 
 Note: This is still a problem even if we run repair every gc grace. 
  



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7123) BATCH documentation should be explicit about ordering guarantees

2014-05-01 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13986805#comment-13986805
 ] 

Sylvain Lebresne commented on CASSANDRA-7123:
-

If you make timestamp ties link to 
http://wiki.apache.org/cassandra/FAQ#clocktie (I see no reason to hide details 
provided we properly guard against misuse of such details, which this FAQ 
does), you have my +1.

 BATCH documentation should be explicit about ordering guarantees
 

 Key: CASSANDRA-7123
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7123
 Project: Cassandra
  Issue Type: Task
  Components: Documentation  website
Reporter: Tyler Hobbs
Assignee: Tyler Hobbs
Priority: Minor
 Attachments: 7123-v2.txt, 7123.txt


 In the CQL3 [batch statement 
 documentation](http://cassandra.apache.org/doc/cql3/CQL.html#batchStmt) we 
 don't mention that there are no ordering guarantees, which can lead to 
 somewhat surprising behavior (CASSANDRA-6291).
 We should also mention that you could specify timestamps in order to achieve 
 a particular ordering.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6875) CQL3: select multiple CQL rows in a single partition using IN

2014-05-01 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13986810#comment-13986810
 ] 

Sylvain Lebresne commented on CASSANDRA-6875:
-

bq. Do you mind if I start a dev ML thread?

Absolutely not.

 CQL3: select multiple CQL rows in a single partition using IN
 -

 Key: CASSANDRA-6875
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6875
 Project: Cassandra
  Issue Type: Bug
  Components: API
Reporter: Nicolas Favre-Felix
Assignee: Tyler Hobbs
Priority: Minor
 Fix For: 2.0.8


 In the spirit of CASSANDRA-4851 and to bring CQL to parity with Thrift, it is 
 important to support reading several distinct CQL rows from a given partition 
 using a distinct set of coordinates for these rows within the partition.
 CASSANDRA-4851 introduced a range scan over the multi-dimensional space of 
 clustering keys. We also need to support a multi-get of CQL rows, 
 potentially using the IN keyword to define a set of clustering keys to 
 fetch at once.
 (reusing the same example\:)
 Consider the following table:
 {code}
 CREATE TABLE test (
   k int,
   c1 int,
   c2 int,
   PRIMARY KEY (k, c1, c2)
 );
 {code}
 with the following data:
 {code}
  k | c1 | c2
 ---++
  0 |  0 |  0
  0 |  0 |  1
  0 |  1 |  0
  0 |  1 |  1
 {code}
 We can fetch a single row or a range of rows, but not a set of them:
 {code}
  SELECT * FROM test WHERE k = 0 AND (c1, c2) IN ((0, 0), (1,1)) ;
 Bad Request: line 1:54 missing EOF at ','
 {code}
 Supporting this syntax would return:
 {code}
  k | c1 | c2
 ---++
  0 |  0 |  0
  0 |  1 |  1
 {code}
 Being able to fetch these two CQL rows in a single read is important to 
 maintain partition-level isolation.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (CASSANDRA-7131) Add command line option for cqlshrc file path

2014-05-01 Thread Jeremiah Jordan (JIRA)
Jeremiah Jordan created CASSANDRA-7131:
--

 Summary: Add command line option for cqlshrc file path
 Key: CASSANDRA-7131
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7131
 Project: Cassandra
  Issue Type: New Feature
  Components: Tools
Reporter: Jeremiah Jordan
Priority: Trivial


It would be nice if you could specify the cqlshrc file location on the command 
line, so you don't have to jump through hoops when running it from a service 
user or something.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


git commit: Document lack of order guarantees for BATCH statements

2014-05-01 Thread tylerhobbs
Repository: cassandra
Updated Branches:
  refs/heads/cassandra-2.0 d5a9c9254 - 427fdd476


Document lack of order guarantees for BATCH statements

Patch by Tyler Hobbs, reviewed by Sylvain Lebresnse and Alex Popescu for
CASSANDRA-7123


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/427fdd47
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/427fdd47
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/427fdd47

Branch: refs/heads/cassandra-2.0
Commit: 427fdd476a631b3850ecd643f71155a5b7dd2bf7
Parents: d5a9c92
Author: Tyler Hobbs ty...@datastax.com
Authored: Thu May 1 13:03:56 2014 -0500
Committer: Tyler Hobbs ty...@datastax.com
Committed: Thu May 1 13:03:56 2014 -0500

--
 doc/cql3/CQL.textile | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/427fdd47/doc/cql3/CQL.textile
--
diff --git a/doc/cql3/CQL.textile b/doc/cql3/CQL.textile
index f6208bf..bedd189 100644
--- a/doc/cql3/CQL.textile
+++ b/doc/cql3/CQL.textile
@@ -614,7 +614,11 @@ The @BATCH@ statement group multiple modification 
statements (insertions/updates
 # It saves network round-trips between the client and the server (and 
sometimes between the server coordinator and the replicas) when batching 
multiple updates.
 # All updates in a @BATCH@ belonging to a given partition key are performed in 
isolation.
 # By default, all operations in the batch are performed atomically.  See the 
notes on @UNLOGGED@:#unloggedBatch for more details.
-Note however that the @BATCH@ statement only allows @UPDATE@, @INSERT@ and 
@DELETE@ statements and is _not_ a full analogue for SQL transactions.
+
+Note that:
+* @BATCH@ statements may only contain @UPDATE@, @INSERT@ and @DELETE@ 
statements.
+* Batches are _not_ a full analogue for SQL transactions.
+* If a timestamp is not specified for each operation, then all operations will 
be applied with the same timestamp. Due to Cassandra's conflict resolution 
procedure in the case of timestamp 
ties:http://wiki.apache.org/cassandra/FAQ#clocktie, operations may be applied 
in an order that is different from the order they are listed in the @BATCH@ 
statement. To force a particular operation ordering, you must specify 
per-operation timestamps.
 
 h4(#unloggedBatch). @UNLOGGED@
 



[2/3] git commit: Merge branch 'cassandra-2.0' into cassandra-2.1

2014-05-01 Thread tylerhobbs
Merge branch 'cassandra-2.0' into cassandra-2.1


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/1491f75d
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/1491f75d
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/1491f75d

Branch: refs/heads/trunk
Commit: 1491f75dd6313b2f17450e31184947d447ef6808
Parents: e786101 427fdd4
Author: Tyler Hobbs ty...@datastax.com
Authored: Thu May 1 13:05:52 2014 -0500
Committer: Tyler Hobbs ty...@datastax.com
Committed: Thu May 1 13:05:52 2014 -0500

--
 doc/cql3/CQL.textile | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/1491f75d/doc/cql3/CQL.textile
--



[1/2] git commit: Document lack of order guarantees for BATCH statements

2014-05-01 Thread tylerhobbs
Repository: cassandra
Updated Branches:
  refs/heads/cassandra-2.1 e786101eb - 1491f75dd


Document lack of order guarantees for BATCH statements

Patch by Tyler Hobbs, reviewed by Sylvain Lebresnse and Alex Popescu for
CASSANDRA-7123


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/427fdd47
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/427fdd47
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/427fdd47

Branch: refs/heads/cassandra-2.1
Commit: 427fdd476a631b3850ecd643f71155a5b7dd2bf7
Parents: d5a9c92
Author: Tyler Hobbs ty...@datastax.com
Authored: Thu May 1 13:03:56 2014 -0500
Committer: Tyler Hobbs ty...@datastax.com
Committed: Thu May 1 13:03:56 2014 -0500

--
 doc/cql3/CQL.textile | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/427fdd47/doc/cql3/CQL.textile
--
diff --git a/doc/cql3/CQL.textile b/doc/cql3/CQL.textile
index f6208bf..bedd189 100644
--- a/doc/cql3/CQL.textile
+++ b/doc/cql3/CQL.textile
@@ -614,7 +614,11 @@ The @BATCH@ statement group multiple modification 
statements (insertions/updates
 # It saves network round-trips between the client and the server (and 
sometimes between the server coordinator and the replicas) when batching 
multiple updates.
 # All updates in a @BATCH@ belonging to a given partition key are performed in 
isolation.
 # By default, all operations in the batch are performed atomically.  See the 
notes on @UNLOGGED@:#unloggedBatch for more details.
-Note however that the @BATCH@ statement only allows @UPDATE@, @INSERT@ and 
@DELETE@ statements and is _not_ a full analogue for SQL transactions.
+
+Note that:
+* @BATCH@ statements may only contain @UPDATE@, @INSERT@ and @DELETE@ 
statements.
+* Batches are _not_ a full analogue for SQL transactions.
+* If a timestamp is not specified for each operation, then all operations will 
be applied with the same timestamp. Due to Cassandra's conflict resolution 
procedure in the case of timestamp 
ties:http://wiki.apache.org/cassandra/FAQ#clocktie, operations may be applied 
in an order that is different from the order they are listed in the @BATCH@ 
statement. To force a particular operation ordering, you must specify 
per-operation timestamps.
 
 h4(#unloggedBatch). @UNLOGGED@
 



[2/2] git commit: Merge branch 'cassandra-2.0' into cassandra-2.1

2014-05-01 Thread tylerhobbs
Merge branch 'cassandra-2.0' into cassandra-2.1


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/1491f75d
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/1491f75d
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/1491f75d

Branch: refs/heads/cassandra-2.1
Commit: 1491f75dd6313b2f17450e31184947d447ef6808
Parents: e786101 427fdd4
Author: Tyler Hobbs ty...@datastax.com
Authored: Thu May 1 13:05:52 2014 -0500
Committer: Tyler Hobbs ty...@datastax.com
Committed: Thu May 1 13:05:52 2014 -0500

--
 doc/cql3/CQL.textile | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/1491f75d/doc/cql3/CQL.textile
--



[3/3] git commit: Merge branch 'cassandra-2.1' into trunk

2014-05-01 Thread tylerhobbs
Merge branch 'cassandra-2.1' into trunk


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/f74063c6
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/f74063c6
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/f74063c6

Branch: refs/heads/trunk
Commit: f74063c6eabb809bb59a2a8979d741e707d32797
Parents: d8490cc 1491f75
Author: Tyler Hobbs ty...@datastax.com
Authored: Thu May 1 13:06:31 2014 -0500
Committer: Tyler Hobbs ty...@datastax.com
Committed: Thu May 1 13:06:31 2014 -0500

--
 doc/cql3/CQL.textile | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)
--




[1/3] git commit: Document lack of order guarantees for BATCH statements

2014-05-01 Thread tylerhobbs
Repository: cassandra
Updated Branches:
  refs/heads/trunk d8490ccea - f74063c6e


Document lack of order guarantees for BATCH statements

Patch by Tyler Hobbs, reviewed by Sylvain Lebresnse and Alex Popescu for
CASSANDRA-7123


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/427fdd47
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/427fdd47
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/427fdd47

Branch: refs/heads/trunk
Commit: 427fdd476a631b3850ecd643f71155a5b7dd2bf7
Parents: d5a9c92
Author: Tyler Hobbs ty...@datastax.com
Authored: Thu May 1 13:03:56 2014 -0500
Committer: Tyler Hobbs ty...@datastax.com
Committed: Thu May 1 13:03:56 2014 -0500

--
 doc/cql3/CQL.textile | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/427fdd47/doc/cql3/CQL.textile
--
diff --git a/doc/cql3/CQL.textile b/doc/cql3/CQL.textile
index f6208bf..bedd189 100644
--- a/doc/cql3/CQL.textile
+++ b/doc/cql3/CQL.textile
@@ -614,7 +614,11 @@ The @BATCH@ statement group multiple modification 
statements (insertions/updates
 # It saves network round-trips between the client and the server (and 
sometimes between the server coordinator and the replicas) when batching 
multiple updates.
 # All updates in a @BATCH@ belonging to a given partition key are performed in 
isolation.
 # By default, all operations in the batch are performed atomically.  See the 
notes on @UNLOGGED@:#unloggedBatch for more details.
-Note however that the @BATCH@ statement only allows @UPDATE@, @INSERT@ and 
@DELETE@ statements and is _not_ a full analogue for SQL transactions.
+
+Note that:
+* @BATCH@ statements may only contain @UPDATE@, @INSERT@ and @DELETE@ 
statements.
+* Batches are _not_ a full analogue for SQL transactions.
+* If a timestamp is not specified for each operation, then all operations will 
be applied with the same timestamp. Due to Cassandra's conflict resolution 
procedure in the case of timestamp 
ties:http://wiki.apache.org/cassandra/FAQ#clocktie, operations may be applied 
in an order that is different from the order they are listed in the @BATCH@ 
statement. To force a particular operation ordering, you must specify 
per-operation timestamps.
 
 h4(#unloggedBatch). @UNLOGGED@
 



[jira] [Commented] (CASSANDRA-6696) Drive replacement in JBOD can cause data to reappear.

2014-05-01 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13986845#comment-13986845
 ] 

Benedict commented on CASSANDRA-6696:
-

Or (somewhat handwavy, just to give a basic outline of the idea): we could say 
each vnode has its own LCS hierarchy - this is optimal from a read perspective 
- and perhaps have L1 switch to 1 file in size by default (L2 being 10, etc), 
and then for our flush to L0 we write files equivalent in size to one L1 file, 
grouping however many vnodes fit in the flush, and then only merge with the 
individual L1s once the density of the relevant portion of L0 is  ~0.5 per 
vnode

 Drive replacement in JBOD can cause data to reappear. 
 --

 Key: CASSANDRA-6696
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6696
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: sankalp kohli
Assignee: Marcus Eriksson
 Fix For: 3.0


 In JBOD, when someone gets a bad drive, the bad drive is replaced with a new 
 empty one and repair is run. 
 This can cause deleted data to come back in some cases. Also this is true for 
 corrupt stables in which we delete the corrupt stable and run repair. 
 Here is an example:
 Say we have 3 nodes A,B and C and RF=3 and GC grace=10days. 
 row=sankalp col=sankalp is written 20 days back and successfully went to all 
 three nodes. 
 Then a delete/tombstone was written successfully for the same row column 15 
 days back. 
 Since this tombstone is more than gc grace, it got compacted in Nodes A and B 
 since it got compacted with the actual data. So there is no trace of this row 
 column in node A and B.
 Now in node C, say the original data is in drive1 and tombstone is in drive2. 
 Compaction has not yet reclaimed the data and tombstone.  
 Drive2 becomes corrupt and was replaced with new empty drive. 
 Due to the replacement, the tombstone in now gone and row=sankalp col=sankalp 
 has come back to life. 
 Now after replacing the drive we run repair. This data will be propagated to 
 all nodes. 
 Note: This is still a problem even if we run repair every gc grace. 
  



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (CASSANDRA-6575) By default, Cassandra should refuse to start if JNA can't be initialized properly

2014-05-01 Thread Benedict (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benedict updated CASSANDRA-6575:


Attachment: 6575.txt

Simple patch to remove dependency on JNA Native class for creating a ByteBuffer 
from a memory address

 By default, Cassandra should refuse to start if JNA can't be initialized 
 properly
 -

 Key: CASSANDRA-6575
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6575
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Tupshin Harper
Assignee: Clément Lardeur
Priority: Minor
  Labels: lhf
 Fix For: 2.1 beta1

 Attachments: 6575.txt, trunk-6575-v2.patch, trunk-6575-v3.patch, 
 trunk-6575-v4.patch, trunk-6575.patch


 Failure to have JNA working properly is such a common undetected problem that 
 it would be far preferable to have Cassandra refuse to startup unless JNA is 
 initialized. In theory, this should be much less of a problem with Cassandra 
 2.1 due to CASSANDRA-5872, but even there, it might fail due to native lib 
 problems, or might otherwise be misconfigured. A yaml override, such as 
 boot_without_jna would allow the deliberate overriding of this policy.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (CASSANDRA-7126) relocate doesn't remove tokens from system.local when done

2014-05-01 Thread Brandon Williams (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7126?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-7126:


Priority: Blocker  (was: Major)

 relocate doesn't remove tokens from system.local when done
 --

 Key: CASSANDRA-7126
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7126
 Project: Cassandra
  Issue Type: Bug
Reporter: T Jake Luciani
Assignee: Brandon Williams
Priority: Blocker
 Fix For: 2.0.8


 While testing CASSANDRA-2434 I noticed the tokens being relocated aren't 
 being removed from the source node.
 Here is a dtest https://github.com/tjake/cassandra-dtest/tree/taketoken



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7126) relocate doesn't remove tokens from system.local when done

2014-05-01 Thread Brandon Williams (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13986879#comment-13986879
 ] 

Brandon Williams commented on CASSANDRA-7126:
-

I went and checked if shuffle has the same problem (it should) and it's broken 
in a similar way: it never deletes things from the 'schedule' even though it 
moves them.  I do see the system.local table flushing though, so I suspect 
these are related.  Ring output also doesn't reflect any changes even though 
the log indicates all the tokens moved.

 relocate doesn't remove tokens from system.local when done
 --

 Key: CASSANDRA-7126
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7126
 Project: Cassandra
  Issue Type: Bug
Reporter: T Jake Luciani
Assignee: Brandon Williams
 Fix For: 2.0.8


 While testing CASSANDRA-2434 I noticed the tokens being relocated aren't 
 being removed from the source node.
 Here is a dtest https://github.com/tjake/cassandra-dtest/tree/taketoken



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (CASSANDRA-7132) Add a new Snitch for Google Cloud Platform

2014-05-01 Thread Brian Lynch (JIRA)
Brian Lynch created CASSANDRA-7132:
--

 Summary: Add a new Snitch for Google Cloud Platform
 Key: CASSANDRA-7132
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7132
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Virtual Machine on Google Cloud Platform.  Not dependent 
on the OS.
Reporter: Brian Lynch


In order to correctly identify the rack and datacenter, the snitch needs to 
query the metadata from the host.  I will be attaching a diff to this issue 
shortly with the new snitch file.  



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7126) relocate doesn't remove tokens from system.local when done

2014-05-01 Thread Brandon Williams (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13986886#comment-13986886
 ] 

Brandon Williams commented on CASSANDRA-7126:
-

At least if you do writes and then restart, everything still works, like 
shuffle didn't actually happen.

 relocate doesn't remove tokens from system.local when done
 --

 Key: CASSANDRA-7126
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7126
 Project: Cassandra
  Issue Type: Bug
Reporter: T Jake Luciani
Assignee: Brandon Williams
Priority: Blocker
 Fix For: 2.0.8


 While testing CASSANDRA-2434 I noticed the tokens being relocated aren't 
 being removed from the source node.
 Here is a dtest https://github.com/tjake/cassandra-dtest/tree/taketoken



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6875) CQL3: select multiple CQL rows in a single partition using IN

2014-05-01 Thread Tyler Hobbs (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13986889#comment-13986889
 ] 

Tyler Hobbs commented on CASSANDRA-6875:


After thinking more closely about splitting up SelectStatement into two 
subclasses for single-column and multi-column restrictions, I'm not 100% 
convinced that's the best path.

For example, suppose you have a query like {{SELECT * FROM foo WHERE key=0 AND 
c1  0 AND (c1, c2)  (2, 3)}}.  We could a) require it to be written like 
{{(c1)  (0) AND (c1, c2)  (2, 3)}}, or b) accept that syntax and correctly 
reduce the expressions to a single multi-column slice restriction.  I'm not 
sure that option (b) would be clearer than keeping the restrictions separate.

I can also imagine us supporting something like {{SELECT ... WHERE key=0 AND 
c1=0 AND (c2, c3)  (1, 2)}} in the future.  Of course, we could also require 
this to be written differently ({{(c1, c2, c3)  (0, 1, 2) AND (c1) = (0)}} or 
reduce it to a single multi-column slice restriction.  I'm just pointing out 
that this may become less clear than simply improving the bounds-building code 
(which I agree is needed).



 CQL3: select multiple CQL rows in a single partition using IN
 -

 Key: CASSANDRA-6875
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6875
 Project: Cassandra
  Issue Type: Bug
  Components: API
Reporter: Nicolas Favre-Felix
Assignee: Tyler Hobbs
Priority: Minor
 Fix For: 2.0.8


 In the spirit of CASSANDRA-4851 and to bring CQL to parity with Thrift, it is 
 important to support reading several distinct CQL rows from a given partition 
 using a distinct set of coordinates for these rows within the partition.
 CASSANDRA-4851 introduced a range scan over the multi-dimensional space of 
 clustering keys. We also need to support a multi-get of CQL rows, 
 potentially using the IN keyword to define a set of clustering keys to 
 fetch at once.
 (reusing the same example\:)
 Consider the following table:
 {code}
 CREATE TABLE test (
   k int,
   c1 int,
   c2 int,
   PRIMARY KEY (k, c1, c2)
 );
 {code}
 with the following data:
 {code}
  k | c1 | c2
 ---++
  0 |  0 |  0
  0 |  0 |  1
  0 |  1 |  0
  0 |  1 |  1
 {code}
 We can fetch a single row or a range of rows, but not a set of them:
 {code}
  SELECT * FROM test WHERE k = 0 AND (c1, c2) IN ((0, 0), (1,1)) ;
 Bad Request: line 1:54 missing EOF at ','
 {code}
 Supporting this syntax would return:
 {code}
  k | c1 | c2
 ---++
  0 |  0 |  0
  0 |  1 |  1
 {code}
 Being able to fetch these two CQL rows in a single read is important to 
 maintain partition-level isolation.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (CASSANDRA-7132) Add a new Snitch for Google Cloud Platform

2014-05-01 Thread Brian Lynch (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brian Lynch updated CASSANDRA-7132:
---

Attachment: trunk-7132.txt

Patch for the new Google Cloud Snitch. 

 Add a new Snitch for Google Cloud Platform
 --

 Key: CASSANDRA-7132
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7132
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Virtual Machine on Google Cloud Platform.  Not dependent 
 on the OS.
Reporter: Brian Lynch
  Labels: patch
 Fix For: 2.0.6, 2.1 beta2

 Attachments: trunk-7132.txt


 In order to correctly identify the rack and datacenter, the snitch needs to 
 query the metadata from the host.  I will be attaching a diff to this issue 
 shortly with the new snitch file.  



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (CASSANDRA-7132) Add a new Snitch for Google Cloud Platform

2014-05-01 Thread Brian Lynch (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brian Lynch updated CASSANDRA-7132:
---

Priority: Minor  (was: Major)

 Add a new Snitch for Google Cloud Platform
 --

 Key: CASSANDRA-7132
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7132
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Virtual Machine on Google Cloud Platform.  Not dependent 
 on the OS.
Reporter: Brian Lynch
Priority: Minor
  Labels: patch
 Fix For: 2.0.6, 2.1 beta2

 Attachments: trunk-7132.txt


 In order to correctly identify the rack and datacenter, the snitch needs to 
 query the metadata from the host.  I will be attaching a diff to this issue 
 shortly with the new snitch file.  



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (CASSANDRA-7133) yield in SlabAllocator$Region.allocate could cause starvation

2014-05-01 Thread Jeremiah Jordan (JIRA)
Jeremiah Jordan created CASSANDRA-7133:
--

 Summary: yield in SlabAllocator$Region.allocate could cause 
starvation
 Key: CASSANDRA-7133
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7133
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Jeremiah Jordan
Assignee: Benedict
Priority: Minor


If a low priority thread inserts data into a table, it is possible the yield in 
org.apache.cassandra.utils.SlabAllocator$Region.allocate() could starve.  We 
already changed this in CASSANDRA-5549 to not yield any longer.  It might be 
good to back port some of those changes to 2.0 if they aren't too large.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (CASSANDRA-7133) yield in SlabAllocator$Region.allocate could cause starvation

2014-05-01 Thread Benedict (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benedict updated CASSANDRA-7133:


Attachment: 7133.20.txt
7133.12.txt

Patches for 2.0 and 1.2 attached

 yield in SlabAllocator$Region.allocate could cause starvation
 -

 Key: CASSANDRA-7133
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7133
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Jeremiah Jordan
Assignee: Benedict
Priority: Minor
 Attachments: 7133.12.txt, 7133.20.txt


 If a low priority thread inserts data into a table, it is possible the yield 
 in org.apache.cassandra.utils.SlabAllocator$Region.allocate() could starve.  
 We already changed this in CASSANDRA-5549 to not yield any longer.  It might 
 be good to back port some of those changes to 2.0 if they aren't too large.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (CASSANDRA-7133) yield in SlabAllocator$Region.allocate could cause starvation

2014-05-01 Thread Benedict (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benedict updated CASSANDRA-7133:


Fix Version/s: 2.0.8
   1.2.17

 yield in SlabAllocator$Region.allocate could cause starvation
 -

 Key: CASSANDRA-7133
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7133
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Jeremiah Jordan
Assignee: Benedict
Priority: Minor
 Fix For: 1.2.17, 2.0.8

 Attachments: 7133.12.txt, 7133.20.txt


 If a low priority thread inserts data into a table, it is possible the yield 
 in org.apache.cassandra.utils.SlabAllocator$Region.allocate() could starve.  
 We already changed this in CASSANDRA-5549 to not yield any longer.  It might 
 be good to back port some of those changes to 2.0 if they aren't too large.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (CASSANDRA-7132) Add a new Snitch for Google Cloud Platform

2014-05-01 Thread Brandon Williams (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-7132:


Reviewer: Brandon Williams

 Add a new Snitch for Google Cloud Platform
 --

 Key: CASSANDRA-7132
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7132
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Virtual Machine on Google Cloud Platform.  Not dependent 
 on the OS.
Reporter: Brian Lynch
Priority: Minor
  Labels: patch
 Fix For: 2.0.6, 2.1 beta2

 Attachments: trunk-7132.txt


 In order to correctly identify the rack and datacenter, the snitch needs to 
 query the metadata from the host.  I will be attaching a diff to this issue 
 shortly with the new snitch file.  



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[4/6] git commit: Merge branch 'cassandra-2.0' into cassandra-2.1

2014-05-01 Thread brandonwilliams
Merge branch 'cassandra-2.0' into cassandra-2.1

Conflicts:
CHANGES.txt


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/5bd6a756
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/5bd6a756
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/5bd6a756

Branch: refs/heads/trunk
Commit: 5bd6a756e10b4e869f1babb3bb6f9dfb223bd75e
Parents: 1491f75 f2bbd6f
Author: Brandon Williams brandonwilli...@apache.org
Authored: Thu May 1 15:14:34 2014 -0500
Committer: Brandon Williams brandonwilli...@apache.org
Committed: Thu May 1 15:14:34 2014 -0500

--
 CHANGES.txt |   4 +
 .../cassandra/locator/GoogleCloudSnitch.java| 128 +++
 .../locator/GoogleCloudSnitchTest.java  | 108 
 3 files changed, 240 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/5bd6a756/CHANGES.txt
--
diff --cc CHANGES.txt
index be72ad1,827003b..16fcdfc
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@@ -1,63 -1,5 +1,67 @@@
 -2.0.8
++2.1.0-rc1
++Merged from 2.0:
+  * Add Google Compute Engine snitch (CASSANDRA-7132)
++
 +2.1.0-beta2
 + * Increase default CL space to 8GB (CASSANDRA-7031)
 + * Add range tombstones to read repair digests (CASSANDRA-6863)
 + * Fix BTree.clear for large updates (CASSANDRA-6943)
 + * Fail write instead of logging a warning when unable to append to CL
 +   (CASSANDRA-6764)
 + * Eliminate possibility of CL segment appearing twice in active list 
 +   (CASSANDRA-6557)
 + * Apply DONTNEED fadvise to commitlog segments (CASSANDRA-6759)
 + * Switch CRC component to Adler and include it for compressed sstables 
 +   (CASSANDRA-4165)
 + * Allow cassandra-stress to set compaction strategy options (CASSANDRA-6451)
 + * Add broadcast_rpc_address option to cassandra.yaml (CASSANDRA-5899)
 + * Auto reload GossipingPropertyFileSnitch config (CASSANDRA-5897)
 + * Fix overflow of memtable_total_space_in_mb (CASSANDRA-6573)
 + * Fix ABTC NPE and apply update function correctly (CASSANDRA-6692)
 + * Allow nodetool to use a file or prompt for password (CASSANDRA-6660)
 + * Fix AIOOBE when concurrently accessing ABSC (CASSANDRA-6742)
 + * Fix assertion error in ALTER TYPE RENAME (CASSANDRA-6705)
 + * Scrub should not always clear out repaired status (CASSANDRA-5351)
 + * Improve handling of range tombstone for wide partitions (CASSANDRA-6446)
 + * Fix ClassCastException for compact table with composites (CASSANDRA-6738)
 + * Fix potentially repairing with wrong nodes (CASSANDRA-6808)
 + * Change caching option syntax (CASSANDRA-6745)
 + * Fix stress to do proper counter reads (CASSANDRA-6835)
 + * Fix help message for stress counter_write (CASSANDRA-6824)
 + * Fix stress smart Thrift client to pick servers correctly (CASSANDRA-6848)
 + * Add logging levels (minimal, normal or verbose) to stress tool 
(CASSANDRA-6849)
 + * Fix race condition in Batch CLE (CASSANDRA-6860)
 + * Improve cleanup/scrub/upgradesstables failure handling (CASSANDRA-6774)
 + * ByteBuffer write() methods for serializing sstables (CASSANDRA-6781)
 + * Proper compare function for CollectionType (CASSANDRA-6783)
 + * Update native server to Netty 4 (CASSANDRA-6236)
 + * Fix off-by-one error in stress (CASSANDRA-6883)
 + * Make OpOrder AutoCloseable (CASSANDRA-6901)
 + * Remove sync repair JMX interface (CASSANDRA-6900)
 + * Add multiple memory allocation options for memtables (CASSANDRA-6689, 6694)
 + * Remove adjusted op rate from stress output (CASSANDRA-6921)
 + * Add optimized CF.hasColumns() implementations (CASSANDRA-6941)
 + * Serialize batchlog mutations with the version of the target node
 +   (CASSANDRA-6931)
 + * Optimize CounterColumn#reconcile() (CASSANDRA-6953)
 + * Properly remove 1.2 sstable support in 2.1 (CASSANDRA-6869)
 + * Lock counter cells, not partitions (CASSANDRA-6880)
 + * Track presence of legacy counter shards in sstables (CASSANDRA-6888)
 + * Ensure safe resource cleanup when replacing sstables (CASSANDRA-6912)
 + * Add failure handler to async callback (CASSANDRA-6747)
 + * Fix AE when closing SSTable without releasing reference (CASSANDRA-7000)
 + * Clean up IndexInfo on keyspace/table drops (CASSANDRA-6924)
 + * Only snapshot relative SSTables when sequential repair (CASSANDRA-7024)
 + * Require nodetool rebuild_index to specify index names (CASSANDRA-7038)
 + * fix cassandra stress errors on reads with native protocol (CASSANDRA-7033)
 + * Use OpOrder to guard sstable references for reads (CASSANDRA-6919)
 + * Preemptive opening of compaction result (CASSANDRA-6916)
 + * Multi-threaded scrub/cleanup/upgradesstables (CASSANDRA-5547)
 + * Optimize cellname comparison (CASSANDRA-6934)
 + * Native protocol v3 (CASSANDRA-6855)
 + * Optimize Cell liveness 

[5/6] git commit: Merge branch 'cassandra-2.0' into cassandra-2.1

2014-05-01 Thread brandonwilliams
Merge branch 'cassandra-2.0' into cassandra-2.1

Conflicts:
CHANGES.txt


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/5bd6a756
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/5bd6a756
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/5bd6a756

Branch: refs/heads/cassandra-2.1
Commit: 5bd6a756e10b4e869f1babb3bb6f9dfb223bd75e
Parents: 1491f75 f2bbd6f
Author: Brandon Williams brandonwilli...@apache.org
Authored: Thu May 1 15:14:34 2014 -0500
Committer: Brandon Williams brandonwilli...@apache.org
Committed: Thu May 1 15:14:34 2014 -0500

--
 CHANGES.txt |   4 +
 .../cassandra/locator/GoogleCloudSnitch.java| 128 +++
 .../locator/GoogleCloudSnitchTest.java  | 108 
 3 files changed, 240 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/5bd6a756/CHANGES.txt
--
diff --cc CHANGES.txt
index be72ad1,827003b..16fcdfc
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@@ -1,63 -1,5 +1,67 @@@
 -2.0.8
++2.1.0-rc1
++Merged from 2.0:
+  * Add Google Compute Engine snitch (CASSANDRA-7132)
++
 +2.1.0-beta2
 + * Increase default CL space to 8GB (CASSANDRA-7031)
 + * Add range tombstones to read repair digests (CASSANDRA-6863)
 + * Fix BTree.clear for large updates (CASSANDRA-6943)
 + * Fail write instead of logging a warning when unable to append to CL
 +   (CASSANDRA-6764)
 + * Eliminate possibility of CL segment appearing twice in active list 
 +   (CASSANDRA-6557)
 + * Apply DONTNEED fadvise to commitlog segments (CASSANDRA-6759)
 + * Switch CRC component to Adler and include it for compressed sstables 
 +   (CASSANDRA-4165)
 + * Allow cassandra-stress to set compaction strategy options (CASSANDRA-6451)
 + * Add broadcast_rpc_address option to cassandra.yaml (CASSANDRA-5899)
 + * Auto reload GossipingPropertyFileSnitch config (CASSANDRA-5897)
 + * Fix overflow of memtable_total_space_in_mb (CASSANDRA-6573)
 + * Fix ABTC NPE and apply update function correctly (CASSANDRA-6692)
 + * Allow nodetool to use a file or prompt for password (CASSANDRA-6660)
 + * Fix AIOOBE when concurrently accessing ABSC (CASSANDRA-6742)
 + * Fix assertion error in ALTER TYPE RENAME (CASSANDRA-6705)
 + * Scrub should not always clear out repaired status (CASSANDRA-5351)
 + * Improve handling of range tombstone for wide partitions (CASSANDRA-6446)
 + * Fix ClassCastException for compact table with composites (CASSANDRA-6738)
 + * Fix potentially repairing with wrong nodes (CASSANDRA-6808)
 + * Change caching option syntax (CASSANDRA-6745)
 + * Fix stress to do proper counter reads (CASSANDRA-6835)
 + * Fix help message for stress counter_write (CASSANDRA-6824)
 + * Fix stress smart Thrift client to pick servers correctly (CASSANDRA-6848)
 + * Add logging levels (minimal, normal or verbose) to stress tool 
(CASSANDRA-6849)
 + * Fix race condition in Batch CLE (CASSANDRA-6860)
 + * Improve cleanup/scrub/upgradesstables failure handling (CASSANDRA-6774)
 + * ByteBuffer write() methods for serializing sstables (CASSANDRA-6781)
 + * Proper compare function for CollectionType (CASSANDRA-6783)
 + * Update native server to Netty 4 (CASSANDRA-6236)
 + * Fix off-by-one error in stress (CASSANDRA-6883)
 + * Make OpOrder AutoCloseable (CASSANDRA-6901)
 + * Remove sync repair JMX interface (CASSANDRA-6900)
 + * Add multiple memory allocation options for memtables (CASSANDRA-6689, 6694)
 + * Remove adjusted op rate from stress output (CASSANDRA-6921)
 + * Add optimized CF.hasColumns() implementations (CASSANDRA-6941)
 + * Serialize batchlog mutations with the version of the target node
 +   (CASSANDRA-6931)
 + * Optimize CounterColumn#reconcile() (CASSANDRA-6953)
 + * Properly remove 1.2 sstable support in 2.1 (CASSANDRA-6869)
 + * Lock counter cells, not partitions (CASSANDRA-6880)
 + * Track presence of legacy counter shards in sstables (CASSANDRA-6888)
 + * Ensure safe resource cleanup when replacing sstables (CASSANDRA-6912)
 + * Add failure handler to async callback (CASSANDRA-6747)
 + * Fix AE when closing SSTable without releasing reference (CASSANDRA-7000)
 + * Clean up IndexInfo on keyspace/table drops (CASSANDRA-6924)
 + * Only snapshot relative SSTables when sequential repair (CASSANDRA-7024)
 + * Require nodetool rebuild_index to specify index names (CASSANDRA-7038)
 + * fix cassandra stress errors on reads with native protocol (CASSANDRA-7033)
 + * Use OpOrder to guard sstable references for reads (CASSANDRA-6919)
 + * Preemptive opening of compaction result (CASSANDRA-6916)
 + * Multi-threaded scrub/cleanup/upgradesstables (CASSANDRA-5547)
 + * Optimize cellname comparison (CASSANDRA-6934)
 + * Native protocol v3 (CASSANDRA-6855)
 + * Optimize Cell 

[6/6] git commit: Merge branch 'cassandra-2.1' into trunk

2014-05-01 Thread brandonwilliams
Merge branch 'cassandra-2.1' into trunk

Conflicts:
CHANGES.txt


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/c2579b92
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/c2579b92
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/c2579b92

Branch: refs/heads/trunk
Commit: c2579b92bf2e721c099720a27b5d9e56be66e49c
Parents: f74063c 5bd6a75
Author: Brandon Williams brandonwilli...@apache.org
Authored: Thu May 1 15:19:15 2014 -0500
Committer: Brandon Williams brandonwilli...@apache.org
Committed: Thu May 1 15:19:15 2014 -0500

--
 CHANGES.txt |   3 +
 .../cassandra/locator/GoogleCloudSnitch.java| 128 +++
 .../locator/GoogleCloudSnitchTest.java  | 108 
 3 files changed, 239 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/c2579b92/CHANGES.txt
--
diff --cc CHANGES.txt
index d98174b,16fcdfc..0c32f3c
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@@ -1,11 -1,6 +1,14 @@@
 +3.0
 + * Move sstable RandomAccessReader to nio2, which allows using the
 +   FILE_SHARE_DELETE flag on Windows (CASSANDRA-4050)
 + * Remove CQL2 (CASSANDRA-5918)
 + * Add Thrift get_multi_slice call (CASSANDRA-6757)
 + * Optimize fetching multiple cells by name (CASSANDRA-6933)
 + * Allow compilation in java 8 (CASSANDRA-7208)
 +
+ 2.1.0-rc1
+ Merged from 2.0:
+  * Add Google Compute Engine snitch (CASSANDRA-7132)
  
  2.1.0-beta2
   * Increase default CL space to 8GB (CASSANDRA-7031)



[2/6] git commit: Add Google Compute Engine snitch.

2014-05-01 Thread brandonwilliams
Add Google Compute Engine snitch.

Patch by Brian Lynch, reviewed by brandonwilliams for CASSANDRA-7132


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/f2bbd6fc
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/f2bbd6fc
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/f2bbd6fc

Branch: refs/heads/cassandra-2.1
Commit: f2bbd6fcc670b9cb2eecbe0d2964d7e4b785e543
Parents: 427fdd4
Author: Brandon Williams brandonwilli...@apache.org
Authored: Thu May 1 15:12:55 2014 -0500
Committer: Brandon Williams brandonwilli...@apache.org
Committed: Thu May 1 15:12:55 2014 -0500

--
 CHANGES.txt |   1 +
 .../cassandra/locator/GoogleCloudSnitch.java| 128 +++
 .../locator/GoogleCloudSnitchTest.java  | 108 
 3 files changed, 237 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/f2bbd6fc/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index e25e71f..827003b 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 2.0.8
+ * Add Google Compute Engine snitch (CASSANDRA-7132)
  * Allow overriding cassandra-rackdc.properties file (CASSANDRA-7072)
  * Set JMX RMI port to 7199 (CASSANDRA-7087)
  * Use LOCAL_QUORUM for data reads at LOCAL_SERIAL (CASSANDRA-6939)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/f2bbd6fc/src/java/org/apache/cassandra/locator/GoogleCloudSnitch.java
--
diff --git a/src/java/org/apache/cassandra/locator/GoogleCloudSnitch.java 
b/src/java/org/apache/cassandra/locator/GoogleCloudSnitch.java
new file mode 100644
index 000..05fbea2
--- /dev/null
+++ b/src/java/org/apache/cassandra/locator/GoogleCloudSnitch.java
@@ -0,0 +1,128 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * License); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.cassandra.locator;
+
+import java.io.DataInputStream;
+import java.io.FilterInputStream;
+import java.io.IOException;
+import java.net.HttpURLConnection;
+import java.net.InetAddress;
+import java.net.URL;
+import java.nio.charset.StandardCharsets;
+import java.util.Map;
+
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+import org.apache.cassandra.db.SystemKeyspace;
+import org.apache.cassandra.exceptions.ConfigurationException;
+import org.apache.cassandra.gms.ApplicationState;
+import org.apache.cassandra.gms.EndpointState;
+import org.apache.cassandra.gms.Gossiper;
+import org.apache.cassandra.io.util.FileUtils;
+import org.apache.cassandra.utils.FBUtilities;
+
+/**
+ * A snitch that assumes an GCE region is a DC and an GCE availability_zone
+ *  is a rack. This information is available in the config for the node.
+ */
+public class GoogleCloudSnitch extends AbstractNetworkTopologySnitch
+{
+protected static final Logger logger = 
LoggerFactory.getLogger(GoogleCloudSnitch.class);
+protected static final String ZONE_NAME_QUERY_URL = 
http://metadata.google.internal/computeMetadata/v1/instance/zone;;
+private static final String DEFAULT_DC = UNKNOWN-DC;
+private static final String DEFAULT_RACK = UNKNOWN-RACK;
+private MapInetAddress, MapString, String savedEndpoints;
+protected String gceZone;
+protected String gceRegion;
+
+public GoogleCloudSnitch() throws IOException, ConfigurationException
+{
+String response = gceApiCall(ZONE_NAME_QUERY_URL);
+   String[] splits = response.split(/);
+   String az = splits[splits.length - 1];
+
+// Split us-central1-a or asia-east1-a into us-central1/a and 
asia-east1/a.
+splits = az.split(-);
+gceZone = splits[splits.length - 1];
+
+   int lastRegionIndex = az.lastIndexOf(-);
+   gceRegion = az.substring(0, lastRegionIndex);
+
+String datacenterSuffix = (new SnitchProperties()).get(dc_suffix, 
);
+gceRegion = gceRegion.concat(datacenterSuffix);
+logger.info(GCESnitch using region: {}, zone: {}., gceRegion, 
gceZone);
+   

[jira] [Updated] (CASSANDRA-7132) Add a new Snitch for Google Cloud Platform

2014-05-01 Thread Brandon Williams (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-7132:


Fix Version/s: (was: 2.1 beta2)
   (was: 2.0.6)
   2.1 rc1
   2.0.8
 Assignee: Brian Lynch
   Labels:   (was: patch)

Committed, thanks!

 Add a new Snitch for Google Cloud Platform
 --

 Key: CASSANDRA-7132
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7132
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Virtual Machine on Google Cloud Platform.  Not dependent 
 on the OS.
Reporter: Brian Lynch
Assignee: Brian Lynch
Priority: Minor
 Fix For: 2.0.8, 2.1 rc1

 Attachments: trunk-7132.txt


 In order to correctly identify the rack and datacenter, the snitch needs to 
 query the metadata from the host.  I will be attaching a diff to this issue 
 shortly with the new snitch file.  



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[3/6] git commit: Add Google Compute Engine snitch.

2014-05-01 Thread brandonwilliams
Add Google Compute Engine snitch.

Patch by Brian Lynch, reviewed by brandonwilliams for CASSANDRA-7132


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/f2bbd6fc
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/f2bbd6fc
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/f2bbd6fc

Branch: refs/heads/trunk
Commit: f2bbd6fcc670b9cb2eecbe0d2964d7e4b785e543
Parents: 427fdd4
Author: Brandon Williams brandonwilli...@apache.org
Authored: Thu May 1 15:12:55 2014 -0500
Committer: Brandon Williams brandonwilli...@apache.org
Committed: Thu May 1 15:12:55 2014 -0500

--
 CHANGES.txt |   1 +
 .../cassandra/locator/GoogleCloudSnitch.java| 128 +++
 .../locator/GoogleCloudSnitchTest.java  | 108 
 3 files changed, 237 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/f2bbd6fc/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index e25e71f..827003b 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 2.0.8
+ * Add Google Compute Engine snitch (CASSANDRA-7132)
  * Allow overriding cassandra-rackdc.properties file (CASSANDRA-7072)
  * Set JMX RMI port to 7199 (CASSANDRA-7087)
  * Use LOCAL_QUORUM for data reads at LOCAL_SERIAL (CASSANDRA-6939)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/f2bbd6fc/src/java/org/apache/cassandra/locator/GoogleCloudSnitch.java
--
diff --git a/src/java/org/apache/cassandra/locator/GoogleCloudSnitch.java 
b/src/java/org/apache/cassandra/locator/GoogleCloudSnitch.java
new file mode 100644
index 000..05fbea2
--- /dev/null
+++ b/src/java/org/apache/cassandra/locator/GoogleCloudSnitch.java
@@ -0,0 +1,128 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * License); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.cassandra.locator;
+
+import java.io.DataInputStream;
+import java.io.FilterInputStream;
+import java.io.IOException;
+import java.net.HttpURLConnection;
+import java.net.InetAddress;
+import java.net.URL;
+import java.nio.charset.StandardCharsets;
+import java.util.Map;
+
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+import org.apache.cassandra.db.SystemKeyspace;
+import org.apache.cassandra.exceptions.ConfigurationException;
+import org.apache.cassandra.gms.ApplicationState;
+import org.apache.cassandra.gms.EndpointState;
+import org.apache.cassandra.gms.Gossiper;
+import org.apache.cassandra.io.util.FileUtils;
+import org.apache.cassandra.utils.FBUtilities;
+
+/**
+ * A snitch that assumes an GCE region is a DC and an GCE availability_zone
+ *  is a rack. This information is available in the config for the node.
+ */
+public class GoogleCloudSnitch extends AbstractNetworkTopologySnitch
+{
+protected static final Logger logger = 
LoggerFactory.getLogger(GoogleCloudSnitch.class);
+protected static final String ZONE_NAME_QUERY_URL = 
http://metadata.google.internal/computeMetadata/v1/instance/zone;;
+private static final String DEFAULT_DC = UNKNOWN-DC;
+private static final String DEFAULT_RACK = UNKNOWN-RACK;
+private MapInetAddress, MapString, String savedEndpoints;
+protected String gceZone;
+protected String gceRegion;
+
+public GoogleCloudSnitch() throws IOException, ConfigurationException
+{
+String response = gceApiCall(ZONE_NAME_QUERY_URL);
+   String[] splits = response.split(/);
+   String az = splits[splits.length - 1];
+
+// Split us-central1-a or asia-east1-a into us-central1/a and 
asia-east1/a.
+splits = az.split(-);
+gceZone = splits[splits.length - 1];
+
+   int lastRegionIndex = az.lastIndexOf(-);
+   gceRegion = az.substring(0, lastRegionIndex);
+
+String datacenterSuffix = (new SnitchProperties()).get(dc_suffix, 
);
+gceRegion = gceRegion.concat(datacenterSuffix);
+logger.info(GCESnitch using region: {}, zone: {}., gceRegion, 
gceZone);
+}
+
+  

[1/6] git commit: Add Google Compute Engine snitch.

2014-05-01 Thread brandonwilliams
Repository: cassandra
Updated Branches:
  refs/heads/cassandra-2.0 427fdd476 - f2bbd6fcc
  refs/heads/cassandra-2.1 1491f75dd - 5bd6a756e
  refs/heads/trunk f74063c6e - c2579b92b


Add Google Compute Engine snitch.

Patch by Brian Lynch, reviewed by brandonwilliams for CASSANDRA-7132


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/f2bbd6fc
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/f2bbd6fc
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/f2bbd6fc

Branch: refs/heads/cassandra-2.0
Commit: f2bbd6fcc670b9cb2eecbe0d2964d7e4b785e543
Parents: 427fdd4
Author: Brandon Williams brandonwilli...@apache.org
Authored: Thu May 1 15:12:55 2014 -0500
Committer: Brandon Williams brandonwilli...@apache.org
Committed: Thu May 1 15:12:55 2014 -0500

--
 CHANGES.txt |   1 +
 .../cassandra/locator/GoogleCloudSnitch.java| 128 +++
 .../locator/GoogleCloudSnitchTest.java  | 108 
 3 files changed, 237 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/f2bbd6fc/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index e25e71f..827003b 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 2.0.8
+ * Add Google Compute Engine snitch (CASSANDRA-7132)
  * Allow overriding cassandra-rackdc.properties file (CASSANDRA-7072)
  * Set JMX RMI port to 7199 (CASSANDRA-7087)
  * Use LOCAL_QUORUM for data reads at LOCAL_SERIAL (CASSANDRA-6939)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/f2bbd6fc/src/java/org/apache/cassandra/locator/GoogleCloudSnitch.java
--
diff --git a/src/java/org/apache/cassandra/locator/GoogleCloudSnitch.java 
b/src/java/org/apache/cassandra/locator/GoogleCloudSnitch.java
new file mode 100644
index 000..05fbea2
--- /dev/null
+++ b/src/java/org/apache/cassandra/locator/GoogleCloudSnitch.java
@@ -0,0 +1,128 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * License); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.cassandra.locator;
+
+import java.io.DataInputStream;
+import java.io.FilterInputStream;
+import java.io.IOException;
+import java.net.HttpURLConnection;
+import java.net.InetAddress;
+import java.net.URL;
+import java.nio.charset.StandardCharsets;
+import java.util.Map;
+
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+import org.apache.cassandra.db.SystemKeyspace;
+import org.apache.cassandra.exceptions.ConfigurationException;
+import org.apache.cassandra.gms.ApplicationState;
+import org.apache.cassandra.gms.EndpointState;
+import org.apache.cassandra.gms.Gossiper;
+import org.apache.cassandra.io.util.FileUtils;
+import org.apache.cassandra.utils.FBUtilities;
+
+/**
+ * A snitch that assumes an GCE region is a DC and an GCE availability_zone
+ *  is a rack. This information is available in the config for the node.
+ */
+public class GoogleCloudSnitch extends AbstractNetworkTopologySnitch
+{
+protected static final Logger logger = 
LoggerFactory.getLogger(GoogleCloudSnitch.class);
+protected static final String ZONE_NAME_QUERY_URL = 
http://metadata.google.internal/computeMetadata/v1/instance/zone;;
+private static final String DEFAULT_DC = UNKNOWN-DC;
+private static final String DEFAULT_RACK = UNKNOWN-RACK;
+private MapInetAddress, MapString, String savedEndpoints;
+protected String gceZone;
+protected String gceRegion;
+
+public GoogleCloudSnitch() throws IOException, ConfigurationException
+{
+String response = gceApiCall(ZONE_NAME_QUERY_URL);
+   String[] splits = response.split(/);
+   String az = splits[splits.length - 1];
+
+// Split us-central1-a or asia-east1-a into us-central1/a and 
asia-east1/a.
+splits = az.split(-);
+gceZone = splits[splits.length - 1];
+
+   int lastRegionIndex = az.lastIndexOf(-);
+   gceRegion = az.substring(0, lastRegionIndex);
+
+String datacenterSuffix = (new 

[jira] [Updated] (CASSANDRA-7132) Add a new Snitch for Google Cloud Platform

2014-05-01 Thread Brandon Williams (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-7132:


Reproduced In:   (was: 2.0.6, 2.1 beta2)

 Add a new Snitch for Google Cloud Platform
 --

 Key: CASSANDRA-7132
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7132
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Virtual Machine on Google Cloud Platform.  Not dependent 
 on the OS.
Reporter: Brian Lynch
Priority: Minor
 Fix For: 2.0.8, 2.1 rc1

 Attachments: trunk-7132.txt


 In order to correctly identify the rack and datacenter, the snitch needs to 
 query the metadata from the host.  I will be attaching a diff to this issue 
 shortly with the new snitch file.  



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7132) Add a new Snitch for Google Cloud Platform

2014-05-01 Thread Brian Lynch (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13986955#comment-13986955
 ] 

Brian Lynch commented on CASSANDRA-7132:


Fantastic!  Thanks, Brandon!

 Add a new Snitch for Google Cloud Platform
 --

 Key: CASSANDRA-7132
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7132
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Virtual Machine on Google Cloud Platform.  Not dependent 
 on the OS.
Reporter: Brian Lynch
Assignee: Brian Lynch
Priority: Minor
 Fix For: 2.0.8, 2.1 rc1

 Attachments: trunk-7132.txt


 In order to correctly identify the rack and datacenter, the snitch needs to 
 query the metadata from the host.  I will be attaching a diff to this issue 
 shortly with the new snitch file.  



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6861) Optimise our Netty 4 integration

2014-05-01 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13987002#comment-13987002
 ] 

Benedict commented on CASSANDRA-6861:
-

It looks like we're still allocating a lot of unpooled buffers, even though 
they are now (mostly) being released. We'd like to allocate them from a pool if 
possible.

It seems CBUtil.readValue() can also still allocate unpooled buffers that won't 
have release called on them (though likely the code path isn't being exercised 
right now) - since this one just returns an NIO buffer, it's probably easiest 
to simply allocate a ByteBuffer and read the bytes into it.

 Optimise our Netty 4 integration
 

 Key: CASSANDRA-6861
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6861
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: T Jake Luciani
Priority: Minor
  Labels: performance
 Fix For: 2.1 beta2


 Now we've upgraded to Netty 4, we're generating a lot of garbage that could 
 be avoided, so we should probably stop that. Should be reasonably easy to 
 hook into Netty's pooled buffers, returning them to the pool once a given 
 message is completed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (CASSANDRA-6861) Optimise our Netty 4 integration

2014-05-01 Thread Benedict (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benedict updated CASSANDRA-6861:


Reviewer: Benedict

 Optimise our Netty 4 integration
 

 Key: CASSANDRA-6861
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6861
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: T Jake Luciani
Priority: Minor
  Labels: performance
 Fix For: 2.1 beta2


 Now we've upgraded to Netty 4, we're generating a lot of garbage that could 
 be avoided, so we should probably stop that. Should be reasonably easy to 
 hook into Netty's pooled buffers, returning them to the pool once a given 
 message is completed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-3569) Failure detector downs should not break streams

2014-05-01 Thread Joshua McKenzie (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13987023#comment-13987023
 ] 

Joshua McKenzie commented on CASSANDRA-3569:


nio isn't interested in us getting the integer file descriptor for the 
underlying sockets without cracking open some internals and using reflection to 
rip out private member variables.  I don't think this is the right way to go - 
it'll be messy (reflecting out 2 private members from SocketAdapter down), java 
platform-dependent, and brittle.

As for using sysctl or modifying the registry (Windows) on cassandra start - 
that isn't the least surprising thing we could do as there would be side 
effects to other processes running on these machines.  Do we have a precedent 
at this time for changing global system configuration settings on startup of 
the daemon or during rpm install?

Maybe adding an optional parameter in the yaml for tcp_keepalive_interval and 
selectively setting that if the users opt-in.  Still seems like it doesn't 
address our need of a default-state with improved behavior though.



 Failure detector downs should not break streams
 ---

 Key: CASSANDRA-3569
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3569
 Project: Cassandra
  Issue Type: New Feature
Reporter: Peter Schuller
Assignee: Joshua McKenzie
 Fix For: 2.1 rc1

 Attachments: 3569-2.0.txt


 CASSANDRA-2433 introduced this behavior just to get repairs to don't sit 
 there waiting forever. In my opinion the correct fix to that problem is to 
 use TCP keep alive. Unfortunately the TCP keep alive period is insanely high 
 by default on a modern Linux, so just doing that is not entirely good either.
 But using the failure detector seems non-sensicle to me. We have a 
 communication method which is the TCP transport, that we know is used for 
 long-running processes that you don't want to incorrectly be killed for no 
 good reason, and we are using a failure detector tuned to detecting when not 
 to send real-time sensitive request to nodes in order to actively kill a 
 working connection.
 So, rather than add complexity with protocol based ping/pongs and such, I 
 propose that we simply just use TCP keep alive for streaming connections and 
 instruct operators of production clusters to tweak 
 net.ipv4.tcp_keepalive_{probes,intvl} as appropriate (or whatever equivalent 
 on their OS).
 I can submit the patch. Awaiting opinions.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-5964) cqlsh raises a ValueError when connecting to Cassandra running in Eclipse

2014-05-01 Thread Alexey Filippov (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13987021#comment-13987021
 ] 

Alexey Filippov commented on CASSANDRA-5964:


That's interesting, the HEAD at the moment of writing 
(c2579b92bf2e721c099720a27b5d9e56be66e49c):
{code}
$ ./cqlsh
Traceback (most recent call last):
  File ./cqlsh, line 1855, in module
main(*read_options(sys.argv[1:], os.environ))
  File ./cqlsh, line 1841, in main
ssl=options.ssl)
  File ./cqlsh, line 490, in __init__
self.get_connection_versions()
  File ./cqlsh, line 578, in get_connection_versions
self.cass_ver_tuple = tuple(map(int, vers['build'].split('-', 
1)[0].split('.')[:3]))
ValueError: invalid literal for int() with base 10: 'Unknown'
{code}

So it does not appear to be fixed in HEAD or, it seems, 2.1.

 cqlsh raises a ValueError when connecting to Cassandra running in Eclipse
 -

 Key: CASSANDRA-5964
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5964
 Project: Cassandra
  Issue Type: Bug
Reporter: Greg DeAngelis
Assignee: Dave Brosius
Priority: Minor
 Fix For: 2.0.1

 Attachments: 5964.txt


 The release_version is set to 'Unknown' in system.local so the version 
 parsing logic fails.
 Traceback (most recent call last):
   File ./cqlsh, line 2027, in module
 main(*read_options(sys.argv[1:], os.environ))
   File ./cqlsh, line 2013, in main
 display_float_precision=options.float_precision)
   File ./cqlsh, line 486, in __init__
 self.get_connection_versions()
   File ./cqlsh, line 580, in get_connection_versions
 self.cass_ver_tuple = tuple(map(int, vers['build'].split('-', 
 1)[0].split('.')[:3]))
 ValueError: invalid literal for int() with base 10: 'Unknown'



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-3569) Failure detector downs should not break streams

2014-05-01 Thread Brandon Williams (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13987027#comment-13987027
 ] 

Brandon Williams commented on CASSANDRA-3569:
-

bq. Do we have a precedent at this time for changing global system 
configuration settings on startup of the daemon or during rpm install?

We don't ship rpms any longer, but on debian we already modify sysctl for other 
things in debian/cassandra-sysctl.conf

 Failure detector downs should not break streams
 ---

 Key: CASSANDRA-3569
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3569
 Project: Cassandra
  Issue Type: New Feature
Reporter: Peter Schuller
Assignee: Joshua McKenzie
 Fix For: 2.1 rc1

 Attachments: 3569-2.0.txt


 CASSANDRA-2433 introduced this behavior just to get repairs to don't sit 
 there waiting forever. In my opinion the correct fix to that problem is to 
 use TCP keep alive. Unfortunately the TCP keep alive period is insanely high 
 by default on a modern Linux, so just doing that is not entirely good either.
 But using the failure detector seems non-sensicle to me. We have a 
 communication method which is the TCP transport, that we know is used for 
 long-running processes that you don't want to incorrectly be killed for no 
 good reason, and we are using a failure detector tuned to detecting when not 
 to send real-time sensitive request to nodes in order to actively kill a 
 working connection.
 So, rather than add complexity with protocol based ping/pongs and such, I 
 propose that we simply just use TCP keep alive for streaming connections and 
 instruct operators of production clusters to tweak 
 net.ipv4.tcp_keepalive_{probes,intvl} as appropriate (or whatever equivalent 
 on their OS).
 I can submit the patch. Awaiting opinions.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Comment Edited] (CASSANDRA-5964) cqlsh raises a ValueError when connecting to Cassandra running in Eclipse

2014-05-01 Thread Alexey Filippov (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13987021#comment-13987021
 ] 

Alexey Filippov edited comment on CASSANDRA-5964 at 5/1/14 9:42 PM:


That's interesting, the HEAD at the moment of writing 
([c2579b92|https://git-wip-us.apache.org/repos/asf?p=cassandra.git;a=commit;h=c2579b92bf2e721c099720a27b5d9e56be66e49c]):
{code}
$ ./cqlsh
Traceback (most recent call last):
  File ./cqlsh, line 1855, in module
main(*read_options(sys.argv[1:], os.environ))
  File ./cqlsh, line 1841, in main
ssl=options.ssl)
  File ./cqlsh, line 490, in __init__
self.get_connection_versions()
  File ./cqlsh, line 578, in get_connection_versions
self.cass_ver_tuple = tuple(map(int, vers['build'].split('-', 
1)[0].split('.')[:3]))
ValueError: invalid literal for int() with base 10: 'Unknown'
{code}

So it does not appear to be fixed in HEAD or, it seems, 2.1.


was (Author: alf239):
That's interesting, the HEAD at the moment of writing 
(c2579b92bf2e721c099720a27b5d9e56be66e49c):
{code}
$ ./cqlsh
Traceback (most recent call last):
  File ./cqlsh, line 1855, in module
main(*read_options(sys.argv[1:], os.environ))
  File ./cqlsh, line 1841, in main
ssl=options.ssl)
  File ./cqlsh, line 490, in __init__
self.get_connection_versions()
  File ./cqlsh, line 578, in get_connection_versions
self.cass_ver_tuple = tuple(map(int, vers['build'].split('-', 
1)[0].split('.')[:3]))
ValueError: invalid literal for int() with base 10: 'Unknown'
{code}

So it does not appear to be fixed in HEAD or, it seems, 2.1.

 cqlsh raises a ValueError when connecting to Cassandra running in Eclipse
 -

 Key: CASSANDRA-5964
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5964
 Project: Cassandra
  Issue Type: Bug
Reporter: Greg DeAngelis
Assignee: Dave Brosius
Priority: Minor
 Fix For: 2.0.1

 Attachments: 5964.txt


 The release_version is set to 'Unknown' in system.local so the version 
 parsing logic fails.
 Traceback (most recent call last):
   File ./cqlsh, line 2027, in module
 main(*read_options(sys.argv[1:], os.environ))
   File ./cqlsh, line 2013, in main
 display_float_precision=options.float_precision)
   File ./cqlsh, line 486, in __init__
 self.get_connection_versions()
   File ./cqlsh, line 580, in get_connection_versions
 self.cass_ver_tuple = tuple(map(int, vers['build'].split('-', 
 1)[0].split('.')[:3]))
 ValueError: invalid literal for int() with base 10: 'Unknown'



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Comment Edited] (CASSANDRA-5964) cqlsh raises a ValueError when connecting to Cassandra running in Eclipse

2014-05-01 Thread Alexey Filippov (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13987021#comment-13987021
 ] 

Alexey Filippov edited comment on CASSANDRA-5964 at 5/1/14 9:42 PM:


That's interesting, the HEAD at the moment of writing 
([c2579b92|https://git-wip-us.apache.org/repos/asf?p=cassandra.git;a=commit;h=c2579b92bf2e721c099720a27b5d9e56be66e49c])
 seems to fail exactly as described in the ticket, with the only difference 
that the line numbering was changed:
{code}
$ ./cqlsh
Traceback (most recent call last):
  File ./cqlsh, line 1855, in module
main(*read_options(sys.argv[1:], os.environ))
  File ./cqlsh, line 1841, in main
ssl=options.ssl)
  File ./cqlsh, line 490, in __init__
self.get_connection_versions()
  File ./cqlsh, line 578, in get_connection_versions
self.cass_ver_tuple = tuple(map(int, vers['build'].split('-', 
1)[0].split('.')[:3]))
ValueError: invalid literal for int() with base 10: 'Unknown'
{code}

So it does not appear to be fixed in HEAD or, it seems, 2.1.


was (Author: alf239):
That's interesting, the HEAD at the moment of writing 
([c2579b92|https://git-wip-us.apache.org/repos/asf?p=cassandra.git;a=commit;h=c2579b92bf2e721c099720a27b5d9e56be66e49c]):
{code}
$ ./cqlsh
Traceback (most recent call last):
  File ./cqlsh, line 1855, in module
main(*read_options(sys.argv[1:], os.environ))
  File ./cqlsh, line 1841, in main
ssl=options.ssl)
  File ./cqlsh, line 490, in __init__
self.get_connection_versions()
  File ./cqlsh, line 578, in get_connection_versions
self.cass_ver_tuple = tuple(map(int, vers['build'].split('-', 
1)[0].split('.')[:3]))
ValueError: invalid literal for int() with base 10: 'Unknown'
{code}

So it does not appear to be fixed in HEAD or, it seems, 2.1.

 cqlsh raises a ValueError when connecting to Cassandra running in Eclipse
 -

 Key: CASSANDRA-5964
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5964
 Project: Cassandra
  Issue Type: Bug
Reporter: Greg DeAngelis
Assignee: Dave Brosius
Priority: Minor
 Fix For: 2.0.1

 Attachments: 5964.txt


 The release_version is set to 'Unknown' in system.local so the version 
 parsing logic fails.
 Traceback (most recent call last):
   File ./cqlsh, line 2027, in module
 main(*read_options(sys.argv[1:], os.environ))
   File ./cqlsh, line 2013, in main
 display_float_precision=options.float_precision)
   File ./cqlsh, line 486, in __init__
 self.get_connection_versions()
   File ./cqlsh, line 580, in get_connection_versions
 self.cass_ver_tuple = tuple(map(int, vers['build'].split('-', 
 1)[0].split('.')[:3]))
 ValueError: invalid literal for int() with base 10: 'Unknown'



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[4/6] git commit: Merge branch 'cassandra-2.0' into cassandra-2.1

2014-05-01 Thread brandonwilliams
Merge branch 'cassandra-2.0' into cassandra-2.1


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/d66a70e1
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/d66a70e1
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/d66a70e1

Branch: refs/heads/trunk
Commit: d66a70e1f3be1a4f50aa33215d8640234197c547
Parents: 5bd6a75 7426828
Author: Brandon Williams brandonwilli...@apache.org
Authored: Thu May 1 16:40:28 2014 -0500
Committer: Brandon Williams brandonwilli...@apache.org
Committed: Thu May 1 16:40:28 2014 -0500

--

--




[5/6] git commit: Merge branch 'cassandra-2.0' into cassandra-2.1

2014-05-01 Thread brandonwilliams
Merge branch 'cassandra-2.0' into cassandra-2.1


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/d66a70e1
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/d66a70e1
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/d66a70e1

Branch: refs/heads/cassandra-2.1
Commit: d66a70e1f3be1a4f50aa33215d8640234197c547
Parents: 5bd6a75 7426828
Author: Brandon Williams brandonwilli...@apache.org
Authored: Thu May 1 16:40:28 2014 -0500
Committer: Brandon Williams brandonwilli...@apache.org
Committed: Thu May 1 16:40:28 2014 -0500

--

--




[1/6] git commit: fix GCS test for 2.0

2014-05-01 Thread brandonwilliams
Repository: cassandra
Updated Branches:
  refs/heads/cassandra-2.0 f2bbd6fcc - 742682833
  refs/heads/cassandra-2.1 5bd6a756e - d66a70e1f
  refs/heads/trunk c2579b92b - 36448a3bb


fix GCS test for 2.0


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/74268283
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/74268283
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/74268283

Branch: refs/heads/cassandra-2.0
Commit: 742682833259e0bd48243211253dd6a2469e95e2
Parents: f2bbd6f
Author: Brandon Williams brandonwilli...@apache.org
Authored: Thu May 1 16:40:17 2014 -0500
Committer: Brandon Williams brandonwilli...@apache.org
Committed: Thu May 1 16:40:17 2014 -0500

--
 test/unit/org/apache/cassandra/locator/GoogleCloudSnitchTest.java | 1 -
 1 file changed, 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/74268283/test/unit/org/apache/cassandra/locator/GoogleCloudSnitchTest.java
--
diff --git a/test/unit/org/apache/cassandra/locator/GoogleCloudSnitchTest.java 
b/test/unit/org/apache/cassandra/locator/GoogleCloudSnitchTest.java
index 70080a8..09f96db 100644
--- a/test/unit/org/apache/cassandra/locator/GoogleCloudSnitchTest.java
+++ b/test/unit/org/apache/cassandra/locator/GoogleCloudSnitchTest.java
@@ -52,7 +52,6 @@ public class GoogleCloudSnitchTest
 {
 SchemaLoader.mkdirs();
 SchemaLoader.cleanup();
-Keyspace.setInitialized();
 StorageService.instance.initServer(0);
 }
 



[2/6] git commit: fix GCS test for 2.0

2014-05-01 Thread brandonwilliams
fix GCS test for 2.0


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/74268283
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/74268283
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/74268283

Branch: refs/heads/cassandra-2.1
Commit: 742682833259e0bd48243211253dd6a2469e95e2
Parents: f2bbd6f
Author: Brandon Williams brandonwilli...@apache.org
Authored: Thu May 1 16:40:17 2014 -0500
Committer: Brandon Williams brandonwilli...@apache.org
Committed: Thu May 1 16:40:17 2014 -0500

--
 test/unit/org/apache/cassandra/locator/GoogleCloudSnitchTest.java | 1 -
 1 file changed, 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/74268283/test/unit/org/apache/cassandra/locator/GoogleCloudSnitchTest.java
--
diff --git a/test/unit/org/apache/cassandra/locator/GoogleCloudSnitchTest.java 
b/test/unit/org/apache/cassandra/locator/GoogleCloudSnitchTest.java
index 70080a8..09f96db 100644
--- a/test/unit/org/apache/cassandra/locator/GoogleCloudSnitchTest.java
+++ b/test/unit/org/apache/cassandra/locator/GoogleCloudSnitchTest.java
@@ -52,7 +52,6 @@ public class GoogleCloudSnitchTest
 {
 SchemaLoader.mkdirs();
 SchemaLoader.cleanup();
-Keyspace.setInitialized();
 StorageService.instance.initServer(0);
 }
 



[6/6] git commit: Merge branch 'cassandra-2.1' into trunk

2014-05-01 Thread brandonwilliams
Merge branch 'cassandra-2.1' into trunk


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/36448a3b
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/36448a3b
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/36448a3b

Branch: refs/heads/trunk
Commit: 36448a3bb0ed493c884fac75644f7815795b3864
Parents: c2579b9 d66a70e
Author: Brandon Williams brandonwilli...@apache.org
Authored: Thu May 1 16:40:35 2014 -0500
Committer: Brandon Williams brandonwilli...@apache.org
Committed: Thu May 1 16:40:35 2014 -0500

--

--




[3/6] git commit: fix GCS test for 2.0

2014-05-01 Thread brandonwilliams
fix GCS test for 2.0


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/74268283
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/74268283
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/74268283

Branch: refs/heads/trunk
Commit: 742682833259e0bd48243211253dd6a2469e95e2
Parents: f2bbd6f
Author: Brandon Williams brandonwilli...@apache.org
Authored: Thu May 1 16:40:17 2014 -0500
Committer: Brandon Williams brandonwilli...@apache.org
Committed: Thu May 1 16:40:17 2014 -0500

--
 test/unit/org/apache/cassandra/locator/GoogleCloudSnitchTest.java | 1 -
 1 file changed, 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/74268283/test/unit/org/apache/cassandra/locator/GoogleCloudSnitchTest.java
--
diff --git a/test/unit/org/apache/cassandra/locator/GoogleCloudSnitchTest.java 
b/test/unit/org/apache/cassandra/locator/GoogleCloudSnitchTest.java
index 70080a8..09f96db 100644
--- a/test/unit/org/apache/cassandra/locator/GoogleCloudSnitchTest.java
+++ b/test/unit/org/apache/cassandra/locator/GoogleCloudSnitchTest.java
@@ -52,7 +52,6 @@ public class GoogleCloudSnitchTest
 {
 SchemaLoader.mkdirs();
 SchemaLoader.cleanup();
-Keyspace.setInitialized();
 StorageService.instance.initServer(0);
 }
 



[jira] [Commented] (CASSANDRA-3569) Failure detector downs should not break streams

2014-05-01 Thread Joshua McKenzie (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13987044#comment-13987044
 ] 

Joshua McKenzie commented on CASSANDRA-3569:


Sounds like a winner.  Should be trivial to replicate the same with the new 
launch scripts on Windows (CASSANDRA-7001) and I'll pull the FD from repairs 
except for on merkle computation.

 Failure detector downs should not break streams
 ---

 Key: CASSANDRA-3569
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3569
 Project: Cassandra
  Issue Type: New Feature
Reporter: Peter Schuller
Assignee: Joshua McKenzie
 Fix For: 2.1 rc1

 Attachments: 3569-2.0.txt


 CASSANDRA-2433 introduced this behavior just to get repairs to don't sit 
 there waiting forever. In my opinion the correct fix to that problem is to 
 use TCP keep alive. Unfortunately the TCP keep alive period is insanely high 
 by default on a modern Linux, so just doing that is not entirely good either.
 But using the failure detector seems non-sensicle to me. We have a 
 communication method which is the TCP transport, that we know is used for 
 long-running processes that you don't want to incorrectly be killed for no 
 good reason, and we are using a failure detector tuned to detecting when not 
 to send real-time sensitive request to nodes in order to actively kill a 
 working connection.
 So, rather than add complexity with protocol based ping/pongs and such, I 
 propose that we simply just use TCP keep alive for streaming connections and 
 instruct operators of production clusters to tweak 
 net.ipv4.tcp_keepalive_{probes,intvl} as appropriate (or whatever equivalent 
 on their OS).
 I can submit the patch. Awaiting opinions.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6546) disablethrift results in unclosed file descriptors

2014-05-01 Thread Brandon Williams (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13987046#comment-13987046
 ] 

Brandon Williams commented on CASSANDRA-6546:
-

+1, but we shouldn't call it 0.3.4 unless it's actually an 0.3.4 release.

 disablethrift results in unclosed file descriptors
 --

 Key: CASSANDRA-6546
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6546
 Project: Cassandra
  Issue Type: Bug
Reporter: Jason Harvey
Assignee: Mikhail Stepura
Priority: Minor
 Fix For: 1.2.17, 2.0.8, 2.1 beta2

 Attachments: CASSANDRA-6546-1.2.patch, CASSANDRA-6546-2.x.patch


 Disabling thrift results in unclosed thrift sockets being left around.
 Steps to reproduce and observe:
 1. Have a handful of clients connect via thrift.
 2. Disable thrift.
 3. Enable thrift, have the clients reconnect.
 4. Observe netstat or lsof, and you'll find a lot of thrift sockets in 
 CLOSE_WAIT state, and they'll never go away.
   * Also verifiable from 
 org.apache.cassandra.metrics:type=Client,name=connectedThriftClients MBean.
 What's extra fun about this is the leaked sockets still count towards your 
 maximum RPC thread count. As a result, toggling thrift enough times will 
 result in an rpc_max_threads number of CLOSED_WAIT sockets, with no new 
 clients able to connect.
 This was reproduced with HSHA. I haven't tried it in sync yet.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (CASSANDRA-6546) disablethrift results in unclosed file descriptors

2014-05-01 Thread Brandon Williams (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-6546:


Fix Version/s: (was: 2.1 beta2)
   2.1 rc1

 disablethrift results in unclosed file descriptors
 --

 Key: CASSANDRA-6546
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6546
 Project: Cassandra
  Issue Type: Bug
Reporter: Jason Harvey
Assignee: Mikhail Stepura
Priority: Minor
 Fix For: 1.2.17, 2.0.8, 2.1 rc1

 Attachments: CASSANDRA-6546-1.2.patch, CASSANDRA-6546-2.x.patch


 Disabling thrift results in unclosed thrift sockets being left around.
 Steps to reproduce and observe:
 1. Have a handful of clients connect via thrift.
 2. Disable thrift.
 3. Enable thrift, have the clients reconnect.
 4. Observe netstat or lsof, and you'll find a lot of thrift sockets in 
 CLOSE_WAIT state, and they'll never go away.
   * Also verifiable from 
 org.apache.cassandra.metrics:type=Client,name=connectedThriftClients MBean.
 What's extra fun about this is the leaked sockets still count towards your 
 maximum RPC thread count. As a result, toggling thrift enough times will 
 result in an rpc_max_threads number of CLOSED_WAIT sockets, with no new 
 clients able to connect.
 This was reproduced with HSHA. I haven't tried it in sync yet.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-2527) Add ability to snapshot data as input to hadoop jobs

2014-05-01 Thread Jeremy Hanna (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13987067#comment-13987067
 ] 

Jeremy Hanna commented on CASSANDRA-2527:
-

Another instance: https://twitter.com/davidjacot/status/461942635381284864

 Add ability to snapshot data as input to hadoop jobs
 

 Key: CASSANDRA-2527
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2527
 Project: Cassandra
  Issue Type: New Feature
Reporter: Jeremy Hanna
Assignee: Joshua McKenzie
Priority: Minor
  Labels: hadoop
 Fix For: 2.1 beta2


 It is desirable to have immutable inputs to hadoop jobs for the duration of 
 the job.  That way re-execution of individual tasks do not alter the output.  
 One way to accomplish this would be to snapshot the data that is used as 
 input to a job.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[04/13] git commit: Cleanup selector's keys.

2014-05-01 Thread mishail
Cleanup selector's keys.

patch by Mikhail Stepura; reviewed by Brandon Williams for CASSANDRA-6546


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/b573d0fb
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/b573d0fb
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/b573d0fb

Branch: refs/heads/cassandra-2.1
Commit: b573d0fb51a91b053d11d5693eae5a397019d288
Parents: 1052749
Author: Mikhail Stepura mish...@apache.org
Authored: Wed Apr 30 13:39:02 2014 -0700
Committer: Mikhail Stepura mish...@apache.org
Committed: Thu May 1 15:18:00 2014 -0700

--
 src/java/org/apache/cassandra/thrift/CustomTHsHaServer.java | 5 +
 1 file changed, 5 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/b573d0fb/src/java/org/apache/cassandra/thrift/CustomTHsHaServer.java
--
diff --git a/src/java/org/apache/cassandra/thrift/CustomTHsHaServer.java 
b/src/java/org/apache/cassandra/thrift/CustomTHsHaServer.java
index 2e2287d..076652f 100644
--- a/src/java/org/apache/cassandra/thrift/CustomTHsHaServer.java
+++ b/src/java/org/apache/cassandra/thrift/CustomTHsHaServer.java
@@ -195,6 +195,11 @@ public class CustomTHsHaServer extends TNonblockingServer
 {
 try
 {
+//CASSANDRA-6546
+for (SelectionKey key: selector.keys())
+{
+cleanupSelectionkey(key);
+}
 selector.close(); // CASSANDRA-3867
 }
 catch (IOException e)



  1   2   >