[jira] [Commented] (CASSANDRA-6506) counters++ split counter context shards into separate cells

2014-03-24 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13944813#comment-13944813
 ] 

Sylvain Lebresne commented on CASSANDRA-6506:
-

Are we talking about 6506: Clean up CFMetaData and 6506: Clean up 
Cell/OnDiskAtom? Asking because I can't seem to match the commit hash of your 
comment to the 2nd one in particular (nor git was able to find said commit hash 
on your branch from above). Anyway, if that's the ones, definitively no 
objections on the first one. On the second one, shouldn't we continue passing 
the allocator down to CounterContext up until we do this properly? (but I'm 
good with the min/maxTimestamp refactoring part).

 counters++ split counter context shards into separate cells
 ---

 Key: CASSANDRA-6506
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6506
 Project: Cassandra
  Issue Type: Improvement
Reporter: Aleksey Yeschenko
Assignee: Aleksey Yeschenko
 Fix For: 2.1 beta2


 This change is related to, but somewhat orthogonal to CASSANDRA-6504.
 Currently all the shard tuples for a given counter cell are packed, in sorted 
 order, in one binary blob. Thus reconciling N counter cells requires 
 allocating a new byte buffer capable of holding the union of the two 
 context's shards N-1 times.
 For writes, in post CASSANDRA-6504 world, it also means reading more data 
 than we have to (the complete context, when all we need is the local node's 
 global shard).
 Splitting the context into separate cells, one cell per shard, will help to 
 improve this. We did a similar thing with super columns for CASSANDRA-3237. 
 Incidentally, doing this split is now possible thanks to CASSANDRA-3237.
 Doing this would also simplify counter reconciliation logic. Getting rid of 
 old contexts altogether can be done trivially with upgradesstables.
 In fact, we should be able to put the logical clock into the cell's 
 timestamp, and use regular Cell-s and regular Cell reconcile() logic for the 
 shards, especially once we get rid of the local/remote shards some time in 
 the future (until then we still have to differentiate between 
 global/remote/local shards and their priority rules).



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6575) By default, Cassandra should refuse to start if JNA can't be initialized properly

2014-03-24 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13944817#comment-13944817
 ] 

Benedict commented on CASSANDRA-6575:
-

Note that this dependency is not on a *functioning* JNA, but on the JNA jar 
itself only, for Java internal functionality. This dependency is removed anyway 
once we get CASSANDRA-6694, so I don't think it is an issue, and if it is we 
will hopefully fix it shortly regardless.

 By default, Cassandra should refuse to start if JNA can't be initialized 
 properly
 -

 Key: CASSANDRA-6575
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6575
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Tupshin Harper
Assignee: Clément Lardeur
Priority: Minor
  Labels: lhf
 Fix For: 2.1 beta1

 Attachments: trunk-6575-v2.patch, trunk-6575-v3.patch, 
 trunk-6575-v4.patch, trunk-6575.patch


 Failure to have JNA working properly is such a common undetected problem that 
 it would be far preferable to have Cassandra refuse to startup unless JNA is 
 initialized. In theory, this should be much less of a problem with Cassandra 
 2.1 due to CASSANDRA-5872, but even there, it might fail due to native lib 
 problems, or might otherwise be misconfigured. A yaml override, such as 
 boot_without_jna would allow the deliberate overriding of this policy.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6746) Reads have a slow ramp up in speed

2014-03-24 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13944821#comment-13944821
 ] 

Benedict commented on CASSANDRA-6746:
-

@enigmacurry looks kosher.

It won't be eliminating any drop at the start, it is just moving the timing of 
the drops (and making them shorter). I think we should rename this ticket to 
compaction destroys page cache and split out a new ticket for [~xedin]'s 
changes (page cache population is suboptimal) which may be sensible in 
principle. In practice, moving the WILLNEED into the getSegment() call is 
dangerous as the segment is used past the initial 64Kb, and if we rely on 
ourselves only for read-ahead this could result in very substandard performance 
for larger rows. We also probably want to only WILLNEED the actual size of the 
buffer we expect to read for compressed files. But this was only a 
proof-of-concept, and in principle the idea is probably sound.

 Reads have a slow ramp up in speed
 --

 Key: CASSANDRA-6746
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6746
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Ryan McGuire
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2

 Attachments: 2.1_vs_2.0_read.png, 6746-buffered-io-tweaks.png, 
 6746-patched.png, 6746.blockdev_setra.full.png, 
 6746.blockdev_setra.zoomed.png, 6746.buffered_io_tweaks.logs.tar.gz, 
 6746.buffered_io_tweaks.write-flush-compact-mixed.png, 
 6746.buffered_io_tweaks.write-read-flush-compact.png, 6746.txt, 
 buffered-io-tweaks.patch, cassandra-2.0-bdplab-trial-fincore.tar.bz2, 
 cassandra-2.1-bdplab-trial-fincore.tar.bz2


 On a physical four node cluister I am doing a big write and then a big read. 
 The read takes a long time to ramp up to respectable speeds.
 !2.1_vs_2.0_read.png!
 [See data 
 here|http://ryanmcguire.info/ds/graph/graph.html?stats=stats.2.1_vs_2.0_vs_1.2.retry1.jsonmetric=interval_op_rateoperation=stress-readsmoothing=1]



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (CASSANDRA-6912) SSTableReader.isReplaced does not allow for safe resource cleanup

2014-03-24 Thread Benedict (JIRA)
Benedict created CASSANDRA-6912:
---

 Summary: SSTableReader.isReplaced does not allow for safe resource 
cleanup
 Key: CASSANDRA-6912
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6912
 Project: Cassandra
  Issue Type: Bug
Reporter: Benedict
Assignee: Benedict
 Fix For: 2.1 beta2


There are a number of possible race conditions on resource cleanup from the use 
of cloneWithNewSummarySamplingLevel, because the replacement sstable can be 
itself replaced/obsoleted while the prior sstable is still referenced (this is 
actually quite easy with compaction, but can happen in other circumstances less 
commonly).



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6912) SSTableReader.isReplaced does not allow for safe resource cleanup

2014-03-24 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13944873#comment-13944873
 ] 

Benedict commented on CASSANDRA-6912:
-

Patch available 
[here|https://github.com/belliottsmith/cassandra/tree/6912-fix.sstablereader.cloneWith]

 SSTableReader.isReplaced does not allow for safe resource cleanup
 -

 Key: CASSANDRA-6912
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6912
 Project: Cassandra
  Issue Type: Bug
Reporter: Benedict
Assignee: Benedict
 Fix For: 2.1 beta2


 There are a number of possible race conditions on resource cleanup from the 
 use of cloneWithNewSummarySamplingLevel, because the replacement sstable can 
 be itself replaced/obsoleted while the prior sstable is still referenced 
 (this is actually quite easy with compaction, but can happen in other 
 circumstances less commonly).



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Comment Edited] (CASSANDRA-6912) SSTableReader.isReplaced does not allow for safe resource cleanup

2014-03-24 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13944873#comment-13944873
 ] 

Benedict edited comment on CASSANDRA-6912 at 3/24/14 9:54 AM:
--

Patch available 
[here|https://github.com/belliottsmith/cassandra/tree/6912-fix.sstablereader.cloneWith]

Also fixes size maintenance in DataTracker, which was almost certainly not 
actually accounting for the reduction in disk utilisation, as the calculation 
looks at the files on disk which have been replaced by then.


was (Author: benedict):
Patch available 
[here|https://github.com/belliottsmith/cassandra/tree/6912-fix.sstablereader.cloneWith]

 SSTableReader.isReplaced does not allow for safe resource cleanup
 -

 Key: CASSANDRA-6912
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6912
 Project: Cassandra
  Issue Type: Bug
Reporter: Benedict
Assignee: Benedict
 Fix For: 2.1 beta2


 There are a number of possible race conditions on resource cleanup from the 
 use of cloneWithNewSummarySamplingLevel, because the replacement sstable can 
 be itself replaced/obsoleted while the prior sstable is still referenced 
 (this is actually quite easy with compaction, but can happen in other 
 circumstances less commonly).



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6911) Netty dependency update broke stress

2014-03-24 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13944876#comment-13944876
 ] 

Sylvain Lebresne commented on CASSANDRA-6911:
-

Oh, that's because the java driver is still on Netty 3. I do plan on migrating 
it too to Netty 4 but haven't got to it. That being said, since Netty 
completely changed it's package name between 3 and 4, I suspect just dropping 
back the Netty 3 jar in the stress lib dir should be good enough. Can you give 
that a shot [~enigmacurry]?

 Netty dependency update broke stress
 

 Key: CASSANDRA-6911
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6911
 Project: Cassandra
  Issue Type: Bug
  Components: Tools
Reporter: Ryan McGuire
Assignee: Benedict

 I compiled stress fresh from cassandra-2.1 and running this command:
 {code}
 cassandra-stress write n=1900 -rate threads=50 -node bdplab
 {code}
 I get the following traceback:
 {code}
 Exception in thread Thread-49 java.lang.NoClassDefFoundError: 
 org/jboss/netty/channel/ChannelFactory
 at com.datastax.driver.core.Cluster$Manager.init(Cluster.java:941)
 at com.datastax.driver.core.Cluster$Manager.init(Cluster.java:889)
 at com.datastax.driver.core.Cluster.init(Cluster.java:88)
 at com.datastax.driver.core.Cluster.buildFrom(Cluster.java:144)
 at com.datastax.driver.core.Cluster$Builder.build(Cluster.java:854)
 at 
 org.apache.cassandra.stress.util.JavaDriverClient.connect(JavaDriverClient.java:74)
 at 
 org.apache.cassandra.stress.settings.StressSettings.getJavaDriverClient(StressSettings.java:155)
 at 
 org.apache.cassandra.stress.settings.StressSettings.getSmartThriftClient(StressSettings.java:70)
 at 
 org.apache.cassandra.stress.StressAction$Consumer.run(StressAction.java:275)
 Caused by: java.lang.ClassNotFoundException: 
 org.jboss.netty.channel.ChannelFactory
 at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
 at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
 at java.security.AccessController.doPrivileged(Native Method)
 at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
 at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
 ... 9 more
 {code}
 It seems this was introduced with an updated netty jar in 
 cbf304ebd0436a321753e81231545b705aa8dd23



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (CASSANDRA-6913) Compaction of system keyspaces during startup can cause early loading of non-system keyspaces

2014-03-24 Thread Benedict (JIRA)
Benedict created CASSANDRA-6913:
---

 Summary: Compaction of system keyspaces during startup can cause 
early loading of non-system keyspaces
 Key: CASSANDRA-6913
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6913
 Project: Cassandra
  Issue Type: Bug
Reporter: Benedict
Assignee: Benedict
Priority: Minor
 Fix For: 2.1 beta2


This then can result in an inconsistent CFS state, as cleanup of e.g. 
compaction leftovers does not get reflected in DataTracker. It happens because 
StorageService.getLoad() iterates over and opens all CFS, and this is called by 
Compaction.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables

2014-03-24 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13944885#comment-13944885
 ] 

Sylvain Lebresne commented on CASSANDRA-6694:
-

bq. I see a 25% throughput improvement using offheap_objects as the allocator 
type vs either on/off heap buffers.

That's definitively good to know, but does that suggest that without this, 
there isn't much notable performance difference between on and off heap 
buffers? Because if that's the case, I'm still of the opinion that it could be 
worth moving this to 3.0 on the argument that we've moved stuff in 2.1 last 
minute enough as it is. 

 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6694) Slightly More Off-Heap Memtables

2014-03-24 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13944889#comment-13944889
 ] 

Benedict commented on CASSANDRA-6694:
-

This specific patch only permits larger amounts of data to be retained in 
memtables; the only speed-wise performance implications of this are the ones 
stated here, i.e. improved write throughput through reduced write amplification 
and the writing of larger files.

For this workload there's basically no difference between on and off-heap 
(CASSANDRA-6689) ByteBuffer backed storage,  if that's what you're asking, 
because the on-heap overhead still heavily outweighs the off-heap utilisation. 
This would not be true for workloads with large per-column payloads.

 Slightly More Off-Heap Memtables
 

 Key: CASSANDRA-6694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6694
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2


 The Off Heap memtables introduced in CASSANDRA-6689 don't go far enough, as 
 the on-heap overhead is still very large. It should not be tremendously 
 difficult to extend these changes so that we allocate entire Cells off-heap, 
 instead of multiple BBs per Cell (with all their associated overhead).
 The goal (if possible) is to reach an overhead of 16-bytes per Cell (plus 4-6 
 bytes per cell on average for the btree overhead, for a total overhead of 
 around 20-22 bytes). This translates to 8-byte object overhead, 4-byte 
 address (we will do alignment tricks like the VM to allow us to address a 
 reasonably large memory space, although this trick is unlikely to last us 
 forever, at which point we will have to bite the bullet and accept a 24-byte 
 per cell overhead), and 4-byte object reference for maintaining our internal 
 list of allocations, which is unfortunately necessary since we cannot safely 
 (and cheaply) walk the object graph we allocate otherwise, which is necessary 
 for (allocation-) compaction and pointer rewriting.
 The ugliest thing here is going to be implementing the various CellName 
 instances so that they may be backed by native memory OR heap memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (CASSANDRA-6781) ByteBuffer write() methods for serializing sstables

2014-03-24 Thread Benedict (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benedict updated CASSANDRA-6781:


Attachment: 6781.removeavro.txt

Attaching simple patch that removes avro dependency for DataOutputTest

 ByteBuffer write() methods for serializing sstables
 ---

 Key: CASSANDRA-6781
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6781
 Project: Cassandra
  Issue Type: Improvement
Reporter: Benedict
Assignee: Benedict
Priority: Minor
 Fix For: 2.1 beta2

 Attachments: 6781.removeavro.txt


 As mentioned in CASSANDRA-6689, there may be some performance issues with 
 writing sstables from offheap memtables. This is mostly plausibly caused by 
 the single-byte-at-a-time write path for ByteBuffers, as we use DataOutput 
 which only accepts byte[].
 I propose extending DataOutput to include ByteBuffer methods, and to use this 
 extended interface for serializing sstables instead.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Reopened] (CASSANDRA-6311) Add CqlRecordReader to take advantage of native CQL pagination

2014-03-24 Thread Dave Brosius (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dave Brosius reopened CASSANDRA-6311:
-


 Add CqlRecordReader to take advantage of native CQL pagination
 --

 Key: CASSANDRA-6311
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6311
 Project: Cassandra
  Issue Type: New Feature
  Components: Hadoop
Reporter: Alex Liu
Assignee: Alex Liu
 Fix For: 2.0.7

 Attachments: 6311-v10.txt, 6311-v3-2.0-branch.txt, 6311-v4.txt, 
 6311-v5-2.0-branch.txt, 6311-v6-2.0-branch.txt, 6311-v7.txt, 6311-v8.txt, 
 6311-v9.txt, 6331-2.0-branch.txt, 6331-v2-2.0-branch.txt


 Since the latest Cql pagination is done and it should be more efficient, so 
 we need update CqlPagingRecordReader to use it instead of the custom thrift 
 paging.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6311) Add CqlRecordReader to take advantage of native CQL pagination

2014-03-24 Thread Dave Brosius (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13944922#comment-13944922
 ] 

Dave Brosius commented on CASSANDRA-6311:
-

CqlRecordReader.next() doesn't appear to be correct. It assigns values to 
parameters as if that does something.

 Add CqlRecordReader to take advantage of native CQL pagination
 --

 Key: CASSANDRA-6311
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6311
 Project: Cassandra
  Issue Type: New Feature
  Components: Hadoop
Reporter: Alex Liu
Assignee: Alex Liu
 Fix For: 2.0.7

 Attachments: 6311-v10.txt, 6311-v3-2.0-branch.txt, 6311-v4.txt, 
 6311-v5-2.0-branch.txt, 6311-v6-2.0-branch.txt, 6311-v7.txt, 6311-v8.txt, 
 6311-v9.txt, 6331-2.0-branch.txt, 6331-v2-2.0-branch.txt


 Since the latest Cql pagination is done and it should be more efficient, so 
 we need update CqlPagingRecordReader to use it instead of the custom thrift 
 paging.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6311) Add CqlRecordReader to take advantage of native CQL pagination

2014-03-24 Thread JIRA

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13944925#comment-13944925
 ] 

Piotr Kołaczkowski commented on CASSANDRA-6311:
---

Indeed. 

 Add CqlRecordReader to take advantage of native CQL pagination
 --

 Key: CASSANDRA-6311
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6311
 Project: Cassandra
  Issue Type: New Feature
  Components: Hadoop
Reporter: Alex Liu
Assignee: Alex Liu
 Fix For: 2.0.7

 Attachments: 6311-v10.txt, 6311-v3-2.0-branch.txt, 6311-v4.txt, 
 6311-v5-2.0-branch.txt, 6311-v6-2.0-branch.txt, 6311-v7.txt, 6311-v8.txt, 
 6311-v9.txt, 6331-2.0-branch.txt, 6331-v2-2.0-branch.txt


 Since the latest Cql pagination is done and it should be more efficient, so 
 we need update CqlPagingRecordReader to use it instead of the custom thrift 
 paging.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


git commit: Remove avro usage in DataOutputTest

2014-03-24 Thread marcuse
Repository: cassandra
Updated Branches:
  refs/heads/cassandra-2.1 fbc112d4b - 5baa72f7f


Remove avro usage in DataOutputTest


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/5baa72f7
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/5baa72f7
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/5baa72f7

Branch: refs/heads/cassandra-2.1
Commit: 5baa72f7f299b4ec190ddb30b897d5519a6a2a75
Parents: fbc112d
Author: Marcus Eriksson marc...@apache.org
Authored: Mon Mar 24 12:36:05 2014 +0100
Committer: Marcus Eriksson marc...@apache.org
Committed: Mon Mar 24 12:36:05 2014 +0100

--
 test/unit/org/apache/cassandra/io/util/DataOutputTest.java | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/5baa72f7/test/unit/org/apache/cassandra/io/util/DataOutputTest.java
--
diff --git a/test/unit/org/apache/cassandra/io/util/DataOutputTest.java 
b/test/unit/org/apache/cassandra/io/util/DataOutputTest.java
index 4eeec4d..2a8c7a9 100644
--- a/test/unit/org/apache/cassandra/io/util/DataOutputTest.java
+++ b/test/unit/org/apache/cassandra/io/util/DataOutputTest.java
@@ -31,14 +31,12 @@ import java.io.IOException;
 import java.io.RandomAccessFile;
 import java.nio.ByteBuffer;
 import java.nio.channels.Channels;
-import java.util.Arrays;
 import java.util.Random;
 import java.util.concurrent.ThreadLocalRandom;
 
 import org.junit.Assert;
 import org.junit.Test;
 
-import org.apache.avro.util.ByteBufferInputStream;
 import org.apache.cassandra.utils.ByteBufferUtil;
 
 public class DataOutputTest
@@ -79,7 +77,7 @@ public class DataOutputTest
 ByteBuffer buf = wrap(new byte[345], true);
 DataOutputByteBuffer write = new DataOutputByteBuffer(buf.duplicate());
 DataInput canon = testWrite(write);
-DataInput test = new DataInputStream(new 
ByteBufferInputStream(Arrays.asList(buf)));
+DataInput test = new DataInputStream(new 
ByteArrayInputStream(ByteBufferUtil.getArray(buf)));
 testRead(test, canon);
 }
 
@@ -89,7 +87,7 @@ public class DataOutputTest
 ByteBuffer buf = wrap(new byte[345], false);
 DataOutputByteBuffer write = new DataOutputByteBuffer(buf.duplicate());
 DataInput canon = testWrite(write);
-DataInput test = new DataInputStream(new 
ByteBufferInputStream(Arrays.asList(buf)));
+DataInput test = new DataInputStream(new 
ByteArrayInputStream(ByteBufferUtil.getArray(buf)));
 testRead(test, canon);
 }
 



[1/2] git commit: Remove avro usage in DataOutputTest

2014-03-24 Thread marcuse
Repository: cassandra
Updated Branches:
  refs/heads/trunk 6f24097f6 - 6838790f8


Remove avro usage in DataOutputTest


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/5baa72f7
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/5baa72f7
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/5baa72f7

Branch: refs/heads/trunk
Commit: 5baa72f7f299b4ec190ddb30b897d5519a6a2a75
Parents: fbc112d
Author: Marcus Eriksson marc...@apache.org
Authored: Mon Mar 24 12:36:05 2014 +0100
Committer: Marcus Eriksson marc...@apache.org
Committed: Mon Mar 24 12:36:05 2014 +0100

--
 test/unit/org/apache/cassandra/io/util/DataOutputTest.java | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/5baa72f7/test/unit/org/apache/cassandra/io/util/DataOutputTest.java
--
diff --git a/test/unit/org/apache/cassandra/io/util/DataOutputTest.java 
b/test/unit/org/apache/cassandra/io/util/DataOutputTest.java
index 4eeec4d..2a8c7a9 100644
--- a/test/unit/org/apache/cassandra/io/util/DataOutputTest.java
+++ b/test/unit/org/apache/cassandra/io/util/DataOutputTest.java
@@ -31,14 +31,12 @@ import java.io.IOException;
 import java.io.RandomAccessFile;
 import java.nio.ByteBuffer;
 import java.nio.channels.Channels;
-import java.util.Arrays;
 import java.util.Random;
 import java.util.concurrent.ThreadLocalRandom;
 
 import org.junit.Assert;
 import org.junit.Test;
 
-import org.apache.avro.util.ByteBufferInputStream;
 import org.apache.cassandra.utils.ByteBufferUtil;
 
 public class DataOutputTest
@@ -79,7 +77,7 @@ public class DataOutputTest
 ByteBuffer buf = wrap(new byte[345], true);
 DataOutputByteBuffer write = new DataOutputByteBuffer(buf.duplicate());
 DataInput canon = testWrite(write);
-DataInput test = new DataInputStream(new 
ByteBufferInputStream(Arrays.asList(buf)));
+DataInput test = new DataInputStream(new 
ByteArrayInputStream(ByteBufferUtil.getArray(buf)));
 testRead(test, canon);
 }
 
@@ -89,7 +87,7 @@ public class DataOutputTest
 ByteBuffer buf = wrap(new byte[345], false);
 DataOutputByteBuffer write = new DataOutputByteBuffer(buf.duplicate());
 DataInput canon = testWrite(write);
-DataInput test = new DataInputStream(new 
ByteBufferInputStream(Arrays.asList(buf)));
+DataInput test = new DataInputStream(new 
ByteArrayInputStream(ByteBufferUtil.getArray(buf)));
 testRead(test, canon);
 }
 



[2/2] git commit: Merge branch 'cassandra-2.1' into trunk

2014-03-24 Thread marcuse
Merge branch 'cassandra-2.1' into trunk


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/6838790f
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/6838790f
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/6838790f

Branch: refs/heads/trunk
Commit: 6838790f869dfcd773c83fd0165c699e6984540d
Parents: 6f24097 5baa72f
Author: Marcus Eriksson marc...@apache.org
Authored: Mon Mar 24 12:36:20 2014 +0100
Committer: Marcus Eriksson marc...@apache.org
Committed: Mon Mar 24 12:36:20 2014 +0100

--
 test/unit/org/apache/cassandra/io/util/DataOutputTest.java | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)
--




[jira] [Commented] (CASSANDRA-6781) ByteBuffer write() methods for serializing sstables

2014-03-24 Thread Marcus Eriksson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13944941#comment-13944941
 ] 

Marcus Eriksson commented on CASSANDRA-6781:


committed, thanks

 ByteBuffer write() methods for serializing sstables
 ---

 Key: CASSANDRA-6781
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6781
 Project: Cassandra
  Issue Type: Improvement
Reporter: Benedict
Assignee: Benedict
Priority: Minor
 Fix For: 2.1 beta2

 Attachments: 6781.removeavro.txt


 As mentioned in CASSANDRA-6689, there may be some performance issues with 
 writing sstables from offheap memtables. This is mostly plausibly caused by 
 the single-byte-at-a-time write path for ByteBuffers, as we use DataOutput 
 which only accepts byte[].
 I propose extending DataOutput to include ByteBuffer methods, and to use this 
 extended interface for serializing sstables instead.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (CASSANDRA-6913) Compaction of system keyspaces during startup can cause early loading of non-system keyspaces

2014-03-24 Thread Benedict (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6913?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benedict updated CASSANDRA-6913:


Attachment: 6913.txt

Simple patch that uses getKeyspaceInstance() in getLoad() so that compaction 
can happen without opening any keyspaces, and introduces assertions that 
prevent non-system keyspace loading during startup.

An alternative to the getLoad() change would be to disable compaction of system 
keyspaces during startup, but that's probably not necessary, and this should be 
sufficient. The assertions are to try and catch such problems earlier in future.

 Compaction of system keyspaces during startup can cause early loading of 
 non-system keyspaces
 -

 Key: CASSANDRA-6913
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6913
 Project: Cassandra
  Issue Type: Bug
Reporter: Benedict
Assignee: Benedict
Priority: Minor
 Fix For: 2.1 beta2

 Attachments: 6913.txt


 This then can result in an inconsistent CFS state, as cleanup of e.g. 
 compaction leftovers does not get reflected in DataTracker. It happens 
 because StorageService.getLoad() iterates over and opens all CFS, and this is 
 called by Compaction.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (CASSANDRA-6858) Strange Exception on cassandra node restart

2014-03-24 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Clément Lardeur updated CASSANDRA-6858:
---

Description: 
Strange exception on cassandra restart. It seems that it is not bad, but any 
exception is not good.
{code}
 WARN [MutationStage:169] 2014-03-14 21:14:52,723 JmxReporter.java (line 397) 
Error processing 
org.apache.cassandra.metrics:type=Connection,scope=95.163.80.90,name=CommandPendingTasks
javax.management.InstanceNotFoundException: 
org.apache.cassandra.metrics:type=Connection,scope=95.163.80.90,name=CommandPendingTasks
at 
com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getMBean(DefaultMBeanServerInterceptor.java:1095)
at 
com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.exclusiveUnregisterMBean(DefaultMBeanServerInterceptor.java:427)
at 
com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.unregisterMBean(DefaultMBeanServerInterceptor.java:415)
at 
com.sun.jmx.mbeanserver.JmxMBeanServer.unregisterMBean(JmxMBeanServer.java:546)
at 
com.yammer.metrics.reporting.JmxReporter.registerBean(JmxReporter.java:462)
at 
com.yammer.metrics.reporting.JmxReporter.processGauge(JmxReporter.java:438)
at 
com.yammer.metrics.reporting.JmxReporter.processGauge(JmxReporter.java:16)
at com.yammer.metrics.core.Gauge.processWith(Gauge.java:28)
at 
com.yammer.metrics.reporting.JmxReporter.onMetricAdded(JmxReporter.java:395)
at 
com.yammer.metrics.core.MetricsRegistry.notifyMetricAdded(MetricsRegistry.java:516)
at 
com.yammer.metrics.core.MetricsRegistry.getOrAdd(MetricsRegistry.java:491)
at 
com.yammer.metrics.core.MetricsRegistry.newGauge(MetricsRegistry.java:79)
at com.yammer.metrics.Metrics.newGauge(Metrics.java:70)
at 
org.apache.cassandra.metrics.ConnectionMetrics.init(ConnectionMetrics.java:71)
at 
org.apache.cassandra.net.OutboundTcpConnectionPool.init(OutboundTcpConnectionPool.java:55)
at 
org.apache.cassandra.net.MessagingService.getConnectionPool(MessagingService.java:493)
at 
org.apache.cassandra.net.MessagingService.getConnection(MessagingService.java:507)
at 
org.apache.cassandra.net.MessagingService.sendOneWay(MessagingService.java:639)
at 
org.apache.cassandra.net.MessagingService.sendReply(MessagingService.java:613)
at 
org.apache.cassandra.db.RowMutationVerbHandler.doVerb(RowMutationVerbHandler.java:59)
at 
org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:60)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
{code}

  was:
Strange exception on cassandra restart. It seems that it is not bad, but any 
exception is not good.

 WARN [MutationStage:169] 2014-03-14 21:14:52,723 JmxReporter.java (line 397) 
Error processing 
org.apache.cassandra.metrics:type=Connection,scope=95.163.80.90,name=CommandPendingTasks
javax.management.InstanceNotFoundException: 
org.apache.cassandra.metrics:type=Connection,scope=95.163.80.90,name=CommandPendingTasks
at 
com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getMBean(DefaultMBeanServerInterceptor.java:1095)
at 
com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.exclusiveUnregisterMBean(DefaultMBeanServerInterceptor.java:427)
at 
com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.unregisterMBean(DefaultMBeanServerInterceptor.java:415)
at 
com.sun.jmx.mbeanserver.JmxMBeanServer.unregisterMBean(JmxMBeanServer.java:546)
at 
com.yammer.metrics.reporting.JmxReporter.registerBean(JmxReporter.java:462)
at 
com.yammer.metrics.reporting.JmxReporter.processGauge(JmxReporter.java:438)
at 
com.yammer.metrics.reporting.JmxReporter.processGauge(JmxReporter.java:16)
at com.yammer.metrics.core.Gauge.processWith(Gauge.java:28)
at 
com.yammer.metrics.reporting.JmxReporter.onMetricAdded(JmxReporter.java:395)
at 
com.yammer.metrics.core.MetricsRegistry.notifyMetricAdded(MetricsRegistry.java:516)
at 
com.yammer.metrics.core.MetricsRegistry.getOrAdd(MetricsRegistry.java:491)
at 
com.yammer.metrics.core.MetricsRegistry.newGauge(MetricsRegistry.java:79)
at com.yammer.metrics.Metrics.newGauge(Metrics.java:70)
at 
org.apache.cassandra.metrics.ConnectionMetrics.init(ConnectionMetrics.java:71)
at 
org.apache.cassandra.net.OutboundTcpConnectionPool.init(OutboundTcpConnectionPool.java:55)
at 
org.apache.cassandra.net.MessagingService.getConnectionPool(MessagingService.java:493)
at 
org.apache.cassandra.net.MessagingService.getConnection(MessagingService.java:507)
at 
org.apache.cassandra.net.MessagingService.sendOneWay(MessagingService.java:639)
   

git commit: Fix SSTable not released if stream fails before it starts

2014-03-24 Thread yukim
Repository: cassandra
Updated Branches:
  refs/heads/cassandra-1.2 b7bb2fb20 - 35d4b5de8


Fix SSTable not released if stream fails before it starts

patch by yukim; reviewed by Richard Low for CASSANDRA-6818


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/35d4b5de
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/35d4b5de
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/35d4b5de

Branch: refs/heads/cassandra-1.2
Commit: 35d4b5de8f3ee18ec98b01f3aa0951df0e11e8d2
Parents: b7bb2fb
Author: Yuki Morishita yu...@apache.org
Authored: Mon Mar 24 07:44:19 2014 -0500
Committer: Yuki Morishita yu...@apache.org
Committed: Mon Mar 24 07:44:19 2014 -0500

--
 CHANGES.txt | 1 +
 .../org/apache/cassandra/streaming/AbstractStreamSession.java   | 2 --
 src/java/org/apache/cassandra/streaming/StreamInSession.java| 5 +
 src/java/org/apache/cassandra/streaming/StreamOutSession.java   | 5 +
 4 files changed, 11 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/35d4b5de/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 960b0e9..fa46c2e 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -23,6 +23,7 @@
  * Avoid NPEs when receiving table changes for an unknown keyspace 
(CASSANDRA-5631)
  * Fix bootstrapping when there is no schema (CASSANDRA-6685)
  * Fix truncating compression metadata (CASSANDRA-6791)
+ * Fix SSTable not released if stream session fails before starts 
(CASSANDRA-6818)
 
 
 1.2.15

http://git-wip-us.apache.org/repos/asf/cassandra/blob/35d4b5de/src/java/org/apache/cassandra/streaming/AbstractStreamSession.java
--
diff --git a/src/java/org/apache/cassandra/streaming/AbstractStreamSession.java 
b/src/java/org/apache/cassandra/streaming/AbstractStreamSession.java
index 89fbf5f..f8de827 100644
--- a/src/java/org/apache/cassandra/streaming/AbstractStreamSession.java
+++ b/src/java/org/apache/cassandra/streaming/AbstractStreamSession.java
@@ -44,8 +44,6 @@ public abstract class AbstractStreamSession implements 
IEndpointStateChangeSubsc
 this.sessionId = sessionId;
 this.table = table;
 this.callback = callback;
-Gossiper.instance.register(this);
-FailureDetector.instance.registerFailureDetectionEventListener(this);
 }
 
 public UUID getSessionId()

http://git-wip-us.apache.org/repos/asf/cassandra/blob/35d4b5de/src/java/org/apache/cassandra/streaming/StreamInSession.java
--
diff --git a/src/java/org/apache/cassandra/streaming/StreamInSession.java 
b/src/java/org/apache/cassandra/streaming/StreamInSession.java
index e83a5b6..f9cdc31 100644
--- a/src/java/org/apache/cassandra/streaming/StreamInSession.java
+++ b/src/java/org/apache/cassandra/streaming/StreamInSession.java
@@ -24,6 +24,7 @@ import java.net.Socket;
 import java.util.*;
 import java.util.concurrent.ConcurrentMap;
 
+import org.apache.cassandra.gms.Gossiper;
 import org.apache.cassandra.io.sstable.SSTableWriter;
 import org.cliffc.high_scale_lib.NonBlockingHashMap;
 import org.cliffc.high_scale_lib.NonBlockingHashSet;
@@ -61,6 +62,8 @@ public class StreamInSession extends AbstractStreamSession
 public static StreamInSession create(InetAddress host, IStreamCallback 
callback)
 {
 StreamInSession session = new StreamInSession(host, 
UUIDGen.getTimeUUID(), callback);
+Gossiper.instance.register(session);
+
FailureDetector.instance.registerFailureDetectionEventListener(session);
 sessions.put(session.getSessionId(), session);
 return session;
 }
@@ -71,6 +74,8 @@ public class StreamInSession extends AbstractStreamSession
 if (session == null)
 {
 StreamInSession possibleNew = new StreamInSession(host, sessionId, 
null);
+Gossiper.instance.register(possibleNew);
+
FailureDetector.instance.registerFailureDetectionEventListener(possibleNew);
 if ((session = sessions.putIfAbsent(sessionId, possibleNew)) == 
null)
 session = possibleNew;
 }

http://git-wip-us.apache.org/repos/asf/cassandra/blob/35d4b5de/src/java/org/apache/cassandra/streaming/StreamOutSession.java
--
diff --git a/src/java/org/apache/cassandra/streaming/StreamOutSession.java 
b/src/java/org/apache/cassandra/streaming/StreamOutSession.java
index edc07ca..c4d7695 100644
--- a/src/java/org/apache/cassandra/streaming/StreamOutSession.java
+++ b/src/java/org/apache/cassandra/streaming/StreamOutSession.java
@@ -25,6 

[jira] [Commented] (CASSANDRA-6818) SSTable references not released if stream session fails before it starts

2014-03-24 Thread Yuki Morishita (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13944974#comment-13944974
 ] 

Yuki Morishita commented on CASSANDRA-6818:
---

Committed 1.2. version to be released in 1.2.16.

 SSTable references not released if stream session fails before it starts
 

 Key: CASSANDRA-6818
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6818
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Richard Low
Assignee: Yuki Morishita
 Fix For: 1.2.16, 2.0.7, 2.1 beta2

 Attachments: 6818-1.2.txt, 6818-2.0-v2.txt, 6818-2.0.txt


 I observed a large number of 'orphan' SSTables - SSTables that are in the 
 data directory but not loaded by Cassandra - on a 1.1.12 node that had a 
 large stream fail before it started. These orphan files are particularly 
 dangerous because if the node is restarted and picks up these SSTables it 
 could bring data back to life if tombstones have been GCed. To confirm the 
 SSTables are orphan, I created a snapshot and it didn't contain these files. 
 I can see in the logs that they have been compacted so should have been 
 deleted.
 The log entries for the stream are:
 {{INFO [StreamStage:1] 2014-02-21 19:41:48,742 StreamOut.java (line 115) 
 Beginning transfer to /10.0.0.1}}
 {{INFO [StreamStage:1] 2014-02-21 19:41:48,743 StreamOut.java (line 96) 
 Flushing memtables for [CFS(Keyspace='ks', ColumnFamily='cf1'), 
 CFS(Keyspace='ks', ColumnFamily='cf2')]...}}
 {{ERROR [GossipTasks:1] 2014-02-21 19:41:49,239 AbstractStreamSession.java 
 (line 113) Stream failed because /10.0.0.1 died or was restarted/removed 
 (streams may still be active in background, but further streams won't be 
 started)}}
 {{INFO [StreamStage:1] 2014-02-21 19:41:51,783 StreamOut.java (line 161) 
 Stream context metadata [...] 2267 sstables.}}
 {{INFO [StreamStage:1] 2014-02-21 19:41:51,789 StreamOutSession.java (line 
 182) Streaming to /10.0.0.1}}
 {{INFO [Streaming to /10.0.0.1:1] 2014-02-21 19:42:02,218 FileStreamTask.java 
 (line 99) Found no stream out session at end of file stream task - this is 
 expected if the receiver went down}}
 After digging in the code, here's what I think the issue is:
 1. StreamOutSession.transferRanges() creates a streaming session, which is 
 registered with the failure detector in AbstractStreamSession's constructor.
 2. Memtables are flushed, potentially taking a long time.
 3. The remote node fails, convict() is called and the StreamOutSession is 
 closed. However, at this time StreamOutSession.files is empty because it's 
 still waiting for the memtables to flush.
 4. Memtables finish flusing, references are obtained to SSTables to be 
 streamed and the PendingFiles are added to StreamOutSession.files.
 5. The first stream fails but the StreamOutSession isn't found so is never 
 closed and the references are never released.
 This code is more or less the same on 1.2 so I would expect it to reproduce 
 there. I looked at 2.0 and can't even see where SSTable references are 
 released when the stream fails.
 Some possible fixes for 1.1/1.2:
 1. Don't register with the failure detector until after the PendingFiles are 
 set up. I think this is the behaviour in 2.0 but I don't know if it was done 
 like this to avoid this issue.
 2. Detect the above case in (e.g.) StreamOutSession.begin() by noticing the 
 session has been closed with care to avoid double frees.
 3. Add some synchronization so closeInternal() doesn't race with setting up 
 the session.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (CASSANDRA-6914) Map element is not allowed in CAS condition with DELETE/UPDATE query

2014-03-24 Thread Dmitriy Ukhlov (JIRA)
Dmitriy Ukhlov created CASSANDRA-6914:
-

 Summary: Map element is not allowed in CAS condition with 
DELETE/UPDATE query
 Key: CASSANDRA-6914
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6914
 Project: Cassandra
  Issue Type: Bug
Reporter: Dmitriy Ukhlov


CREATE TABLE test (id int, data maptext,text, PRIMARY KEY(id));

INSERT INTO test (id, data) VALUES (1,{'a':'1'});

DELETE FROM test WHERE id=1 IF data['a']=null;
Bad Request: line 1:40 missing EOF at '='

UPDATE test SET data['b']='2' WHERE id=1 IF data['a']='1';
Bad Request: line 1:53 missing EOF at '='

These queries was successfuly executed with cassandra 2.0.5, but don't work in 
2.0.6 release



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[9/9] git commit: Merge branch 'cassandra-2.1' into trunk

2014-03-24 Thread yukim
Merge branch 'cassandra-2.1' into trunk


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/5f6e780d
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/5f6e780d
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/5f6e780d

Branch: refs/heads/trunk
Commit: 5f6e780d8bde7c88960e05e8f96192762137bc4c
Parents: 6838790 874a341
Author: Yuki Morishita yu...@apache.org
Authored: Mon Mar 24 07:55:11 2014 -0500
Committer: Yuki Morishita yu...@apache.org
Committed: Mon Mar 24 07:55:11 2014 -0500

--

--




[1/9] git commit: Fix SSTable not released if stream fails before it starts

2014-03-24 Thread yukim
Repository: cassandra
Updated Branches:
  refs/heads/cassandra-2.0 b7231ff8a - e6c8034b1
  refs/heads/cassandra-2.1 5baa72f7f - 874a34174
  refs/heads/trunk 6838790f8 - 5f6e780d8


Fix SSTable not released if stream fails before it starts

patch by yukim; reviewed by Richard Low for CASSANDRA-6818


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/35d4b5de
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/35d4b5de
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/35d4b5de

Branch: refs/heads/cassandra-2.0
Commit: 35d4b5de8f3ee18ec98b01f3aa0951df0e11e8d2
Parents: b7bb2fb
Author: Yuki Morishita yu...@apache.org
Authored: Mon Mar 24 07:44:19 2014 -0500
Committer: Yuki Morishita yu...@apache.org
Committed: Mon Mar 24 07:44:19 2014 -0500

--
 CHANGES.txt | 1 +
 .../org/apache/cassandra/streaming/AbstractStreamSession.java   | 2 --
 src/java/org/apache/cassandra/streaming/StreamInSession.java| 5 +
 src/java/org/apache/cassandra/streaming/StreamOutSession.java   | 5 +
 4 files changed, 11 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/35d4b5de/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 960b0e9..fa46c2e 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -23,6 +23,7 @@
  * Avoid NPEs when receiving table changes for an unknown keyspace 
(CASSANDRA-5631)
  * Fix bootstrapping when there is no schema (CASSANDRA-6685)
  * Fix truncating compression metadata (CASSANDRA-6791)
+ * Fix SSTable not released if stream session fails before starts 
(CASSANDRA-6818)
 
 
 1.2.15

http://git-wip-us.apache.org/repos/asf/cassandra/blob/35d4b5de/src/java/org/apache/cassandra/streaming/AbstractStreamSession.java
--
diff --git a/src/java/org/apache/cassandra/streaming/AbstractStreamSession.java 
b/src/java/org/apache/cassandra/streaming/AbstractStreamSession.java
index 89fbf5f..f8de827 100644
--- a/src/java/org/apache/cassandra/streaming/AbstractStreamSession.java
+++ b/src/java/org/apache/cassandra/streaming/AbstractStreamSession.java
@@ -44,8 +44,6 @@ public abstract class AbstractStreamSession implements 
IEndpointStateChangeSubsc
 this.sessionId = sessionId;
 this.table = table;
 this.callback = callback;
-Gossiper.instance.register(this);
-FailureDetector.instance.registerFailureDetectionEventListener(this);
 }
 
 public UUID getSessionId()

http://git-wip-us.apache.org/repos/asf/cassandra/blob/35d4b5de/src/java/org/apache/cassandra/streaming/StreamInSession.java
--
diff --git a/src/java/org/apache/cassandra/streaming/StreamInSession.java 
b/src/java/org/apache/cassandra/streaming/StreamInSession.java
index e83a5b6..f9cdc31 100644
--- a/src/java/org/apache/cassandra/streaming/StreamInSession.java
+++ b/src/java/org/apache/cassandra/streaming/StreamInSession.java
@@ -24,6 +24,7 @@ import java.net.Socket;
 import java.util.*;
 import java.util.concurrent.ConcurrentMap;
 
+import org.apache.cassandra.gms.Gossiper;
 import org.apache.cassandra.io.sstable.SSTableWriter;
 import org.cliffc.high_scale_lib.NonBlockingHashMap;
 import org.cliffc.high_scale_lib.NonBlockingHashSet;
@@ -61,6 +62,8 @@ public class StreamInSession extends AbstractStreamSession
 public static StreamInSession create(InetAddress host, IStreamCallback 
callback)
 {
 StreamInSession session = new StreamInSession(host, 
UUIDGen.getTimeUUID(), callback);
+Gossiper.instance.register(session);
+
FailureDetector.instance.registerFailureDetectionEventListener(session);
 sessions.put(session.getSessionId(), session);
 return session;
 }
@@ -71,6 +74,8 @@ public class StreamInSession extends AbstractStreamSession
 if (session == null)
 {
 StreamInSession possibleNew = new StreamInSession(host, sessionId, 
null);
+Gossiper.instance.register(possibleNew);
+
FailureDetector.instance.registerFailureDetectionEventListener(possibleNew);
 if ((session = sessions.putIfAbsent(sessionId, possibleNew)) == 
null)
 session = possibleNew;
 }

http://git-wip-us.apache.org/repos/asf/cassandra/blob/35d4b5de/src/java/org/apache/cassandra/streaming/StreamOutSession.java
--
diff --git a/src/java/org/apache/cassandra/streaming/StreamOutSession.java 
b/src/java/org/apache/cassandra/streaming/StreamOutSession.java
index edc07ca..c4d7695 100644
--- 

[5/9] git commit: Merge branch 'cassandra-1.2' into cassandra-2.0

2014-03-24 Thread yukim
Merge branch 'cassandra-1.2' into cassandra-2.0


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/e6c8034b
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/e6c8034b
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/e6c8034b

Branch: refs/heads/cassandra-2.1
Commit: e6c8034b186e4091927b7b234dae086cd47009be
Parents: b7231ff 35d4b5d
Author: Yuki Morishita yu...@apache.org
Authored: Mon Mar 24 07:54:27 2014 -0500
Committer: Yuki Morishita yu...@apache.org
Committed: Mon Mar 24 07:54:27 2014 -0500

--

--




[2/9] git commit: Fix SSTable not released if stream fails before it starts

2014-03-24 Thread yukim
Fix SSTable not released if stream fails before it starts

patch by yukim; reviewed by Richard Low for CASSANDRA-6818


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/35d4b5de
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/35d4b5de
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/35d4b5de

Branch: refs/heads/cassandra-2.1
Commit: 35d4b5de8f3ee18ec98b01f3aa0951df0e11e8d2
Parents: b7bb2fb
Author: Yuki Morishita yu...@apache.org
Authored: Mon Mar 24 07:44:19 2014 -0500
Committer: Yuki Morishita yu...@apache.org
Committed: Mon Mar 24 07:44:19 2014 -0500

--
 CHANGES.txt | 1 +
 .../org/apache/cassandra/streaming/AbstractStreamSession.java   | 2 --
 src/java/org/apache/cassandra/streaming/StreamInSession.java| 5 +
 src/java/org/apache/cassandra/streaming/StreamOutSession.java   | 5 +
 4 files changed, 11 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/35d4b5de/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 960b0e9..fa46c2e 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -23,6 +23,7 @@
  * Avoid NPEs when receiving table changes for an unknown keyspace 
(CASSANDRA-5631)
  * Fix bootstrapping when there is no schema (CASSANDRA-6685)
  * Fix truncating compression metadata (CASSANDRA-6791)
+ * Fix SSTable not released if stream session fails before starts 
(CASSANDRA-6818)
 
 
 1.2.15

http://git-wip-us.apache.org/repos/asf/cassandra/blob/35d4b5de/src/java/org/apache/cassandra/streaming/AbstractStreamSession.java
--
diff --git a/src/java/org/apache/cassandra/streaming/AbstractStreamSession.java 
b/src/java/org/apache/cassandra/streaming/AbstractStreamSession.java
index 89fbf5f..f8de827 100644
--- a/src/java/org/apache/cassandra/streaming/AbstractStreamSession.java
+++ b/src/java/org/apache/cassandra/streaming/AbstractStreamSession.java
@@ -44,8 +44,6 @@ public abstract class AbstractStreamSession implements 
IEndpointStateChangeSubsc
 this.sessionId = sessionId;
 this.table = table;
 this.callback = callback;
-Gossiper.instance.register(this);
-FailureDetector.instance.registerFailureDetectionEventListener(this);
 }
 
 public UUID getSessionId()

http://git-wip-us.apache.org/repos/asf/cassandra/blob/35d4b5de/src/java/org/apache/cassandra/streaming/StreamInSession.java
--
diff --git a/src/java/org/apache/cassandra/streaming/StreamInSession.java 
b/src/java/org/apache/cassandra/streaming/StreamInSession.java
index e83a5b6..f9cdc31 100644
--- a/src/java/org/apache/cassandra/streaming/StreamInSession.java
+++ b/src/java/org/apache/cassandra/streaming/StreamInSession.java
@@ -24,6 +24,7 @@ import java.net.Socket;
 import java.util.*;
 import java.util.concurrent.ConcurrentMap;
 
+import org.apache.cassandra.gms.Gossiper;
 import org.apache.cassandra.io.sstable.SSTableWriter;
 import org.cliffc.high_scale_lib.NonBlockingHashMap;
 import org.cliffc.high_scale_lib.NonBlockingHashSet;
@@ -61,6 +62,8 @@ public class StreamInSession extends AbstractStreamSession
 public static StreamInSession create(InetAddress host, IStreamCallback 
callback)
 {
 StreamInSession session = new StreamInSession(host, 
UUIDGen.getTimeUUID(), callback);
+Gossiper.instance.register(session);
+
FailureDetector.instance.registerFailureDetectionEventListener(session);
 sessions.put(session.getSessionId(), session);
 return session;
 }
@@ -71,6 +74,8 @@ public class StreamInSession extends AbstractStreamSession
 if (session == null)
 {
 StreamInSession possibleNew = new StreamInSession(host, sessionId, 
null);
+Gossiper.instance.register(possibleNew);
+
FailureDetector.instance.registerFailureDetectionEventListener(possibleNew);
 if ((session = sessions.putIfAbsent(sessionId, possibleNew)) == 
null)
 session = possibleNew;
 }

http://git-wip-us.apache.org/repos/asf/cassandra/blob/35d4b5de/src/java/org/apache/cassandra/streaming/StreamOutSession.java
--
diff --git a/src/java/org/apache/cassandra/streaming/StreamOutSession.java 
b/src/java/org/apache/cassandra/streaming/StreamOutSession.java
index edc07ca..c4d7695 100644
--- a/src/java/org/apache/cassandra/streaming/StreamOutSession.java
+++ b/src/java/org/apache/cassandra/streaming/StreamOutSession.java
@@ -25,6 +25,8 @@ import org.apache.commons.lang.StringUtils;
 import org.slf4j.Logger;
 import 

[7/9] git commit: Merge branch 'cassandra-2.0' into cassandra-2.1

2014-03-24 Thread yukim
Merge branch 'cassandra-2.0' into cassandra-2.1


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/874a3417
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/874a3417
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/874a3417

Branch: refs/heads/cassandra-2.1
Commit: 874a34174a521c8b02ebe89ee91511beb994bd3b
Parents: 5baa72f e6c8034
Author: Yuki Morishita yu...@apache.org
Authored: Mon Mar 24 07:54:50 2014 -0500
Committer: Yuki Morishita yu...@apache.org
Committed: Mon Mar 24 07:54:50 2014 -0500

--

--




[8/9] git commit: Merge branch 'cassandra-2.0' into cassandra-2.1

2014-03-24 Thread yukim
Merge branch 'cassandra-2.0' into cassandra-2.1


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/874a3417
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/874a3417
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/874a3417

Branch: refs/heads/trunk
Commit: 874a34174a521c8b02ebe89ee91511beb994bd3b
Parents: 5baa72f e6c8034
Author: Yuki Morishita yu...@apache.org
Authored: Mon Mar 24 07:54:50 2014 -0500
Committer: Yuki Morishita yu...@apache.org
Committed: Mon Mar 24 07:54:50 2014 -0500

--

--




[3/9] git commit: Fix SSTable not released if stream fails before it starts

2014-03-24 Thread yukim
Fix SSTable not released if stream fails before it starts

patch by yukim; reviewed by Richard Low for CASSANDRA-6818


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/35d4b5de
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/35d4b5de
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/35d4b5de

Branch: refs/heads/trunk
Commit: 35d4b5de8f3ee18ec98b01f3aa0951df0e11e8d2
Parents: b7bb2fb
Author: Yuki Morishita yu...@apache.org
Authored: Mon Mar 24 07:44:19 2014 -0500
Committer: Yuki Morishita yu...@apache.org
Committed: Mon Mar 24 07:44:19 2014 -0500

--
 CHANGES.txt | 1 +
 .../org/apache/cassandra/streaming/AbstractStreamSession.java   | 2 --
 src/java/org/apache/cassandra/streaming/StreamInSession.java| 5 +
 src/java/org/apache/cassandra/streaming/StreamOutSession.java   | 5 +
 4 files changed, 11 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/35d4b5de/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 960b0e9..fa46c2e 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -23,6 +23,7 @@
  * Avoid NPEs when receiving table changes for an unknown keyspace 
(CASSANDRA-5631)
  * Fix bootstrapping when there is no schema (CASSANDRA-6685)
  * Fix truncating compression metadata (CASSANDRA-6791)
+ * Fix SSTable not released if stream session fails before starts 
(CASSANDRA-6818)
 
 
 1.2.15

http://git-wip-us.apache.org/repos/asf/cassandra/blob/35d4b5de/src/java/org/apache/cassandra/streaming/AbstractStreamSession.java
--
diff --git a/src/java/org/apache/cassandra/streaming/AbstractStreamSession.java 
b/src/java/org/apache/cassandra/streaming/AbstractStreamSession.java
index 89fbf5f..f8de827 100644
--- a/src/java/org/apache/cassandra/streaming/AbstractStreamSession.java
+++ b/src/java/org/apache/cassandra/streaming/AbstractStreamSession.java
@@ -44,8 +44,6 @@ public abstract class AbstractStreamSession implements 
IEndpointStateChangeSubsc
 this.sessionId = sessionId;
 this.table = table;
 this.callback = callback;
-Gossiper.instance.register(this);
-FailureDetector.instance.registerFailureDetectionEventListener(this);
 }
 
 public UUID getSessionId()

http://git-wip-us.apache.org/repos/asf/cassandra/blob/35d4b5de/src/java/org/apache/cassandra/streaming/StreamInSession.java
--
diff --git a/src/java/org/apache/cassandra/streaming/StreamInSession.java 
b/src/java/org/apache/cassandra/streaming/StreamInSession.java
index e83a5b6..f9cdc31 100644
--- a/src/java/org/apache/cassandra/streaming/StreamInSession.java
+++ b/src/java/org/apache/cassandra/streaming/StreamInSession.java
@@ -24,6 +24,7 @@ import java.net.Socket;
 import java.util.*;
 import java.util.concurrent.ConcurrentMap;
 
+import org.apache.cassandra.gms.Gossiper;
 import org.apache.cassandra.io.sstable.SSTableWriter;
 import org.cliffc.high_scale_lib.NonBlockingHashMap;
 import org.cliffc.high_scale_lib.NonBlockingHashSet;
@@ -61,6 +62,8 @@ public class StreamInSession extends AbstractStreamSession
 public static StreamInSession create(InetAddress host, IStreamCallback 
callback)
 {
 StreamInSession session = new StreamInSession(host, 
UUIDGen.getTimeUUID(), callback);
+Gossiper.instance.register(session);
+
FailureDetector.instance.registerFailureDetectionEventListener(session);
 sessions.put(session.getSessionId(), session);
 return session;
 }
@@ -71,6 +74,8 @@ public class StreamInSession extends AbstractStreamSession
 if (session == null)
 {
 StreamInSession possibleNew = new StreamInSession(host, sessionId, 
null);
+Gossiper.instance.register(possibleNew);
+
FailureDetector.instance.registerFailureDetectionEventListener(possibleNew);
 if ((session = sessions.putIfAbsent(sessionId, possibleNew)) == 
null)
 session = possibleNew;
 }

http://git-wip-us.apache.org/repos/asf/cassandra/blob/35d4b5de/src/java/org/apache/cassandra/streaming/StreamOutSession.java
--
diff --git a/src/java/org/apache/cassandra/streaming/StreamOutSession.java 
b/src/java/org/apache/cassandra/streaming/StreamOutSession.java
index edc07ca..c4d7695 100644
--- a/src/java/org/apache/cassandra/streaming/StreamOutSession.java
+++ b/src/java/org/apache/cassandra/streaming/StreamOutSession.java
@@ -25,6 +25,8 @@ import org.apache.commons.lang.StringUtils;
 import org.slf4j.Logger;
 import 

[4/9] git commit: Merge branch 'cassandra-1.2' into cassandra-2.0

2014-03-24 Thread yukim
Merge branch 'cassandra-1.2' into cassandra-2.0


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/e6c8034b
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/e6c8034b
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/e6c8034b

Branch: refs/heads/cassandra-2.0
Commit: e6c8034b186e4091927b7b234dae086cd47009be
Parents: b7231ff 35d4b5d
Author: Yuki Morishita yu...@apache.org
Authored: Mon Mar 24 07:54:27 2014 -0500
Committer: Yuki Morishita yu...@apache.org
Committed: Mon Mar 24 07:54:27 2014 -0500

--

--




[6/9] git commit: Merge branch 'cassandra-1.2' into cassandra-2.0

2014-03-24 Thread yukim
Merge branch 'cassandra-1.2' into cassandra-2.0


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/e6c8034b
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/e6c8034b
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/e6c8034b

Branch: refs/heads/trunk
Commit: e6c8034b186e4091927b7b234dae086cd47009be
Parents: b7231ff 35d4b5d
Author: Yuki Morishita yu...@apache.org
Authored: Mon Mar 24 07:54:27 2014 -0500
Committer: Yuki Morishita yu...@apache.org
Committed: Mon Mar 24 07:54:27 2014 -0500

--

--




[jira] [Created] (CASSANDRA-6915) Show storage rows in cqlsh

2014-03-24 Thread Robbie Strickland (JIRA)
Robbie Strickland created CASSANDRA-6915:


 Summary: Show storage rows in cqlsh
 Key: CASSANDRA-6915
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6915
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Reporter: Robbie Strickland


In Cassandra it's super important to understand how your CQL schema translates 
to the underlying storage rows.  Right now the only way to see this is to 
create the schema in cqlsh, write some data, then query it using the CLI.  
Obviously we don't want to be encouraging people to use the CLI when it's 
supposed to be deprecated.  So I'd like to see a function in cqlsh to do this.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6506) counters++ split counter context shards into separate cells

2014-03-24 Thread Aleksey Yeschenko (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13944997#comment-13944997
 ] 

Aleksey Yeschenko commented on CASSANDRA-6506:
--

Yeah, those. We don't need to pass it, because we don't really want to use 
non-HeapAllocator for merging counter contexts - these objects are extremely 
short-lived. The important bit is the localCopy() allocator, and there we do 
use the configured memtable allocator.

 counters++ split counter context shards into separate cells
 ---

 Key: CASSANDRA-6506
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6506
 Project: Cassandra
  Issue Type: Improvement
Reporter: Aleksey Yeschenko
Assignee: Aleksey Yeschenko
 Fix For: 2.1 beta2


 This change is related to, but somewhat orthogonal to CASSANDRA-6504.
 Currently all the shard tuples for a given counter cell are packed, in sorted 
 order, in one binary blob. Thus reconciling N counter cells requires 
 allocating a new byte buffer capable of holding the union of the two 
 context's shards N-1 times.
 For writes, in post CASSANDRA-6504 world, it also means reading more data 
 than we have to (the complete context, when all we need is the local node's 
 global shard).
 Splitting the context into separate cells, one cell per shard, will help to 
 improve this. We did a similar thing with super columns for CASSANDRA-3237. 
 Incidentally, doing this split is now possible thanks to CASSANDRA-3237.
 Doing this would also simplify counter reconciliation logic. Getting rid of 
 old contexts altogether can be done trivially with upgradesstables.
 In fact, we should be able to put the logical clock into the cell's 
 timestamp, and use regular Cell-s and regular Cell reconcile() logic for the 
 shards, especially once we get rid of the local/remote shards some time in 
 the future (until then we still have to differentiate between 
 global/remote/local shards and their priority rules).



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6506) counters++ split counter context shards into separate cells

2014-03-24 Thread Aleksey Yeschenko (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13945099#comment-13945099
 ] 

Aleksey Yeschenko commented on CASSANDRA-6506:
--

That's one of the differences between 2.0 and 2.1 - in 2.0 we localCopy() 
first, then reconcile(), and thus have to use the same allocator all the way, 
but in 2.1 we reconcile() first, and localCopy() the result.

 counters++ split counter context shards into separate cells
 ---

 Key: CASSANDRA-6506
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6506
 Project: Cassandra
  Issue Type: Improvement
Reporter: Aleksey Yeschenko
Assignee: Aleksey Yeschenko
 Fix For: 2.1 beta2


 This change is related to, but somewhat orthogonal to CASSANDRA-6504.
 Currently all the shard tuples for a given counter cell are packed, in sorted 
 order, in one binary blob. Thus reconciling N counter cells requires 
 allocating a new byte buffer capable of holding the union of the two 
 context's shards N-1 times.
 For writes, in post CASSANDRA-6504 world, it also means reading more data 
 than we have to (the complete context, when all we need is the local node's 
 global shard).
 Splitting the context into separate cells, one cell per shard, will help to 
 improve this. We did a similar thing with super columns for CASSANDRA-3237. 
 Incidentally, doing this split is now possible thanks to CASSANDRA-3237.
 Doing this would also simplify counter reconciliation logic. Getting rid of 
 old contexts altogether can be done trivially with upgradesstables.
 In fact, we should be able to put the logical clock into the cell's 
 timestamp, and use regular Cell-s and regular Cell reconcile() logic for the 
 shards, especially once we get rid of the local/remote shards some time in 
 the future (until then we still have to differentiate between 
 global/remote/local shards and their priority rules).



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[1/2] git commit: Clean up CFMetaData

2014-03-24 Thread aleksey
Repository: cassandra
Updated Branches:
  refs/heads/cassandra-2.1 874a34174 - 8a2a0c3d4


Clean up CFMetaData

patch by Aleksey Yeschenko; reviewed by Sylvain Lebresne for
CASSANDRA-6506


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/69bfca06
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/69bfca06
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/69bfca06

Branch: refs/heads/cassandra-2.1
Commit: 69bfca06f2b048c43b0dc4c3423227946b7f6523
Parents: 874a341
Author: Aleksey Yeschenko alek...@apache.org
Authored: Mon Mar 24 16:52:38 2014 +0300
Committer: Aleksey Yeschenko alek...@apache.org
Committed: Mon Mar 24 16:52:38 2014 +0300

--
 src/java/org/apache/cassandra/auth/Auth.java|  2 +-
 .../org/apache/cassandra/config/CFMetaData.java | 94 +++-
 .../cassandra/cql/AlterTableStatement.java  |  2 +-
 .../cassandra/cql/DropIndexStatement.java   |  2 +-
 .../apache/cassandra/cql/QueryProcessor.java|  2 +-
 .../cql3/statements/AlterTableStatement.java|  2 +-
 .../cql3/statements/AlterTypeStatement.java |  2 +-
 .../cql3/statements/CreateIndexStatement.java   |  2 +-
 .../cql3/statements/CreateTriggerStatement.java |  2 +-
 .../cql3/statements/DropIndexStatement.java |  2 +-
 .../cql3/statements/DropTriggerStatement.java   |  2 +-
 .../apache/cassandra/config/CFMetaDataTest.java |  2 +-
 .../org/apache/cassandra/config/DefsTest.java   |  8 +-
 .../cassandra/thrift/ThriftValidationTest.java  |  2 +-
 .../cassandra/triggers/TriggersSchemaTest.java  |  6 +-
 15 files changed, 49 insertions(+), 83 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/69bfca06/src/java/org/apache/cassandra/auth/Auth.java
--
diff --git a/src/java/org/apache/cassandra/auth/Auth.java 
b/src/java/org/apache/cassandra/auth/Auth.java
index 90b1215..237fc99 100644
--- a/src/java/org/apache/cassandra/auth/Auth.java
+++ b/src/java/org/apache/cassandra/auth/Auth.java
@@ -205,7 +205,7 @@ public class Auth
 CFStatement parsed = 
(CFStatement)QueryProcessor.parseStatement(cql);
 parsed.prepareKeyspace(AUTH_KS);
 CreateTableStatement statement = (CreateTableStatement) 
parsed.prepare().statement;
-CFMetaData cfm = 
statement.getCFMetaData().clone(CFMetaData.generateLegacyCfId(AUTH_KS, name));
+CFMetaData cfm = 
statement.getCFMetaData().copy(CFMetaData.generateLegacyCfId(AUTH_KS, name));
 assert cfm.cfName.equals(name);
 MigrationManager.announceNewColumnFamily(cfm);
 }

http://git-wip-us.apache.org/repos/asf/cassandra/blob/69bfca06/src/java/org/apache/cassandra/config/CFMetaData.java
--
diff --git a/src/java/org/apache/cassandra/config/CFMetaData.java 
b/src/java/org/apache/cassandra/config/CFMetaData.java
index f38dd5e..9c8ceaf 100644
--- a/src/java/org/apache/cassandra/config/CFMetaData.java
+++ b/src/java/org/apache/cassandra/config/CFMetaData.java
@@ -20,26 +20,23 @@ package org.apache.cassandra.config;
 import java.io.DataInput;
 import java.lang.reflect.Constructor;
 import java.lang.reflect.InvocationTargetException;
-import java.lang.reflect.Method;
 import java.nio.ByteBuffer;
 import java.util.*;
 
 import com.google.common.annotations.VisibleForTesting;
 import com.google.common.base.Objects;
+import com.google.common.base.Strings;
 import com.google.common.collect.AbstractIterator;
 import com.google.common.collect.Iterables;
 import com.google.common.collect.MapDifference;
 import com.google.common.collect.Maps;
-
-import org.apache.cassandra.cache.CachingOptions;
-import org.apache.cassandra.db.composites.*;
-
 import org.apache.commons.lang3.ArrayUtils;
 import org.apache.commons.lang3.builder.HashCodeBuilder;
 import org.apache.commons.lang3.builder.ToStringBuilder;
 import org.slf4j.Logger;
 import org.slf4j.LoggerFactory;
 
+import org.apache.cassandra.cache.CachingOptions;
 import org.apache.cassandra.cql3.*;
 import org.apache.cassandra.cql3.statements.CFStatement;
 import org.apache.cassandra.cql3.statements.CreateTableStatement;
@@ -47,6 +44,7 @@ import org.apache.cassandra.db.*;
 import org.apache.cassandra.db.compaction.AbstractCompactionStrategy;
 import org.apache.cassandra.db.compaction.LeveledCompactionStrategy;
 import org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy;
+import org.apache.cassandra.db.composites.*;
 import org.apache.cassandra.db.index.SecondaryIndex;
 import org.apache.cassandra.db.marshal.*;
 import org.apache.cassandra.exceptions.ConfigurationException;
@@ -66,14 +64,11 @@ import org.apache.cassandra.utils.UUIDGen;
 
 import static 

[1/3] git commit: Clean up CFMetaData

2014-03-24 Thread aleksey
Repository: cassandra
Updated Branches:
  refs/heads/trunk 5f6e780d8 - e5314641a


Clean up CFMetaData

patch by Aleksey Yeschenko; reviewed by Sylvain Lebresne for
CASSANDRA-6506


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/69bfca06
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/69bfca06
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/69bfca06

Branch: refs/heads/trunk
Commit: 69bfca06f2b048c43b0dc4c3423227946b7f6523
Parents: 874a341
Author: Aleksey Yeschenko alek...@apache.org
Authored: Mon Mar 24 16:52:38 2014 +0300
Committer: Aleksey Yeschenko alek...@apache.org
Committed: Mon Mar 24 16:52:38 2014 +0300

--
 src/java/org/apache/cassandra/auth/Auth.java|  2 +-
 .../org/apache/cassandra/config/CFMetaData.java | 94 +++-
 .../cassandra/cql/AlterTableStatement.java  |  2 +-
 .../cassandra/cql/DropIndexStatement.java   |  2 +-
 .../apache/cassandra/cql/QueryProcessor.java|  2 +-
 .../cql3/statements/AlterTableStatement.java|  2 +-
 .../cql3/statements/AlterTypeStatement.java |  2 +-
 .../cql3/statements/CreateIndexStatement.java   |  2 +-
 .../cql3/statements/CreateTriggerStatement.java |  2 +-
 .../cql3/statements/DropIndexStatement.java |  2 +-
 .../cql3/statements/DropTriggerStatement.java   |  2 +-
 .../apache/cassandra/config/CFMetaDataTest.java |  2 +-
 .../org/apache/cassandra/config/DefsTest.java   |  8 +-
 .../cassandra/thrift/ThriftValidationTest.java  |  2 +-
 .../cassandra/triggers/TriggersSchemaTest.java  |  6 +-
 15 files changed, 49 insertions(+), 83 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/69bfca06/src/java/org/apache/cassandra/auth/Auth.java
--
diff --git a/src/java/org/apache/cassandra/auth/Auth.java 
b/src/java/org/apache/cassandra/auth/Auth.java
index 90b1215..237fc99 100644
--- a/src/java/org/apache/cassandra/auth/Auth.java
+++ b/src/java/org/apache/cassandra/auth/Auth.java
@@ -205,7 +205,7 @@ public class Auth
 CFStatement parsed = 
(CFStatement)QueryProcessor.parseStatement(cql);
 parsed.prepareKeyspace(AUTH_KS);
 CreateTableStatement statement = (CreateTableStatement) 
parsed.prepare().statement;
-CFMetaData cfm = 
statement.getCFMetaData().clone(CFMetaData.generateLegacyCfId(AUTH_KS, name));
+CFMetaData cfm = 
statement.getCFMetaData().copy(CFMetaData.generateLegacyCfId(AUTH_KS, name));
 assert cfm.cfName.equals(name);
 MigrationManager.announceNewColumnFamily(cfm);
 }

http://git-wip-us.apache.org/repos/asf/cassandra/blob/69bfca06/src/java/org/apache/cassandra/config/CFMetaData.java
--
diff --git a/src/java/org/apache/cassandra/config/CFMetaData.java 
b/src/java/org/apache/cassandra/config/CFMetaData.java
index f38dd5e..9c8ceaf 100644
--- a/src/java/org/apache/cassandra/config/CFMetaData.java
+++ b/src/java/org/apache/cassandra/config/CFMetaData.java
@@ -20,26 +20,23 @@ package org.apache.cassandra.config;
 import java.io.DataInput;
 import java.lang.reflect.Constructor;
 import java.lang.reflect.InvocationTargetException;
-import java.lang.reflect.Method;
 import java.nio.ByteBuffer;
 import java.util.*;
 
 import com.google.common.annotations.VisibleForTesting;
 import com.google.common.base.Objects;
+import com.google.common.base.Strings;
 import com.google.common.collect.AbstractIterator;
 import com.google.common.collect.Iterables;
 import com.google.common.collect.MapDifference;
 import com.google.common.collect.Maps;
-
-import org.apache.cassandra.cache.CachingOptions;
-import org.apache.cassandra.db.composites.*;
-
 import org.apache.commons.lang3.ArrayUtils;
 import org.apache.commons.lang3.builder.HashCodeBuilder;
 import org.apache.commons.lang3.builder.ToStringBuilder;
 import org.slf4j.Logger;
 import org.slf4j.LoggerFactory;
 
+import org.apache.cassandra.cache.CachingOptions;
 import org.apache.cassandra.cql3.*;
 import org.apache.cassandra.cql3.statements.CFStatement;
 import org.apache.cassandra.cql3.statements.CreateTableStatement;
@@ -47,6 +44,7 @@ import org.apache.cassandra.db.*;
 import org.apache.cassandra.db.compaction.AbstractCompactionStrategy;
 import org.apache.cassandra.db.compaction.LeveledCompactionStrategy;
 import org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy;
+import org.apache.cassandra.db.composites.*;
 import org.apache.cassandra.db.index.SecondaryIndex;
 import org.apache.cassandra.db.marshal.*;
 import org.apache.cassandra.exceptions.ConfigurationException;
@@ -66,14 +64,11 @@ import org.apache.cassandra.utils.UUIDGen;
 
 import static 

[3/3] git commit: Merge branch 'cassandra-2.1' into trunk

2014-03-24 Thread aleksey
Merge branch 'cassandra-2.1' into trunk

Conflicts:
src/java/org/apache/cassandra/cql/AlterTableStatement.java
src/java/org/apache/cassandra/cql/DropIndexStatement.java
src/java/org/apache/cassandra/cql/QueryProcessor.java


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/e5314641
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/e5314641
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/e5314641

Branch: refs/heads/trunk
Commit: e5314641a41b9464d04ed156e46465b3c213abe8
Parents: 5f6e780 8a2a0c3
Author: Aleksey Yeschenko alek...@apache.org
Authored: Mon Mar 24 17:00:00 2014 +0300
Committer: Aleksey Yeschenko alek...@apache.org
Committed: Mon Mar 24 17:00:00 2014 +0300

--
 src/java/org/apache/cassandra/auth/Auth.java|   2 +-
 .../org/apache/cassandra/config/CFMetaData.java |  94 -
 .../cql3/statements/AlterTableStatement.java|   2 +-
 .../cql3/statements/AlterTypeStatement.java |   2 +-
 .../cql3/statements/CreateIndexStatement.java   |   2 +-
 .../cql3/statements/CreateTriggerStatement.java |   2 +-
 .../cql3/statements/DropIndexStatement.java |   2 +-
 .../cql3/statements/DropTriggerStatement.java   |   2 +-
 .../apache/cassandra/db/AtomicBTreeColumns.java |  14 +-
 src/java/org/apache/cassandra/db/Cell.java  |  23 +---
 .../org/apache/cassandra/db/ColumnFamily.java   |   6 +-
 .../org/apache/cassandra/db/CounterCell.java|   9 +-
 .../apache/cassandra/db/CounterMutation.java|   4 +-
 .../apache/cassandra/db/CounterUpdateCell.java  |   4 +-
 .../org/apache/cassandra/db/DeletedCell.java|   8 +-
 .../org/apache/cassandra/db/DeletionTime.java   |   2 +-
 .../org/apache/cassandra/db/ExpiringCell.java   |   2 +-
 .../cassandra/db/HintedHandOffManager.java  |   8 +-
 src/java/org/apache/cassandra/db/Memtable.java  |   4 +-
 .../org/apache/cassandra/db/OnDiskAtom.java |   3 +-
 .../org/apache/cassandra/db/RangeTombstone.java |   9 +-
 .../db/compaction/LazilyCompactedRow.java   |   6 +-
 .../cassandra/db/context/CounterContext.java|  22 ++-
 .../apache/cassandra/db/filter/QueryFilter.java |   3 +-
 .../io/sstable/AbstractSSTableSimpleWriter.java |   3 +-
 .../cassandra/io/sstable/SSTableWriter.java |   4 +-
 .../utils/memory/ContextAllocator.java  |  11 +-
 .../cassandra/utils/memory/PoolAllocator.java   |  11 +-
 .../apache/cassandra/config/CFMetaDataTest.java |   2 +-
 .../org/apache/cassandra/config/DefsTest.java   |   8 +-
 .../apache/cassandra/db/CounterCellTest.java|  39 +++---
 .../db/context/CounterContextTest.java  | 138 +++
 .../streaming/StreamingTransferTest.java|   3 +-
 .../cassandra/thrift/ThriftValidationTest.java  |   2 +-
 .../cassandra/triggers/TriggersSchemaTest.java  |   6 +-
 35 files changed, 170 insertions(+), 292 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/e5314641/src/java/org/apache/cassandra/config/CFMetaData.java
--

http://git-wip-us.apache.org/repos/asf/cassandra/blob/e5314641/src/java/org/apache/cassandra/io/sstable/SSTableWriter.java
--

http://git-wip-us.apache.org/repos/asf/cassandra/blob/e5314641/test/unit/org/apache/cassandra/config/DefsTest.java
--
diff --cc test/unit/org/apache/cassandra/config/DefsTest.java
index 6c06648,2e1876f..fd24822
--- a/test/unit/org/apache/cassandra/config/DefsTest.java
+++ b/test/unit/org/apache/cassandra/config/DefsTest.java
@@@ -69,9 -68,9 +69,9 @@@ public class DefsTest extends SchemaLoa
 .maxCompactionThreshold(500);
  
  // we'll be adding this one later. make sure it's not already there.
 -assert cfm.getColumnDefinition(ByteBuffer.wrap(new byte[] { 5 })) == 
null;
 +Assert.assertNull(cfm.getColumnDefinition(ByteBuffer.wrap(new byte[] 
{ 5 })));
  
- CFMetaData cfNew = cfm.clone();
+ CFMetaData cfNew = cfm.copy();
  
  // add one.
  ColumnDefinition addIndexDef = ColumnDefinition.regularDef(cfm, 
ByteBuffer.wrap(new byte[] { 5 }), BytesType.instance, null)
@@@ -407,12 -406,12 +407,12 @@@
  KSMetaData ksm = KSMetaData.testMetadata(cf.ksName, 
SimpleStrategy.class, KSMetaData.optsWithRF(1), cf);
  MigrationManager.announceNewKeyspace(ksm);
  
 -assert Schema.instance.getKSMetaData(cf.ksName) != null;
 -assert Schema.instance.getKSMetaData(cf.ksName).equals(ksm);
 -assert Schema.instance.getCFMetaData(cf.ksName, cf.cfName) != null;
 +Assert.assertNotNull(Schema.instance.getKSMetaData(cf.ksName));
 +

[jira] [Commented] (CASSANDRA-6911) Netty dependency update broke stress

2014-03-24 Thread Ryan McGuire (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13945116#comment-13945116
 ] 

Ryan McGuire commented on CASSANDRA-6911:
-

It won't start up with the old jar:

{code}
ERROR [main] 2014-03-24 09:58:21,156 CassandraDaemon.java:471 - Exception 
encountered during startup
java.lang.NoClassDefFoundError: 
io/netty/util/internal/logging/InternalLoggerFactory
at 
org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:374) 
[main/:na]
at 
org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:454) 
[main/:na]
at 
org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:543) 
[main/:na]
Caused by: java.lang.ClassNotFoundException: 
io.netty.util.internal.logging.InternalLoggerFactory
at java.net.URLClassLoader$1.run(URLClassLoader.java:366) ~[na:1.7.0_51]
at java.net.URLClassLoader$1.run(URLClassLoader.java:355) ~[na:1.7.0_51]
at java.security.AccessController.doPrivileged(Native Method) 
~[na:1.7.0_51]
at java.net.URLClassLoader.findClass(URLClassLoader.java:354) 
~[na:1.7.0_51]
at java.lang.ClassLoader.loadClass(ClassLoader.java:425) ~[na:1.7.0_51]
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308) 
~[na:1.7.0_51]
at java.lang.ClassLoader.loadClass(ClassLoader.java:358) ~[na:1.7.0_51]
... 3 common frames omitted
INFO  [StorageServiceShutdownHook] 2014-03-24 09:59:27,298 Gossiper.java:1269 - 
Announcing shutdown
{code}


 Netty dependency update broke stress
 

 Key: CASSANDRA-6911
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6911
 Project: Cassandra
  Issue Type: Bug
  Components: Tools
Reporter: Ryan McGuire
Assignee: Benedict

 I compiled stress fresh from cassandra-2.1 and running this command:
 {code}
 cassandra-stress write n=1900 -rate threads=50 -node bdplab
 {code}
 I get the following traceback:
 {code}
 Exception in thread Thread-49 java.lang.NoClassDefFoundError: 
 org/jboss/netty/channel/ChannelFactory
 at com.datastax.driver.core.Cluster$Manager.init(Cluster.java:941)
 at com.datastax.driver.core.Cluster$Manager.init(Cluster.java:889)
 at com.datastax.driver.core.Cluster.init(Cluster.java:88)
 at com.datastax.driver.core.Cluster.buildFrom(Cluster.java:144)
 at com.datastax.driver.core.Cluster$Builder.build(Cluster.java:854)
 at 
 org.apache.cassandra.stress.util.JavaDriverClient.connect(JavaDriverClient.java:74)
 at 
 org.apache.cassandra.stress.settings.StressSettings.getJavaDriverClient(StressSettings.java:155)
 at 
 org.apache.cassandra.stress.settings.StressSettings.getSmartThriftClient(StressSettings.java:70)
 at 
 org.apache.cassandra.stress.StressAction$Consumer.run(StressAction.java:275)
 Caused by: java.lang.ClassNotFoundException: 
 org.jboss.netty.channel.ChannelFactory
 at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
 at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
 at java.security.AccessController.doPrivileged(Native Method)
 at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
 at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
 ... 9 more
 {code}
 It seems this was introduced with an updated netty jar in 
 cbf304ebd0436a321753e81231545b705aa8dd23



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (CASSANDRA-6506) counters++ split counter context shards into separate cells

2014-03-24 Thread Aleksey Yeschenko (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko updated CASSANDRA-6506:
-

Fix Version/s: (was: 2.1 beta2)
   3.0

Anyway, committed the first two, changed fixver to 3.0.

 counters++ split counter context shards into separate cells
 ---

 Key: CASSANDRA-6506
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6506
 Project: Cassandra
  Issue Type: Improvement
Reporter: Aleksey Yeschenko
Assignee: Aleksey Yeschenko
 Fix For: 3.0


 This change is related to, but somewhat orthogonal to CASSANDRA-6504.
 Currently all the shard tuples for a given counter cell are packed, in sorted 
 order, in one binary blob. Thus reconciling N counter cells requires 
 allocating a new byte buffer capable of holding the union of the two 
 context's shards N-1 times.
 For writes, in post CASSANDRA-6504 world, it also means reading more data 
 than we have to (the complete context, when all we need is the local node's 
 global shard).
 Splitting the context into separate cells, one cell per shard, will help to 
 improve this. We did a similar thing with super columns for CASSANDRA-3237. 
 Incidentally, doing this split is now possible thanks to CASSANDRA-3237.
 Doing this would also simplify counter reconciliation logic. Getting rid of 
 old contexts altogether can be done trivially with upgradesstables.
 In fact, we should be able to put the logical clock into the cell's 
 timestamp, and use regular Cell-s and regular Cell reconcile() logic for the 
 shards, especially once we get rid of the local/remote shards some time in 
 the future (until then we still have to differentiate between 
 global/remote/local shards and their priority rules).



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6915) Show storage rows in cqlsh

2014-03-24 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13945124#comment-13945124
 ] 

Jonathan Ellis commented on CASSANDRA-6915:
---

Maybe reasonable: group partitions with whitespace in between, for compound 
primary keys.  OTOH I don't see that this adds a whole ton of value, 
particularly since most queries are single-partition.

Not reasonable: a function in cqlsh to emulate the cli

In the end I think it's a pipe dream to save people from reading the docs.  I 
also reject the contention that it's super important for most users to 
understand all the storage-level details of (for instance) WITH COMPACT STORAGE.

So unless you have a better idea for information to show in cqlsh that is both 
relevant and unintrusive I think we should Notaproblem this.

 Show storage rows in cqlsh
 --

 Key: CASSANDRA-6915
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6915
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Reporter: Robbie Strickland
  Labels: cqlsh

 In Cassandra it's super important to understand how your CQL schema 
 translates to the underlying storage rows.  Right now the only way to see 
 this is to create the schema in cqlsh, write some data, then query it using 
 the CLI.  Obviously we don't want to be encouraging people to use the CLI 
 when it's supposed to be deprecated.  So I'd like to see a function in cqlsh 
 to do this.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (CASSANDRA-6914) Map element is not allowed in CAS condition with DELETE/UPDATE query

2014-03-24 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-6914:
--

Fix Version/s: 2.0.7
 Assignee: Sylvain Lebresne

 Map element is not allowed in CAS condition with DELETE/UPDATE query
 

 Key: CASSANDRA-6914
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6914
 Project: Cassandra
  Issue Type: Bug
Reporter: Dmitriy Ukhlov
Assignee: Sylvain Lebresne
 Fix For: 2.0.7


 CREATE TABLE test (id int, data maptext,text, PRIMARY KEY(id));
 INSERT INTO test (id, data) VALUES (1,{'a':'1'});
 DELETE FROM test WHERE id=1 IF data['a']=null;
 Bad Request: line 1:40 missing EOF at '='
 UPDATE test SET data['b']='2' WHERE id=1 IF data['a']='1';
 Bad Request: line 1:53 missing EOF at '='
 These queries was successfuly executed with cassandra 2.0.5, but don't work 
 in 2.0.6 release



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6911) Netty dependency update broke stress

2014-03-24 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13945128#comment-13945128
 ] 

Benedict commented on CASSANDRA-6911:
-

Did you remove the new jar from the regular lib dir? That looks like C* is not 
starting because netty-4 is missing, which is nothing to do with the stress lib 
dir.

 Netty dependency update broke stress
 

 Key: CASSANDRA-6911
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6911
 Project: Cassandra
  Issue Type: Bug
  Components: Tools
Reporter: Ryan McGuire
Assignee: Benedict

 I compiled stress fresh from cassandra-2.1 and running this command:
 {code}
 cassandra-stress write n=1900 -rate threads=50 -node bdplab
 {code}
 I get the following traceback:
 {code}
 Exception in thread Thread-49 java.lang.NoClassDefFoundError: 
 org/jboss/netty/channel/ChannelFactory
 at com.datastax.driver.core.Cluster$Manager.init(Cluster.java:941)
 at com.datastax.driver.core.Cluster$Manager.init(Cluster.java:889)
 at com.datastax.driver.core.Cluster.init(Cluster.java:88)
 at com.datastax.driver.core.Cluster.buildFrom(Cluster.java:144)
 at com.datastax.driver.core.Cluster$Builder.build(Cluster.java:854)
 at 
 org.apache.cassandra.stress.util.JavaDriverClient.connect(JavaDriverClient.java:74)
 at 
 org.apache.cassandra.stress.settings.StressSettings.getJavaDriverClient(StressSettings.java:155)
 at 
 org.apache.cassandra.stress.settings.StressSettings.getSmartThriftClient(StressSettings.java:70)
 at 
 org.apache.cassandra.stress.StressAction$Consumer.run(StressAction.java:275)
 Caused by: java.lang.ClassNotFoundException: 
 org.jboss.netty.channel.ChannelFactory
 at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
 at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
 at java.security.AccessController.doPrivileged(Native Method)
 at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
 at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
 ... 9 more
 {code}
 It seems this was introduced with an updated netty jar in 
 cbf304ebd0436a321753e81231545b705aa8dd23



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6911) Netty dependency update broke stress

2014-03-24 Thread Ryan McGuire (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13945130#comment-13945130
 ] 

Ryan McGuire commented on CASSANDRA-6911:
-

nevermind, I missed what you meant sylvain, it does work if I put the old netty 
jar in the *stress* lib dir.

 Netty dependency update broke stress
 

 Key: CASSANDRA-6911
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6911
 Project: Cassandra
  Issue Type: Bug
  Components: Tools
Reporter: Ryan McGuire
Assignee: Benedict

 I compiled stress fresh from cassandra-2.1 and running this command:
 {code}
 cassandra-stress write n=1900 -rate threads=50 -node bdplab
 {code}
 I get the following traceback:
 {code}
 Exception in thread Thread-49 java.lang.NoClassDefFoundError: 
 org/jboss/netty/channel/ChannelFactory
 at com.datastax.driver.core.Cluster$Manager.init(Cluster.java:941)
 at com.datastax.driver.core.Cluster$Manager.init(Cluster.java:889)
 at com.datastax.driver.core.Cluster.init(Cluster.java:88)
 at com.datastax.driver.core.Cluster.buildFrom(Cluster.java:144)
 at com.datastax.driver.core.Cluster$Builder.build(Cluster.java:854)
 at 
 org.apache.cassandra.stress.util.JavaDriverClient.connect(JavaDriverClient.java:74)
 at 
 org.apache.cassandra.stress.settings.StressSettings.getJavaDriverClient(StressSettings.java:155)
 at 
 org.apache.cassandra.stress.settings.StressSettings.getSmartThriftClient(StressSettings.java:70)
 at 
 org.apache.cassandra.stress.StressAction$Consumer.run(StressAction.java:275)
 Caused by: java.lang.ClassNotFoundException: 
 org.jboss.netty.channel.ChannelFactory
 at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
 at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
 at java.security.AccessController.doPrivileged(Native Method)
 at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
 at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
 ... 9 more
 {code}
 It seems this was introduced with an updated netty jar in 
 cbf304ebd0436a321753e81231545b705aa8dd23



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6911) Netty dependency update broke stress

2014-03-24 Thread Ryan McGuire (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13945132#comment-13945132
 ] 

Ryan McGuire commented on CASSANDRA-6911:
-

No, I don't have to remove the new one from the main lib dir.

 Netty dependency update broke stress
 

 Key: CASSANDRA-6911
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6911
 Project: Cassandra
  Issue Type: Bug
  Components: Tools
Reporter: Ryan McGuire
Assignee: Benedict

 I compiled stress fresh from cassandra-2.1 and running this command:
 {code}
 cassandra-stress write n=1900 -rate threads=50 -node bdplab
 {code}
 I get the following traceback:
 {code}
 Exception in thread Thread-49 java.lang.NoClassDefFoundError: 
 org/jboss/netty/channel/ChannelFactory
 at com.datastax.driver.core.Cluster$Manager.init(Cluster.java:941)
 at com.datastax.driver.core.Cluster$Manager.init(Cluster.java:889)
 at com.datastax.driver.core.Cluster.init(Cluster.java:88)
 at com.datastax.driver.core.Cluster.buildFrom(Cluster.java:144)
 at com.datastax.driver.core.Cluster$Builder.build(Cluster.java:854)
 at 
 org.apache.cassandra.stress.util.JavaDriverClient.connect(JavaDriverClient.java:74)
 at 
 org.apache.cassandra.stress.settings.StressSettings.getJavaDriverClient(StressSettings.java:155)
 at 
 org.apache.cassandra.stress.settings.StressSettings.getSmartThriftClient(StressSettings.java:70)
 at 
 org.apache.cassandra.stress.StressAction$Consumer.run(StressAction.java:275)
 Caused by: java.lang.ClassNotFoundException: 
 org.jboss.netty.channel.ChannelFactory
 at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
 at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
 at java.security.AccessController.doPrivileged(Native Method)
 at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
 at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
 ... 9 more
 {code}
 It seems this was introduced with an updated netty jar in 
 cbf304ebd0436a321753e81231545b705aa8dd23



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6911) Netty dependency update broke stress

2014-03-24 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13945133#comment-13945133
 ] 

Benedict commented on CASSANDRA-6911:
-

bq. No, I don't have to remove the new one from the main lib dir.

I know, I meant maybe you had because modifying the stress lib dir wouldn't 
cause this. But nevermind looks like it's fixed now :-)

 Netty dependency update broke stress
 

 Key: CASSANDRA-6911
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6911
 Project: Cassandra
  Issue Type: Bug
  Components: Tools
Reporter: Ryan McGuire
Assignee: Benedict

 I compiled stress fresh from cassandra-2.1 and running this command:
 {code}
 cassandra-stress write n=1900 -rate threads=50 -node bdplab
 {code}
 I get the following traceback:
 {code}
 Exception in thread Thread-49 java.lang.NoClassDefFoundError: 
 org/jboss/netty/channel/ChannelFactory
 at com.datastax.driver.core.Cluster$Manager.init(Cluster.java:941)
 at com.datastax.driver.core.Cluster$Manager.init(Cluster.java:889)
 at com.datastax.driver.core.Cluster.init(Cluster.java:88)
 at com.datastax.driver.core.Cluster.buildFrom(Cluster.java:144)
 at com.datastax.driver.core.Cluster$Builder.build(Cluster.java:854)
 at 
 org.apache.cassandra.stress.util.JavaDriverClient.connect(JavaDriverClient.java:74)
 at 
 org.apache.cassandra.stress.settings.StressSettings.getJavaDriverClient(StressSettings.java:155)
 at 
 org.apache.cassandra.stress.settings.StressSettings.getSmartThriftClient(StressSettings.java:70)
 at 
 org.apache.cassandra.stress.StressAction$Consumer.run(StressAction.java:275)
 Caused by: java.lang.ClassNotFoundException: 
 org.jboss.netty.channel.ChannelFactory
 at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
 at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
 at java.security.AccessController.doPrivileged(Native Method)
 at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
 at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
 ... 9 more
 {code}
 It seems this was introduced with an updated netty jar in 
 cbf304ebd0436a321753e81231545b705aa8dd23



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6911) Netty dependency update broke stress

2014-03-24 Thread Ryan McGuire (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13945136#comment-13945136
 ] 

Ryan McGuire commented on CASSANDRA-6911:
-

yep, fixed (on my machine, anyway) :)

 Netty dependency update broke stress
 

 Key: CASSANDRA-6911
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6911
 Project: Cassandra
  Issue Type: Bug
  Components: Tools
Reporter: Ryan McGuire
Assignee: Benedict

 I compiled stress fresh from cassandra-2.1 and running this command:
 {code}
 cassandra-stress write n=1900 -rate threads=50 -node bdplab
 {code}
 I get the following traceback:
 {code}
 Exception in thread Thread-49 java.lang.NoClassDefFoundError: 
 org/jboss/netty/channel/ChannelFactory
 at com.datastax.driver.core.Cluster$Manager.init(Cluster.java:941)
 at com.datastax.driver.core.Cluster$Manager.init(Cluster.java:889)
 at com.datastax.driver.core.Cluster.init(Cluster.java:88)
 at com.datastax.driver.core.Cluster.buildFrom(Cluster.java:144)
 at com.datastax.driver.core.Cluster$Builder.build(Cluster.java:854)
 at 
 org.apache.cassandra.stress.util.JavaDriverClient.connect(JavaDriverClient.java:74)
 at 
 org.apache.cassandra.stress.settings.StressSettings.getJavaDriverClient(StressSettings.java:155)
 at 
 org.apache.cassandra.stress.settings.StressSettings.getSmartThriftClient(StressSettings.java:70)
 at 
 org.apache.cassandra.stress.StressAction$Consumer.run(StressAction.java:275)
 Caused by: java.lang.ClassNotFoundException: 
 org.jboss.netty.channel.ChannelFactory
 at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
 at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
 at java.security.AccessController.doPrivileged(Native Method)
 at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
 at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
 ... 9 more
 {code}
 It seems this was introduced with an updated netty jar in 
 cbf304ebd0436a321753e81231545b705aa8dd23



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6908) Dynamic endpoint snitch destabilizes cluster under heavy load

2014-03-24 Thread Brandon Williams (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13945156#comment-13945156
 ] 

Brandon Williams commented on CASSANDRA-6908:
-

CASSANDRA-6465 is what I was thinking of.

 Dynamic endpoint snitch destabilizes cluster under heavy load
 -

 Key: CASSANDRA-6908
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6908
 Project: Cassandra
  Issue Type: Improvement
  Components: Config, Core
Reporter: Bartłomiej Romański

 We observe that with dynamic snitch disabled our cluster is much more stable 
 than with dynamic snitch enabled.
 We've got a 15 nodes cluster with pretty strong machines (2xE5-2620, 64 GB 
 RAM, 2x480 GB SSD). We mostly do reads (about 300k/s).
 We use Astyanax on client side with TOKEN_AWARE option enabled. It 
 automatically direct read queries to one of the nodes responsible the given 
 token.
 In that case with dynamic snitch disabled Cassandra always handles read 
 locally. With dynamic snitch enabled Cassandra very often decides to proxy 
 the read to some other node. This causes much higher CPU usage and produces 
 much more garbage what results in more often GC pauses (young generation 
 fills up quicker). By much higher and much more I mean 1.5-2x.
 I'm aware that higher dynamic_snitch_badness_threshold value should solve 
 that issue. The default value is 0.1. I've looked at scores exposed in JMX 
 and the problem is that our values seemed to be completely random. They are 
 between usually 0.5 and 2.0, but changes randomly every time I hit refresh.
 Of course, I can set dynamic_snitch_badness_threshold to 5.0 or something 
 like that, but the result will be similar to simply disabling the dynamic 
 switch at all (that's what we done).
 I've tried to understand what's the logic behind these scores and I'm not 
 sure if I get the idea...
 It's a sum (without any multipliers) of two components:
 - ratio of recent given node latency to recent average node latency
 - something called 'severity', what, if I analyzed the code correctly, is a 
 result of BackgroundActivityMonitor.getIOWait() - it's a ratio of iowait 
 CPU time to the whole CPU time as reported in /proc/stats (the ratio is 
 multiplied by 100)
 In our case the second value is something around 0-2% but varies quite 
 heavily every second.
 What's the idea behind simply adding this two values without any multipliers 
 (e.g the second one is in percentage while the first one is not)? Are we sure 
 this is the best possible way of calculating the final score?
 Is there a way too force Cassandra to use (much) longer samples? In our case 
 we probably need that to get stable values. The 'severity' is calculated for 
 each second. The mean latency is calculated based on some magic, hardcoded 
 values (ALPHA = 0.75, WINDOW_SIZE = 100). 
 Am I right that there's no way to tune that without hacking the code?
 I'm aware that there's dynamic_snitch_update_interval_in_ms property in the 
 config file, but that only determines how often the scores are recalculated 
 not how long samples are taken. Is that correct?
 To sum up, It would be really nice to have more control over dynamic snitch 
 behavior or at least have the official option to disable it described in the 
 default config file (it took me some time to discover that we can just 
 disable it instead of hacking with dynamic_snitch_badness_threshold=1000).
 Currently for some scenarios (like ours - optimized cluster, token aware 
 client, heavy load) it causes more harm than good.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6915) Show storage rows in cqlsh

2014-03-24 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13945169#comment-13945169
 ] 

Sylvain Lebresne commented on CASSANDRA-6915:
-

bq. group partitions with whitespace in between, for compound primary keys

I think that giving more clues about partitioning and clustering is not a bad 
idea in itself, but that's kind of covered by CASSANDRA-6910 imo (I like the 
idea there of using some color code in the header a bit better than adding 
empty lines between partition, though really one doesn't exclude the other).

 Show storage rows in cqlsh
 --

 Key: CASSANDRA-6915
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6915
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Reporter: Robbie Strickland
  Labels: cqlsh

 In Cassandra it's super important to understand how your CQL schema 
 translates to the underlying storage rows.  Right now the only way to see 
 this is to create the schema in cqlsh, write some data, then query it using 
 the CLI.  Obviously we don't want to be encouraging people to use the CLI 
 when it's supposed to be deprecated.  So I'd like to see a function in cqlsh 
 to do this.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6915) Show storage rows in cqlsh

2014-03-24 Thread Robbie Strickland (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13945172#comment-13945172
 ] 

Robbie Strickland commented on CASSANDRA-6915:
--

It's the compound key (whether composite partition key or composite column) 
case that makes this useful--and I would still argue really important.  Yes you 
can read the documentation to understand the mapping, but I think this remains 
one of the most misunderstood concepts in CQL.  I would argue that it's 
important to understand the storage layer difference between PRIMARY KEY ((id, 
timestamp), event) and PRIMARY KEY (id, timestamp, event), and that the best 
way to see the difference is to visualize it.  People still don't seem to get 
the difference between partition keys and composite column names, and this 
obviously has huge implications for what sorts of queries you can run and how 
wide your rows will get.  

Perhaps something along the lines of:

CREATE TABLE MyTable (
id uuid,
timestamp int,
event string,
details string,
userId string,
PRIMARY KEY (id, timestamp, event)
);

EXPLAIN MyTable;

Partition Key: id (uuid)
Columns: 
timestamp:event:details (int:string:string)
timestamp:event:userId (int:string:string)

CREATE TABLE MyTable (
id uuid,
timestamp int,
event string,
details string,
userId string,
PRIMARY KEY ((id, timestamp), event)
);

EXPLAIN MyTable;

Partition Key: id:timestamp (uuid:int)
Columns: 
event:details (string)
event:userId (string)

 Show storage rows in cqlsh
 --

 Key: CASSANDRA-6915
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6915
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Reporter: Robbie Strickland
  Labels: cqlsh

 In Cassandra it's super important to understand how your CQL schema 
 translates to the underlying storage rows.  Right now the only way to see 
 this is to create the schema in cqlsh, write some data, then query it using 
 the CLI.  Obviously we don't want to be encouraging people to use the CLI 
 when it's supposed to be deprecated.  So I'd like to see a function in cqlsh 
 to do this.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Comment Edited] (CASSANDRA-6915) Show storage rows in cqlsh

2014-03-24 Thread Robbie Strickland (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13945172#comment-13945172
 ] 

Robbie Strickland edited comment on CASSANDRA-6915 at 3/24/14 2:42 PM:
---

It's the compound key (whether composite partition key or composite column) 
case that makes this useful--and I would still argue really important.  Yes you 
can read the documentation to understand the mapping, but I think this remains 
one of the most misunderstood concepts in CQL.  I would argue that it's 
important to understand the storage layer difference between PRIMARY KEY ((id, 
timestamp), event) and PRIMARY KEY (id, timestamp, event), and that the best 
way to see the difference is to visualize it.  People still don't seem to get 
the difference between partition keys and composite column names, and this 
obviously has huge implications for what sorts of queries you can run and how 
wide your rows will get.  

Perhaps something along the lines of:

{code}
CREATE TABLE MyTable (
id uuid,
timestamp int,
event string,
details string,
userId string,
PRIMARY KEY (id, timestamp, event)
);

EXPLAIN MyTable;

Partition Key: id (uuid)
Columns: 
timestamp:event:details (int:string:string)
timestamp:event:userId (int:string:string)

CREATE TABLE MyTable (
id uuid,
timestamp int,
event string,
details string,
userId string,
PRIMARY KEY ((id, timestamp), event)
);

EXPLAIN MyTable;

Partition Key: id:timestamp (uuid:int)
Columns: 
event:details (string)
event:userId (string)
{code}


was (Author: rstrickland):
It's the compound key (whether composite partition key or composite column) 
case that makes this useful--and I would still argue really important.  Yes you 
can read the documentation to understand the mapping, but I think this remains 
one of the most misunderstood concepts in CQL.  I would argue that it's 
important to understand the storage layer difference between PRIMARY KEY ((id, 
timestamp), event) and PRIMARY KEY (id, timestamp, event), and that the best 
way to see the difference is to visualize it.  People still don't seem to get 
the difference between partition keys and composite column names, and this 
obviously has huge implications for what sorts of queries you can run and how 
wide your rows will get.  

Perhaps something along the lines of:

CREATE TABLE MyTable (
id uuid,
timestamp int,
event string,
details string,
userId string,
PRIMARY KEY (id, timestamp, event)
);

EXPLAIN MyTable;

Partition Key: id (uuid)
Columns: 
timestamp:event:details (int:string:string)
timestamp:event:userId (int:string:string)

CREATE TABLE MyTable (
id uuid,
timestamp int,
event string,
details string,
userId string,
PRIMARY KEY ((id, timestamp), event)
);

EXPLAIN MyTable;

Partition Key: id:timestamp (uuid:int)
Columns: 
event:details (string)
event:userId (string)

 Show storage rows in cqlsh
 --

 Key: CASSANDRA-6915
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6915
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Reporter: Robbie Strickland
  Labels: cqlsh

 In Cassandra it's super important to understand how your CQL schema 
 translates to the underlying storage rows.  Right now the only way to see 
 this is to create the schema in cqlsh, write some data, then query it using 
 the CLI.  Obviously we don't want to be encouraging people to use the CLI 
 when it's supposed to be deprecated.  So I'd like to see a function in cqlsh 
 to do this.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6696) Drive replacement in JBOD can cause data to reappear.

2014-03-24 Thread Marcus Eriksson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13945173#comment-13945173
 ] 

Marcus Eriksson commented on CASSANDRA-6696:


Been poking this, wip-patch pushed here: 
https://github.com/krummas/cassandra/commits/marcuse/6696

it does the following;
* Extract an interface out of SSTableWriter (imaginatively called 
SSTableWriterInterface), start using this interface everywhere
* Create DiskAwareSSTableWriter which knows about disk layout and starts using 
it instead of standard SSTW
* Ranges of tokens are assigned to the disks, this way we only need to check 
is the key we are appending larger than the boundary token for the current 
disk? If so, create a new SSTableWriter for that disk
* Breaks unit tests

todo:
* fix unit tests, general cleanups
* I kind of want to name the interface SSTableWriter and call the old SSTW 
class something else, but i guess SSTW is the class that most external people 
depend on, so maybe not
* Take disk size into consideration when splitting the ranges over disks, this 
needs to be deterministic though, so we have to use total disk size instead of 
free disk space.
* Make other partitioners than M3P work
* Fix keycache

Rebalancing of data is simply running upgradesstables or scrub, if we loose a 
disk, we will take writes to the other disks

Comments on this approach?

 Drive replacement in JBOD can cause data to reappear. 
 --

 Key: CASSANDRA-6696
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6696
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: sankalp kohli
Assignee: Marcus Eriksson
 Fix For: 3.0


 In JBOD, when someone gets a bad drive, the bad drive is replaced with a new 
 empty one and repair is run. 
 This can cause deleted data to come back in some cases. Also this is true for 
 corrupt stables in which we delete the corrupt stable and run repair. 
 Here is an example:
 Say we have 3 nodes A,B and C and RF=3 and GC grace=10days. 
 row=sankalp col=sankalp is written 20 days back and successfully went to all 
 three nodes. 
 Then a delete/tombstone was written successfully for the same row column 15 
 days back. 
 Since this tombstone is more than gc grace, it got compacted in Nodes A and B 
 since it got compacted with the actual data. So there is no trace of this row 
 column in node A and B.
 Now in node C, say the original data is in drive1 and tombstone is in drive2. 
 Compaction has not yet reclaimed the data and tombstone.  
 Drive2 becomes corrupt and was replaced with new empty drive. 
 Due to the replacement, the tombstone in now gone and row=sankalp col=sankalp 
 has come back to life. 
 Now after replacing the drive we run repair. This data will be propagated to 
 all nodes. 
 Note: This is still a problem even if we run repair every gc grace. 
  



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Comment Edited] (CASSANDRA-6915) Show storage rows in cqlsh

2014-03-24 Thread Robbie Strickland (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13945172#comment-13945172
 ] 

Robbie Strickland edited comment on CASSANDRA-6915 at 3/24/14 2:44 PM:
---

It's the compound key (whether composite partition key or composite column) 
case that makes this useful--and I would still argue really important.  Yes you 
can read the documentation to understand the mapping, but I think this remains 
one of the most misunderstood concepts in CQL.  I would argue that it's 
important to understand the storage layer difference between PRIMARY KEY ((id, 
timestamp), event) and PRIMARY KEY (id, timestamp, event), and that the best 
way to see the difference is to visualize it.  People still don't seem to get 
the difference between partition keys and composite column names, and this 
obviously has huge implications for what sorts of queries you can run and how 
wide your rows will get.  

Perhaps something along the lines of:

{code}
CREATE TABLE MyTable (
id uuid,
timestamp int,
event string,
details string,
userId string,
PRIMARY KEY (id, timestamp, event)
);

EXPLAIN MyTable;

Partition Key: id (uuid)
Columns: 
timestamp:event:details (int:string:string)
timestamp:event:userId (int:string:string)

CREATE TABLE MyTable (
id uuid,
timestamp int,
event string,
details string,
userId string,
PRIMARY KEY ((id, timestamp), event)
);

EXPLAIN MyTable;

Partition Key: id:timestamp (uuid:int)
Columns: 
event:details (string:string)
event:userId (string:string)
{code}


was (Author: rstrickland):
It's the compound key (whether composite partition key or composite column) 
case that makes this useful--and I would still argue really important.  Yes you 
can read the documentation to understand the mapping, but I think this remains 
one of the most misunderstood concepts in CQL.  I would argue that it's 
important to understand the storage layer difference between PRIMARY KEY ((id, 
timestamp), event) and PRIMARY KEY (id, timestamp, event), and that the best 
way to see the difference is to visualize it.  People still don't seem to get 
the difference between partition keys and composite column names, and this 
obviously has huge implications for what sorts of queries you can run and how 
wide your rows will get.  

Perhaps something along the lines of:

{code}
CREATE TABLE MyTable (
id uuid,
timestamp int,
event string,
details string,
userId string,
PRIMARY KEY (id, timestamp, event)
);

EXPLAIN MyTable;

Partition Key: id (uuid)
Columns: 
timestamp:event:details (int:string:string)
timestamp:event:userId (int:string:string)

CREATE TABLE MyTable (
id uuid,
timestamp int,
event string,
details string,
userId string,
PRIMARY KEY ((id, timestamp), event)
);

EXPLAIN MyTable;

Partition Key: id:timestamp (uuid:int)
Columns: 
event:details (string)
event:userId (string)
{code}

 Show storage rows in cqlsh
 --

 Key: CASSANDRA-6915
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6915
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Reporter: Robbie Strickland
  Labels: cqlsh

 In Cassandra it's super important to understand how your CQL schema 
 translates to the underlying storage rows.  Right now the only way to see 
 this is to create the schema in cqlsh, write some data, then query it using 
 the CLI.  Obviously we don't want to be encouraging people to use the CLI 
 when it's supposed to be deprecated.  So I'd like to see a function in cqlsh 
 to do this.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6915) Show storage rows in cqlsh

2014-03-24 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13945179#comment-13945179
 ] 

Sylvain Lebresne commented on CASSANDRA-6915:
-

bq.  I would argue that it's important to understand the storage layer 
difference between PRIMARY KEY ((id, timestamp), event) and PRIMARY KEY (id, 
timestamp, event)

It's important to understand what your partition key is, and that the partition 
key decides how our table rows will be distributed on the cluster. But you 
don't need to delve into storage layer representation to explain that. Again, 
I think the approach of CASSANDRA-6910 of using separate colors in the 
resultset header (colors that we could reuse in DESC) is simple and efficient. 

 Show storage rows in cqlsh
 --

 Key: CASSANDRA-6915
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6915
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Reporter: Robbie Strickland
  Labels: cqlsh

 In Cassandra it's super important to understand how your CQL schema 
 translates to the underlying storage rows.  Right now the only way to see 
 this is to create the schema in cqlsh, write some data, then query it using 
 the CLI.  Obviously we don't want to be encouraging people to use the CLI 
 when it's supposed to be deprecated.  So I'd like to see a function in cqlsh 
 to do this.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6915) Show storage rows in cqlsh

2014-03-24 Thread Robbie Strickland (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13945182#comment-13945182
 ] 

Robbie Strickland commented on CASSANDRA-6915:
--

I agree that CASSANDRA-6910 is a good step, but I'd still like to see something 
along the lines of the EXPLAIN I demonstrate above.  Maybe I'm just hanging 
onto the past, but I think a lot of people would appreciate the overt 
explanation.

 Show storage rows in cqlsh
 --

 Key: CASSANDRA-6915
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6915
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Reporter: Robbie Strickland
  Labels: cqlsh

 In Cassandra it's super important to understand how your CQL schema 
 translates to the underlying storage rows.  Right now the only way to see 
 this is to create the schema in cqlsh, write some data, then query it using 
 the CLI.  Obviously we don't want to be encouraging people to use the CLI 
 when it's supposed to be deprecated.  So I'd like to see a function in cqlsh 
 to do this.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6696) Drive replacement in JBOD can cause data to reappear.

2014-03-24 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13945194#comment-13945194
 ] 

Jonathan Ellis commented on CASSANDRA-6696:
---

Can we drop BOP/OPP in 3.0?

 Drive replacement in JBOD can cause data to reappear. 
 --

 Key: CASSANDRA-6696
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6696
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: sankalp kohli
Assignee: Marcus Eriksson
 Fix For: 3.0


 In JBOD, when someone gets a bad drive, the bad drive is replaced with a new 
 empty one and repair is run. 
 This can cause deleted data to come back in some cases. Also this is true for 
 corrupt stables in which we delete the corrupt stable and run repair. 
 Here is an example:
 Say we have 3 nodes A,B and C and RF=3 and GC grace=10days. 
 row=sankalp col=sankalp is written 20 days back and successfully went to all 
 three nodes. 
 Then a delete/tombstone was written successfully for the same row column 15 
 days back. 
 Since this tombstone is more than gc grace, it got compacted in Nodes A and B 
 since it got compacted with the actual data. So there is no trace of this row 
 column in node A and B.
 Now in node C, say the original data is in drive1 and tombstone is in drive2. 
 Compaction has not yet reclaimed the data and tombstone.  
 Drive2 becomes corrupt and was replaced with new empty drive. 
 Due to the replacement, the tombstone in now gone and row=sankalp col=sankalp 
 has come back to life. 
 Now after replacing the drive we run repair. This data will be propagated to 
 all nodes. 
 Note: This is still a problem even if we run repair every gc grace. 
  



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (CASSANDRA-6357) Flush memtables to separate directory

2014-03-24 Thread dan jatnieks (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dan jatnieks updated CASSANDRA-6357:


Attachment: c6357-2.1-stress-write-adj-ops-sec.png
c6357-2.1-stress-write-latency-median.png
c6357-2.1-stress-write-latency-99th.png

 Flush memtables to separate directory
 -

 Key: CASSANDRA-6357
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6357
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Patrick McFadin
Assignee: Jonathan Ellis
Priority: Minor
  Labels: performance
 Fix For: 2.1 beta1

 Attachments: 6357-v2.txt, 6357.txt, 
 c6357-2.1-stress-write-adj-ops-sec.png, 
 c6357-2.1-stress-write-latency-99th.png, 
 c6357-2.1-stress-write-latency-median.png, 
 c6357-stress-write-latency-99th-1.png


 Flush writers are a critical element for keeping a node healthy. When several 
 compactions run on systems with low performing data directories, IO becomes a 
 premium. Once the disk subsystem is saturated, write IO is blocked which will 
 cause flush writer threads to backup. Since memtables are large blocks of 
 memory in the JVM, too much blocking can cause excessive GC over time 
 degrading performance. In the worst case causing an OOM.
 Since compaction is running on the data directories. My proposal is to create 
 a separate directory for flushing memtables. Potentially we can use the same 
 methodology of keeping the commit log separate and minimize disk contention 
 against the critical function of the flushwriter. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6357) Flush memtables to separate directory

2014-03-24 Thread dan jatnieks (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13945219#comment-13945219
 ] 

dan jatnieks commented on CASSANDRA-6357:
-


Retested this using trunk (as of Mar 13). I switched to different hardware with 
7200rpm disks because the slow 5400rpm disks on the other system just couldn't 
keep up without throttling stress op/s.

The machine this time was a Dell with two quad core hyper-threaded Intel Xeon 
E5620 CPU, 32Gb memory, and 8 disks (500Gb, 7200 rpm).

The two scenarios were the same as last time and used the same stress 
parameters.

The results were much less dramatic than the 2.0 test.

The base test results (data and flush on the same device) :
{noformat}
real op rate  : 9433
adjusted op rate  : 9435
adjusted op rate stderr   : 0
key rate  : 9433
latency mean  : 5.3
latency median: 1.4
latency 95th percentile   : 5.4
latency 99th percentile   : 12.9
latency 99.9th percentile : 259.4
latency max   : 20873.9
Total operation time  : 00:17:40
{noformat}

The flush test results (data and flush on separate devices):
{noformat}
real op rate  : 10391
adjusted op rate  : 10391
adjusted op rate stderr   : 0
key rate  : 10391
latency mean  : 4.8
latency median: 1.4
latency 95th percentile   : 5.4
latency 99th percentile   : 14.2
latency 99.9th percentile : 245.0
latency max   : 17035.2
Total operation time  : 00:16:02
{noformat}


See attached graphs: 
[2.1 Stress Write Latency 99.9th 
Percentile|^c6357-2.1-stress-write-latency-99th.png]
[2.1 Stress Write Median Latency|^c6357-2.1-stress-write-latency-median.png]
[2.1 Stress Write Adjusted Ops/sec|^c6357-2.1-stress-write-adj-ops-sec.png]

 Flush memtables to separate directory
 -

 Key: CASSANDRA-6357
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6357
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Patrick McFadin
Assignee: Jonathan Ellis
Priority: Minor
  Labels: performance
 Fix For: 2.1 beta1

 Attachments: 6357-v2.txt, 6357.txt, 
 c6357-2.1-stress-write-adj-ops-sec.png, 
 c6357-2.1-stress-write-latency-99th.png, 
 c6357-2.1-stress-write-latency-median.png, 
 c6357-stress-write-latency-99th-1.png


 Flush writers are a critical element for keeping a node healthy. When several 
 compactions run on systems with low performing data directories, IO becomes a 
 premium. Once the disk subsystem is saturated, write IO is blocked which will 
 cause flush writer threads to backup. Since memtables are large blocks of 
 memory in the JVM, too much blocking can cause excessive GC over time 
 degrading performance. In the worst case causing an OOM.
 Since compaction is running on the data directories. My proposal is to create 
 a separate directory for flushing memtables. Potentially we can use the same 
 methodology of keeping the commit log separate and minimize disk contention 
 against the critical function of the flushwriter. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6357) Flush memtables to separate directory

2014-03-24 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13945234#comment-13945234
 ] 

Jonathan Ellis commented on CASSANDRA-6357:
---

Hmm.  If the answer is, as long as you're not on 5400rpm disks this doesn't do 
anything for you then I'd be inclined to back it out.  Do we need to test a 
different scenario [~pmcfadin]?

 Flush memtables to separate directory
 -

 Key: CASSANDRA-6357
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6357
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Patrick McFadin
Assignee: Jonathan Ellis
Priority: Minor
  Labels: performance
 Fix For: 2.1 beta1

 Attachments: 6357-v2.txt, 6357.txt, 
 c6357-2.1-stress-write-adj-ops-sec.png, 
 c6357-2.1-stress-write-latency-99th.png, 
 c6357-2.1-stress-write-latency-median.png, 
 c6357-stress-write-latency-99th-1.png


 Flush writers are a critical element for keeping a node healthy. When several 
 compactions run on systems with low performing data directories, IO becomes a 
 premium. Once the disk subsystem is saturated, write IO is blocked which will 
 cause flush writer threads to backup. Since memtables are large blocks of 
 memory in the JVM, too much blocking can cause excessive GC over time 
 degrading performance. In the worst case causing an OOM.
 Since compaction is running on the data directories. My proposal is to create 
 a separate directory for flushing memtables. Potentially we can use the same 
 methodology of keeping the commit log separate and minimize disk contention 
 against the critical function of the flushwriter. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (CASSANDRA-6916) Preemptive re-open of compaction result

2014-03-24 Thread Benedict (JIRA)
Benedict created CASSANDRA-6916:
---

 Summary: Preemptive re-open of compaction result
 Key: CASSANDRA-6916
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6916
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Benedict
Assignee: Benedict
Priority: Minor
 Fix For: 2.1


Related to CASSANDRA-6812, but a little simpler: when compacting, we mess quite 
badly with the page cache. One thing we can do to mitigate this problem is to 
use the sstable we're writing before we've finished writing it, and to drop the 
regions from the old sstables from the page cache as soon as the new sstables 
have them (even if they're only written to the page cache). This should 
minimise any page cache churn, as the old sstables must be larger than the new 
sstable, and since both will be in memory, dropping the old sstables is at 
least as good as dropping the new.

The approach is quite straight-forward. Every X MB written:
# grab flushed length of index file;
# grab second to last index summary record, after excluding those that point to 
positions after the flushed length;
# open index file, and check that our last record doesn't occur outside of the 
flushed length of the data file (pretty unlikely)
# Open the sstable with the calculated upper bound

Some complications:
# must keep running copy of compression metadata for reopening with
# we need to be able to replace an sstable with itself but a different lower 
bound
# we need to drop the old page cache only when readers have finished





--
This message was sent by Atlassian JIRA
(v6.2#6252)


git commit: Fix typo in DeletionInfo

2014-03-24 Thread slebresne
Repository: cassandra
Updated Branches:
  refs/heads/cassandra-1.2 35d4b5de8 - 91130373f


Fix typo in DeletionInfo


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/91130373
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/91130373
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/91130373

Branch: refs/heads/cassandra-1.2
Commit: 91130373f474c8a8d8f5100044507553d2a9b872
Parents: 35d4b5d
Author: Sylvain Lebresne sylv...@datastax.com
Authored: Mon Mar 24 17:06:01 2014 +0100
Committer: Sylvain Lebresne sylv...@datastax.com
Committed: Mon Mar 24 17:06:01 2014 +0100

--
 src/java/org/apache/cassandra/db/DeletionInfo.java | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/91130373/src/java/org/apache/cassandra/db/DeletionInfo.java
--
diff --git a/src/java/org/apache/cassandra/db/DeletionInfo.java 
b/src/java/org/apache/cassandra/db/DeletionInfo.java
index 91af9fd..ce683d1 100644
--- a/src/java/org/apache/cassandra/db/DeletionInfo.java
+++ b/src/java/org/apache/cassandra/db/DeletionInfo.java
@@ -227,7 +227,7 @@ public class DeletionInfo
 public boolean mayModify(DeletionInfo delInfo)
 {
 return topLevel.markedForDeleteAt  delInfo.topLevel.markedForDeleteAt
-|| ranges == null;
+|| ranges != null;
 }
 
 @Override



[jira] [Commented] (CASSANDRA-6746) Reads have a slow ramp up in speed

2014-03-24 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13945283#comment-13945283
 ] 

Benedict commented on CASSANDRA-6746:
-

CASSANDRA-6916 is my proposed solution to this problem, which should provide 
pretty much optimal behaviour on this front, regardless of OS, and with fewer 
parameters to tweak.

 Reads have a slow ramp up in speed
 --

 Key: CASSANDRA-6746
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6746
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Ryan McGuire
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2

 Attachments: 2.1_vs_2.0_read.png, 6746-buffered-io-tweaks.png, 
 6746-patched.png, 6746.blockdev_setra.full.png, 
 6746.blockdev_setra.zoomed.png, 6746.buffered_io_tweaks.logs.tar.gz, 
 6746.buffered_io_tweaks.write-flush-compact-mixed.png, 
 6746.buffered_io_tweaks.write-read-flush-compact.png, 6746.txt, 
 buffered-io-tweaks.patch, cassandra-2.0-bdplab-trial-fincore.tar.bz2, 
 cassandra-2.1-bdplab-trial-fincore.tar.bz2


 On a physical four node cluister I am doing a big write and then a big read. 
 The read takes a long time to ramp up to respectable speeds.
 !2.1_vs_2.0_read.png!
 [See data 
 here|http://ryanmcguire.info/ds/graph/graph.html?stats=stats.2.1_vs_2.0_vs_1.2.retry1.jsonmetric=interval_op_rateoperation=stress-readsmoothing=1]



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (CASSANDRA-6916) Preemptive opening of compaction result

2014-03-24 Thread Benedict (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benedict updated CASSANDRA-6916:


Summary: Preemptive opening of compaction result  (was: Preemptive re-open 
of compaction result)

 Preemptive opening of compaction result
 ---

 Key: CASSANDRA-6916
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6916
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Benedict
Assignee: Benedict
Priority: Minor
 Fix For: 2.1


 Related to CASSANDRA-6812, but a little simpler: when compacting, we mess 
 quite badly with the page cache. One thing we can do to mitigate this problem 
 is to use the sstable we're writing before we've finished writing it, and to 
 drop the regions from the old sstables from the page cache as soon as the new 
 sstables have them (even if they're only written to the page cache). This 
 should minimise any page cache churn, as the old sstables must be larger than 
 the new sstable, and since both will be in memory, dropping the old sstables 
 is at least as good as dropping the new.
 The approach is quite straight-forward. Every X MB written:
 # grab flushed length of index file;
 # grab second to last index summary record, after excluding those that point 
 to positions after the flushed length;
 # open index file, and check that our last record doesn't occur outside of 
 the flushed length of the data file (pretty unlikely)
 # Open the sstable with the calculated upper bound
 Some complications:
 # must keep running copy of compression metadata for reopening with
 # we need to be able to replace an sstable with itself but a different lower 
 bound
 # we need to drop the old page cache only when readers have finished



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6916) Preemptive re-open of compaction result

2014-03-24 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13945281#comment-13945281
 ] 

Benedict commented on CASSANDRA-6916:
-

[Here|https://github.com/belliottsmith/cassandra/tree/6916-preempive-open-compact]
 is a patch that adds this functionality, and also drops support for preheating 
page cache or dropping the page cache on writes, since they are no longer 
likely to provide any benefit above the standard behaviour with this patch.

 Preemptive re-open of compaction result
 ---

 Key: CASSANDRA-6916
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6916
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Benedict
Assignee: Benedict
Priority: Minor
 Fix For: 2.1


 Related to CASSANDRA-6812, but a little simpler: when compacting, we mess 
 quite badly with the page cache. One thing we can do to mitigate this problem 
 is to use the sstable we're writing before we've finished writing it, and to 
 drop the regions from the old sstables from the page cache as soon as the new 
 sstables have them (even if they're only written to the page cache). This 
 should minimise any page cache churn, as the old sstables must be larger than 
 the new sstable, and since both will be in memory, dropping the old sstables 
 is at least as good as dropping the new.
 The approach is quite straight-forward. Every X MB written:
 # grab flushed length of index file;
 # grab second to last index summary record, after excluding those that point 
 to positions after the flushed length;
 # open index file, and check that our last record doesn't occur outside of 
 the flushed length of the data file (pretty unlikely)
 # Open the sstable with the calculated upper bound
 Some complications:
 # must keep running copy of compression metadata for reopening with
 # we need to be able to replace an sstable with itself but a different lower 
 bound
 # we need to drop the old page cache only when readers have finished



--
This message was sent by Atlassian JIRA
(v6.2#6252)


git commit: Update versions and NEWS for 1.2.16 release

2014-03-24 Thread slebresne
Repository: cassandra
Updated Branches:
  refs/heads/cassandra-1.2 91130373f - 05fcfa2be


Update versions and NEWS for 1.2.16 release


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/05fcfa2b
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/05fcfa2b
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/05fcfa2b

Branch: refs/heads/cassandra-1.2
Commit: 05fcfa2be4eba2cd6daeee62d943f48c45f42668
Parents: 9113037
Author: Sylvain Lebresne sylv...@datastax.com
Authored: Mon Mar 24 17:16:15 2014 +0100
Committer: Sylvain Lebresne sylv...@datastax.com
Committed: Mon Mar 24 17:16:15 2014 +0100

--
 NEWS.txt | 9 +
 build.xml| 2 +-
 debian/changelog | 6 ++
 3 files changed, 16 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/05fcfa2b/NEWS.txt
--
diff --git a/NEWS.txt b/NEWS.txt
index 771536d..f297634 100644
--- a/NEWS.txt
+++ b/NEWS.txt
@@ -14,6 +14,15 @@ restore snapshots created with the previous major version 
using the
 using the provided 'sstableupgrade' tool.
 
 
+1.2.16
+==
+
+Upgrading
+-
+- Nothing specific to this release, but please see 1.2.15 if you are 
upgrading
+  from a previous version.
+
+
 1.2.15
 ==
 

http://git-wip-us.apache.org/repos/asf/cassandra/blob/05fcfa2b/build.xml
--
diff --git a/build.xml b/build.xml
index eaf35b5..5db0a6a 100644
--- a/build.xml
+++ b/build.xml
@@ -25,7 +25,7 @@
 property name=debuglevel value=source,lines,vars/
 
 !-- default version and SCM information --
-property name=base.version value=1.2.15/
+property name=base.version value=1.2.16/
 property name=scm.connection 
value=scm:git://git.apache.org/cassandra.git/
 property name=scm.developerConnection 
value=scm:git://git.apache.org/cassandra.git/
 property name=scm.url 
value=http://git-wip-us.apache.org/repos/asf?p=cassandra.git;a=tree/

http://git-wip-us.apache.org/repos/asf/cassandra/blob/05fcfa2b/debian/changelog
--
diff --git a/debian/changelog b/debian/changelog
index bb8ecf2..50318c8 100644
--- a/debian/changelog
+++ b/debian/changelog
@@ -1,3 +1,9 @@
+cassandra (1.2.16) unstable; urgency=low
+
+  * New release
+
+ -- Sylvain Lebresne slebre...@apache.org  Mon, 24 Mar 2014 17:15:34 +0100
+
 cassandra (1.2.15) unstable; urgency=low
 
   * New release



[jira] [Commented] (CASSANDRA-6907) ignore snapshot repair flag on Windows

2014-03-24 Thread Joshua McKenzie (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13945305#comment-13945305
 ] 

Joshua McKenzie commented on CASSANDRA-6907:


Went with protection in the startup .bat file to force addition of -par on 
repair if it's not there and log to stdout.  Lighter touch and isolates changes 
to windows environment only.

 ignore snapshot repair flag on Windows
 --

 Key: CASSANDRA-6907
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6907
 Project: Cassandra
  Issue Type: Bug
  Components: Tools
Reporter: Jonathan Ellis
Assignee: Joshua McKenzie
 Fix For: 2.0.7

 Attachments: CASSANDRA-6907_v1.patch


 Per discussion in CASSANDRA-4050, we should ignore the snapshot repair flag 
 on windows, and log a warning while proceeding to do non-snapshot repair.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (CASSANDRA-6907) ignore snapshot repair flag on Windows

2014-03-24 Thread Joshua McKenzie (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joshua McKenzie updated CASSANDRA-6907:
---

Attachment: CASSANDRA-6907_v1.patch

 ignore snapshot repair flag on Windows
 --

 Key: CASSANDRA-6907
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6907
 Project: Cassandra
  Issue Type: Bug
  Components: Tools
Reporter: Jonathan Ellis
Assignee: Joshua McKenzie
 Fix For: 2.0.7

 Attachments: CASSANDRA-6907_v1.patch


 Per discussion in CASSANDRA-4050, we should ignore the snapshot repair flag 
 on windows, and log a warning while proceeding to do non-snapshot repair.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (CASSANDRA-6907) ignore snapshot repair flag on Windows

2014-03-24 Thread Joshua McKenzie (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joshua McKenzie updated CASSANDRA-6907:
---

Attachment: CASSANDRA-6907_v2.patch

fixed ticket # listed in comment

 ignore snapshot repair flag on Windows
 --

 Key: CASSANDRA-6907
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6907
 Project: Cassandra
  Issue Type: Bug
  Components: Tools
Reporter: Jonathan Ellis
Assignee: Joshua McKenzie
 Fix For: 2.0.7

 Attachments: CASSANDRA-6907_v1.patch, CASSANDRA-6907_v2.patch


 Per discussion in CASSANDRA-4050, we should ignore the snapshot repair flag 
 on windows, and log a warning while proceeding to do non-snapshot repair.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6907) ignore snapshot repair flag on Windows

2014-03-24 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13945315#comment-13945315
 ] 

Jonathan Ellis commented on CASSANDRA-6907:
---

Wouldn't addressing in code be more robust for invocations via jmx as well as 
.bat?

 ignore snapshot repair flag on Windows
 --

 Key: CASSANDRA-6907
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6907
 Project: Cassandra
  Issue Type: Bug
  Components: Tools
Reporter: Jonathan Ellis
Assignee: Joshua McKenzie
 Fix For: 2.0.7

 Attachments: CASSANDRA-6907_v1.patch, CASSANDRA-6907_v2.patch


 Per discussion in CASSANDRA-4050, we should ignore the snapshot repair flag 
 on windows, and log a warning while proceeding to do non-snapshot repair.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Git Push Summary

2014-03-24 Thread slebresne
Repository: cassandra
Updated Tags:  refs/tags/1.2.16-tentative [created] 05fcfa2be


[jira] [Commented] (CASSANDRA-6907) ignore snapshot repair flag on Windows

2014-03-24 Thread Joshua McKenzie (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13945329#comment-13945329
 ] 

Joshua McKenzie commented on CASSANDRA-6907:


Would be more robust, yes, but also represent more invasive changes for 
temporary OS-specific workarounds.  If we want to err on the side of robustness 
that should be a trivial change.

 ignore snapshot repair flag on Windows
 --

 Key: CASSANDRA-6907
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6907
 Project: Cassandra
  Issue Type: Bug
  Components: Tools
Reporter: Jonathan Ellis
Assignee: Joshua McKenzie
 Fix For: 2.0.7

 Attachments: CASSANDRA-6907_v1.patch, CASSANDRA-6907_v2.patch


 Per discussion in CASSANDRA-4050, we should ignore the snapshot repair flag 
 on windows, and log a warning while proceeding to do non-snapshot repair.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6311) Add CqlRecordReader to take advantage of native CQL pagination

2014-03-24 Thread Alex Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13945345#comment-13945345
 ] 

Alex Liu commented on CASSANDRA-6311:
-

Key is Long which is row count number. Value is Row which is backed by 
ArrayBackedRow, a protected class. We need make it to be a public class.

 Add CqlRecordReader to take advantage of native CQL pagination
 --

 Key: CASSANDRA-6311
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6311
 Project: Cassandra
  Issue Type: New Feature
  Components: Hadoop
Reporter: Alex Liu
Assignee: Alex Liu
 Fix For: 2.0.7

 Attachments: 6311-v10.txt, 6311-v3-2.0-branch.txt, 6311-v4.txt, 
 6311-v5-2.0-branch.txt, 6311-v6-2.0-branch.txt, 6311-v7.txt, 6311-v8.txt, 
 6311-v9.txt, 6331-2.0-branch.txt, 6331-v2-2.0-branch.txt


 Since the latest Cql pagination is done and it should be more efficient, so 
 we need update CqlPagingRecordReader to use it instead of the custom thrift 
 paging.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6907) ignore snapshot repair flag on Windows

2014-03-24 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13945357#comment-13945357
 ] 

Jonathan Ellis commented on CASSANDRA-6907:
---

Let's go ahead and do that, we can just leave it out of 3.0 when we merge so 
there's no need to worry about it outliving its usefulness.

 ignore snapshot repair flag on Windows
 --

 Key: CASSANDRA-6907
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6907
 Project: Cassandra
  Issue Type: Bug
  Components: Tools
Reporter: Jonathan Ellis
Assignee: Joshua McKenzie
 Fix For: 2.0.7

 Attachments: CASSANDRA-6907_v1.patch, CASSANDRA-6907_v2.patch


 Per discussion in CASSANDRA-4050, we should ignore the snapshot repair flag 
 on windows, and log a warning while proceeding to do non-snapshot repair.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (CASSANDRA-6907) ignore snapshot repair flag on Windows

2014-03-24 Thread Joshua McKenzie (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joshua McKenzie updated CASSANDRA-6907:
---

Attachment: CASSANDRA-6907_v3.patch

 ignore snapshot repair flag on Windows
 --

 Key: CASSANDRA-6907
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6907
 Project: Cassandra
  Issue Type: Bug
  Components: Tools
Reporter: Jonathan Ellis
Assignee: Joshua McKenzie
 Fix For: 2.0.7

 Attachments: CASSANDRA-6907_v1.patch, CASSANDRA-6907_v2.patch, 
 CASSANDRA-6907_v3.patch


 Per discussion in CASSANDRA-4050, we should ignore the snapshot repair flag 
 on windows, and log a warning while proceeding to do non-snapshot repair.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (CASSANDRA-6914) Map element is not allowed in CAS condition with DELETE/UPDATE query

2014-03-24 Thread Mikhail Stepura (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Stepura updated CASSANDRA-6914:
---

Description: 
{code}
CREATE TABLE test (id int, data maptext,text, PRIMARY KEY(id));

INSERT INTO test (id, data) VALUES (1,{'a':'1'});

DELETE FROM test WHERE id=1 IF data['a']=null;
Bad Request: line 1:40 missing EOF at '='

UPDATE test SET data['b']='2' WHERE id=1 IF data['a']='1';
Bad Request: line 1:53 missing EOF at '='
{code}
These queries was successfuly executed with cassandra 2.0.5, but don't work in 
2.0.6 release

  was:
CREATE TABLE test (id int, data maptext,text, PRIMARY KEY(id));

INSERT INTO test (id, data) VALUES (1,{'a':'1'});

DELETE FROM test WHERE id=1 IF data['a']=null;
Bad Request: line 1:40 missing EOF at '='

UPDATE test SET data['b']='2' WHERE id=1 IF data['a']='1';
Bad Request: line 1:53 missing EOF at '='

These queries was successfuly executed with cassandra 2.0.5, but don't work in 
2.0.6 release


 Map element is not allowed in CAS condition with DELETE/UPDATE query
 

 Key: CASSANDRA-6914
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6914
 Project: Cassandra
  Issue Type: Bug
Reporter: Dmitriy Ukhlov
Assignee: Sylvain Lebresne
 Fix For: 2.0.7


 {code}
 CREATE TABLE test (id int, data maptext,text, PRIMARY KEY(id));
 INSERT INTO test (id, data) VALUES (1,{'a':'1'});
 DELETE FROM test WHERE id=1 IF data['a']=null;
 Bad Request: line 1:40 missing EOF at '='
 UPDATE test SET data['b']='2' WHERE id=1 IF data['a']='1';
 Bad Request: line 1:53 missing EOF at '='
 {code}
 These queries was successfuly executed with cassandra 2.0.5, but don't work 
 in 2.0.6 release



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6907) ignore snapshot repair flag on Windows

2014-03-24 Thread Joshua McKenzie (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13945419#comment-13945419
 ] 

Joshua McKenzie commented on CASSANDRA-6907:


new patch attached.

 ignore snapshot repair flag on Windows
 --

 Key: CASSANDRA-6907
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6907
 Project: Cassandra
  Issue Type: Bug
  Components: Tools
Reporter: Jonathan Ellis
Assignee: Joshua McKenzie
 Fix For: 2.0.7

 Attachments: CASSANDRA-6907_v1.patch, CASSANDRA-6907_v2.patch, 
 CASSANDRA-6907_v3.patch


 Per discussion in CASSANDRA-4050, we should ignore the snapshot repair flag 
 on windows, and log a warning while proceeding to do non-snapshot repair.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6696) Drive replacement in JBOD can cause data to reappear.

2014-03-24 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13945471#comment-13945471
 ] 

Benedict commented on CASSANDRA-6696:
-

It seems to me it _might_ also be simpler, once this change is made, to just 
split the range of the memtable and call subMap(lb, ub) and spawn a separate 
flush writer for each range, which might avoid the need for an 
SSTableWriterInterface... Might also be a good time to introduce a separate 
flush executor for each disk.

 Drive replacement in JBOD can cause data to reappear. 
 --

 Key: CASSANDRA-6696
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6696
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: sankalp kohli
Assignee: Marcus Eriksson
 Fix For: 3.0


 In JBOD, when someone gets a bad drive, the bad drive is replaced with a new 
 empty one and repair is run. 
 This can cause deleted data to come back in some cases. Also this is true for 
 corrupt stables in which we delete the corrupt stable and run repair. 
 Here is an example:
 Say we have 3 nodes A,B and C and RF=3 and GC grace=10days. 
 row=sankalp col=sankalp is written 20 days back and successfully went to all 
 three nodes. 
 Then a delete/tombstone was written successfully for the same row column 15 
 days back. 
 Since this tombstone is more than gc grace, it got compacted in Nodes A and B 
 since it got compacted with the actual data. So there is no trace of this row 
 column in node A and B.
 Now in node C, say the original data is in drive1 and tombstone is in drive2. 
 Compaction has not yet reclaimed the data and tombstone.  
 Drive2 becomes corrupt and was replaced with new empty drive. 
 Due to the replacement, the tombstone in now gone and row=sankalp col=sankalp 
 has come back to life. 
 Now after replacing the drive we run repair. This data will be propagated to 
 all nodes. 
 Note: This is still a problem even if we run repair every gc grace. 
  



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6696) Drive replacement in JBOD can cause data to reappear.

2014-03-24 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13945452#comment-13945452
 ] 

Benedict commented on CASSANDRA-6696:
-

Had a quick glance, and have one initial thought: Might be worth forcing 
compaction to always work on one disk (i.e. always selects files from one disk 
for compaction). Would simplify it slightly, and it seems likely to be the most 
optimal use of IO, but also as it stands you could have a scenario where one 
file is selected each from a different disk, which would result in a perpetual 
compaction loop.



 Drive replacement in JBOD can cause data to reappear. 
 --

 Key: CASSANDRA-6696
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6696
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: sankalp kohli
Assignee: Marcus Eriksson
 Fix For: 3.0


 In JBOD, when someone gets a bad drive, the bad drive is replaced with a new 
 empty one and repair is run. 
 This can cause deleted data to come back in some cases. Also this is true for 
 corrupt stables in which we delete the corrupt stable and run repair. 
 Here is an example:
 Say we have 3 nodes A,B and C and RF=3 and GC grace=10days. 
 row=sankalp col=sankalp is written 20 days back and successfully went to all 
 three nodes. 
 Then a delete/tombstone was written successfully for the same row column 15 
 days back. 
 Since this tombstone is more than gc grace, it got compacted in Nodes A and B 
 since it got compacted with the actual data. So there is no trace of this row 
 column in node A and B.
 Now in node C, say the original data is in drive1 and tombstone is in drive2. 
 Compaction has not yet reclaimed the data and tombstone.  
 Drive2 becomes corrupt and was replaced with new empty drive. 
 Due to the replacement, the tombstone in now gone and row=sankalp col=sankalp 
 has come back to life. 
 Now after replacing the drive we run repair. This data will be propagated to 
 all nodes. 
 Note: This is still a problem even if we run repair every gc grace. 
  



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6541) New versions of Hotspot create new Class objects on every JMX connection causing the heap to fill up with them if CMSClassUnloadingEnabled isn't set.

2014-03-24 Thread Robert Coli (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13945515#comment-13945515
 ] 

Robert Coli commented on CASSANDRA-6541:


{quote}It's a HotSpot regression that we're working around, not a Cassandra 
bug.{quote}
Yes, so the statement this HotSpot regression is not-worked-around in all 
extant versions of Cassandra since the dawn of time is correct. Thanks for the 
clarification!

 New versions of Hotspot create new Class objects on every JMX connection 
 causing the heap to fill up with them if CMSClassUnloadingEnabled isn't set.
 -

 Key: CASSANDRA-6541
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6541
 Project: Cassandra
  Issue Type: Bug
  Components: Config
Reporter: jonathan lacefield
Assignee: Brandon Williams
Priority: Minor
 Fix For: 1.2.16, 2.0.6, 2.1 beta2


 Newer versions of Oracle's Hotspot JVM , post 6u43 (maybe earlier) and 7u25 
 (maybe earlier), are experiencing issues with GC and JMX where heap slowly 
 fills up overtime until OOM or a full GC event occurs, specifically when CMS 
 is leveraged.  Adding:
 {noformat}
 JVM_OPTS=$JVM_OPTS -XX:+CMSClassUnloadingEnabled
 {noformat}
 The the options in cassandra-env.sh alleviates the problem.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (CASSANDRA-6541) New versions of Hotspot create new Class objects on every JMX connection causing the heap to fill up with them if CMSClassUnloadingEnabled isn't set.

2014-03-24 Thread Robert Coli (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Coli updated CASSANDRA-6541:
---

Since Version: 0.3

 New versions of Hotspot create new Class objects on every JMX connection 
 causing the heap to fill up with them if CMSClassUnloadingEnabled isn't set.
 -

 Key: CASSANDRA-6541
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6541
 Project: Cassandra
  Issue Type: Bug
  Components: Config
Reporter: jonathan lacefield
Assignee: Brandon Williams
Priority: Minor
 Fix For: 1.2.16, 2.0.6, 2.1 beta2


 Newer versions of Oracle's Hotspot JVM , post 6u43 (maybe earlier) and 7u25 
 (maybe earlier), are experiencing issues with GC and JMX where heap slowly 
 fills up overtime until OOM or a full GC event occurs, specifically when CMS 
 is leveraged.  Adding:
 {noformat}
 JVM_OPTS=$JVM_OPTS -XX:+CMSClassUnloadingEnabled
 {noformat}
 The the options in cassandra-env.sh alleviates the problem.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6541) New versions of Hotspot create new Class objects on every JMX connection causing the heap to fill up with them if CMSClassUnloadingEnabled isn't set.

2014-03-24 Thread Brandon Williams (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13945516#comment-13945516
 ] 

Brandon Williams commented on CASSANDRA-6541:
-

Well, not really, since the problematic JVM versions didn't exist back then.

 New versions of Hotspot create new Class objects on every JMX connection 
 causing the heap to fill up with them if CMSClassUnloadingEnabled isn't set.
 -

 Key: CASSANDRA-6541
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6541
 Project: Cassandra
  Issue Type: Bug
  Components: Config
Reporter: jonathan lacefield
Assignee: Brandon Williams
Priority: Minor
 Fix For: 1.2.16, 2.0.6, 2.1 beta2


 Newer versions of Oracle's Hotspot JVM , post 6u43 (maybe earlier) and 7u25 
 (maybe earlier), are experiencing issues with GC and JMX where heap slowly 
 fills up overtime until OOM or a full GC event occurs, specifically when CMS 
 is leveraged.  Adding:
 {noformat}
 JVM_OPTS=$JVM_OPTS -XX:+CMSClassUnloadingEnabled
 {noformat}
 The the options in cassandra-env.sh alleviates the problem.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6541) New versions of Hotspot create new Class objects on every JMX connection causing the heap to fill up with them if CMSClassUnloadingEnabled isn't set.

2014-03-24 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13945527#comment-13945527
 ] 

Benedict commented on CASSANDRA-6541:
-

bq. Yes, so the statement this HotSpot regression is not-worked-around in all 
extant versions of Cassandra since the dawn of time is correct. Thanks for the 
clarification!

No, it isn't, since the hot spot bug did not exist since the dawn of time. As 
such the from-version is ill-defined, but probably the most sensible definition 
requires determining the from-version-for-hotspot, which we have yet to manage, 
and aligning that with C* releases.

 New versions of Hotspot create new Class objects on every JMX connection 
 causing the heap to fill up with them if CMSClassUnloadingEnabled isn't set.
 -

 Key: CASSANDRA-6541
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6541
 Project: Cassandra
  Issue Type: Bug
  Components: Config
Reporter: jonathan lacefield
Assignee: Brandon Williams
Priority: Minor
 Fix For: 1.2.16, 2.0.6, 2.1 beta2


 Newer versions of Oracle's Hotspot JVM , post 6u43 (maybe earlier) and 7u25 
 (maybe earlier), are experiencing issues with GC and JMX where heap slowly 
 fills up overtime until OOM or a full GC event occurs, specifically when CMS 
 is leveraged.  Adding:
 {noformat}
 JVM_OPTS=$JVM_OPTS -XX:+CMSClassUnloadingEnabled
 {noformat}
 The the options in cassandra-env.sh alleviates the problem.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (CASSANDRA-6875) CQL3: select multiple CQL rows in a single partition using IN

2014-03-24 Thread Brandon Williams (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6875?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-6875:


Fix Version/s: (was: 2.1)
   2.1 beta2

 CQL3: select multiple CQL rows in a single partition using IN
 -

 Key: CASSANDRA-6875
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6875
 Project: Cassandra
  Issue Type: Bug
  Components: API
Reporter: Nicolas Favre-Felix
Assignee: Tyler Hobbs
Priority: Minor
 Fix For: 2.1 beta2


 In the spirit of CASSANDRA-4851 and to bring CQL to parity with Thrift, it is 
 important to support reading several distinct CQL rows from a given partition 
 using a distinct set of coordinates for these rows within the partition.
 CASSANDRA-4851 introduced a range scan over the multi-dimensional space of 
 clustering keys. We also need to support a multi-get of CQL rows, 
 potentially using the IN keyword to define a set of clustering keys to 
 fetch at once.
 (reusing the same example\:)
 Consider the following table:
 {code}
 CREATE TABLE test (
   k int,
   c1 int,
   c2 int,
   PRIMARY KEY (k, c1, c2)
 );
 {code}
 with the following data:
 {code}
  k | c1 | c2
 ---++
  0 |  0 |  0
  0 |  0 |  1
  0 |  1 |  0
  0 |  1 |  1
 {code}
 We can fetch a single row or a range of rows, but not a set of them:
 {code}
  SELECT * FROM test WHERE k = 0 AND (c1, c2) IN ((0, 0), (1,1)) ;
 Bad Request: line 1:54 missing EOF at ','
 {code}
 Supporting this syntax would return:
 {code}
  k | c1 | c2
 ---++
  0 |  0 |  0
  0 |  1 |  1
 {code}
 Being able to fetch these two CQL rows in a single read is important to 
 maintain partition-level isolation.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (CASSANDRA-6918) Compaction Assert: Incorrect Row Data Size

2014-03-24 Thread Alexander Goodrich (JIRA)
Alexander Goodrich created CASSANDRA-6918:
-

 Summary: Compaction Assert: Incorrect Row Data Size
 Key: CASSANDRA-6918
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6918
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: 11 node Linux Cassandra 1.2.15 cluster, each node 
configured as follows:
2P IntelXeon CPU X5660 @ 2.8 GHz (12 cores, 24 threads total)
148 GB RAM
CentOS release 6.4 (Final)
2.6.32-358.11.1.el6.x86_64 #1 SMP Wed May 15 10:48:38 EDT 2013 x86_64 x86_64 
x86_64 GNU/Linux
Java(TM) SE Runtime Environment (build 1.7.0_40-b43)
Java HotSpot(TM) 64-Bit Server VM (build 24.0-b56, mixed mode)

Node configuration:
Default cassandra.yaml settings for the most part with the following exceptions:
rpc_server_type: hsha


Reporter: Alexander Goodrich
 Fix For: 1.2.16


I have four tables in a schema with Replication Factor: 6 (previously we set 
this to 3, but when we added more nodes we figured adding more replication to 
improve read time would help, this might have aggravated the issue).

create table table_value_one (
id timeuuid PRIMARY KEY,
value_1 counter
);

create table table_value_two (
id timeuuid PRIMARY KEY,
value_2 counter
);

create table table_position_lookup (
value_1 bigint,
value_2 bigint,
id timeuuid,
PRIMARY KEY (id)
) WITH compaction={'class': 'LeveledCompactionStrategy'};

create table sorted_table (
row_key_index text,
range bigint,
sorted_value bigint,
id timeuuid,
extra_data listbigint,
PRIMARY KEY ((row_key_index, range), sorted_value, id)
) WITH CLUSTERING ORDER BY (sorted_value DESC) AND
  compaction={'class': 'LeveledCompactionStrategy'};

The application creates an object, and stores it in sorted_table based on a 
value position - for example, an object has a value_1 of 5500, and a value_2 of 
4300.

There are rows which represent indices by which I can sort items based on these 
values in descending order. If I wish to see items with the highest # of 
value_1, I can create an index that stores them like so:

row_key_index = 'highest_value_1s'

Additionally, we shard each row by bucket ranges - which is simply the value_1 
or value_2 / 1000. For example, our object above would be found in 
row_key_index = 'highest_value_1s' and range 5000, and also in row_key_index = 
'highest_value_2s' with range 4300.

The true values of this object are stored in two counter tables, 
table_value_one and table_value_two. The current indexed position is stored in 
table_position_lookup.

We allow the application to modify value_one and value_two in the counter table 
indiscriminately. If we know the current values for these are dirty, we wait a 
tuned amount of time before we update the position in the sorted_table index. 
This creates 2 delete operations, and 2 write operations on the same table.

The issue is when we expand the number of write/delete operations on 
sorted_table, we see the following assert in the system log:

ERROR [CompactionExecutor:169] 2014-03-24 08:07:12,871 CassandraDaemon.java 
(line 191) Exception in thread Thread[CompactionExecutor:169,1,main]
java.lang.AssertionError: incorrect row data size 77705872 written to 
/var/lib/cassandra/data/loadtest_1/sorted_table/loadtest_1-sorted_table-tmp-ic-165-Data.db;
 correct is 77800512
at 
org.apache.cassandra.io.sstable.SSTableWriter.append(SSTableWriter.java:162)
at 
org.apache.cassandra.db.compaction.CompactionTask.runWith(CompactionTask.java:162)
at 
org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48)
at 
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
at 
org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:58)
at 
org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:60)
at 
org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionTask.run(CompactionManager.java:208)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:724)

Each object creates approximately ~500 unique row keys in sorted_table, and it 
possesses an extra_data field containing approximately 15 different bigint 
values.

Previously, our application was running Cassandra 1.2.10 and we did not see the 
assert when our sorted_table did not have the extra data listbigint. Also, 
we were writing around ~200 unique row keys, only containing the ID column.

We tried both leveled compaction 

[jira] [Commented] (CASSANDRA-6541) New versions of Hotspot create new Class objects on every JMX connection causing the heap to fill up with them if CMSClassUnloadingEnabled isn't set.

2014-03-24 Thread Jeremiah Jordan (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13945567#comment-13945567
 ] 

Jeremiah Jordan commented on CASSANDRA-6541:


C* from version really has no meaning here.  But yes, if you run C* 0.3 (if 
that did JMX monitoring) with 1.6_45 you will hit this issue.  Unless of course 
you add the given setting to your cassandra-env.sh.  Which can be done without 
upgrading your C*.

 New versions of Hotspot create new Class objects on every JMX connection 
 causing the heap to fill up with them if CMSClassUnloadingEnabled isn't set.
 -

 Key: CASSANDRA-6541
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6541
 Project: Cassandra
  Issue Type: Bug
  Components: Config
Reporter: jonathan lacefield
Assignee: Brandon Williams
Priority: Minor
 Fix For: 1.2.16, 2.0.6, 2.1 beta2


 Newer versions of Oracle's Hotspot JVM , post 6u43 (maybe earlier) and 7u25 
 (maybe earlier), are experiencing issues with GC and JMX where heap slowly 
 fills up overtime until OOM or a full GC event occurs, specifically when CMS 
 is leveraged.  Adding:
 {noformat}
 JVM_OPTS=$JVM_OPTS -XX:+CMSClassUnloadingEnabled
 {noformat}
 The the options in cassandra-env.sh alleviates the problem.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Comment Edited] (CASSANDRA-6917) enum data type

2014-03-24 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13945592#comment-13945592
 ] 

Jonathan Ellis edited comment on CASSANDRA-6917 at 3/24/14 7:35 PM:


see also CASSANDRA-4175


was (Author: jbellis):
see also CASSNADRA-4175

 enum data type
 --

 Key: CASSANDRA-6917
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6917
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Priority: Minor

 It seems like it would be useful to support an enum data type, that 
 automatically converts string data from the user into a fixed-width data type 
 with guaranteed uniqueness across the cluster. This data would be replicated 
 to all nodes for lookup, but ideally would use only the keyspace RF to 
 determine nodes for coordinating quorum writes/consistency.
 This would not only permit improved local disk and inter-node network IO for 
 symbology information (e.g. stock tickers, ISINs, etc), but also potentially 
 for column identifiers also, which are currently stored as their full string 
 representation.
 It should be possible then with later updates to propagate the enum map 
 (lazily) to clients through the native protocol, reducing network IO further.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6917) enum data type

2014-03-24 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13945592#comment-13945592
 ] 

Jonathan Ellis commented on CASSANDRA-6917:
---

see also CASSNADRA-4175

 enum data type
 --

 Key: CASSANDRA-6917
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6917
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Priority: Minor

 It seems like it would be useful to support an enum data type, that 
 automatically converts string data from the user into a fixed-width data type 
 with guaranteed uniqueness across the cluster. This data would be replicated 
 to all nodes for lookup, but ideally would use only the keyspace RF to 
 determine nodes for coordinating quorum writes/consistency.
 This would not only permit improved local disk and inter-node network IO for 
 symbology information (e.g. stock tickers, ISINs, etc), but also potentially 
 for column identifiers also, which are currently stored as their full string 
 representation.
 It should be possible then with later updates to propagate the enum map 
 (lazily) to clients through the native protocol, reducing network IO further.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (CASSANDRA-6917) enum data type

2014-03-24 Thread Benedict (JIRA)
Benedict created CASSANDRA-6917:
---

 Summary: enum data type
 Key: CASSANDRA-6917
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6917
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Priority: Minor


It seems like it would be useful to support an enum data type, that 
automatically converts string data from the user into a fixed-width data type 
with guaranteed uniqueness across the cluster. This data would be replicated to 
all nodes for lookup, but ideally would use only the keyspace RF to determine 
nodes for coordinating quorum writes/consistency.

This would not only permit improved local disk and inter-node network IO for 
symbology information (e.g. stock tickers, ISINs, etc), but also potentially 
for column identifiers also, which are currently stored as their full string 
representation.

It should be possible then with later updates to propagate the enum map 
(lazily) to clients through the native protocol, reducing network IO further.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6918) Compaction Assert: Incorrect Row Data Size

2014-03-24 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13945597#comment-13945597
 ] 

Jonathan Ellis commented on CASSANDRA-6918:
---

[~iamaleksey] is this something that counters++ will fix or do you think it is 
more general than counters?

 Compaction Assert: Incorrect Row Data Size
 --

 Key: CASSANDRA-6918
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6918
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: 11 node Linux Cassandra 1.2.15 cluster, each node 
 configured as follows:
 2P IntelXeon CPU X5660 @ 2.8 GHz (12 cores, 24 threads total)
 148 GB RAM
 CentOS release 6.4 (Final)
 2.6.32-358.11.1.el6.x86_64 #1 SMP Wed May 15 10:48:38 EDT 2013 x86_64 x86_64 
 x86_64 GNU/Linux
 Java(TM) SE Runtime Environment (build 1.7.0_40-b43)
 Java HotSpot(TM) 64-Bit Server VM (build 24.0-b56, mixed mode)
 Node configuration:
 Default cassandra.yaml settings for the most part with the following 
 exceptions:
 rpc_server_type: hsha
Reporter: Alexander Goodrich
 Fix For: 1.2.16


 I have four tables in a schema with Replication Factor: 6 (previously we set 
 this to 3, but when we added more nodes we figured adding more replication to 
 improve read time would help, this might have aggravated the issue).
 create table table_value_one (
 id timeuuid PRIMARY KEY,
 value_1 counter
 );
 
 create table table_value_two (
 id timeuuid PRIMARY KEY,
 value_2 counter
 );
 create table table_position_lookup (
 value_1 bigint,
 value_2 bigint,
 id timeuuid,
 PRIMARY KEY (id)
 ) WITH compaction={'class': 'LeveledCompactionStrategy'};
 create table sorted_table (
 row_key_index text,
 range bigint,
 sorted_value bigint,
 id timeuuid,
 extra_data listbigint,
 PRIMARY KEY ((row_key_index, range), sorted_value, id)
 ) WITH CLUSTERING ORDER BY (sorted_value DESC) AND
   compaction={'class': 'LeveledCompactionStrategy'};
 The application creates an object, and stores it in sorted_table based on a 
 value position - for example, an object has a value_1 of 5500, and a value_2 
 of 4300.
 There are rows which represent indices by which I can sort items based on 
 these values in descending order. If I wish to see items with the highest # 
 of value_1, I can create an index that stores them like so:
 row_key_index = 'highest_value_1s'
 Additionally, we shard each row by bucket ranges - which is simply the 
 value_1 or value_2 / 1000. For example, our object above would be found in 
 row_key_index = 'highest_value_1s' and range 5000, and also in row_key_index 
 = 'highest_value_2s' with range 4300.
 The true values of this object are stored in two counter tables, 
 table_value_one and table_value_two. The current indexed position is stored 
 in table_position_lookup.
 We allow the application to modify value_one and value_two in the counter 
 table indiscriminately. If we know the current values for these are dirty, we 
 wait a tuned amount of time before we update the position in the sorted_table 
 index. This creates 2 delete operations, and 2 write operations on the same 
 table.
 The issue is when we expand the number of write/delete operations on 
 sorted_table, we see the following assert in the system log:
 ERROR [CompactionExecutor:169] 2014-03-24 08:07:12,871 CassandraDaemon.java 
 (line 191) Exception in thread Thread[CompactionExecutor:169,1,main]
 java.lang.AssertionError: incorrect row data size 77705872 written to 
 /var/lib/cassandra/data/loadtest_1/sorted_table/loadtest_1-sorted_table-tmp-ic-165-Data.db;
  correct is 77800512
 at 
 org.apache.cassandra.io.sstable.SSTableWriter.append(SSTableWriter.java:162)
 at 
 org.apache.cassandra.db.compaction.CompactionTask.runWith(CompactionTask.java:162)
 at 
 org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48)
 at 
 org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
 at 
 org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:58)
 at 
 org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:60)
 at 
 org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionTask.run(CompactionManager.java:208)
 at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
 at java.util.concurrent.FutureTask.run(FutureTask.java:262)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:724)
 Each object creates approximately ~500 unique 

[jira] [Created] (CASSANDRA-6919) Use OpOrder to guard sstable references for reads, instead of acquiring/releasing references

2014-03-24 Thread Benedict (JIRA)
Benedict created CASSANDRA-6919:
---

 Summary: Use OpOrder to guard sstable references for reads, 
instead of acquiring/releasing references
 Key: CASSANDRA-6919
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6919
 Project: Cassandra
  Issue Type: Improvement
Reporter: Benedict
Assignee: Benedict
Priority: Minor
 Fix For: 2.1 beta2


To slightly improve CASSANDRA-6916, and because it's a bit of a simplification 
anyway, we should move to ensuring sstable resources remain around during reads 
by guarding them with an OpOrder (which is also being introduced for 
CASSANDRA-6694) instead of using markReferenced()/release.

Note this does not eliminate markReferenced, as for long running processes such 
as compaction it makes sense to have an independent mechanism, because these 
long running processes would prevent all resource cleanup for their duration 
rather than just the resources they're using. 

All this does is cleanup and slightly optimise the read path, whilst giving 
better guarantees about resource cleanup (e.g. page cache dropping of old 
sstables which may have been replaced multiple times since the reader was 
created, so we are dropping pages we don't realise are still in use - in real 
terms it should be very rare for such a reader to outlive multiple replacements 
and this is only a performance issue, not a matter of correctness, but it's 
nice to absolutely be certain anyway).




--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (CASSANDRA-6541) New versions of Hotspot create new Class objects on every JMX connection causing the heap to fill up with them if CMSClassUnloadingEnabled isn't set.

2014-03-24 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-6541:
--

Since Version:   (was: 0.3)

 New versions of Hotspot create new Class objects on every JMX connection 
 causing the heap to fill up with them if CMSClassUnloadingEnabled isn't set.
 -

 Key: CASSANDRA-6541
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6541
 Project: Cassandra
  Issue Type: Bug
  Components: Config
Reporter: jonathan lacefield
Assignee: Brandon Williams
Priority: Minor
 Fix For: 1.2.16, 2.0.6, 2.1 beta2


 Newer versions of Oracle's Hotspot JVM , post 6u43 (maybe earlier) and 7u25 
 (maybe earlier), are experiencing issues with GC and JMX where heap slowly 
 fills up overtime until OOM or a full GC event occurs, specifically when CMS 
 is leveraged.  Adding:
 {noformat}
 JVM_OPTS=$JVM_OPTS -XX:+CMSClassUnloadingEnabled
 {noformat}
 The the options in cassandra-env.sh alleviates the problem.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6918) Compaction Assert: Incorrect Row Data Size

2014-03-24 Thread Aleksey Yeschenko (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13945606#comment-13945606
 ] 

Aleksey Yeschenko commented on CASSANDRA-6918:
--

[~jbellis] I could be reading it wrong, but it seems like their issue is with 
the `sorted_table` table, and that one is counter-less.

 Compaction Assert: Incorrect Row Data Size
 --

 Key: CASSANDRA-6918
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6918
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: 11 node Linux Cassandra 1.2.15 cluster, each node 
 configured as follows:
 2P IntelXeon CPU X5660 @ 2.8 GHz (12 cores, 24 threads total)
 148 GB RAM
 CentOS release 6.4 (Final)
 2.6.32-358.11.1.el6.x86_64 #1 SMP Wed May 15 10:48:38 EDT 2013 x86_64 x86_64 
 x86_64 GNU/Linux
 Java(TM) SE Runtime Environment (build 1.7.0_40-b43)
 Java HotSpot(TM) 64-Bit Server VM (build 24.0-b56, mixed mode)
 Node configuration:
 Default cassandra.yaml settings for the most part with the following 
 exceptions:
 rpc_server_type: hsha
Reporter: Alexander Goodrich
 Fix For: 1.2.16


 I have four tables in a schema with Replication Factor: 6 (previously we set 
 this to 3, but when we added more nodes we figured adding more replication to 
 improve read time would help, this might have aggravated the issue).
 create table table_value_one (
 id timeuuid PRIMARY KEY,
 value_1 counter
 );
 
 create table table_value_two (
 id timeuuid PRIMARY KEY,
 value_2 counter
 );
 create table table_position_lookup (
 value_1 bigint,
 value_2 bigint,
 id timeuuid,
 PRIMARY KEY (id)
 ) WITH compaction={'class': 'LeveledCompactionStrategy'};
 create table sorted_table (
 row_key_index text,
 range bigint,
 sorted_value bigint,
 id timeuuid,
 extra_data listbigint,
 PRIMARY KEY ((row_key_index, range), sorted_value, id)
 ) WITH CLUSTERING ORDER BY (sorted_value DESC) AND
   compaction={'class': 'LeveledCompactionStrategy'};
 The application creates an object, and stores it in sorted_table based on a 
 value position - for example, an object has a value_1 of 5500, and a value_2 
 of 4300.
 There are rows which represent indices by which I can sort items based on 
 these values in descending order. If I wish to see items with the highest # 
 of value_1, I can create an index that stores them like so:
 row_key_index = 'highest_value_1s'
 Additionally, we shard each row by bucket ranges - which is simply the 
 value_1 or value_2 / 1000. For example, our object above would be found in 
 row_key_index = 'highest_value_1s' and range 5000, and also in row_key_index 
 = 'highest_value_2s' with range 4300.
 The true values of this object are stored in two counter tables, 
 table_value_one and table_value_two. The current indexed position is stored 
 in table_position_lookup.
 We allow the application to modify value_one and value_two in the counter 
 table indiscriminately. If we know the current values for these are dirty, we 
 wait a tuned amount of time before we update the position in the sorted_table 
 index. This creates 2 delete operations, and 2 write operations on the same 
 table.
 The issue is when we expand the number of write/delete operations on 
 sorted_table, we see the following assert in the system log:
 ERROR [CompactionExecutor:169] 2014-03-24 08:07:12,871 CassandraDaemon.java 
 (line 191) Exception in thread Thread[CompactionExecutor:169,1,main]
 java.lang.AssertionError: incorrect row data size 77705872 written to 
 /var/lib/cassandra/data/loadtest_1/sorted_table/loadtest_1-sorted_table-tmp-ic-165-Data.db;
  correct is 77800512
 at 
 org.apache.cassandra.io.sstable.SSTableWriter.append(SSTableWriter.java:162)
 at 
 org.apache.cassandra.db.compaction.CompactionTask.runWith(CompactionTask.java:162)
 at 
 org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48)
 at 
 org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
 at 
 org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:58)
 at 
 org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:60)
 at 
 org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionTask.run(CompactionManager.java:208)
 at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
 at java.util.concurrent.FutureTask.run(FutureTask.java:262)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:724)
 Each object 

[jira] [Commented] (CASSANDRA-6746) Reads have a slow ramp up in speed

2014-03-24 Thread Pavel Yaskevich (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13945607#comment-13945607
 ] 

Pavel Yaskevich commented on CASSANDRA-6746:


bq. In practice, moving the WILLNEED into the getSegment() call is dangerous as 
the segment is used past the initial 64Kb, and if we rely on ourselves only for 
read-ahead this could result in very substandard performance for larger rows. 
We also probably want to only WILLNEED the actual size of the buffer we expect 
to read for compressed files.

Yes, this is only PoC to see if the scheme works for platters. Just a couple of 
things, for the optimal performance we need an information from the index about 
the size of the row, so we can mark SEQUENTIAL a). whole row if the row is less 
then indexing threshold, b). portions of the row on the index boundaries. 
Original 1 page WILLNEED (very conservative) is used to make sure that read can 
quickly grab the first portion of the buffer while extended read-ahead 
prefetches everything else. This still works for the big rows because we are 
forced to read the header of the row first (key at least) and then when we 
seek() to the position indicated by column index and we want to hint that we 
are going to read for the portion of the row, so large rows are suffering more 
from the fact that we have to over-buffer then WILLNEED. I wish we could have 
useful mmap'ed buffer implementation, so madvice as such as we do fadvice would 
no longer be required...

There is a way to solve cold cache problem from the parts of the data from 
original SSTables that have been read before, I did some work with mincore() 
previously and can revisit if needed. The problem we are trying to solve with 
dropping the cache for memtable and compacted SSTables (in memory restricted 
and/or slow I/O systems) is keeping page cache for the old files creates more 
jitter and slows down warmup of the newly created SSTable. 

 

 Reads have a slow ramp up in speed
 --

 Key: CASSANDRA-6746
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6746
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Ryan McGuire
Assignee: Benedict
  Labels: performance
 Fix For: 2.1 beta2

 Attachments: 2.1_vs_2.0_read.png, 6746-buffered-io-tweaks.png, 
 6746-patched.png, 6746.blockdev_setra.full.png, 
 6746.blockdev_setra.zoomed.png, 6746.buffered_io_tweaks.logs.tar.gz, 
 6746.buffered_io_tweaks.write-flush-compact-mixed.png, 
 6746.buffered_io_tweaks.write-read-flush-compact.png, 6746.txt, 
 buffered-io-tweaks.patch, cassandra-2.0-bdplab-trial-fincore.tar.bz2, 
 cassandra-2.1-bdplab-trial-fincore.tar.bz2


 On a physical four node cluister I am doing a big write and then a big read. 
 The read takes a long time to ramp up to respectable speeds.
 !2.1_vs_2.0_read.png!
 [See data 
 here|http://ryanmcguire.info/ds/graph/graph.html?stats=stats.2.1_vs_2.0_vs_1.2.retry1.jsonmetric=interval_op_rateoperation=stress-readsmoothing=1]



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6918) Compaction Assert: Incorrect Row Data Size

2014-03-24 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13945623#comment-13945623
 ] 

Jonathan Ellis commented on CASSANDRA-6918:
---

[~agoodrich] [~redpriest] does it log Compacting large row before the 
exception?

 Compaction Assert: Incorrect Row Data Size
 --

 Key: CASSANDRA-6918
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6918
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: 11 node Linux Cassandra 1.2.15 cluster, each node 
 configured as follows:
 2P IntelXeon CPU X5660 @ 2.8 GHz (12 cores, 24 threads total)
 148 GB RAM
 CentOS release 6.4 (Final)
 2.6.32-358.11.1.el6.x86_64 #1 SMP Wed May 15 10:48:38 EDT 2013 x86_64 x86_64 
 x86_64 GNU/Linux
 Java(TM) SE Runtime Environment (build 1.7.0_40-b43)
 Java HotSpot(TM) 64-Bit Server VM (build 24.0-b56, mixed mode)
 Node configuration:
 Default cassandra.yaml settings for the most part with the following 
 exceptions:
 rpc_server_type: hsha
Reporter: Alexander Goodrich
 Fix For: 1.2.16


 I have four tables in a schema with Replication Factor: 6 (previously we set 
 this to 3, but when we added more nodes we figured adding more replication to 
 improve read time would help, this might have aggravated the issue).
 create table table_value_one (
 id timeuuid PRIMARY KEY,
 value_1 counter
 );
 
 create table table_value_two (
 id timeuuid PRIMARY KEY,
 value_2 counter
 );
 create table table_position_lookup (
 value_1 bigint,
 value_2 bigint,
 id timeuuid,
 PRIMARY KEY (id)
 ) WITH compaction={'class': 'LeveledCompactionStrategy'};
 create table sorted_table (
 row_key_index text,
 range bigint,
 sorted_value bigint,
 id timeuuid,
 extra_data listbigint,
 PRIMARY KEY ((row_key_index, range), sorted_value, id)
 ) WITH CLUSTERING ORDER BY (sorted_value DESC) AND
   compaction={'class': 'LeveledCompactionStrategy'};
 The application creates an object, and stores it in sorted_table based on a 
 value position - for example, an object has a value_1 of 5500, and a value_2 
 of 4300.
 There are rows which represent indices by which I can sort items based on 
 these values in descending order. If I wish to see items with the highest # 
 of value_1, I can create an index that stores them like so:
 row_key_index = 'highest_value_1s'
 Additionally, we shard each row by bucket ranges - which is simply the 
 value_1 or value_2 / 1000. For example, our object above would be found in 
 row_key_index = 'highest_value_1s' and range 5000, and also in row_key_index 
 = 'highest_value_2s' with range 4300.
 The true values of this object are stored in two counter tables, 
 table_value_one and table_value_two. The current indexed position is stored 
 in table_position_lookup.
 We allow the application to modify value_one and value_two in the counter 
 table indiscriminately. If we know the current values for these are dirty, we 
 wait a tuned amount of time before we update the position in the sorted_table 
 index. This creates 2 delete operations, and 2 write operations on the same 
 table.
 The issue is when we expand the number of write/delete operations on 
 sorted_table, we see the following assert in the system log:
 ERROR [CompactionExecutor:169] 2014-03-24 08:07:12,871 CassandraDaemon.java 
 (line 191) Exception in thread Thread[CompactionExecutor:169,1,main]
 java.lang.AssertionError: incorrect row data size 77705872 written to 
 /var/lib/cassandra/data/loadtest_1/sorted_table/loadtest_1-sorted_table-tmp-ic-165-Data.db;
  correct is 77800512
 at 
 org.apache.cassandra.io.sstable.SSTableWriter.append(SSTableWriter.java:162)
 at 
 org.apache.cassandra.db.compaction.CompactionTask.runWith(CompactionTask.java:162)
 at 
 org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48)
 at 
 org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
 at 
 org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:58)
 at 
 org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:60)
 at 
 org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionTask.run(CompactionManager.java:208)
 at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
 at java.util.concurrent.FutureTask.run(FutureTask.java:262)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:724)
 Each object creates approximately ~500 unique row keys in sorted_table, 

[jira] [Commented] (CASSANDRA-6696) Drive replacement in JBOD can cause data to reappear.

2014-03-24 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13945631#comment-13945631
 ] 

Benedict commented on CASSANDRA-6696:
-

Last thoughts for the day: only major downside to this approach is that we are 
now guaranteeing no better than single disk performance for all operations on a 
given partition. So if there are particularly large and fragmented partitions, 
they could see read performance decline notably. One possible solution to this 
would be split by clustering part (if any), instead of partition key, but 
determine the clustering part range split as a function of the partition hash, 
so that the distribution of data as a whole is still random (i.e. each 
partition has a different clustering distribution across the disks). This would 
make the initial flush more complex, and might require more merging on reads, 
but compaction could still be easily constrained to one disk. This is just a 
poorly formed thought I'm throwing out there for consideration, and possibly 
outside of scope for this ticket.

Either way, I'm not certain that splitting ranges based on disk size is such a 
great idea. As a follow on ticket it might be sensible to permit two category 
of disks: archive for slow and cold data, and live disks for faster data. 
Splitting by capacity seems likely to create undesirable performance 
characteristics, as two similarly performant disks with different capacities 
would lead to worse performance for the data residing on the larger disks.

On the whole I'm +1 this change anyway, the more I think about it. I had been 
vaguely considering something along these lines to optimise flush performance, 
but it seems we can get this for free along with improving correctness, which 
is great.

 Drive replacement in JBOD can cause data to reappear. 
 --

 Key: CASSANDRA-6696
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6696
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: sankalp kohli
Assignee: Marcus Eriksson
 Fix For: 3.0


 In JBOD, when someone gets a bad drive, the bad drive is replaced with a new 
 empty one and repair is run. 
 This can cause deleted data to come back in some cases. Also this is true for 
 corrupt stables in which we delete the corrupt stable and run repair. 
 Here is an example:
 Say we have 3 nodes A,B and C and RF=3 and GC grace=10days. 
 row=sankalp col=sankalp is written 20 days back and successfully went to all 
 three nodes. 
 Then a delete/tombstone was written successfully for the same row column 15 
 days back. 
 Since this tombstone is more than gc grace, it got compacted in Nodes A and B 
 since it got compacted with the actual data. So there is no trace of this row 
 column in node A and B.
 Now in node C, say the original data is in drive1 and tombstone is in drive2. 
 Compaction has not yet reclaimed the data and tombstone.  
 Drive2 becomes corrupt and was replaced with new empty drive. 
 Due to the replacement, the tombstone in now gone and row=sankalp col=sankalp 
 has come back to life. 
 Now after replacing the drive we run repair. This data will be propagated to 
 all nodes. 
 Note: This is still a problem even if we run repair every gc grace. 
  



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6918) Compaction Assert: Incorrect Row Data Size

2014-03-24 Thread Alexander Goodrich (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13945640#comment-13945640
 ] 

Alexander Goodrich commented on CASSANDRA-6918:
---

Yes, this is a counter-less table that the exceptions occur on - [~jbellis] It 
depends on the node - here's an exception on node #2 in my cluster - I've seen 
it happen without (seemingly) a corresponding compaction large row. Here's an 
example where there is one directly above it:

INFO [CompactionExecutor:144] 2014-03-24 07:50:33,240 CompactionController.java 
(line 156) Compacting large row 
loadtest_1/sorted_table:category1_globallist_item_4:0 (67157460 bytes) 
incrementally
ERROR [CompactionExecutor:144] 2014-03-24 07:50:42,471 CassandraDaemon.java 
(line 191) Exception in thread Thread[CompactionExecutor:144,1,main]
java.lang.AssertionError: incorrect row data size 67156948 written to 
/var/lib/cassandra/data/loadtest_1/sorted_table/loadtest_1-sorted_table-tmp-ic-77-Data.db;
 correct is 67239030
at 
org.apache.cassandra.io.sstable.SSTableWriter.append(SSTableWriter.java:162)
at 
org.apache.cassandra.db.compaction.CompactionTask.runWith(CompactionTask.java:162)
at 
org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48)
at 
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
at 
org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:58)
at 
org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:60)
at 
org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionTask.run(CompactionManager.java:208)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:724)


 Compaction Assert: Incorrect Row Data Size
 --

 Key: CASSANDRA-6918
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6918
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: 11 node Linux Cassandra 1.2.15 cluster, each node 
 configured as follows:
 2P IntelXeon CPU X5660 @ 2.8 GHz (12 cores, 24 threads total)
 148 GB RAM
 CentOS release 6.4 (Final)
 2.6.32-358.11.1.el6.x86_64 #1 SMP Wed May 15 10:48:38 EDT 2013 x86_64 x86_64 
 x86_64 GNU/Linux
 Java(TM) SE Runtime Environment (build 1.7.0_40-b43)
 Java HotSpot(TM) 64-Bit Server VM (build 24.0-b56, mixed mode)
 Node configuration:
 Default cassandra.yaml settings for the most part with the following 
 exceptions:
 rpc_server_type: hsha
Reporter: Alexander Goodrich
 Fix For: 1.2.16


 I have four tables in a schema with Replication Factor: 6 (previously we set 
 this to 3, but when we added more nodes we figured adding more replication to 
 improve read time would help, this might have aggravated the issue).
 create table table_value_one (
 id timeuuid PRIMARY KEY,
 value_1 counter
 );
 
 create table table_value_two (
 id timeuuid PRIMARY KEY,
 value_2 counter
 );
 create table table_position_lookup (
 value_1 bigint,
 value_2 bigint,
 id timeuuid,
 PRIMARY KEY (id)
 ) WITH compaction={'class': 'LeveledCompactionStrategy'};
 create table sorted_table (
 row_key_index text,
 range bigint,
 sorted_value bigint,
 id timeuuid,
 extra_data listbigint,
 PRIMARY KEY ((row_key_index, range), sorted_value, id)
 ) WITH CLUSTERING ORDER BY (sorted_value DESC) AND
   compaction={'class': 'LeveledCompactionStrategy'};
 The application creates an object, and stores it in sorted_table based on a 
 value position - for example, an object has a value_1 of 5500, and a value_2 
 of 4300.
 There are rows which represent indices by which I can sort items based on 
 these values in descending order. If I wish to see items with the highest # 
 of value_1, I can create an index that stores them like so:
 row_key_index = 'highest_value_1s'
 Additionally, we shard each row by bucket ranges - which is simply the 
 value_1 or value_2 / 1000. For example, our object above would be found in 
 row_key_index = 'highest_value_1s' and range 5000, and also in row_key_index 
 = 'highest_value_2s' with range 4300.
 The true values of this object are stored in two counter tables, 
 table_value_one and table_value_two. The current indexed position is stored 
 in table_position_lookup.
 We allow the application to modify value_one and value_two in the counter 
 table indiscriminately. If we know the current values for these 

[jira] [Created] (CASSANDRA-6920) LatencyMetrics can return infinity

2014-03-24 Thread Nick Bailey (JIRA)
Nick Bailey created CASSANDRA-6920:
--

 Summary: LatencyMetrics can return infinity 
 Key: CASSANDRA-6920
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6920
 Project: Cassandra
  Issue Type: Bug
Reporter: Nick Bailey


There is a race condition when updating the recentLatency metrics exposed from 
LatencyMetrics.

Attaching a patch with a test that exposes the issue and a potential fix.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6357) Flush memtables to separate directory

2014-03-24 Thread dan jatnieks (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13945705#comment-13945705
 ] 

dan jatnieks commented on CASSANDRA-6357:
-

I can also go back and re-run 2.0 w/patch on the same machine used for 2.1 ... 
but that patch isn't quite the same as the 2.1 code either...

 Flush memtables to separate directory
 -

 Key: CASSANDRA-6357
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6357
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Patrick McFadin
Assignee: Jonathan Ellis
Priority: Minor
  Labels: performance
 Fix For: 2.1 beta1

 Attachments: 6357-v2.txt, 6357.txt, 
 c6357-2.1-stress-write-adj-ops-sec.png, 
 c6357-2.1-stress-write-latency-99th.png, 
 c6357-2.1-stress-write-latency-median.png, 
 c6357-stress-write-latency-99th-1.png


 Flush writers are a critical element for keeping a node healthy. When several 
 compactions run on systems with low performing data directories, IO becomes a 
 premium. Once the disk subsystem is saturated, write IO is blocked which will 
 cause flush writer threads to backup. Since memtables are large blocks of 
 memory in the JVM, too much blocking can cause excessive GC over time 
 degrading performance. In the worst case causing an OOM.
 Since compaction is running on the data directories. My proposal is to create 
 a separate directory for flushing memtables. Potentially we can use the same 
 methodology of keeping the commit log separate and minimize disk contention 
 against the critical function of the flushwriter. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (CASSANDRA-6920) LatencyMetrics can return infinity

2014-03-24 Thread Nick Bailey (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nick Bailey updated CASSANDRA-6920:
---

Attachment: 6920-infinity-metrics.patch

 LatencyMetrics can return infinity 
 ---

 Key: CASSANDRA-6920
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6920
 Project: Cassandra
  Issue Type: Bug
Reporter: Nick Bailey
 Attachments: 6920-infinity-metrics.patch


 There is a race condition when updating the recentLatency metrics exposed 
 from LatencyMetrics.
 Attaching a patch with a test that exposes the issue and a potential fix.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Comment Edited] (CASSANDRA-6357) Flush memtables to separate directory

2014-03-24 Thread dan jatnieks (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13945705#comment-13945705
 ] 

dan jatnieks edited comment on CASSANDRA-6357 at 3/24/14 9:19 PM:
--

I can also go back and re-run 2.0 w/patch on the same machine used for 2.1 ... 
but that patch isn't quite the same as the 2.1 code either...

I was wondering if changes to flushing and/or compaction in 2.1 already lessen 
the contention that was present in 2.0?


was (Author: djatnieks):
I can also go back and re-run 2.0 w/patch on the same machine used for 2.1 ... 
but that patch isn't quite the same as the 2.1 code either...

 Flush memtables to separate directory
 -

 Key: CASSANDRA-6357
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6357
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Patrick McFadin
Assignee: Jonathan Ellis
Priority: Minor
  Labels: performance
 Fix For: 2.1 beta1

 Attachments: 6357-v2.txt, 6357.txt, 
 c6357-2.1-stress-write-adj-ops-sec.png, 
 c6357-2.1-stress-write-latency-99th.png, 
 c6357-2.1-stress-write-latency-median.png, 
 c6357-stress-write-latency-99th-1.png


 Flush writers are a critical element for keeping a node healthy. When several 
 compactions run on systems with low performing data directories, IO becomes a 
 premium. Once the disk subsystem is saturated, write IO is blocked which will 
 cause flush writer threads to backup. Since memtables are large blocks of 
 memory in the JVM, too much blocking can cause excessive GC over time 
 degrading performance. In the worst case causing an OOM.
 Since compaction is running on the data directories. My proposal is to create 
 a separate directory for flushing memtables. Potentially we can use the same 
 methodology of keeping the commit log separate and minimize disk contention 
 against the critical function of the flushwriter. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


  1   2   >