[jira] [Commented] (CASSANDRA-7053) USING TIMESTAMP for batches does not work
[ https://issues.apache.org/jira/browse/CASSANDRA-7053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13976478#comment-13976478 ] Sylvain Lebresne commented on CASSANDRA-7053: - bq. It looks like CAS/paxos stuff always uses server timestamps While it does is supposed to be that way, we should probably refuse CAS batch that have a 'USING TIMESTAMP', since that make no sense. USING TIMESTAMP for batches does not work - Key: CASSANDRA-7053 URL: https://issues.apache.org/jira/browse/CASSANDRA-7053 Project: Cassandra Issue Type: Bug Reporter: Robert Supencheck Assignee: Mikhail Stepura Labels: cqlsh Fix For: 2.0.8, 2.1 beta2 Attachments: cassandra-2.0-7053.patch When using the USING TIMESTAMP timestamp syntax for a batch statement, the supplied timestamp is ignored. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (CASSANDRA-5547) Multi-threaded scrub
[ https://issues.apache.org/jira/browse/CASSANDRA-5547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcus Eriksson updated CASSANDRA-5547: --- Attachment: 0001-5547.patch Did a bit more refactoring on top of your commit now we only have to implement the performXYZ and then the xyzOne methods for these operations, what do you think? Multi-threaded scrub Key: CASSANDRA-5547 URL: https://issues.apache.org/jira/browse/CASSANDRA-5547 Project: Cassandra Issue Type: Improvement Components: Tools Reporter: Benjamin Coverston Assignee: Russell Alexander Spitzer Labels: lhf Fix For: 2.0.8 Attachments: 0001-5547.patch, cassandra-2.0-5547.txt Scrub (especially offline) could benefit from being multi-threaded, especially in the case where the SSTables are compressed. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6696) Drive replacement in JBOD can cause data to reappear.
[ https://issues.apache.org/jira/browse/CASSANDRA-6696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13976513#comment-13976513 ] Marcus Eriksson commented on CASSANDRA-6696: I don't think it would simplify things much (this is quite simple already), but doing per-vnode sstables could enable some nice benefits, like turning off the exact vnodes that are affected by a disk failure or a mini auto-repair on corrupt sstables perhaps? The drawback I see is that we would end up with very many sstables, making it a real pita to do backups etc. Drive replacement in JBOD can cause data to reappear. -- Key: CASSANDRA-6696 URL: https://issues.apache.org/jira/browse/CASSANDRA-6696 Project: Cassandra Issue Type: Improvement Components: Core Reporter: sankalp kohli Assignee: Marcus Eriksson Fix For: 3.0 In JBOD, when someone gets a bad drive, the bad drive is replaced with a new empty one and repair is run. This can cause deleted data to come back in some cases. Also this is true for corrupt stables in which we delete the corrupt stable and run repair. Here is an example: Say we have 3 nodes A,B and C and RF=3 and GC grace=10days. row=sankalp col=sankalp is written 20 days back and successfully went to all three nodes. Then a delete/tombstone was written successfully for the same row column 15 days back. Since this tombstone is more than gc grace, it got compacted in Nodes A and B since it got compacted with the actual data. So there is no trace of this row column in node A and B. Now in node C, say the original data is in drive1 and tombstone is in drive2. Compaction has not yet reclaimed the data and tombstone. Drive2 becomes corrupt and was replaced with new empty drive. Due to the replacement, the tombstone in now gone and row=sankalp col=sankalp has come back to life. Now after replacing the drive we run repair. This data will be propagated to all nodes. Note: This is still a problem even if we run repair every gc grace. -- This message was sent by Atlassian JIRA (v6.2#6252)
git commit: Fix CQL version number for CASSANDRA-7055
Repository: cassandra Updated Branches: refs/heads/cassandra-2.0 bc6f4d003 - 48d7e4080 Fix CQL version number for CASSANDRA-7055 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/48d7e408 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/48d7e408 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/48d7e408 Branch: refs/heads/cassandra-2.0 Commit: 48d7e408085ff4f56a718487dedb05c23dfe57c9 Parents: bc6f4d0 Author: Sylvain Lebresne sylv...@datastax.com Authored: Tue Apr 22 10:12:19 2014 +0200 Committer: Sylvain Lebresne sylv...@datastax.com Committed: Tue Apr 22 10:12:19 2014 +0200 -- doc/cql3/CQL.textile| 16 +++- .../org/apache/cassandra/cql3/QueryProcessor.java | 2 +- 2 files changed, 16 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/48d7e408/doc/cql3/CQL.textile -- diff --git a/doc/cql3/CQL.textile b/doc/cql3/CQL.textile index 3847701..f6208bf 100644 --- a/doc/cql3/CQL.textile +++ b/doc/cql3/CQL.textile @@ -497,13 +497,16 @@ bc(syntax).. ( USING option ( AND option )* )? SET assignment ( ',' assignment )* WHERE where-clause - ( IF identifier '=' term ( AND identifier '=' term )* )? + ( IF condition ( AND condition )* )? assignment ::= identifier '=' term | identifier '=' identifier ('+' | '-') (int-term | set-literal | list-literal) | identifier '=' identifier '+' map-literal | identifier '[' term ']' '=' term +condition ::= identifier '=' term + | identifier '[' term ']' '=' term + where-clause ::= relation ( AND relation )* relation ::= identifier '=' term @@ -552,6 +555,7 @@ bc(syntax).. FROM tablename ( USING TIMESTAMP integer)? WHERE where-clause + ( IF ( EXISTS | ( condition ( AND condition )*) ) )? selection ::= identifier ( '[' term ']' )? @@ -560,6 +564,9 @@ bc(syntax).. relation ::= identifier '=' term | identifier IN '(' ( term ( ',' term )* )? ')' | identifier IN '?' + +condition ::= identifier '=' term + | identifier '[' term ']' '=' term p. __Sample:__ @@ -574,6 +581,7 @@ The @DELETE@ statement deletes columns and rows. If column names are provided di In a @DELETE@ statement, all deletions within the same partition key are applied atomically and in isolation. +A @DELETE@ operation application can be conditioned using @IF@ like for @UPDATE@ and @INSERT@. But please not that as for the later, this will incur a non negligible performance cost (internally, Paxos will be used) and so should be used sparingly. h3(#batchStmt). BATCH @@ -1149,6 +1157,12 @@ h2(#changes). Changes The following describes the addition/changes brought for each version of CQL. +h3. 3.1.6 + +* A new @uuid@ method:#uuidFun has been added. +* Support for @DELETE ... IF EXISTS@ syntax. + + h3. 3.1.5 * It is now possible to group clustering columns in a relatiion, see SELECT Where clauses:#selectWhere. http://git-wip-us.apache.org/repos/asf/cassandra/blob/48d7e408/src/java/org/apache/cassandra/cql3/QueryProcessor.java -- diff --git a/src/java/org/apache/cassandra/cql3/QueryProcessor.java b/src/java/org/apache/cassandra/cql3/QueryProcessor.java index 64ea5e5..ab0ea40 100644 --- a/src/java/org/apache/cassandra/cql3/QueryProcessor.java +++ b/src/java/org/apache/cassandra/cql3/QueryProcessor.java @@ -43,7 +43,7 @@ import org.apache.cassandra.utils.SemanticVersion; public class QueryProcessor implements QueryHandler { -public static final SemanticVersion CQL_VERSION = new SemanticVersion(3.1.5); +public static final SemanticVersion CQL_VERSION = new SemanticVersion(3.1.6); public static final QueryProcessor instance = new QueryProcessor();
[1/2] git commit: Fix CQL version number for CASSANDRA-7055
Repository: cassandra Updated Branches: refs/heads/cassandra-2.1 30e2bff69 - 3e6b29925 Fix CQL version number for CASSANDRA-7055 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/48d7e408 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/48d7e408 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/48d7e408 Branch: refs/heads/cassandra-2.1 Commit: 48d7e408085ff4f56a718487dedb05c23dfe57c9 Parents: bc6f4d0 Author: Sylvain Lebresne sylv...@datastax.com Authored: Tue Apr 22 10:12:19 2014 +0200 Committer: Sylvain Lebresne sylv...@datastax.com Committed: Tue Apr 22 10:12:19 2014 +0200 -- doc/cql3/CQL.textile| 16 +++- .../org/apache/cassandra/cql3/QueryProcessor.java | 2 +- 2 files changed, 16 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/48d7e408/doc/cql3/CQL.textile -- diff --git a/doc/cql3/CQL.textile b/doc/cql3/CQL.textile index 3847701..f6208bf 100644 --- a/doc/cql3/CQL.textile +++ b/doc/cql3/CQL.textile @@ -497,13 +497,16 @@ bc(syntax).. ( USING option ( AND option )* )? SET assignment ( ',' assignment )* WHERE where-clause - ( IF identifier '=' term ( AND identifier '=' term )* )? + ( IF condition ( AND condition )* )? assignment ::= identifier '=' term | identifier '=' identifier ('+' | '-') (int-term | set-literal | list-literal) | identifier '=' identifier '+' map-literal | identifier '[' term ']' '=' term +condition ::= identifier '=' term + | identifier '[' term ']' '=' term + where-clause ::= relation ( AND relation )* relation ::= identifier '=' term @@ -552,6 +555,7 @@ bc(syntax).. FROM tablename ( USING TIMESTAMP integer)? WHERE where-clause + ( IF ( EXISTS | ( condition ( AND condition )*) ) )? selection ::= identifier ( '[' term ']' )? @@ -560,6 +564,9 @@ bc(syntax).. relation ::= identifier '=' term | identifier IN '(' ( term ( ',' term )* )? ')' | identifier IN '?' + +condition ::= identifier '=' term + | identifier '[' term ']' '=' term p. __Sample:__ @@ -574,6 +581,7 @@ The @DELETE@ statement deletes columns and rows. If column names are provided di In a @DELETE@ statement, all deletions within the same partition key are applied atomically and in isolation. +A @DELETE@ operation application can be conditioned using @IF@ like for @UPDATE@ and @INSERT@. But please not that as for the later, this will incur a non negligible performance cost (internally, Paxos will be used) and so should be used sparingly. h3(#batchStmt). BATCH @@ -1149,6 +1157,12 @@ h2(#changes). Changes The following describes the addition/changes brought for each version of CQL. +h3. 3.1.6 + +* A new @uuid@ method:#uuidFun has been added. +* Support for @DELETE ... IF EXISTS@ syntax. + + h3. 3.1.5 * It is now possible to group clustering columns in a relatiion, see SELECT Where clauses:#selectWhere. http://git-wip-us.apache.org/repos/asf/cassandra/blob/48d7e408/src/java/org/apache/cassandra/cql3/QueryProcessor.java -- diff --git a/src/java/org/apache/cassandra/cql3/QueryProcessor.java b/src/java/org/apache/cassandra/cql3/QueryProcessor.java index 64ea5e5..ab0ea40 100644 --- a/src/java/org/apache/cassandra/cql3/QueryProcessor.java +++ b/src/java/org/apache/cassandra/cql3/QueryProcessor.java @@ -43,7 +43,7 @@ import org.apache.cassandra.utils.SemanticVersion; public class QueryProcessor implements QueryHandler { -public static final SemanticVersion CQL_VERSION = new SemanticVersion(3.1.5); +public static final SemanticVersion CQL_VERSION = new SemanticVersion(3.1.6); public static final QueryProcessor instance = new QueryProcessor();
[2/2] git commit: Merge branch 'cassandra-2.0' into cassandra-2.1
Merge branch 'cassandra-2.0' into cassandra-2.1 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/3e6b2992 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/3e6b2992 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/3e6b2992 Branch: refs/heads/cassandra-2.1 Commit: 3e6b29925686dac0275c2db64e3a3b69203b1747 Parents: 30e2bff 48d7e40 Author: Sylvain Lebresne sylv...@datastax.com Authored: Tue Apr 22 10:12:47 2014 +0200 Committer: Sylvain Lebresne sylv...@datastax.com Committed: Tue Apr 22 10:12:47 2014 +0200 -- doc/cql3/CQL.textile| 16 +++- .../org/apache/cassandra/cql3/QueryProcessor.java | 2 +- 2 files changed, 16 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/3e6b2992/doc/cql3/CQL.textile -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/3e6b2992/src/java/org/apache/cassandra/cql3/QueryProcessor.java --
[1/3] git commit: Fix CQL version number for CASSANDRA-7055
Repository: cassandra Updated Branches: refs/heads/trunk 33fa4f648 - 68aa62bde Fix CQL version number for CASSANDRA-7055 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/48d7e408 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/48d7e408 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/48d7e408 Branch: refs/heads/trunk Commit: 48d7e408085ff4f56a718487dedb05c23dfe57c9 Parents: bc6f4d0 Author: Sylvain Lebresne sylv...@datastax.com Authored: Tue Apr 22 10:12:19 2014 +0200 Committer: Sylvain Lebresne sylv...@datastax.com Committed: Tue Apr 22 10:12:19 2014 +0200 -- doc/cql3/CQL.textile| 16 +++- .../org/apache/cassandra/cql3/QueryProcessor.java | 2 +- 2 files changed, 16 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/48d7e408/doc/cql3/CQL.textile -- diff --git a/doc/cql3/CQL.textile b/doc/cql3/CQL.textile index 3847701..f6208bf 100644 --- a/doc/cql3/CQL.textile +++ b/doc/cql3/CQL.textile @@ -497,13 +497,16 @@ bc(syntax).. ( USING option ( AND option )* )? SET assignment ( ',' assignment )* WHERE where-clause - ( IF identifier '=' term ( AND identifier '=' term )* )? + ( IF condition ( AND condition )* )? assignment ::= identifier '=' term | identifier '=' identifier ('+' | '-') (int-term | set-literal | list-literal) | identifier '=' identifier '+' map-literal | identifier '[' term ']' '=' term +condition ::= identifier '=' term + | identifier '[' term ']' '=' term + where-clause ::= relation ( AND relation )* relation ::= identifier '=' term @@ -552,6 +555,7 @@ bc(syntax).. FROM tablename ( USING TIMESTAMP integer)? WHERE where-clause + ( IF ( EXISTS | ( condition ( AND condition )*) ) )? selection ::= identifier ( '[' term ']' )? @@ -560,6 +564,9 @@ bc(syntax).. relation ::= identifier '=' term | identifier IN '(' ( term ( ',' term )* )? ')' | identifier IN '?' + +condition ::= identifier '=' term + | identifier '[' term ']' '=' term p. __Sample:__ @@ -574,6 +581,7 @@ The @DELETE@ statement deletes columns and rows. If column names are provided di In a @DELETE@ statement, all deletions within the same partition key are applied atomically and in isolation. +A @DELETE@ operation application can be conditioned using @IF@ like for @UPDATE@ and @INSERT@. But please not that as for the later, this will incur a non negligible performance cost (internally, Paxos will be used) and so should be used sparingly. h3(#batchStmt). BATCH @@ -1149,6 +1157,12 @@ h2(#changes). Changes The following describes the addition/changes brought for each version of CQL. +h3. 3.1.6 + +* A new @uuid@ method:#uuidFun has been added. +* Support for @DELETE ... IF EXISTS@ syntax. + + h3. 3.1.5 * It is now possible to group clustering columns in a relatiion, see SELECT Where clauses:#selectWhere. http://git-wip-us.apache.org/repos/asf/cassandra/blob/48d7e408/src/java/org/apache/cassandra/cql3/QueryProcessor.java -- diff --git a/src/java/org/apache/cassandra/cql3/QueryProcessor.java b/src/java/org/apache/cassandra/cql3/QueryProcessor.java index 64ea5e5..ab0ea40 100644 --- a/src/java/org/apache/cassandra/cql3/QueryProcessor.java +++ b/src/java/org/apache/cassandra/cql3/QueryProcessor.java @@ -43,7 +43,7 @@ import org.apache.cassandra.utils.SemanticVersion; public class QueryProcessor implements QueryHandler { -public static final SemanticVersion CQL_VERSION = new SemanticVersion(3.1.5); +public static final SemanticVersion CQL_VERSION = new SemanticVersion(3.1.6); public static final QueryProcessor instance = new QueryProcessor();
[3/3] git commit: Merge branch 'cassandra-2.1' into trunk
Merge branch 'cassandra-2.1' into trunk Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/68aa62bd Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/68aa62bd Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/68aa62bd Branch: refs/heads/trunk Commit: 68aa62bde1596a0f7cae03049a3cdcb491151a1c Parents: 33fa4f6 3e6b299 Author: Sylvain Lebresne sylv...@datastax.com Authored: Tue Apr 22 10:13:26 2014 +0200 Committer: Sylvain Lebresne sylv...@datastax.com Committed: Tue Apr 22 10:13:26 2014 +0200 -- doc/cql3/CQL.textile| 16 +++- .../org/apache/cassandra/cql3/QueryProcessor.java | 2 +- 2 files changed, 16 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/68aa62bd/src/java/org/apache/cassandra/cql3/QueryProcessor.java --
[2/3] git commit: Merge branch 'cassandra-2.0' into cassandra-2.1
Merge branch 'cassandra-2.0' into cassandra-2.1 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/3e6b2992 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/3e6b2992 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/3e6b2992 Branch: refs/heads/trunk Commit: 3e6b29925686dac0275c2db64e3a3b69203b1747 Parents: 30e2bff 48d7e40 Author: Sylvain Lebresne sylv...@datastax.com Authored: Tue Apr 22 10:12:47 2014 +0200 Committer: Sylvain Lebresne sylv...@datastax.com Committed: Tue Apr 22 10:12:47 2014 +0200 -- doc/cql3/CQL.textile| 16 +++- .../org/apache/cassandra/cql3/QueryProcessor.java | 2 +- 2 files changed, 16 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/3e6b2992/doc/cql3/CQL.textile -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/3e6b2992/src/java/org/apache/cassandra/cql3/QueryProcessor.java --
[jira] [Resolved] (CASSANDRA-7055) Boken CQL Version number in 2.0.7
[ https://issues.apache.org/jira/browse/CASSANDRA-7055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sylvain Lebresne resolved CASSANDRA-7055. - Resolution: Fixed Can't do anything for 2.0.7, but bumped the version for 2.0.8 (and updated the doc accordingly). Boken CQL Version number in 2.0.7 - Key: CASSANDRA-7055 URL: https://issues.apache.org/jira/browse/CASSANDRA-7055 Project: Cassandra Issue Type: Bug Reporter: Michaël Figuière Assignee: Sylvain Lebresne Priority: Trivial Fix For: 2.0.8 Cassandra 2.0.7 has introduced 2 changes in the CQL language: *Add uuid() function (CASSANDRA-6473) *Add support for DELETE ... IF EXISTS to CQL3 (CASSANDRA-5708) Unfortunately the {{cql_version}} hasn't been incremented as reported in the {{system.local}} table. In 2.0.6: {code} cqlsh select cql_version from system.local; cql_version - 3.1.5 {code} In 2.0.7: {code} cqlsh select cql_version from system.local; cql_version - 3.1.5 {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6863) Incorrect read repair of range thombstones
[ https://issues.apache.org/jira/browse/CASSANDRA-6863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13976532#comment-13976532 ] Oleg Anastasyev commented on CASSANDRA-6863: RangeTombstoneList.updateDigest: {code} +for (int j = 0; j ends[i].size(); j++) +digest.update(starts[i].get(j).duplicate()); {code} should call digest.update on ends[i], not starts[i] the rest LGTM Incorrect read repair of range thombstones -- Key: CASSANDRA-6863 URL: https://issues.apache.org/jira/browse/CASSANDRA-6863 Project: Cassandra Issue Type: Bug Environment: 2.0 Reporter: Oleg Anastasyev Attachments: 6863-v2.txt, 6863-v2.txt, ReadRepairRangeThombstoneDiff.txt, ReadRepairsDebugLogger.txt Rows with range thombstones are read repaired for every replica, if RR is triggered (this is because CF.diff() returns non null if !isEmpty(), which in turn returns false if range thombstones list is not empty). Also, full rangethombstone list is send to all nodes, which could be a problem if you have wide partition. Fixed this by evaluating diff on range thombstone lists as well as on deteleInfo of endpoint CF versions. Also return null from CF.diff, if no diff in RTL. A second patch (ReadRepairsDebugLogger.txt) adds some debug logging to look at read repairs. You may find it useful as well. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6993) Windows: remove mmap'ed I/O for index files and force standard file access
[ https://issues.apache.org/jira/browse/CASSANDRA-6993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13976537#comment-13976537 ] Benedict commented on CASSANDRA-6993: - 1) isUnix should be final 2) I think your isUnix check is too limited: this will break Mac OSX, FreeBSD and Solaris users, possibly others. Since basically every OS other than Windows probably supports this, I'd suggest making it an isWindows check and looking for contains(windows). [This link|http://mindprod.com/jgloss/properties.html#OSNAME] may help, although may not be completely authoritative. A quick grep of openjdk shows the following line in their own test tools, though: {code} static boolean isWindows = System.getProperty(os.name).startsWith(Windows); {code} Which suggests it's probably sufficient. Windows: remove mmap'ed I/O for index files and force standard file access -- Key: CASSANDRA-6993 URL: https://issues.apache.org/jira/browse/CASSANDRA-6993 Project: Cassandra Issue Type: Improvement Reporter: Joshua McKenzie Assignee: Joshua McKenzie Priority: Minor Fix For: 3.0 Attachments: 6993_v1.txt Memory-mapped I/O on Windows causes issues with hard-links; we're unable to delete hard-links to open files with memory-mapped segments even using nio. We'll need to push for close to performance parity between mmap'ed I/O and buffered going forward as the buffered / compressed path offers other benefits. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7029) Investigate alternative transport protocols for both client and inter-server communications
[ https://issues.apache.org/jira/browse/CASSANDRA-7029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13976543#comment-13976543 ] Benedict commented on CASSANDRA-7029: - This is exactly the reason I created CASSANDRA-7061, which I intend to look at first. Profilers indicate a great deal of overhead in networking, but I'm not sure how honest that is. Investigate alternative transport protocols for both client and inter-server communications --- Key: CASSANDRA-7029 URL: https://issues.apache.org/jira/browse/CASSANDRA-7029 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Assignee: Benedict Labels: performance Fix For: 3.0 There are a number of reasons to think we can do better than TCP for our communications: 1) We can actually tolerate sporadic small message losses, so guaranteed delivery isn't essential (although for larger messages it probably is) 2) As shown in \[1\] and \[2\], Linux can behave quite suboptimally with regard to TCP message delivery when the system is under load. Judging from the theoretical description, this is likely to apply even when the system-load is not high, but the number of processes to schedule is high. Cassandra generally has a lot of threads to schedule, so this is quite pertinent for us. UDP performs substantially better here. 3) Even when the system is not under load, UDP has a lower CPU burden, and that burden is constant regardless of the number of connections it processes. 4) On a simple benchmark on my local PC, using non-blocking IO for UDP and busy spinning on IO I can actually push 20-40% more throughput through loopback (where TCP should be optimal, as no latency), even for very small messages. Since we can see networking taking multiple CPUs' worth of time during a stress test, using a busy-spin for ~100micros after last message receipt is almost certainly acceptable, especially as we can (ultimately) process inter-server and client communications on the same thread/socket in this model. 5) We can optimise the threading model heavily: since we generally process very small messages (200 bytes not at all implausible), the thread signalling costs on the processing thread can actually dramatically impede throughput. In general it costs ~10micros to signal (and passing the message to another thread for processing in the current model requires signalling). For 200-byte messages this caps our throughput at 20MB/s. I propose to knock up a highly naive UDP-based connection protocol with super-trivial congestion control over the course of a few days, with the only initial goal being maximum possible performance (not fairness, reliability, or anything else), and trial it in Netty (possibly making some changes to Netty to mitigate thread signalling costs). The reason for knocking up our own here is to get a ceiling on what the absolute limit of potential for this approach is. Assuming this pans out with performance gains in C* proper, we then look to contributing to/forking the udt-java project and see how easy it is to bring performance in line with what we can get with our naive approach (I don't suggest starting here, as the project is using blocking old-IO, and modifying it with latency in mind may be challenging, and we won't know for sure what the best case scenario is). \[1\] http://test-docdb.fnal.gov/0016/001648/002/Potential%20Performance%20Bottleneck%20in%20Linux%20TCP.PDF \[2\] http://cd-docdb.fnal.gov/cgi-bin/RetrieveFile?docid=1968;filename=Performance%20Analysis%20of%20Linux%20Networking%20-%20Packet%20Receiving%20(Official).pdf;version=2 Further related reading: http://public.dhe.ibm.com/software/commerce/doc/mft/cdunix/41/UDTWhitepaper.pdf https://mospace.umsystem.edu/xmlui/bitstream/handle/10355/14482/ChoiUndPerTcp.pdf?sequence=1 https://access.redhat.com/site/documentation/en-US/JBoss_Enterprise_Web_Platform/5/html/Administration_And_Configuration_Guide/jgroups-perf-udpbuffer.html http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.153.3762rep=rep1type=pdf -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (CASSANDRA-7065) Add some extra metadata in leveled manifest to be able to reduce the amount of sstables searched on read path
Marcus Eriksson created CASSANDRA-7065: -- Summary: Add some extra metadata in leveled manifest to be able to reduce the amount of sstables searched on read path Key: CASSANDRA-7065 URL: https://issues.apache.org/jira/browse/CASSANDRA-7065 Project: Cassandra Issue Type: Improvement Reporter: Marcus Eriksson Based on this; http://rocksdb.org/blog/431/indexing-sst-files-for-better-lookup-performance/ By keeping pointers from the sstables in lower to higher levels we could reduce the number of candidates in higher levels, ie, instead of searching all 1000 L3 sstables, we use the information from the L2 search to include less L3 sstables. First we need to figure out if this can beat our IntervalTree approach (and if the win is worth it). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7031) Increase default commit log total space + segment size
[ https://issues.apache.org/jira/browse/CASSANDRA-7031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13976547#comment-13976547 ] Benedict commented on CASSANDRA-7031: - The _worst_ latency is substantially reduced, which is down to waiting on the commit log to catch up. It's possible the 99th/99.9th are increased due to sharing the same disk, but notice the 95th percentile is lower also for both, so it's only a slight spike in the 99th+99.9th for a substantial drop in the max and the more common cases. Could simply be random noise from running on my box, though. [~enigmacurry] perhaps you could kick off a simple test comparing with and without this patch on the real cluster so we can see some pretty graphs (keep populate range low so that commit log is a more visible component though, preferably)? Increase default commit log total space + segment size -- Key: CASSANDRA-7031 URL: https://issues.apache.org/jira/browse/CASSANDRA-7031 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Assignee: Benedict Priority: Trivial Fix For: 2.1 beta2 Attachments: 7031.txt I would like to increase the default commit log total space and segment size options for 64-bit JVMs: The current default of 1Gb and 32Mb is quite constrained and can have some (very minor) negative performance implications, for no major benefit: # 32Mb files are actually quite small, and if during the 10s interval we have completely filled multiple of them (quite easy) it would be more efficient to write fewer larger files, as we can issue fewer fsyncs and permit the OS to schedule the writes more efficiently. On my box this has a small but noticeable impact. Although I would expect on decent server hardware this would be smaller still, since we immediately drop the pages from cache on writing there isn't a great deal of advantage to keeping the files so small. The only advantage I can see is that during a drop KS/CF or other event that forces log rollover we're wasting less space until log recycling. 128-256Mb are modest increases that seem more appropriate to me. # 1Gb is too small for the default total log space. We can find that we force memtable flushes as a result of log utilisation instead of memtable occupancy quite often (esp. as a result of increased effective memtable space from recent improvements), especially on machines with more addressable memory. I suggest 8Gb as a minimum. The only disadvantage of having more log data is that replay on restart may be slightly slower, but since most of the events will be ignored it should be relatively benign, and I would rather take the penalty on startup instead of during running, no matter how small the running penalty. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6916) Preemptive opening of compaction result
[ https://issues.apache.org/jira/browse/CASSANDRA-6916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13976585#comment-13976585 ] Benedict commented on CASSANDRA-6916: - [~enigmacurry]: 6916v3 should crash if you set the preheat_kernel_page_cache property to true (just tested this locally). Setting the populate_io_cache_on_flush property of the CF would probably work but simply have no effect. Are you sure you were running the correct branch? bq. Furthermore, I hadn't realized when testing CASSANDRA-6746 that we could actually fare well with the existing options like this The problem is that we consider the default to be better at preventing dramatic page cache churn during compaction, which this should continue to deliver but without the downsides. The errors look plausible - but could we confirm we're running the correct (v3) branch given it didn't crash with the preheat setting in the yaml? Preemptive opening of compaction result --- Key: CASSANDRA-6916 URL: https://issues.apache.org/jira/browse/CASSANDRA-6916 Project: Cassandra Issue Type: Bug Components: Core Reporter: Benedict Assignee: Benedict Priority: Minor Labels: performance Fix For: 2.1 Attachments: 6916-stock2_1.mixed.cache_tweaks.tar.gz, 6916-stock2_1.mixed.logs.tar.gz, 6916v3-preempive-open-compact.logs.gz, 6916v3-preempive-open-compact.mixed.2.logs.tar.gz, 6916v3-premptive-open-compact.mixed.cache_tweaks.2.tar.gz Related to CASSANDRA-6812, but a little simpler: when compacting, we mess quite badly with the page cache. One thing we can do to mitigate this problem is to use the sstable we're writing before we've finished writing it, and to drop the regions from the old sstables from the page cache as soon as the new sstables have them (even if they're only written to the page cache). This should minimise any page cache churn, as the old sstables must be larger than the new sstable, and since both will be in memory, dropping the old sstables is at least as good as dropping the new. The approach is quite straight-forward. Every X MB written: # grab flushed length of index file; # grab second to last index summary record, after excluding those that point to positions after the flushed length; # open index file, and check that our last record doesn't occur outside of the flushed length of the data file (pretty unlikely) # Open the sstable with the calculated upper bound Some complications: # must keep running copy of compression metadata for reopening with # we need to be able to replace an sstable with itself but a different lower bound # we need to drop the old page cache only when readers have finished -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7031) Increase default commit log total space + segment size
[ https://issues.apache.org/jira/browse/CASSANDRA-7031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13976624#comment-13976624 ] Benedict commented on CASSANDRA-7031: - From what POV is 128Mb a long gap between archived segments? Do we mean that there may be a 128Mb gap after the most recent archive during which no PIT restore is possible? Seems like this would be a minimal problem, as the most recent CLS is still present in the CL directory, and we could always offer the ability to create a PITR point through force recycling the current CL segment at the requested time to make sure there is a separate backup. If you care about rolling PITR backups with minimal intervals then you're probably a very specific use case, I'd reckon. As far as replay is concerned, I don't see a major difference: we need to read ahead potentially more than even one 128Mb file to check if there are delayed commits, and either way 128Mb is a very small amount of data - a few seconds at most of extra restore time. Increase default commit log total space + segment size -- Key: CASSANDRA-7031 URL: https://issues.apache.org/jira/browse/CASSANDRA-7031 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Assignee: Benedict Priority: Trivial Fix For: 2.1 beta2 Attachments: 7031.txt I would like to increase the default commit log total space and segment size options for 64-bit JVMs: The current default of 1Gb and 32Mb is quite constrained and can have some (very minor) negative performance implications, for no major benefit: # 32Mb files are actually quite small, and if during the 10s interval we have completely filled multiple of them (quite easy) it would be more efficient to write fewer larger files, as we can issue fewer fsyncs and permit the OS to schedule the writes more efficiently. On my box this has a small but noticeable impact. Although I would expect on decent server hardware this would be smaller still, since we immediately drop the pages from cache on writing there isn't a great deal of advantage to keeping the files so small. The only advantage I can see is that during a drop KS/CF or other event that forces log rollover we're wasting less space until log recycling. 128-256Mb are modest increases that seem more appropriate to me. # 1Gb is too small for the default total log space. We can find that we force memtable flushes as a result of log utilisation instead of memtable occupancy quite often (esp. as a result of increased effective memtable space from recent improvements), especially on machines with more addressable memory. I suggest 8Gb as a minimum. The only disadvantage of having more log data is that replay on restart may be slightly slower, but since most of the events will be ignored it should be relatively benign, and I would rather take the penalty on startup instead of during running, no matter how small the running penalty. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-4450) CQL3: Allow preparing the consistency level, timestamp and ttl
[ https://issues.apache.org/jira/browse/CASSANDRA-4450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13976639#comment-13976639 ] Pavel Eremeev commented on CASSANDRA-4450: -- Why [timestamp] is using LongType.instance instead of TimestampType.instance? I think its better for clients to know the true type of that field for simplified and correct encoding of that value. CQL3: Allow preparing the consistency level, timestamp and ttl -- Key: CASSANDRA-4450 URL: https://issues.apache.org/jira/browse/CASSANDRA-4450 Project: Cassandra Issue Type: Improvement Affects Versions: 1.2.0 Reporter: Sylvain Lebresne Assignee: Sylvain Lebresne Priority: Minor Labels: cql3 Fix For: 2.0 beta 1 It could be useful to allow the preparation of the consitency level, the timestamp and the ttl. I.e. to allow: {noformat} UPDATE foo SET .. USING CONSISTENCY ? AND TIMESTAMP ? AND TTL ? {noformat} A slight concern is that when preparing a statement we return the names of the prepared variables, but none of timestamp, ttl and consistency are reserved names currently, so returning those as names could conflict with a column name. We can either: * make these reserved identifier (I have to add that I'm not a fan because at least for timestamp, I think that's a potentially useful and common column name). * use some specific special character to indicate those are not column names, like returning [timestamp], [ttl], [consistency]. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7046) Update nodetool commands to output the date and time they were run on
[ https://issues.apache.org/jira/browse/CASSANDRA-7046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13976662#comment-13976662 ] Clément Lardeur commented on CASSANDRA-7046: For me, It's not the responsibility of nodetool to display the time in stdout when the command has been executed, but rather it's the script or the client that called nodetool which should do it. If you use a shell script to call nodetool and redirect his output into a file, use the {{date}} command before calling nodetool. Update nodetool commands to output the date and time they were run on - Key: CASSANDRA-7046 URL: https://issues.apache.org/jira/browse/CASSANDRA-7046 Project: Cassandra Issue Type: Improvement Reporter: Johnny Miller Priority: Trivial Labels: lhf It would help if the various nodetool commands also outputted the system date time they were run. Often these commands are executed and then we look at the cassandra log files to try and find out what was happening at that time. This is certainly just a convenience feature, but it would be nice to have the information in there to aid with diagnostics. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (CASSANDRA-7062) Extension of static columns for compound cluster keys
[ https://issues.apache.org/jira/browse/CASSANDRA-7062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Clément Lardeur updated CASSANDRA-7062: --- Description: CASSANDRA-6561 implemented static columns for a given partition key. What this is proposing for a compound cluster key is a static column that is static at intermediate parts of a compound cluster key. This example shows a table modelling a moderately complex EAV pattern : {code} CREATE TABLE t ( entityID text, propertyName text, valueIndex text, entityName text static (entityID), propertyType text static (entityID, propertyName), propertyRelations Listtext static (entityID, propertyName), data text, PRIMARY KEY (entityID, (propertyName,valueIndex)) ) {code} So in this example has the following static columns: - the entityName column behaves exactly as CASSANDRA-6561 details, so all cluster rows have the same value - the propertyType and propertyRelations columns are static with respect to the remaining parts of the cluster key (that is, across all valueIndex values for a given propertyName), so an update to those values for an entityID and a propertyName will be shared/constant by all the value rows... Is this a relatively simple extension of the same mechanism in -6561, or is this a whoa, you have no idea what you are proposing? Sample data: Mary and Jane aren't married... {code} INSERT INTO t (entityID, entityName, propertyName, propertyType, valueIndex, data) VALUES ('0001','MARY MATALIN','married','SingleValue','0','false'); INSERT INTO t (entityID, entityName, propertyName, propertyType, valueIndex, data) VALUES ('0002','JANE JOHNSON','married','SingleValue','0','false'); INSERT INTO t (entityID, entityName, propertyName, propertyType, valueIndex) VALUES ('0001','MARY MATALIN','kids','NOVALUE',''); INSERT INTO t (entityID, entityName, propertyName, propertyType, valueIndex) VALUES ('0002','JANE JOHNSON','kids','NOVALUE',''); {code} {code} SELECT * FROM t: 0001 MARY MATALIN married SingleValue 0 false 0001 MARY MATALIN kids NOVALUE null 0002 JANE JOHNSON married SingleValue 0 false 0002 JANE JOHNSON kids NOVALUE null {code} Then mary and jane get married (so the entityName column that is static on the partition key is updated just like CASSANDRA-6561 ) {code} INSERT INTO t (entityID, entityName, propertyName, propertyType, valueIndex, data) VALUES ('0001','MARY SMITH','married','SingleValue','0','TRUE'); INSERT INTO t (entityID, entityName, propertyName, propertyType, valueIndex, data) VALUES ('0002','JANE JONES','married','SingleValue','0','TRUE'); {code} {code} SELECT * FROM t: 0001 MARY SMITH married SingleValue 0 TRUE 0001 MARY SMITH kids NOVALUE null 0002 JANE JONES married SingleValue 0 TRUE 0002 JANE JONES kids NOVALUE null {code} Then mary and jane have a kid, so we add another value to the kids attribute: {code} INSERT INTO t (entityID, propertyName, propertyType, valueIndex,data) VALUES ('0001','kids','SingleValue','0','JIM-BOB'); INSERT INTO t (entityID, propertyName, propertyType, valueIndex,data) VALUES ('0002','kids','SingleValue','0','JENNY'); {code} {code} SELECT * FROM t: 0001 MARY SMITH married SingleValue 0 TRUE 0001 MARY SMITH kids SingleValuenull 0001 MARY SMITH kids SingleValue 0 JIM-BOB 0002 JANE JONES married SingleValue 0 TRUE 0002 JANE JONES kids SingleValuenull 0002 JANE JONES kids SingleValue 0 JENNY {code} Then Mary has ANOTHER kid, which demonstrates the partially static column relative to the cluster key, as ALL value rows for the property 'kids' get updated to the new value: {code} INSERT INTO t (entityID, propertyName, propertyType, valueIndex,data) VALUES ('0001','kids','MultiValue','1','HARRY'); {code} {code} SELECT * FROM t: 0001 MARY SMITH married SingleValue 0 TRUE 0001 MARY SMITH kids MultiValue null 0001 MARY SMITH kids MultiValue 0 JIM-BOB 0001 MARY SMITH kids MultiValue 1 HARRY 0002 JANE JONES married SingleValue 0 TRUE 0002 JANE JONES kids SingleValuenull 0002 JANE JONES kids SingleValue 0 JENNY {code} ... ok, hopefully that example isn't TOO complicated. Yes, there's a stupid hack bug in there with the null/empty row for the kids attribute, but please bear with me on that Generally speaking, this will aid in flattening / denormalization of relational constructs into cassandra-friendly schemas. In the above example we are flattening a relational schema of three tables: entity, property, and value tables into a single sparse flattened denormalized compound table. was: CASSANDRA-6561 implemented static columns for a given partition key. What this is proposing for a compound cluster key is a static
[jira] [Commented] (CASSANDRA-6696) Drive replacement in JBOD can cause data to reappear.
[ https://issues.apache.org/jira/browse/CASSANDRA-6696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13976672#comment-13976672 ] Tupshin Harper commented on CASSANDRA-6696: --- +1. To the extent that we can do sstables per vnode without introducing other performance costs, I am hugely in favor of it. With good OS tuning, I'm not scared of too many sstables. If it is a pain for backup, or other things, you could have an offline sstable consolidator script that would take a batch of sstables and stream them out as a single sstable to a remote location. Drive replacement in JBOD can cause data to reappear. -- Key: CASSANDRA-6696 URL: https://issues.apache.org/jira/browse/CASSANDRA-6696 Project: Cassandra Issue Type: Improvement Components: Core Reporter: sankalp kohli Assignee: Marcus Eriksson Fix For: 3.0 In JBOD, when someone gets a bad drive, the bad drive is replaced with a new empty one and repair is run. This can cause deleted data to come back in some cases. Also this is true for corrupt stables in which we delete the corrupt stable and run repair. Here is an example: Say we have 3 nodes A,B and C and RF=3 and GC grace=10days. row=sankalp col=sankalp is written 20 days back and successfully went to all three nodes. Then a delete/tombstone was written successfully for the same row column 15 days back. Since this tombstone is more than gc grace, it got compacted in Nodes A and B since it got compacted with the actual data. So there is no trace of this row column in node A and B. Now in node C, say the original data is in drive1 and tombstone is in drive2. Compaction has not yet reclaimed the data and tombstone. Drive2 becomes corrupt and was replaced with new empty drive. Due to the replacement, the tombstone in now gone and row=sankalp col=sankalp has come back to life. Now after replacing the drive we run repair. This data will be propagated to all nodes. Note: This is still a problem even if we run repair every gc grace. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (CASSANDRA-6863) Incorrect read repair of range thombstones
[ https://issues.apache.org/jira/browse/CASSANDRA-6863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-6863: -- Reviewer: Jonathan Ellis Fix Version/s: 2.1 Assignee: Oleg Anastasyev Incorrect read repair of range thombstones -- Key: CASSANDRA-6863 URL: https://issues.apache.org/jira/browse/CASSANDRA-6863 Project: Cassandra Issue Type: Bug Environment: 2.0 Reporter: Oleg Anastasyev Assignee: Oleg Anastasyev Fix For: 2.1 Attachments: 6863-v2.txt, 6863-v2.txt, ReadRepairRangeThombstoneDiff.txt, ReadRepairsDebugLogger.txt Rows with range thombstones are read repaired for every replica, if RR is triggered (this is because CF.diff() returns non null if !isEmpty(), which in turn returns false if range thombstones list is not empty). Also, full rangethombstone list is send to all nodes, which could be a problem if you have wide partition. Fixed this by evaluating diff on range thombstone lists as well as on deteleInfo of endpoint CF versions. Also return null from CF.diff, if no diff in RTL. A second patch (ReadRepairsDebugLogger.txt) adds some debug logging to look at read repairs. You may find it useful as well. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6987) sstablesplit fails in 2.1
[ https://issues.apache.org/jira/browse/CASSANDRA-6987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13976684#comment-13976684 ] Jonathan Ellis commented on CASSANDRA-6987: --- committed sstablesplit fails in 2.1 - Key: CASSANDRA-6987 URL: https://issues.apache.org/jira/browse/CASSANDRA-6987 Project: Cassandra Issue Type: Bug Components: Core Environment: Debian Testing/Jessie Oracle JDK 1.7.0_51 c*-2.1 branch, commit 5ebadc11e36749e6479f9aba19406db3aacdaf41 Reporter: Michael Shuler Assignee: Benedict Fix For: 2.1 beta2 Attachments: 6987.txt sstablesplit dtest began failing in 2.1 at http://cassci.datastax.com/job/cassandra-2.1_novnode_dtest/95/ triggered by http://cassci.datastax.com/job/cassandra-2.1/186/ repro: {noformat} (cassandra-2.1)mshuler@hana:~/git/cassandra$ ./bin/cassandra /dev/null (cassandra-2.1)mshuler@hana:~/git/cassandra$ ./tools/bin/cassandra-stress write n=100 Created keyspaces. Sleeping 1s for propagation. Warming up WRITE with 5 iterations... Connected to cluster: Test Cluster Datatacenter: datacenter1; Host: localhost/127.0.0.1; Rack: rack1 Sleeping 2s... Running WRITE with 50 threads for 100 iterations ops ,op/s, key/s,mean, med, .95, .99,.999, max, time, stderr 26836 , 26830, 26830, 2.0, 1.1, 4.0,20.8, 131.4, 207.4,1.0, 0.0 64002 , 36236, 36236, 1.4, 0.8, 4.2,13.8,41.3, 234.8,2.0, 0.0 105604, 38188, 38188, 1.3, 0.8, 3.2,10.6,78.4, 93.7,3.1, 0.10546 156179, 36750, 36750, 1.4, 0.9, 2.9, 8.8, 117.0, 139.8,4.5, 0.08482 202092, 40487, 40487, 1.2, 0.9, 2.9, 7.3,45.6, 122.5,5.6, 0.07231 246947, 40583, 40583, 1.2, 0.8, 3.0, 7.6,98.2, 152.1,6.7, 0.07056 290186, 39867, 39867, 1.3, 0.8, 2.6, 8.9, 113.3, 126.4,7.8, 0.06391 331609, 40155, 40155, 1.2, 0.8, 3.1, 8.7,99.1, 124.9,8.8, 0.05731 371813, 38742, 38742, 1.3, 0.8, 3.1, 9.2, 117.2, 123.9,9.9, 0.05153 416853, 40024, 40024, 1.2, 0.8, 3.2, 8.1,70.4, 119.8, 11.0, 0.04634 458389, 39045, 39045, 1.3, 0.8, 3.2, 9.1, 106.4, 135.9, 12.1, 0.04236 511323, 36513, 36513, 1.4, 0.8, 3.3, 9.2, 120.2, 161.0, 13.5, 0.03883 549872, 34296, 34296, 1.5, 0.9, 3.4,11.5, 106.7, 132.7, 14.6, 0.03678 589405, 34535, 34535, 1.4, 0.9, 2.9,10.6, 106.2, 147.9, 15.8, 0.03607 633225, 39472, 39472, 1.3, 0.8, 3.0, 7.6, 106.3, 125.1, 16.9, 0.03374 672751, 38251, 38251, 1.3, 0.8, 3.0, 8.0,94.7, 157.5, 17.9, 0.03193 714762, 38047, 38047, 1.3, 0.8, 3.0, 9.3, 102.6, 167.8, 19.0, 0.03001 756629, 38080, 38080, 1.3, 0.8, 3.2, 8.8, 101.7, 117.4, 20.1, 0.02847 802981, 38955, 38955, 1.3, 0.8, 3.0, 9.1, 105.2, 164.6, 21.3, 0.02708 847262, 38817, 38817, 1.3, 0.7, 3.2, 9.8, 112.1, 137.4, 22.5, 0.02581 887639, 38403, 38403, 1.3, 0.8, 2.9,10.0,99.1, 147.8, 23.5, 0.02470 929362, 35056, 35056, 1.4, 0.8, 3.3,11.5, 111.8, 149.3, 24.7, 0.02360 980996, 38247, 38247, 1.3, 0.8, 3.5, 8.3,78.8, 129.0, 26.1, 0.02338 100 , 39379, 39379, 1.2, 0.9, 3.1, 9.0,29.4, 83.8, 26.5, 0.02238 Results: real op rate : 37673 adjusted op rate stderr : 0 key rate : 37673 latency mean : 1.3 latency median: 0.8 latency 95th percentile : 3.2 latency 99th percentile : 10.4 latency 99.9th percentile : 92.1 latency max : 234.8 Total operation time : 00:00:26 END (cassandra-2.1)mshuler@hana:~/git/cassandra$ ./bin/nodetool compact Keyspace1 (cassandra-2.1)mshuler@hana:~/git/cassandra$ ./bin/sstablesplit /var/lib/cassandra/data/Keyspace1/Standard1-*/Keyspace1-Standard1-ka-2-Data.db Exception in thread main java.lang.AssertionError at org.apache.cassandra.db.Keyspace.openWithoutSSTables(Keyspace.java:104) at org.apache.cassandra.tools.StandaloneSplitter.main(StandaloneSplitter.java:108) {noformat} There are no errors in system.log. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6863) Incorrect read repair of range thombstones
[ https://issues.apache.org/jira/browse/CASSANDRA-6863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13976686#comment-13976686 ] Jonathan Ellis commented on CASSANDRA-6863: --- committed Incorrect read repair of range thombstones -- Key: CASSANDRA-6863 URL: https://issues.apache.org/jira/browse/CASSANDRA-6863 Project: Cassandra Issue Type: Bug Environment: 2.0 Reporter: Oleg Anastasyev Assignee: Oleg Anastasyev Fix For: 2.1 Attachments: 6863-v2.txt, 6863-v2.txt, ReadRepairRangeThombstoneDiff.txt, ReadRepairsDebugLogger.txt Rows with range thombstones are read repaired for every replica, if RR is triggered (this is because CF.diff() returns non null if !isEmpty(), which in turn returns false if range thombstones list is not empty). Also, full rangethombstone list is send to all nodes, which could be a problem if you have wide partition. Fixed this by evaluating diff on range thombstone lists as well as on deteleInfo of endpoint CF versions. Also return null from CF.diff, if no diff in RTL. A second patch (ReadRepairsDebugLogger.txt) adds some debug logging to look at read repairs. You may find it useful as well. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6696) Drive replacement in JBOD can cause data to reappear.
[ https://issues.apache.org/jira/browse/CASSANDRA-6696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13976688#comment-13976688 ] Benedict commented on CASSANDRA-6696: - The problem here is packing vnodes fairly across the disks: either we need to ensure that all vnodes are of roughly equal size (very difficult), or we probably need to have a dynamic allocation strategy, and the problem with _that_ is that when the token range gets redistributed by node additions/removals, the whole cluster suddenly needs to start kicking off rebalancing of their local disks. We could support splitting the token range into M distinct chunks, where M is preferably some multiple of the number of disks, and split the total token range into M chunks, then allocate each chunk to a disk in round-robin fashion. This then remains deterministic, and it is I think easier to guarantee an even distribution within a given token range than it is to guarantee all vnodes are of equal size, whilst still supporting a dynamic cluster size. Even here, though, realistically I think we need the number of chunks to be quite a bit smaller than the number of vnodes to guarantee anything approaching balance of these chunks. Drive replacement in JBOD can cause data to reappear. -- Key: CASSANDRA-6696 URL: https://issues.apache.org/jira/browse/CASSANDRA-6696 Project: Cassandra Issue Type: Improvement Components: Core Reporter: sankalp kohli Assignee: Marcus Eriksson Fix For: 3.0 In JBOD, when someone gets a bad drive, the bad drive is replaced with a new empty one and repair is run. This can cause deleted data to come back in some cases. Also this is true for corrupt stables in which we delete the corrupt stable and run repair. Here is an example: Say we have 3 nodes A,B and C and RF=3 and GC grace=10days. row=sankalp col=sankalp is written 20 days back and successfully went to all three nodes. Then a delete/tombstone was written successfully for the same row column 15 days back. Since this tombstone is more than gc grace, it got compacted in Nodes A and B since it got compacted with the actual data. So there is no trace of this row column in node A and B. Now in node C, say the original data is in drive1 and tombstone is in drive2. Compaction has not yet reclaimed the data and tombstone. Drive2 becomes corrupt and was replaced with new empty drive. Due to the replacement, the tombstone in now gone and row=sankalp col=sankalp has come back to life. Now after replacing the drive we run repair. This data will be propagated to all nodes. Note: This is still a problem even if we run repair every gc grace. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7031) Increase default commit log total space + segment size
[ https://issues.apache.org/jira/browse/CASSANDRA-7031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13976690#comment-13976690 ] Jonathan Ellis commented on CASSANDRA-7031: --- bq. Do we mean that there may be a 128Mb gap after the most recent archive during which no PIT restore is possible? Seems like this would be a minimal problem, as the most recent CLS is still present in the CL directory Since one point of restore is, I don't have the CL directory anymore this is kind of a non-solution. we could always offer the ability to create a PITR point through force recycling the current CL segment at the requested time to make sure there is a separate backup. So now we're forcing users to add a cron job for PITR to work? I don't like that idea either. Increase default commit log total space + segment size -- Key: CASSANDRA-7031 URL: https://issues.apache.org/jira/browse/CASSANDRA-7031 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Assignee: Benedict Priority: Trivial Fix For: 2.1 beta2 Attachments: 7031.txt I would like to increase the default commit log total space and segment size options for 64-bit JVMs: The current default of 1Gb and 32Mb is quite constrained and can have some (very minor) negative performance implications, for no major benefit: # 32Mb files are actually quite small, and if during the 10s interval we have completely filled multiple of them (quite easy) it would be more efficient to write fewer larger files, as we can issue fewer fsyncs and permit the OS to schedule the writes more efficiently. On my box this has a small but noticeable impact. Although I would expect on decent server hardware this would be smaller still, since we immediately drop the pages from cache on writing there isn't a great deal of advantage to keeping the files so small. The only advantage I can see is that during a drop KS/CF or other event that forces log rollover we're wasting less space until log recycling. 128-256Mb are modest increases that seem more appropriate to me. # 1Gb is too small for the default total log space. We can find that we force memtable flushes as a result of log utilisation instead of memtable occupancy quite often (esp. as a result of increased effective memtable space from recent improvements), especially on machines with more addressable memory. I suggest 8Gb as a minimum. The only disadvantage of having more log data is that replay on restart may be slightly slower, but since most of the events will be ignored it should be relatively benign, and I would rather take the penalty on startup instead of during running, no matter how small the running penalty. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Comment Edited] (CASSANDRA-7031) Increase default commit log total space + segment size
[ https://issues.apache.org/jira/browse/CASSANDRA-7031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13976690#comment-13976690 ] Jonathan Ellis edited comment on CASSANDRA-7031 at 4/22/14 12:31 PM: - bq. Do we mean that there may be a 128Mb gap after the most recent archive during which no PIT restore is possible? Seems like this would be a minimal problem, as the most recent CLS is still present in the CL directory Since one point of restore is, I don't have the CL directory anymore this is kind of a non-solution. bq. we could always offer the ability to create a PITR point through force recycling the current CL segment at the requested time to make sure there is a separate backup. So now we're forcing users to add a cron job for PITR to work? I don't like that idea either. was (Author: jbellis): bq. Do we mean that there may be a 128Mb gap after the most recent archive during which no PIT restore is possible? Seems like this would be a minimal problem, as the most recent CLS is still present in the CL directory Since one point of restore is, I don't have the CL directory anymore this is kind of a non-solution. we could always offer the ability to create a PITR point through force recycling the current CL segment at the requested time to make sure there is a separate backup. So now we're forcing users to add a cron job for PITR to work? I don't like that idea either. Increase default commit log total space + segment size -- Key: CASSANDRA-7031 URL: https://issues.apache.org/jira/browse/CASSANDRA-7031 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Assignee: Benedict Priority: Trivial Fix For: 2.1 beta2 Attachments: 7031.txt I would like to increase the default commit log total space and segment size options for 64-bit JVMs: The current default of 1Gb and 32Mb is quite constrained and can have some (very minor) negative performance implications, for no major benefit: # 32Mb files are actually quite small, and if during the 10s interval we have completely filled multiple of them (quite easy) it would be more efficient to write fewer larger files, as we can issue fewer fsyncs and permit the OS to schedule the writes more efficiently. On my box this has a small but noticeable impact. Although I would expect on decent server hardware this would be smaller still, since we immediately drop the pages from cache on writing there isn't a great deal of advantage to keeping the files so small. The only advantage I can see is that during a drop KS/CF or other event that forces log rollover we're wasting less space until log recycling. 128-256Mb are modest increases that seem more appropriate to me. # 1Gb is too small for the default total log space. We can find that we force memtable flushes as a result of log utilisation instead of memtable occupancy quite often (esp. as a result of increased effective memtable space from recent improvements), especially on machines with more addressable memory. I suggest 8Gb as a minimum. The only disadvantage of having more log data is that replay on restart may be slightly slower, but since most of the events will be ignored it should be relatively benign, and I would rather take the penalty on startup instead of during running, no matter how small the running penalty. -- This message was sent by Atlassian JIRA (v6.2#6252)
[03/19] git commit: CHANGES
CHANGES Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/75b87ce8 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/75b87ce8 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/75b87ce8 Branch: refs/heads/cassandra-2.1 Commit: 75b87ce86bb729c38832acfb0edb6ab84a0deefd Parents: 655ae7a Author: Jonathan Ellis jbel...@apache.org Authored: Fri Apr 4 15:33:47 2014 -0500 Committer: Jonathan Ellis jbel...@apache.org Committed: Wed Apr 9 09:07:04 2014 -0500 -- CHANGES.txt | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/75b87ce8/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 38a6c3c..b642908 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,5 +1,6 @@ 2.0.7 - * Allow compaction of system tables during startup (CASSANDRA-6913) + * Avoid early loading of non-system keyspaces before compaction-leftovers + cleanup at startup (CASSANDRA-6913) * Restrict Windows to parallel repairs (CASSANDRA-6907) * (Hadoop) Allow manually specifying start/end tokens in CFIF (CASSANDRA-6436) * Fix NPE in MeteredFlusher (CASSANDRA-6820)
[02/19] git commit: cleanup
cleanup Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/655ae7a8 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/655ae7a8 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/655ae7a8 Branch: refs/heads/cassandra-2.0 Commit: 655ae7a8f9c4582eb6c303e005df6edbe76bd732 Parents: 2dd0907 Author: Jonathan Ellis jbel...@apache.org Authored: Fri Apr 4 12:34:12 2014 -0500 Committer: Jonathan Ellis jbel...@apache.org Committed: Wed Apr 9 09:07:04 2014 -0500 -- src/java/org/apache/cassandra/db/DeletionTime.java | 7 +++ 1 file changed, 3 insertions(+), 4 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/655ae7a8/src/java/org/apache/cassandra/db/DeletionTime.java -- diff --git a/src/java/org/apache/cassandra/db/DeletionTime.java b/src/java/org/apache/cassandra/db/DeletionTime.java index b80422c..a1b9f17 100644 --- a/src/java/org/apache/cassandra/db/DeletionTime.java +++ b/src/java/org/apache/cassandra/db/DeletionTime.java @@ -122,10 +122,9 @@ public class DeletionTime implements ComparableDeletionTime { int ldt = in.readInt(); long mfda = in.readLong(); -if (mfda == Long.MIN_VALUE ldt == Integer.MAX_VALUE) -return LIVE; -else -return new DeletionTime(mfda, ldt); +return mfda == Long.MIN_VALUE ldt == Integer.MAX_VALUE + ? LIVE + : new DeletionTime(mfda, ldt); } public long serializedSize(DeletionTime delTime, TypeSizes typeSizes)
[11/19] git commit: Merge remote-tracking branch 'origin/cassandra-2.0' into cassandra-2.0
Merge remote-tracking branch 'origin/cassandra-2.0' into cassandra-2.0 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/364282a7 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/364282a7 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/364282a7 Branch: refs/heads/cassandra-2.1 Commit: 364282a7d43c475b82fc169cec751fba899e3101 Parents: cdfe4e0 48d7e40 Author: Jonathan Ellis jbel...@apache.org Authored: Tue Apr 22 07:20:32 2014 -0500 Committer: Jonathan Ellis jbel...@apache.org Committed: Tue Apr 22 07:20:32 2014 -0500 -- CHANGES.txt | 14 ++- doc/cql3/CQL.textile| 16 ++- pylib/cqlshlib/cql3handling.py | 24 +++-- .../apache/cassandra/cql3/QueryProcessor.java | 2 +- .../cql3/statements/BatchStatement.java | 3 +- .../cql3/statements/SelectStatement.java| 34 --- .../apache/cassandra/db/BatchlogManager.java| 102 +++ .../apache/cassandra/db/ColumnFamilyStore.java | 6 -- .../cassandra/db/HintedHandOffManager.java | 21 +--- .../org/apache/cassandra/db/SystemKeyspace.java | 60 +++ .../db/commitlog/CommitLogReplayer.java | 12 +-- .../apache/cassandra/service/StorageProxy.java | 9 +- .../cassandra/service/pager/QueryPager.java | 2 +- .../org/apache/cassandra/tools/NodeCmd.java | 5 +- .../apache/cassandra/tools/NodeToolHelp.yaml| 2 +- .../cassandra/db/BatchlogManagerTest.java | 78 -- .../apache/cassandra/db/HintedHandOffTest.java | 19 ++-- 17 files changed, 272 insertions(+), 137 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/364282a7/CHANGES.txt -- diff --cc CHANGES.txt index 01830ef,5d47cfa..791586c --- a/CHANGES.txt +++ b/CHANGES.txt @@@ -1,7 -1,15 +1,16 @@@ + 2.0.8 + * Queries on compact tables can return more rows that requested (CASSANDRA-7052) + * USING TIMESTAMP for batches does not work (CASSANDRA-7053) + Merged from 1.2: + * Fix batchlog to account for CF truncation records (CASSANDRA-6999) + * Fix CQLSH parsing of functions and BLOB literals (CASSANDRA-7018) + * Require nodetool rebuild_index to specify index names (CASSANDRA-7038) + + 2.0.7 * Put nodes in hibernate when join_ring is false (CASSANDRA-6961) - * Allow compaction of system tables during startup (CASSANDRA-6913) + * Avoid early loading of non-system keyspaces before compaction-leftovers + cleanup at startup (CASSANDRA-6913) * Restrict Windows to parallel repairs (CASSANDRA-6907) * (Hadoop) Allow manually specifying start/end tokens in CFIF (CASSANDRA-6436) * Fix NPE in MeteredFlusher (CASSANDRA-6820)
[17/19] git commit: Add range tombstones to read repair digests patch by Oleg Anastasyev; reviewed by jbellis for CASSANDRA-6863
Add range tombstones to read repair digests patch by Oleg Anastasyev; reviewed by jbellis for CASSANDRA-6863 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/a2e74354 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/a2e74354 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/a2e74354 Branch: refs/heads/cassandra-2.1 Commit: a2e74354ca51809a11b62dd7995c026807683b0a Parents: be2686d Author: Jonathan Ellis jbel...@apache.org Authored: Tue Apr 22 07:27:20 2014 -0500 Committer: Jonathan Ellis jbel...@apache.org Committed: Tue Apr 22 07:27:20 2014 -0500 -- CHANGES.txt | 1 + .../org/apache/cassandra/db/ColumnFamily.java | 5 ++ .../org/apache/cassandra/db/DeletionInfo.java | 34 +- .../apache/cassandra/db/RangeTombstoneList.java | 68 ++-- .../apache/cassandra/net/MessagingService.java | 32 - test/unit/org/apache/cassandra/Util.java| 7 ++ .../apache/cassandra/db/ColumnFamilyTest.java | 46 - test/unit/org/apache/cassandra/db/RowTest.java | 40 ++-- 8 files changed, 222 insertions(+), 11 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/a2e74354/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index ae7410e..495dab2 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 2.1.0-beta2 + * Add range tombstones to read repair digests (CASSANDRA-6863) * Fix BTree.clear for large updates (CASSANDRA-6943) * Fail write instead of logging a warning when unable to append to CL (CASSANDRA-6764) http://git-wip-us.apache.org/repos/asf/cassandra/blob/a2e74354/src/java/org/apache/cassandra/db/ColumnFamily.java -- diff --git a/src/java/org/apache/cassandra/db/ColumnFamily.java b/src/java/org/apache/cassandra/db/ColumnFamily.java index da404b0..4f85610 100644 --- a/src/java/org/apache/cassandra/db/ColumnFamily.java +++ b/src/java/org/apache/cassandra/db/ColumnFamily.java @@ -313,8 +313,11 @@ public abstract class ColumnFamily implements IterableCell, IRowCacheEntry } } + cfDiff.setDeletionInfo(deletionInfo().diff(cfComposite.deletionInfo())); + if (!cfDiff.isEmpty()) return cfDiff; + return null; } @@ -385,6 +388,8 @@ public abstract class ColumnFamily implements IterableCell, IRowCacheEntry { for (Cell cell : this) cell.updateDigest(digest); +if (MessagingService.instance().areAllNodesAtLeast21()) +deletionInfo().updateDigest(digest); } public static ColumnFamily diff(ColumnFamily cf1, ColumnFamily cf2) http://git-wip-us.apache.org/repos/asf/cassandra/blob/a2e74354/src/java/org/apache/cassandra/db/DeletionInfo.java -- diff --git a/src/java/org/apache/cassandra/db/DeletionInfo.java b/src/java/org/apache/cassandra/db/DeletionInfo.java index 8601bce..a167b85 100644 --- a/src/java/org/apache/cassandra/db/DeletionInfo.java +++ b/src/java/org/apache/cassandra/db/DeletionInfo.java @@ -19,7 +19,9 @@ package org.apache.cassandra.db; import java.io.DataInput; import java.io.IOException; -import java.util.*; +import java.security.MessageDigest; +import java.util.Comparator; +import java.util.Iterator; import com.google.common.base.Objects; import com.google.common.collect.Iterators; @@ -29,6 +31,7 @@ import org.apache.cassandra.db.composites.CType; import org.apache.cassandra.db.composites.Composite; import org.apache.cassandra.io.IVersionedSerializer; import org.apache.cassandra.io.util.DataOutputPlus; +import org.apache.cassandra.utils.ByteBufferUtil; import org.apache.cassandra.utils.ObjectSizes; /** @@ -168,6 +171,35 @@ public class DeletionInfo implements IMeasurableMemory } /** + * Evaluates difference between this deletion info and superset for read repair + * + * @return the difference between the two, or LIVE if no difference + */ +public DeletionInfo diff(DeletionInfo superset) +{ +RangeTombstoneList rangeDiff = superset.ranges == null || superset.ranges.isEmpty() + ? null + : ranges == null ? superset.ranges : ranges.diff(superset.ranges); + +return topLevel.markedForDeleteAt != superset.topLevel.markedForDeleteAt || rangeDiff != null + ? new DeletionInfo(superset.topLevel, rangeDiff) + : DeletionInfo.live(); +} + + +/** + * Digests deletion info. Used to trigger read repair on mismatch. + */ +public void
[jira] [Commented] (CASSANDRA-6996) Setting severity via JMX broken
[ https://issues.apache.org/jira/browse/CASSANDRA-6996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13976696#comment-13976696 ] Jonathan Ellis commented on CASSANDRA-6996: --- Rick, are you planning to have a look at the patch? Setting severity via JMX broken --- Key: CASSANDRA-6996 URL: https://issues.apache.org/jira/browse/CASSANDRA-6996 Project: Cassandra Issue Type: Bug Components: Tools Reporter: Rick Branson Assignee: Vijay Priority: Minor Fix For: 2.0.8 Attachments: 0001-CASSANDRA-6996.patch Looks like setting the Severity attribute in the DynamicEndpointSnitch via JMX is a no-op. -- This message was sent by Atlassian JIRA (v6.2#6252)
[18/19] git commit: Add range tombstones to read repair digests patch by Oleg Anastasyev; reviewed by jbellis for CASSANDRA-6863
Add range tombstones to read repair digests patch by Oleg Anastasyev; reviewed by jbellis for CASSANDRA-6863 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/a2e74354 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/a2e74354 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/a2e74354 Branch: refs/heads/trunk Commit: a2e74354ca51809a11b62dd7995c026807683b0a Parents: be2686d Author: Jonathan Ellis jbel...@apache.org Authored: Tue Apr 22 07:27:20 2014 -0500 Committer: Jonathan Ellis jbel...@apache.org Committed: Tue Apr 22 07:27:20 2014 -0500 -- CHANGES.txt | 1 + .../org/apache/cassandra/db/ColumnFamily.java | 5 ++ .../org/apache/cassandra/db/DeletionInfo.java | 34 +- .../apache/cassandra/db/RangeTombstoneList.java | 68 ++-- .../apache/cassandra/net/MessagingService.java | 32 - test/unit/org/apache/cassandra/Util.java| 7 ++ .../apache/cassandra/db/ColumnFamilyTest.java | 46 - test/unit/org/apache/cassandra/db/RowTest.java | 40 ++-- 8 files changed, 222 insertions(+), 11 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/a2e74354/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index ae7410e..495dab2 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 2.1.0-beta2 + * Add range tombstones to read repair digests (CASSANDRA-6863) * Fix BTree.clear for large updates (CASSANDRA-6943) * Fail write instead of logging a warning when unable to append to CL (CASSANDRA-6764) http://git-wip-us.apache.org/repos/asf/cassandra/blob/a2e74354/src/java/org/apache/cassandra/db/ColumnFamily.java -- diff --git a/src/java/org/apache/cassandra/db/ColumnFamily.java b/src/java/org/apache/cassandra/db/ColumnFamily.java index da404b0..4f85610 100644 --- a/src/java/org/apache/cassandra/db/ColumnFamily.java +++ b/src/java/org/apache/cassandra/db/ColumnFamily.java @@ -313,8 +313,11 @@ public abstract class ColumnFamily implements IterableCell, IRowCacheEntry } } + cfDiff.setDeletionInfo(deletionInfo().diff(cfComposite.deletionInfo())); + if (!cfDiff.isEmpty()) return cfDiff; + return null; } @@ -385,6 +388,8 @@ public abstract class ColumnFamily implements IterableCell, IRowCacheEntry { for (Cell cell : this) cell.updateDigest(digest); +if (MessagingService.instance().areAllNodesAtLeast21()) +deletionInfo().updateDigest(digest); } public static ColumnFamily diff(ColumnFamily cf1, ColumnFamily cf2) http://git-wip-us.apache.org/repos/asf/cassandra/blob/a2e74354/src/java/org/apache/cassandra/db/DeletionInfo.java -- diff --git a/src/java/org/apache/cassandra/db/DeletionInfo.java b/src/java/org/apache/cassandra/db/DeletionInfo.java index 8601bce..a167b85 100644 --- a/src/java/org/apache/cassandra/db/DeletionInfo.java +++ b/src/java/org/apache/cassandra/db/DeletionInfo.java @@ -19,7 +19,9 @@ package org.apache.cassandra.db; import java.io.DataInput; import java.io.IOException; -import java.util.*; +import java.security.MessageDigest; +import java.util.Comparator; +import java.util.Iterator; import com.google.common.base.Objects; import com.google.common.collect.Iterators; @@ -29,6 +31,7 @@ import org.apache.cassandra.db.composites.CType; import org.apache.cassandra.db.composites.Composite; import org.apache.cassandra.io.IVersionedSerializer; import org.apache.cassandra.io.util.DataOutputPlus; +import org.apache.cassandra.utils.ByteBufferUtil; import org.apache.cassandra.utils.ObjectSizes; /** @@ -168,6 +171,35 @@ public class DeletionInfo implements IMeasurableMemory } /** + * Evaluates difference between this deletion info and superset for read repair + * + * @return the difference between the two, or LIVE if no difference + */ +public DeletionInfo diff(DeletionInfo superset) +{ +RangeTombstoneList rangeDiff = superset.ranges == null || superset.ranges.isEmpty() + ? null + : ranges == null ? superset.ranges : ranges.diff(superset.ranges); + +return topLevel.markedForDeleteAt != superset.topLevel.markedForDeleteAt || rangeDiff != null + ? new DeletionInfo(superset.topLevel, rangeDiff) + : DeletionInfo.live(); +} + + +/** + * Digests deletion info. Used to trigger read repair on mismatch. + */ +public void updateDigest(MessageDigest
[13/19] git commit: Merge branch 'cassandra-2.0' into cassandra-2.1
Merge branch 'cassandra-2.0' into cassandra-2.1 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/4305bd40 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/4305bd40 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/4305bd40 Branch: refs/heads/cassandra-2.1 Commit: 4305bd40f619e1d4ebdfc5873c082deb82aac5f0 Parents: 3e6b299 364282a Author: Jonathan Ellis jbel...@apache.org Authored: Tue Apr 22 07:20:45 2014 -0500 Committer: Jonathan Ellis jbel...@apache.org Committed: Tue Apr 22 07:20:45 2014 -0500 -- CHANGES.txt| 3 ++- src/java/org/apache/cassandra/db/DeletionTime.java | 7 +++ 2 files changed, 5 insertions(+), 5 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/4305bd40/CHANGES.txt -- diff --cc CHANGES.txt index d94f13b,791586c..ae7410e --- a/CHANGES.txt +++ b/CHANGES.txt @@@ -1,55 -1,16 +1,56 @@@ -2.0.8 - * Queries on compact tables can return more rows that requested (CASSANDRA-7052) - * USING TIMESTAMP for batches does not work (CASSANDRA-7053) -Merged from 1.2: - * Fix batchlog to account for CF truncation records (CASSANDRA-6999) - * Fix CQLSH parsing of functions and BLOB literals (CASSANDRA-7018) +2.1.0-beta2 + * Fix BTree.clear for large updates (CASSANDRA-6943) + * Fail write instead of logging a warning when unable to append to CL + (CASSANDRA-6764) + * Eliminate possibility of CL segment appearing twice in active list + (CASSANDRA-6557) + * Apply DONTNEED fadvise to commitlog segments (CASSANDRA-6759) + * Switch CRC component to Adler and include it for compressed sstables + (CASSANDRA-4165) + * Allow cassandra-stress to set compaction strategy options (CASSANDRA-6451) + * Add broadcast_rpc_address option to cassandra.yaml (CASSANDRA-5899) + * Auto reload GossipingPropertyFileSnitch config (CASSANDRA-5897) + * Fix overflow of memtable_total_space_in_mb (CASSANDRA-6573) + * Fix ABTC NPE and apply update function correctly (CASSANDRA-6692) + * Allow nodetool to use a file or prompt for password (CASSANDRA-6660) + * Fix AIOOBE when concurrently accessing ABSC (CASSANDRA-6742) + * Fix assertion error in ALTER TYPE RENAME (CASSANDRA-6705) + * Scrub should not always clear out repaired status (CASSANDRA-5351) + * Improve handling of range tombstone for wide partitions (CASSANDRA-6446) + * Fix ClassCastException for compact table with composites (CASSANDRA-6738) + * Fix potentially repairing with wrong nodes (CASSANDRA-6808) + * Change caching option syntax (CASSANDRA-6745) + * Fix stress to do proper counter reads (CASSANDRA-6835) + * Fix help message for stress counter_write (CASSANDRA-6824) + * Fix stress smart Thrift client to pick servers correctly (CASSANDRA-6848) + * Add logging levels (minimal, normal or verbose) to stress tool (CASSANDRA-6849) + * Fix race condition in Batch CLE (CASSANDRA-6860) + * Improve cleanup/scrub/upgradesstables failure handling (CASSANDRA-6774) + * ByteBuffer write() methods for serializing sstables (CASSANDRA-6781) + * Proper compare function for CollectionType (CASSANDRA-6783) + * Update native server to Netty 4 (CASSANDRA-6236) + * Fix off-by-one error in stress (CASSANDRA-6883) + * Make OpOrder AutoCloseable (CASSANDRA-6901) + * Remove sync repair JMX interface (CASSANDRA-6900) + * Add multiple memory allocation options for memtables (CASSANDRA-6689) + * Remove adjusted op rate from stress output (CASSANDRA-6921) + * Add optimized CF.hasColumns() implementations (CASSANDRA-6941) + * Serialize batchlog mutations with the version of the target node + (CASSANDRA-6931) + * Optimize CounterColumn#reconcile() (CASSANDRA-6953) + * Properly remove 1.2 sstable support in 2.1 (CASSANDRA-6869) + * Lock counter cells, not partitions (CASSANDRA-6880) + * Track presence of legacy counter shards in sstables (CASSANDRA-6888) + * Ensure safe resource cleanup when replacing sstables (CASSANDRA-6912) + * Add failure handler to async callback (CASSANDRA-6747) + * Fix AE when closing SSTable without releasing reference (CASSANDRA-7000) + * Clean up IndexInfo on keyspace/table drops (CASSANDRA-6924) + * Only snapshot relative SSTables when sequential repair (CASSANDRA-7024) * Require nodetool rebuild_index to specify index names (CASSANDRA-7038) - - -2.0.7 +Merged from 2.0: * Put nodes in hibernate when join_ring is false (CASSANDRA-6961) - * Allow compaction of system tables during startup (CASSANDRA-6913) + * Avoid early loading of non-system keyspaces before compaction-leftovers +cleanup at startup (CASSANDRA-6913) * Restrict Windows to parallel repairs (CASSANDRA-6907) * (Hadoop) Allow manually specifying start/end tokens in
[15/19] git commit: remove invalid assert patch by Benedict Elliott Smith; reviewed by jbellis for CASSANDRA-6987
remove invalid assert patch by Benedict Elliott Smith; reviewed by jbellis for CASSANDRA-6987 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/be2686dd Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/be2686dd Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/be2686dd Branch: refs/heads/trunk Commit: be2686dd94582abbdd20e0b1b3088d52eae4fbf2 Parents: 4305bd4 Author: Jonathan Ellis jbel...@apache.org Authored: Tue Apr 22 07:23:30 2014 -0500 Committer: Jonathan Ellis jbel...@apache.org Committed: Tue Apr 22 07:23:30 2014 -0500 -- src/java/org/apache/cassandra/db/Keyspace.java | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/be2686dd/src/java/org/apache/cassandra/db/Keyspace.java -- diff --git a/src/java/org/apache/cassandra/db/Keyspace.java b/src/java/org/apache/cassandra/db/Keyspace.java index 1c3df77..31e68c1 100644 --- a/src/java/org/apache/cassandra/db/Keyspace.java +++ b/src/java/org/apache/cassandra/db/Keyspace.java @@ -99,9 +99,9 @@ public class Keyspace return open(keyspaceName, Schema.instance, true); } +// to only be used by org.apache.cassandra.tools.Standalone* classes public static Keyspace openWithoutSSTables(String keyspaceName) { -assert initialized || keyspaceName.equals(SYSTEM_KS); return open(keyspaceName, Schema.instance, false); }
[08/19] git commit: merge
merge Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/cdfe4e03 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/cdfe4e03 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/cdfe4e03 Branch: refs/heads/cassandra-2.1 Commit: cdfe4e0338bcb0f7f8d04af4b328a1e86c481ce2 Parents: 75b87ce 7dbbe92 Author: Jonathan Ellis jbel...@apache.org Authored: Thu Apr 17 23:14:25 2014 +0200 Committer: Jonathan Ellis jbel...@apache.org Committed: Thu Apr 17 23:14:25 2014 +0200 -- CHANGES.txt | 9 +- NEWS.txt| 11 +- build.xml | 4 +- debian/changelog| 6 + .../apache/cassandra/db/BatchlogManager.java| 2 +- .../cassandra/net/OutboundTcpConnection.java| 12 +- .../cassandra/service/MigrationManager.java | 12 +- .../cassandra/service/StorageService.java | 138 +++ .../cassandra/streaming/ConnectionHandler.java | 2 +- .../cassandra/streaming/StreamSession.java | 6 +- .../cassandra/streaming/StreamTransferTask.java | 4 +- 11 files changed, 136 insertions(+), 70 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/cdfe4e03/CHANGES.txt -- diff --cc CHANGES.txt index b642908,451d046..01830ef --- a/CHANGES.txt +++ b/CHANGES.txt @@@ -1,6 -1,6 +1,7 @@@ 2.0.7 + * Put nodes in hibernate when join_ring is false (CASSANDRA-6961) - * Allow compaction of system tables during startup (CASSANDRA-6913) + * Avoid early loading of non-system keyspaces before compaction-leftovers + cleanup at startup (CASSANDRA-6913) * Restrict Windows to parallel repairs (CASSANDRA-6907) * (Hadoop) Allow manually specifying start/end tokens in CFIF (CASSANDRA-6436) * Fix NPE in MeteredFlusher (CASSANDRA-6820)
[07/19] git commit: merge
merge Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/cdfe4e03 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/cdfe4e03 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/cdfe4e03 Branch: refs/heads/cassandra-2.0 Commit: cdfe4e0338bcb0f7f8d04af4b328a1e86c481ce2 Parents: 75b87ce 7dbbe92 Author: Jonathan Ellis jbel...@apache.org Authored: Thu Apr 17 23:14:25 2014 +0200 Committer: Jonathan Ellis jbel...@apache.org Committed: Thu Apr 17 23:14:25 2014 +0200 -- CHANGES.txt | 9 +- NEWS.txt| 11 +- build.xml | 4 +- debian/changelog| 6 + .../apache/cassandra/db/BatchlogManager.java| 2 +- .../cassandra/net/OutboundTcpConnection.java| 12 +- .../cassandra/service/MigrationManager.java | 12 +- .../cassandra/service/StorageService.java | 138 +++ .../cassandra/streaming/ConnectionHandler.java | 2 +- .../cassandra/streaming/StreamSession.java | 6 +- .../cassandra/streaming/StreamTransferTask.java | 4 +- 11 files changed, 136 insertions(+), 70 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/cdfe4e03/CHANGES.txt -- diff --cc CHANGES.txt index b642908,451d046..01830ef --- a/CHANGES.txt +++ b/CHANGES.txt @@@ -1,6 -1,6 +1,7 @@@ 2.0.7 + * Put nodes in hibernate when join_ring is false (CASSANDRA-6961) - * Allow compaction of system tables during startup (CASSANDRA-6913) + * Avoid early loading of non-system keyspaces before compaction-leftovers + cleanup at startup (CASSANDRA-6913) * Restrict Windows to parallel repairs (CASSANDRA-6907) * (Hadoop) Allow manually specifying start/end tokens in CFIF (CASSANDRA-6436) * Fix NPE in MeteredFlusher (CASSANDRA-6820)
[06/19] git commit: cleanup
cleanup Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/655ae7a8 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/655ae7a8 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/655ae7a8 Branch: refs/heads/trunk Commit: 655ae7a8f9c4582eb6c303e005df6edbe76bd732 Parents: 2dd0907 Author: Jonathan Ellis jbel...@apache.org Authored: Fri Apr 4 12:34:12 2014 -0500 Committer: Jonathan Ellis jbel...@apache.org Committed: Wed Apr 9 09:07:04 2014 -0500 -- src/java/org/apache/cassandra/db/DeletionTime.java | 7 +++ 1 file changed, 3 insertions(+), 4 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/655ae7a8/src/java/org/apache/cassandra/db/DeletionTime.java -- diff --git a/src/java/org/apache/cassandra/db/DeletionTime.java b/src/java/org/apache/cassandra/db/DeletionTime.java index b80422c..a1b9f17 100644 --- a/src/java/org/apache/cassandra/db/DeletionTime.java +++ b/src/java/org/apache/cassandra/db/DeletionTime.java @@ -122,10 +122,9 @@ public class DeletionTime implements ComparableDeletionTime { int ldt = in.readInt(); long mfda = in.readLong(); -if (mfda == Long.MIN_VALUE ldt == Integer.MAX_VALUE) -return LIVE; -else -return new DeletionTime(mfda, ldt); +return mfda == Long.MIN_VALUE ldt == Integer.MAX_VALUE + ? LIVE + : new DeletionTime(mfda, ldt); } public long serializedSize(DeletionTime delTime, TypeSizes typeSizes)
[05/19] git commit: CHANGES
CHANGES Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/75b87ce8 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/75b87ce8 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/75b87ce8 Branch: refs/heads/trunk Commit: 75b87ce86bb729c38832acfb0edb6ab84a0deefd Parents: 655ae7a Author: Jonathan Ellis jbel...@apache.org Authored: Fri Apr 4 15:33:47 2014 -0500 Committer: Jonathan Ellis jbel...@apache.org Committed: Wed Apr 9 09:07:04 2014 -0500 -- CHANGES.txt | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/75b87ce8/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 38a6c3c..b642908 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,5 +1,6 @@ 2.0.7 - * Allow compaction of system tables during startup (CASSANDRA-6913) + * Avoid early loading of non-system keyspaces before compaction-leftovers + cleanup at startup (CASSANDRA-6913) * Restrict Windows to parallel repairs (CASSANDRA-6907) * (Hadoop) Allow manually specifying start/end tokens in CFIF (CASSANDRA-6436) * Fix NPE in MeteredFlusher (CASSANDRA-6820)
[01/19] git commit: CHANGES
Repository: cassandra Updated Branches: refs/heads/cassandra-2.0 48d7e4080 - 364282a7d refs/heads/cassandra-2.1 3e6b29925 - a2e74354c refs/heads/trunk 68aa62bde - 498eb2ab7 CHANGES Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/75b87ce8 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/75b87ce8 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/75b87ce8 Branch: refs/heads/cassandra-2.0 Commit: 75b87ce86bb729c38832acfb0edb6ab84a0deefd Parents: 655ae7a Author: Jonathan Ellis jbel...@apache.org Authored: Fri Apr 4 15:33:47 2014 -0500 Committer: Jonathan Ellis jbel...@apache.org Committed: Wed Apr 9 09:07:04 2014 -0500 -- CHANGES.txt | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/75b87ce8/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 38a6c3c..b642908 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,5 +1,6 @@ 2.0.7 - * Allow compaction of system tables during startup (CASSANDRA-6913) + * Avoid early loading of non-system keyspaces before compaction-leftovers + cleanup at startup (CASSANDRA-6913) * Restrict Windows to parallel repairs (CASSANDRA-6907) * (Hadoop) Allow manually specifying start/end tokens in CFIF (CASSANDRA-6436) * Fix NPE in MeteredFlusher (CASSANDRA-6820)
[04/19] git commit: cleanup
cleanup Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/655ae7a8 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/655ae7a8 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/655ae7a8 Branch: refs/heads/cassandra-2.1 Commit: 655ae7a8f9c4582eb6c303e005df6edbe76bd732 Parents: 2dd0907 Author: Jonathan Ellis jbel...@apache.org Authored: Fri Apr 4 12:34:12 2014 -0500 Committer: Jonathan Ellis jbel...@apache.org Committed: Wed Apr 9 09:07:04 2014 -0500 -- src/java/org/apache/cassandra/db/DeletionTime.java | 7 +++ 1 file changed, 3 insertions(+), 4 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/655ae7a8/src/java/org/apache/cassandra/db/DeletionTime.java -- diff --git a/src/java/org/apache/cassandra/db/DeletionTime.java b/src/java/org/apache/cassandra/db/DeletionTime.java index b80422c..a1b9f17 100644 --- a/src/java/org/apache/cassandra/db/DeletionTime.java +++ b/src/java/org/apache/cassandra/db/DeletionTime.java @@ -122,10 +122,9 @@ public class DeletionTime implements ComparableDeletionTime { int ldt = in.readInt(); long mfda = in.readLong(); -if (mfda == Long.MIN_VALUE ldt == Integer.MAX_VALUE) -return LIVE; -else -return new DeletionTime(mfda, ldt); +return mfda == Long.MIN_VALUE ldt == Integer.MAX_VALUE + ? LIVE + : new DeletionTime(mfda, ldt); } public long serializedSize(DeletionTime delTime, TypeSizes typeSizes)
[09/19] git commit: merge
merge Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/cdfe4e03 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/cdfe4e03 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/cdfe4e03 Branch: refs/heads/trunk Commit: cdfe4e0338bcb0f7f8d04af4b328a1e86c481ce2 Parents: 75b87ce 7dbbe92 Author: Jonathan Ellis jbel...@apache.org Authored: Thu Apr 17 23:14:25 2014 +0200 Committer: Jonathan Ellis jbel...@apache.org Committed: Thu Apr 17 23:14:25 2014 +0200 -- CHANGES.txt | 9 +- NEWS.txt| 11 +- build.xml | 4 +- debian/changelog| 6 + .../apache/cassandra/db/BatchlogManager.java| 2 +- .../cassandra/net/OutboundTcpConnection.java| 12 +- .../cassandra/service/MigrationManager.java | 12 +- .../cassandra/service/StorageService.java | 138 +++ .../cassandra/streaming/ConnectionHandler.java | 2 +- .../cassandra/streaming/StreamSession.java | 6 +- .../cassandra/streaming/StreamTransferTask.java | 4 +- 11 files changed, 136 insertions(+), 70 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/cdfe4e03/CHANGES.txt -- diff --cc CHANGES.txt index b642908,451d046..01830ef --- a/CHANGES.txt +++ b/CHANGES.txt @@@ -1,6 -1,6 +1,7 @@@ 2.0.7 + * Put nodes in hibernate when join_ring is false (CASSANDRA-6961) - * Allow compaction of system tables during startup (CASSANDRA-6913) + * Avoid early loading of non-system keyspaces before compaction-leftovers + cleanup at startup (CASSANDRA-6913) * Restrict Windows to parallel repairs (CASSANDRA-6907) * (Hadoop) Allow manually specifying start/end tokens in CFIF (CASSANDRA-6436) * Fix NPE in MeteredFlusher (CASSANDRA-6820)
[12/19] git commit: Merge remote-tracking branch 'origin/cassandra-2.0' into cassandra-2.0
Merge remote-tracking branch 'origin/cassandra-2.0' into cassandra-2.0 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/364282a7 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/364282a7 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/364282a7 Branch: refs/heads/trunk Commit: 364282a7d43c475b82fc169cec751fba899e3101 Parents: cdfe4e0 48d7e40 Author: Jonathan Ellis jbel...@apache.org Authored: Tue Apr 22 07:20:32 2014 -0500 Committer: Jonathan Ellis jbel...@apache.org Committed: Tue Apr 22 07:20:32 2014 -0500 -- CHANGES.txt | 14 ++- doc/cql3/CQL.textile| 16 ++- pylib/cqlshlib/cql3handling.py | 24 +++-- .../apache/cassandra/cql3/QueryProcessor.java | 2 +- .../cql3/statements/BatchStatement.java | 3 +- .../cql3/statements/SelectStatement.java| 34 --- .../apache/cassandra/db/BatchlogManager.java| 102 +++ .../apache/cassandra/db/ColumnFamilyStore.java | 6 -- .../cassandra/db/HintedHandOffManager.java | 21 +--- .../org/apache/cassandra/db/SystemKeyspace.java | 60 +++ .../db/commitlog/CommitLogReplayer.java | 12 +-- .../apache/cassandra/service/StorageProxy.java | 9 +- .../cassandra/service/pager/QueryPager.java | 2 +- .../org/apache/cassandra/tools/NodeCmd.java | 5 +- .../apache/cassandra/tools/NodeToolHelp.yaml| 2 +- .../cassandra/db/BatchlogManagerTest.java | 78 -- .../apache/cassandra/db/HintedHandOffTest.java | 19 ++-- 17 files changed, 272 insertions(+), 137 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/364282a7/CHANGES.txt -- diff --cc CHANGES.txt index 01830ef,5d47cfa..791586c --- a/CHANGES.txt +++ b/CHANGES.txt @@@ -1,7 -1,15 +1,16 @@@ + 2.0.8 + * Queries on compact tables can return more rows that requested (CASSANDRA-7052) + * USING TIMESTAMP for batches does not work (CASSANDRA-7053) + Merged from 1.2: + * Fix batchlog to account for CF truncation records (CASSANDRA-6999) + * Fix CQLSH parsing of functions and BLOB literals (CASSANDRA-7018) + * Require nodetool rebuild_index to specify index names (CASSANDRA-7038) + + 2.0.7 * Put nodes in hibernate when join_ring is false (CASSANDRA-6961) - * Allow compaction of system tables during startup (CASSANDRA-6913) + * Avoid early loading of non-system keyspaces before compaction-leftovers + cleanup at startup (CASSANDRA-6913) * Restrict Windows to parallel repairs (CASSANDRA-6907) * (Hadoop) Allow manually specifying start/end tokens in CFIF (CASSANDRA-6436) * Fix NPE in MeteredFlusher (CASSANDRA-6820)
[19/19] git commit: Merge branch 'cassandra-2.1' into trunk
Merge branch 'cassandra-2.1' into trunk Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/498eb2ab Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/498eb2ab Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/498eb2ab Branch: refs/heads/trunk Commit: 498eb2ab77d748b5b638b82450edd968fe5ca73a Parents: 68aa62b a2e7435 Author: Jonathan Ellis jbel...@apache.org Authored: Tue Apr 22 07:27:42 2014 -0500 Committer: Jonathan Ellis jbel...@apache.org Committed: Tue Apr 22 07:27:42 2014 -0500 -- CHANGES.txt | 4 +- .../org/apache/cassandra/db/ColumnFamily.java | 5 ++ .../org/apache/cassandra/db/DeletionInfo.java | 34 +- .../org/apache/cassandra/db/DeletionTime.java | 7 +- src/java/org/apache/cassandra/db/Keyspace.java | 2 +- .../apache/cassandra/db/RangeTombstoneList.java | 68 ++-- .../apache/cassandra/net/MessagingService.java | 32 - test/unit/org/apache/cassandra/Util.java| 7 ++ .../apache/cassandra/db/ColumnFamilyTest.java | 46 - test/unit/org/apache/cassandra/db/RowTest.java | 40 ++-- 10 files changed, 228 insertions(+), 17 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/498eb2ab/CHANGES.txt -- diff --cc CHANGES.txt index d889278,495dab2..742f7de --- a/CHANGES.txt +++ b/CHANGES.txt @@@ -1,13 -1,5 +1,14 @@@ +3.0 + * Move sstable RandomAccessReader to nio2, which allows using the + FILE_SHARE_DELETE flag on Windows (CASSANDRA-4050) + * Remove CQL2 (CASSANDRA-5918) + * Add Thrift get_multi_slice call (CASSANDRA-6757) + * Optimize fetching multiple cells by name (CASSANDRA-6933) + * Allow compilation in java 8 (CASSANDRA-7208) + + 2.1.0-beta2 + * Add range tombstones to read repair digests (CASSANDRA-6863) * Fix BTree.clear for large updates (CASSANDRA-6943) * Fail write instead of logging a warning when unable to append to CL (CASSANDRA-6764) http://git-wip-us.apache.org/repos/asf/cassandra/blob/498eb2ab/src/java/org/apache/cassandra/db/ColumnFamily.java -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/498eb2ab/src/java/org/apache/cassandra/net/MessagingService.java --
[10/19] git commit: Merge remote-tracking branch 'origin/cassandra-2.0' into cassandra-2.0
Merge remote-tracking branch 'origin/cassandra-2.0' into cassandra-2.0 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/364282a7 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/364282a7 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/364282a7 Branch: refs/heads/cassandra-2.0 Commit: 364282a7d43c475b82fc169cec751fba899e3101 Parents: cdfe4e0 48d7e40 Author: Jonathan Ellis jbel...@apache.org Authored: Tue Apr 22 07:20:32 2014 -0500 Committer: Jonathan Ellis jbel...@apache.org Committed: Tue Apr 22 07:20:32 2014 -0500 -- CHANGES.txt | 14 ++- doc/cql3/CQL.textile| 16 ++- pylib/cqlshlib/cql3handling.py | 24 +++-- .../apache/cassandra/cql3/QueryProcessor.java | 2 +- .../cql3/statements/BatchStatement.java | 3 +- .../cql3/statements/SelectStatement.java| 34 --- .../apache/cassandra/db/BatchlogManager.java| 102 +++ .../apache/cassandra/db/ColumnFamilyStore.java | 6 -- .../cassandra/db/HintedHandOffManager.java | 21 +--- .../org/apache/cassandra/db/SystemKeyspace.java | 60 +++ .../db/commitlog/CommitLogReplayer.java | 12 +-- .../apache/cassandra/service/StorageProxy.java | 9 +- .../cassandra/service/pager/QueryPager.java | 2 +- .../org/apache/cassandra/tools/NodeCmd.java | 5 +- .../apache/cassandra/tools/NodeToolHelp.yaml| 2 +- .../cassandra/db/BatchlogManagerTest.java | 78 -- .../apache/cassandra/db/HintedHandOffTest.java | 19 ++-- 17 files changed, 272 insertions(+), 137 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/364282a7/CHANGES.txt -- diff --cc CHANGES.txt index 01830ef,5d47cfa..791586c --- a/CHANGES.txt +++ b/CHANGES.txt @@@ -1,7 -1,15 +1,16 @@@ + 2.0.8 + * Queries on compact tables can return more rows that requested (CASSANDRA-7052) + * USING TIMESTAMP for batches does not work (CASSANDRA-7053) + Merged from 1.2: + * Fix batchlog to account for CF truncation records (CASSANDRA-6999) + * Fix CQLSH parsing of functions and BLOB literals (CASSANDRA-7018) + * Require nodetool rebuild_index to specify index names (CASSANDRA-7038) + + 2.0.7 * Put nodes in hibernate when join_ring is false (CASSANDRA-6961) - * Allow compaction of system tables during startup (CASSANDRA-6913) + * Avoid early loading of non-system keyspaces before compaction-leftovers + cleanup at startup (CASSANDRA-6913) * Restrict Windows to parallel repairs (CASSANDRA-6907) * (Hadoop) Allow manually specifying start/end tokens in CFIF (CASSANDRA-6436) * Fix NPE in MeteredFlusher (CASSANDRA-6820)
[16/19] git commit: remove invalid assert patch by Benedict Elliott Smith; reviewed by jbellis for CASSANDRA-6987
remove invalid assert patch by Benedict Elliott Smith; reviewed by jbellis for CASSANDRA-6987 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/be2686dd Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/be2686dd Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/be2686dd Branch: refs/heads/cassandra-2.1 Commit: be2686dd94582abbdd20e0b1b3088d52eae4fbf2 Parents: 4305bd4 Author: Jonathan Ellis jbel...@apache.org Authored: Tue Apr 22 07:23:30 2014 -0500 Committer: Jonathan Ellis jbel...@apache.org Committed: Tue Apr 22 07:23:30 2014 -0500 -- src/java/org/apache/cassandra/db/Keyspace.java | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/be2686dd/src/java/org/apache/cassandra/db/Keyspace.java -- diff --git a/src/java/org/apache/cassandra/db/Keyspace.java b/src/java/org/apache/cassandra/db/Keyspace.java index 1c3df77..31e68c1 100644 --- a/src/java/org/apache/cassandra/db/Keyspace.java +++ b/src/java/org/apache/cassandra/db/Keyspace.java @@ -99,9 +99,9 @@ public class Keyspace return open(keyspaceName, Schema.instance, true); } +// to only be used by org.apache.cassandra.tools.Standalone* classes public static Keyspace openWithoutSSTables(String keyspaceName) { -assert initialized || keyspaceName.equals(SYSTEM_KS); return open(keyspaceName, Schema.instance, false); }
[14/19] git commit: Merge branch 'cassandra-2.0' into cassandra-2.1
Merge branch 'cassandra-2.0' into cassandra-2.1 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/4305bd40 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/4305bd40 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/4305bd40 Branch: refs/heads/trunk Commit: 4305bd40f619e1d4ebdfc5873c082deb82aac5f0 Parents: 3e6b299 364282a Author: Jonathan Ellis jbel...@apache.org Authored: Tue Apr 22 07:20:45 2014 -0500 Committer: Jonathan Ellis jbel...@apache.org Committed: Tue Apr 22 07:20:45 2014 -0500 -- CHANGES.txt| 3 ++- src/java/org/apache/cassandra/db/DeletionTime.java | 7 +++ 2 files changed, 5 insertions(+), 5 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/4305bd40/CHANGES.txt -- diff --cc CHANGES.txt index d94f13b,791586c..ae7410e --- a/CHANGES.txt +++ b/CHANGES.txt @@@ -1,55 -1,16 +1,56 @@@ -2.0.8 - * Queries on compact tables can return more rows that requested (CASSANDRA-7052) - * USING TIMESTAMP for batches does not work (CASSANDRA-7053) -Merged from 1.2: - * Fix batchlog to account for CF truncation records (CASSANDRA-6999) - * Fix CQLSH parsing of functions and BLOB literals (CASSANDRA-7018) +2.1.0-beta2 + * Fix BTree.clear for large updates (CASSANDRA-6943) + * Fail write instead of logging a warning when unable to append to CL + (CASSANDRA-6764) + * Eliminate possibility of CL segment appearing twice in active list + (CASSANDRA-6557) + * Apply DONTNEED fadvise to commitlog segments (CASSANDRA-6759) + * Switch CRC component to Adler and include it for compressed sstables + (CASSANDRA-4165) + * Allow cassandra-stress to set compaction strategy options (CASSANDRA-6451) + * Add broadcast_rpc_address option to cassandra.yaml (CASSANDRA-5899) + * Auto reload GossipingPropertyFileSnitch config (CASSANDRA-5897) + * Fix overflow of memtable_total_space_in_mb (CASSANDRA-6573) + * Fix ABTC NPE and apply update function correctly (CASSANDRA-6692) + * Allow nodetool to use a file or prompt for password (CASSANDRA-6660) + * Fix AIOOBE when concurrently accessing ABSC (CASSANDRA-6742) + * Fix assertion error in ALTER TYPE RENAME (CASSANDRA-6705) + * Scrub should not always clear out repaired status (CASSANDRA-5351) + * Improve handling of range tombstone for wide partitions (CASSANDRA-6446) + * Fix ClassCastException for compact table with composites (CASSANDRA-6738) + * Fix potentially repairing with wrong nodes (CASSANDRA-6808) + * Change caching option syntax (CASSANDRA-6745) + * Fix stress to do proper counter reads (CASSANDRA-6835) + * Fix help message for stress counter_write (CASSANDRA-6824) + * Fix stress smart Thrift client to pick servers correctly (CASSANDRA-6848) + * Add logging levels (minimal, normal or verbose) to stress tool (CASSANDRA-6849) + * Fix race condition in Batch CLE (CASSANDRA-6860) + * Improve cleanup/scrub/upgradesstables failure handling (CASSANDRA-6774) + * ByteBuffer write() methods for serializing sstables (CASSANDRA-6781) + * Proper compare function for CollectionType (CASSANDRA-6783) + * Update native server to Netty 4 (CASSANDRA-6236) + * Fix off-by-one error in stress (CASSANDRA-6883) + * Make OpOrder AutoCloseable (CASSANDRA-6901) + * Remove sync repair JMX interface (CASSANDRA-6900) + * Add multiple memory allocation options for memtables (CASSANDRA-6689) + * Remove adjusted op rate from stress output (CASSANDRA-6921) + * Add optimized CF.hasColumns() implementations (CASSANDRA-6941) + * Serialize batchlog mutations with the version of the target node + (CASSANDRA-6931) + * Optimize CounterColumn#reconcile() (CASSANDRA-6953) + * Properly remove 1.2 sstable support in 2.1 (CASSANDRA-6869) + * Lock counter cells, not partitions (CASSANDRA-6880) + * Track presence of legacy counter shards in sstables (CASSANDRA-6888) + * Ensure safe resource cleanup when replacing sstables (CASSANDRA-6912) + * Add failure handler to async callback (CASSANDRA-6747) + * Fix AE when closing SSTable without releasing reference (CASSANDRA-7000) + * Clean up IndexInfo on keyspace/table drops (CASSANDRA-6924) + * Only snapshot relative SSTables when sequential repair (CASSANDRA-7024) * Require nodetool rebuild_index to specify index names (CASSANDRA-7038) - - -2.0.7 +Merged from 2.0: * Put nodes in hibernate when join_ring is false (CASSANDRA-6961) - * Allow compaction of system tables during startup (CASSANDRA-6913) + * Avoid early loading of non-system keyspaces before compaction-leftovers +cleanup at startup (CASSANDRA-6913) * Restrict Windows to parallel repairs (CASSANDRA-6907) * (Hadoop) Allow manually specifying start/end tokens in CFIF
[3/6] git commit: Log a warning for large batches patch by Lyuben Todorov; reviewed by Benedict Elliott Smith for CASSANDRA-6487
Log a warning for large batches patch by Lyuben Todorov; reviewed by Benedict Elliott Smith for CASSANDRA-6487 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/de720b4a Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/de720b4a Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/de720b4a Branch: refs/heads/trunk Commit: de720b4aa31198076abbd76a53644df341577126 Parents: 364282a Author: Jonathan Ellis jbel...@apache.org Authored: Tue Apr 22 07:40:32 2014 -0500 Committer: Jonathan Ellis jbel...@apache.org Committed: Tue Apr 22 07:40:32 2014 -0500 -- CHANGES.txt | 1 + conf/cassandra.yaml | 5 +++ .../org/apache/cassandra/config/Config.java | 1 + .../cassandra/config/DatabaseDescriptor.java| 5 +++ .../apache/cassandra/cql3/QueryProcessor.java | 2 +- .../cql3/statements/BatchStatement.java | 42 +++- 6 files changed, 54 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/de720b4a/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 791586c..fffb2a5 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 2.0.8 + * Log a warning for large batches (CASSANDRA-6487) * Queries on compact tables can return more rows that requested (CASSANDRA-7052) * USING TIMESTAMP for batches does not work (CASSANDRA-7053) Merged from 1.2: http://git-wip-us.apache.org/repos/asf/cassandra/blob/de720b4a/conf/cassandra.yaml -- diff --git a/conf/cassandra.yaml b/conf/cassandra.yaml index 2edd498..2de6753 100644 --- a/conf/cassandra.yaml +++ b/conf/cassandra.yaml @@ -429,6 +429,11 @@ tombstone_failure_threshold: 10 # that wastefully either. column_index_size_in_kb: 64 + +# Log WARN on any batch size exceeding this value. 5kb per batch by default. +# Caution should be taken on increasing the size of this threshold as it can lead to node instability. +batch_size_warn_threshold_in_kb: 5 + # Size limit for rows being compacted in memory. Larger rows will spill # over to disk and use a slower two-pass compaction process. A message # will be logged specifying the row key. http://git-wip-us.apache.org/repos/asf/cassandra/blob/de720b4a/src/java/org/apache/cassandra/config/Config.java -- diff --git a/src/java/org/apache/cassandra/config/Config.java b/src/java/org/apache/cassandra/config/Config.java index 5317fb8..7a3185a 100644 --- a/src/java/org/apache/cassandra/config/Config.java +++ b/src/java/org/apache/cassandra/config/Config.java @@ -120,6 +120,7 @@ public class Config /* if the size of columns or super-columns are more than this, indexing will kick in */ public Integer column_index_size_in_kb = 64; +public Integer batch_size_warn_threshold_in_kb = 5; public Integer in_memory_compaction_limit_in_mb = 64; public Integer concurrent_compactors = FBUtilities.getAvailableProcessors(); public volatile Integer compaction_throughput_mb_per_sec = 16; http://git-wip-us.apache.org/repos/asf/cassandra/blob/de720b4a/src/java/org/apache/cassandra/config/DatabaseDescriptor.java -- diff --git a/src/java/org/apache/cassandra/config/DatabaseDescriptor.java b/src/java/org/apache/cassandra/config/DatabaseDescriptor.java index 9e06601..6417524 100644 --- a/src/java/org/apache/cassandra/config/DatabaseDescriptor.java +++ b/src/java/org/apache/cassandra/config/DatabaseDescriptor.java @@ -658,6 +658,11 @@ public class DatabaseDescriptor return conf.column_index_size_in_kb * 1024; } +public static int getBatchSizeWarnThreshold() +{ +return conf.batch_size_warn_threshold_in_kb * 1024; +} + public static CollectionString getInitialTokens() { return tokensFromString(System.getProperty(cassandra.initial_token, conf.initial_token)); http://git-wip-us.apache.org/repos/asf/cassandra/blob/de720b4a/src/java/org/apache/cassandra/cql3/QueryProcessor.java -- diff --git a/src/java/org/apache/cassandra/cql3/QueryProcessor.java b/src/java/org/apache/cassandra/cql3/QueryProcessor.java index ab0ea40..15ee59f 100644 --- a/src/java/org/apache/cassandra/cql3/QueryProcessor.java +++ b/src/java/org/apache/cassandra/cql3/QueryProcessor.java @@ -30,13 +30,13 @@ import org.slf4j.Logger; import org.slf4j.LoggerFactory; import org.apache.cassandra.cql3.statements.*; -import org.apache.cassandra.transport.messages.ResultMessage; import
[6/6] git commit: Merge branch 'cassandra-2.1' into trunk
Merge branch 'cassandra-2.1' into trunk Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/8e943b3d Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/8e943b3d Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/8e943b3d Branch: refs/heads/trunk Commit: 8e943b3dac644e6ca6a8848284e126164054634c Parents: 498eb2a 510c82e Author: Jonathan Ellis jbel...@apache.org Authored: Tue Apr 22 07:41:38 2014 -0500 Committer: Jonathan Ellis jbel...@apache.org Committed: Tue Apr 22 07:41:38 2014 -0500 -- CHANGES.txt | 1 + conf/cassandra.yaml | 5 +++ .../org/apache/cassandra/config/Config.java | 1 + .../cassandra/config/DatabaseDescriptor.java| 5 +++ .../apache/cassandra/cql3/QueryProcessor.java | 2 +- .../cql3/statements/BatchStatement.java | 42 +++- 6 files changed, 54 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/8e943b3d/CHANGES.txt -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/8e943b3d/src/java/org/apache/cassandra/cql3/QueryProcessor.java --
[5/6] git commit: merge from 2.0
merge from 2.0 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/510c82e4 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/510c82e4 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/510c82e4 Branch: refs/heads/cassandra-2.1 Commit: 510c82e4e73be07b44b4902b865a9eaeda5113e9 Parents: a2e7435 de720b4 Author: Jonathan Ellis jbel...@apache.org Authored: Tue Apr 22 07:41:27 2014 -0500 Committer: Jonathan Ellis jbel...@apache.org Committed: Tue Apr 22 07:41:27 2014 -0500 -- CHANGES.txt | 1 + conf/cassandra.yaml | 5 +++ .../org/apache/cassandra/config/Config.java | 1 + .../cassandra/config/DatabaseDescriptor.java| 5 +++ .../apache/cassandra/cql3/QueryProcessor.java | 2 +- .../cql3/statements/BatchStatement.java | 42 +++- 6 files changed, 54 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/510c82e4/CHANGES.txt -- diff --cc CHANGES.txt index 495dab2,fffb2a5..ab3278e --- a/CHANGES.txt +++ b/CHANGES.txt @@@ -1,54 -1,14 +1,55 @@@ -2.0.8 - * Log a warning for large batches (CASSANDRA-6487) - * Queries on compact tables can return more rows that requested (CASSANDRA-7052) - * USING TIMESTAMP for batches does not work (CASSANDRA-7053) -Merged from 1.2: - * Fix batchlog to account for CF truncation records (CASSANDRA-6999) - * Fix CQLSH parsing of functions and BLOB literals (CASSANDRA-7018) +2.1.0-beta2 + * Add range tombstones to read repair digests (CASSANDRA-6863) + * Fix BTree.clear for large updates (CASSANDRA-6943) + * Fail write instead of logging a warning when unable to append to CL + (CASSANDRA-6764) + * Eliminate possibility of CL segment appearing twice in active list + (CASSANDRA-6557) + * Apply DONTNEED fadvise to commitlog segments (CASSANDRA-6759) + * Switch CRC component to Adler and include it for compressed sstables + (CASSANDRA-4165) + * Allow cassandra-stress to set compaction strategy options (CASSANDRA-6451) + * Add broadcast_rpc_address option to cassandra.yaml (CASSANDRA-5899) + * Auto reload GossipingPropertyFileSnitch config (CASSANDRA-5897) + * Fix overflow of memtable_total_space_in_mb (CASSANDRA-6573) + * Fix ABTC NPE and apply update function correctly (CASSANDRA-6692) + * Allow nodetool to use a file or prompt for password (CASSANDRA-6660) + * Fix AIOOBE when concurrently accessing ABSC (CASSANDRA-6742) + * Fix assertion error in ALTER TYPE RENAME (CASSANDRA-6705) + * Scrub should not always clear out repaired status (CASSANDRA-5351) + * Improve handling of range tombstone for wide partitions (CASSANDRA-6446) + * Fix ClassCastException for compact table with composites (CASSANDRA-6738) + * Fix potentially repairing with wrong nodes (CASSANDRA-6808) + * Change caching option syntax (CASSANDRA-6745) + * Fix stress to do proper counter reads (CASSANDRA-6835) + * Fix help message for stress counter_write (CASSANDRA-6824) + * Fix stress smart Thrift client to pick servers correctly (CASSANDRA-6848) + * Add logging levels (minimal, normal or verbose) to stress tool (CASSANDRA-6849) + * Fix race condition in Batch CLE (CASSANDRA-6860) + * Improve cleanup/scrub/upgradesstables failure handling (CASSANDRA-6774) + * ByteBuffer write() methods for serializing sstables (CASSANDRA-6781) + * Proper compare function for CollectionType (CASSANDRA-6783) + * Update native server to Netty 4 (CASSANDRA-6236) + * Fix off-by-one error in stress (CASSANDRA-6883) + * Make OpOrder AutoCloseable (CASSANDRA-6901) + * Remove sync repair JMX interface (CASSANDRA-6900) + * Add multiple memory allocation options for memtables (CASSANDRA-6689) + * Remove adjusted op rate from stress output (CASSANDRA-6921) + * Add optimized CF.hasColumns() implementations (CASSANDRA-6941) + * Serialize batchlog mutations with the version of the target node + (CASSANDRA-6931) + * Optimize CounterColumn#reconcile() (CASSANDRA-6953) + * Properly remove 1.2 sstable support in 2.1 (CASSANDRA-6869) + * Lock counter cells, not partitions (CASSANDRA-6880) + * Track presence of legacy counter shards in sstables (CASSANDRA-6888) + * Ensure safe resource cleanup when replacing sstables (CASSANDRA-6912) + * Add failure handler to async callback (CASSANDRA-6747) + * Fix AE when closing SSTable without releasing reference (CASSANDRA-7000) + * Clean up IndexInfo on keyspace/table drops (CASSANDRA-6924) + * Only snapshot relative SSTables when sequential repair (CASSANDRA-7024) * Require nodetool rebuild_index to specify index names (CASSANDRA-7038) - - -2.0.7 +Merged from 2.0: ++ * Log a warning for large batches (CASSANDRA-6487)
[2/6] git commit: Log a warning for large batches patch by Lyuben Todorov; reviewed by Benedict Elliott Smith for CASSANDRA-6487
Log a warning for large batches patch by Lyuben Todorov; reviewed by Benedict Elliott Smith for CASSANDRA-6487 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/de720b4a Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/de720b4a Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/de720b4a Branch: refs/heads/cassandra-2.1 Commit: de720b4aa31198076abbd76a53644df341577126 Parents: 364282a Author: Jonathan Ellis jbel...@apache.org Authored: Tue Apr 22 07:40:32 2014 -0500 Committer: Jonathan Ellis jbel...@apache.org Committed: Tue Apr 22 07:40:32 2014 -0500 -- CHANGES.txt | 1 + conf/cassandra.yaml | 5 +++ .../org/apache/cassandra/config/Config.java | 1 + .../cassandra/config/DatabaseDescriptor.java| 5 +++ .../apache/cassandra/cql3/QueryProcessor.java | 2 +- .../cql3/statements/BatchStatement.java | 42 +++- 6 files changed, 54 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/de720b4a/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 791586c..fffb2a5 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 2.0.8 + * Log a warning for large batches (CASSANDRA-6487) * Queries on compact tables can return more rows that requested (CASSANDRA-7052) * USING TIMESTAMP for batches does not work (CASSANDRA-7053) Merged from 1.2: http://git-wip-us.apache.org/repos/asf/cassandra/blob/de720b4a/conf/cassandra.yaml -- diff --git a/conf/cassandra.yaml b/conf/cassandra.yaml index 2edd498..2de6753 100644 --- a/conf/cassandra.yaml +++ b/conf/cassandra.yaml @@ -429,6 +429,11 @@ tombstone_failure_threshold: 10 # that wastefully either. column_index_size_in_kb: 64 + +# Log WARN on any batch size exceeding this value. 5kb per batch by default. +# Caution should be taken on increasing the size of this threshold as it can lead to node instability. +batch_size_warn_threshold_in_kb: 5 + # Size limit for rows being compacted in memory. Larger rows will spill # over to disk and use a slower two-pass compaction process. A message # will be logged specifying the row key. http://git-wip-us.apache.org/repos/asf/cassandra/blob/de720b4a/src/java/org/apache/cassandra/config/Config.java -- diff --git a/src/java/org/apache/cassandra/config/Config.java b/src/java/org/apache/cassandra/config/Config.java index 5317fb8..7a3185a 100644 --- a/src/java/org/apache/cassandra/config/Config.java +++ b/src/java/org/apache/cassandra/config/Config.java @@ -120,6 +120,7 @@ public class Config /* if the size of columns or super-columns are more than this, indexing will kick in */ public Integer column_index_size_in_kb = 64; +public Integer batch_size_warn_threshold_in_kb = 5; public Integer in_memory_compaction_limit_in_mb = 64; public Integer concurrent_compactors = FBUtilities.getAvailableProcessors(); public volatile Integer compaction_throughput_mb_per_sec = 16; http://git-wip-us.apache.org/repos/asf/cassandra/blob/de720b4a/src/java/org/apache/cassandra/config/DatabaseDescriptor.java -- diff --git a/src/java/org/apache/cassandra/config/DatabaseDescriptor.java b/src/java/org/apache/cassandra/config/DatabaseDescriptor.java index 9e06601..6417524 100644 --- a/src/java/org/apache/cassandra/config/DatabaseDescriptor.java +++ b/src/java/org/apache/cassandra/config/DatabaseDescriptor.java @@ -658,6 +658,11 @@ public class DatabaseDescriptor return conf.column_index_size_in_kb * 1024; } +public static int getBatchSizeWarnThreshold() +{ +return conf.batch_size_warn_threshold_in_kb * 1024; +} + public static CollectionString getInitialTokens() { return tokensFromString(System.getProperty(cassandra.initial_token, conf.initial_token)); http://git-wip-us.apache.org/repos/asf/cassandra/blob/de720b4a/src/java/org/apache/cassandra/cql3/QueryProcessor.java -- diff --git a/src/java/org/apache/cassandra/cql3/QueryProcessor.java b/src/java/org/apache/cassandra/cql3/QueryProcessor.java index ab0ea40..15ee59f 100644 --- a/src/java/org/apache/cassandra/cql3/QueryProcessor.java +++ b/src/java/org/apache/cassandra/cql3/QueryProcessor.java @@ -30,13 +30,13 @@ import org.slf4j.Logger; import org.slf4j.LoggerFactory; import org.apache.cassandra.cql3.statements.*; -import org.apache.cassandra.transport.messages.ResultMessage; import
[4/6] git commit: merge from 2.0
merge from 2.0 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/510c82e4 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/510c82e4 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/510c82e4 Branch: refs/heads/trunk Commit: 510c82e4e73be07b44b4902b865a9eaeda5113e9 Parents: a2e7435 de720b4 Author: Jonathan Ellis jbel...@apache.org Authored: Tue Apr 22 07:41:27 2014 -0500 Committer: Jonathan Ellis jbel...@apache.org Committed: Tue Apr 22 07:41:27 2014 -0500 -- CHANGES.txt | 1 + conf/cassandra.yaml | 5 +++ .../org/apache/cassandra/config/Config.java | 1 + .../cassandra/config/DatabaseDescriptor.java| 5 +++ .../apache/cassandra/cql3/QueryProcessor.java | 2 +- .../cql3/statements/BatchStatement.java | 42 +++- 6 files changed, 54 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/510c82e4/CHANGES.txt -- diff --cc CHANGES.txt index 495dab2,fffb2a5..ab3278e --- a/CHANGES.txt +++ b/CHANGES.txt @@@ -1,54 -1,14 +1,55 @@@ -2.0.8 - * Log a warning for large batches (CASSANDRA-6487) - * Queries on compact tables can return more rows that requested (CASSANDRA-7052) - * USING TIMESTAMP for batches does not work (CASSANDRA-7053) -Merged from 1.2: - * Fix batchlog to account for CF truncation records (CASSANDRA-6999) - * Fix CQLSH parsing of functions and BLOB literals (CASSANDRA-7018) +2.1.0-beta2 + * Add range tombstones to read repair digests (CASSANDRA-6863) + * Fix BTree.clear for large updates (CASSANDRA-6943) + * Fail write instead of logging a warning when unable to append to CL + (CASSANDRA-6764) + * Eliminate possibility of CL segment appearing twice in active list + (CASSANDRA-6557) + * Apply DONTNEED fadvise to commitlog segments (CASSANDRA-6759) + * Switch CRC component to Adler and include it for compressed sstables + (CASSANDRA-4165) + * Allow cassandra-stress to set compaction strategy options (CASSANDRA-6451) + * Add broadcast_rpc_address option to cassandra.yaml (CASSANDRA-5899) + * Auto reload GossipingPropertyFileSnitch config (CASSANDRA-5897) + * Fix overflow of memtable_total_space_in_mb (CASSANDRA-6573) + * Fix ABTC NPE and apply update function correctly (CASSANDRA-6692) + * Allow nodetool to use a file or prompt for password (CASSANDRA-6660) + * Fix AIOOBE when concurrently accessing ABSC (CASSANDRA-6742) + * Fix assertion error in ALTER TYPE RENAME (CASSANDRA-6705) + * Scrub should not always clear out repaired status (CASSANDRA-5351) + * Improve handling of range tombstone for wide partitions (CASSANDRA-6446) + * Fix ClassCastException for compact table with composites (CASSANDRA-6738) + * Fix potentially repairing with wrong nodes (CASSANDRA-6808) + * Change caching option syntax (CASSANDRA-6745) + * Fix stress to do proper counter reads (CASSANDRA-6835) + * Fix help message for stress counter_write (CASSANDRA-6824) + * Fix stress smart Thrift client to pick servers correctly (CASSANDRA-6848) + * Add logging levels (minimal, normal or verbose) to stress tool (CASSANDRA-6849) + * Fix race condition in Batch CLE (CASSANDRA-6860) + * Improve cleanup/scrub/upgradesstables failure handling (CASSANDRA-6774) + * ByteBuffer write() methods for serializing sstables (CASSANDRA-6781) + * Proper compare function for CollectionType (CASSANDRA-6783) + * Update native server to Netty 4 (CASSANDRA-6236) + * Fix off-by-one error in stress (CASSANDRA-6883) + * Make OpOrder AutoCloseable (CASSANDRA-6901) + * Remove sync repair JMX interface (CASSANDRA-6900) + * Add multiple memory allocation options for memtables (CASSANDRA-6689) + * Remove adjusted op rate from stress output (CASSANDRA-6921) + * Add optimized CF.hasColumns() implementations (CASSANDRA-6941) + * Serialize batchlog mutations with the version of the target node + (CASSANDRA-6931) + * Optimize CounterColumn#reconcile() (CASSANDRA-6953) + * Properly remove 1.2 sstable support in 2.1 (CASSANDRA-6869) + * Lock counter cells, not partitions (CASSANDRA-6880) + * Track presence of legacy counter shards in sstables (CASSANDRA-6888) + * Ensure safe resource cleanup when replacing sstables (CASSANDRA-6912) + * Add failure handler to async callback (CASSANDRA-6747) + * Fix AE when closing SSTable without releasing reference (CASSANDRA-7000) + * Clean up IndexInfo on keyspace/table drops (CASSANDRA-6924) + * Only snapshot relative SSTables when sequential repair (CASSANDRA-7024) * Require nodetool rebuild_index to specify index names (CASSANDRA-7038) - - -2.0.7 +Merged from 2.0: ++ * Log a warning for large batches (CASSANDRA-6487) * Put
[1/6] git commit: Log a warning for large batches patch by Lyuben Todorov; reviewed by Benedict Elliott Smith for CASSANDRA-6487
Repository: cassandra Updated Branches: refs/heads/cassandra-2.0 364282a7d - de720b4aa refs/heads/cassandra-2.1 a2e74354c - 510c82e4e refs/heads/trunk 498eb2ab7 - 8e943b3da Log a warning for large batches patch by Lyuben Todorov; reviewed by Benedict Elliott Smith for CASSANDRA-6487 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/de720b4a Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/de720b4a Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/de720b4a Branch: refs/heads/cassandra-2.0 Commit: de720b4aa31198076abbd76a53644df341577126 Parents: 364282a Author: Jonathan Ellis jbel...@apache.org Authored: Tue Apr 22 07:40:32 2014 -0500 Committer: Jonathan Ellis jbel...@apache.org Committed: Tue Apr 22 07:40:32 2014 -0500 -- CHANGES.txt | 1 + conf/cassandra.yaml | 5 +++ .../org/apache/cassandra/config/Config.java | 1 + .../cassandra/config/DatabaseDescriptor.java| 5 +++ .../apache/cassandra/cql3/QueryProcessor.java | 2 +- .../cql3/statements/BatchStatement.java | 42 +++- 6 files changed, 54 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/de720b4a/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 791586c..fffb2a5 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 2.0.8 + * Log a warning for large batches (CASSANDRA-6487) * Queries on compact tables can return more rows that requested (CASSANDRA-7052) * USING TIMESTAMP for batches does not work (CASSANDRA-7053) Merged from 1.2: http://git-wip-us.apache.org/repos/asf/cassandra/blob/de720b4a/conf/cassandra.yaml -- diff --git a/conf/cassandra.yaml b/conf/cassandra.yaml index 2edd498..2de6753 100644 --- a/conf/cassandra.yaml +++ b/conf/cassandra.yaml @@ -429,6 +429,11 @@ tombstone_failure_threshold: 10 # that wastefully either. column_index_size_in_kb: 64 + +# Log WARN on any batch size exceeding this value. 5kb per batch by default. +# Caution should be taken on increasing the size of this threshold as it can lead to node instability. +batch_size_warn_threshold_in_kb: 5 + # Size limit for rows being compacted in memory. Larger rows will spill # over to disk and use a slower two-pass compaction process. A message # will be logged specifying the row key. http://git-wip-us.apache.org/repos/asf/cassandra/blob/de720b4a/src/java/org/apache/cassandra/config/Config.java -- diff --git a/src/java/org/apache/cassandra/config/Config.java b/src/java/org/apache/cassandra/config/Config.java index 5317fb8..7a3185a 100644 --- a/src/java/org/apache/cassandra/config/Config.java +++ b/src/java/org/apache/cassandra/config/Config.java @@ -120,6 +120,7 @@ public class Config /* if the size of columns or super-columns are more than this, indexing will kick in */ public Integer column_index_size_in_kb = 64; +public Integer batch_size_warn_threshold_in_kb = 5; public Integer in_memory_compaction_limit_in_mb = 64; public Integer concurrent_compactors = FBUtilities.getAvailableProcessors(); public volatile Integer compaction_throughput_mb_per_sec = 16; http://git-wip-us.apache.org/repos/asf/cassandra/blob/de720b4a/src/java/org/apache/cassandra/config/DatabaseDescriptor.java -- diff --git a/src/java/org/apache/cassandra/config/DatabaseDescriptor.java b/src/java/org/apache/cassandra/config/DatabaseDescriptor.java index 9e06601..6417524 100644 --- a/src/java/org/apache/cassandra/config/DatabaseDescriptor.java +++ b/src/java/org/apache/cassandra/config/DatabaseDescriptor.java @@ -658,6 +658,11 @@ public class DatabaseDescriptor return conf.column_index_size_in_kb * 1024; } +public static int getBatchSizeWarnThreshold() +{ +return conf.batch_size_warn_threshold_in_kb * 1024; +} + public static CollectionString getInitialTokens() { return tokensFromString(System.getProperty(cassandra.initial_token, conf.initial_token)); http://git-wip-us.apache.org/repos/asf/cassandra/blob/de720b4a/src/java/org/apache/cassandra/cql3/QueryProcessor.java -- diff --git a/src/java/org/apache/cassandra/cql3/QueryProcessor.java b/src/java/org/apache/cassandra/cql3/QueryProcessor.java index ab0ea40..15ee59f 100644 --- a/src/java/org/apache/cassandra/cql3/QueryProcessor.java +++ b/src/java/org/apache/cassandra/cql3/QueryProcessor.java @@ -30,13 +30,13 @@ import
[jira] [Commented] (CASSANDRA-6487) Log WARN on large batch sizes
[ https://issues.apache.org/jira/browse/CASSANDRA-6487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13976704#comment-13976704 ] Jonathan Ellis commented on CASSANDRA-6487: --- committed Log WARN on large batch sizes - Key: CASSANDRA-6487 URL: https://issues.apache.org/jira/browse/CASSANDRA-6487 Project: Cassandra Issue Type: Improvement Reporter: Patrick McFadin Assignee: Lyuben Todorov Priority: Minor Fix For: 2.0.8 Attachments: 6487-cassandra-2.0.patch, 6487-cassandra-2.0_v2.patch Large batches on a coordinator can cause a lot of node stress. I propose adding a WARN log entry if batch sizes go beyond a configurable size. This will give more visibility to operators on something that can happen on the developer side. New yaml setting with 5k default. {{# Log WARN on any batch size exceeding this value. 5k by default.}} {{# Caution should be taken on increasing the size of this threshold as it can lead to node instability.}} {{batch_size_warn_threshold: 5k}} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-4450) CQL3: Allow preparing the consistency level, timestamp and ttl
[ https://issues.apache.org/jira/browse/CASSANDRA-4450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13976705#comment-13976705 ] Sylvain Lebresne commented on CASSANDRA-4450: - bq. Why [timestamp] is using LongType.instance instead of TimestampType.instance? Because TimestampType is a time in milliseconds, while [timestamp] should be in microseconds. Besides, it's actually possible (though ill-advised unless maybe for very very specific use casse) to use a [timestamp] that is not even a date, so it's not like TimestampType is the absolute true type for that field. CQL3: Allow preparing the consistency level, timestamp and ttl -- Key: CASSANDRA-4450 URL: https://issues.apache.org/jira/browse/CASSANDRA-4450 Project: Cassandra Issue Type: Improvement Affects Versions: 1.2.0 Reporter: Sylvain Lebresne Assignee: Sylvain Lebresne Priority: Minor Labels: cql3 Fix For: 2.0 beta 1 It could be useful to allow the preparation of the consitency level, the timestamp and the ttl. I.e. to allow: {noformat} UPDATE foo SET .. USING CONSISTENCY ? AND TIMESTAMP ? AND TTL ? {noformat} A slight concern is that when preparing a statement we return the names of the prepared variables, but none of timestamp, ttl and consistency are reserved names currently, so returning those as names could conflict with a column name. We can either: * make these reserved identifier (I have to add that I'm not a fan because at least for timestamp, I think that's a potentially useful and common column name). * use some specific special character to indicate those are not column names, like returning [timestamp], [ttl], [consistency]. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7031) Increase default commit log total space + segment size
[ https://issues.apache.org/jira/browse/CASSANDRA-7031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13976706#comment-13976706 ] Benedict commented on CASSANDRA-7031: - bq. Since one point of restore is, I don't have the CL directory anymore this is kind of a non-solution. But this is always a risk for any users - to eliminate that risk is impossible, so they have to be able to cope with some degree of loss in this event, and that's hardly unreasonable since we're talking about whole cluster failure of commit and data disks for this to be a real pain point. We provide guarantees against this problem through RF and more nodes, generally. Restore is more useful for people who want to roll back their own mistakes, or bring a cluster back from death, and if you've had that level of issue, losing out on the past 100Mb of writes is no worse than missing out on the past 25Mb. bq. So now we're forcing users to add a cron job for PITR to work? I don't like that idea either. We're saying: if you need PITR for something within the most recent CL then you need to run this. It's probably a very tiny number of users - I'm not sure it's any at all. Most users want to be able to PITR to some safe point, not to some point within the past 2 seconds or so. If they wanted that they'd be screwed by delayed mutations anyway, so it's kind of moot. Increase default commit log total space + segment size -- Key: CASSANDRA-7031 URL: https://issues.apache.org/jira/browse/CASSANDRA-7031 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Assignee: Benedict Priority: Trivial Fix For: 2.1 beta2 Attachments: 7031.txt I would like to increase the default commit log total space and segment size options for 64-bit JVMs: The current default of 1Gb and 32Mb is quite constrained and can have some (very minor) negative performance implications, for no major benefit: # 32Mb files are actually quite small, and if during the 10s interval we have completely filled multiple of them (quite easy) it would be more efficient to write fewer larger files, as we can issue fewer fsyncs and permit the OS to schedule the writes more efficiently. On my box this has a small but noticeable impact. Although I would expect on decent server hardware this would be smaller still, since we immediately drop the pages from cache on writing there isn't a great deal of advantage to keeping the files so small. The only advantage I can see is that during a drop KS/CF or other event that forces log rollover we're wasting less space until log recycling. 128-256Mb are modest increases that seem more appropriate to me. # 1Gb is too small for the default total log space. We can find that we force memtable flushes as a result of log utilisation instead of memtable occupancy quite often (esp. as a result of increased effective memtable space from recent improvements), especially on machines with more addressable memory. I suggest 8Gb as a minimum. The only disadvantage of having more log data is that replay on restart may be slightly slower, but since most of the events will be ignored it should be relatively benign, and I would rather take the penalty on startup instead of during running, no matter how small the running penalty. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6696) Drive replacement in JBOD can cause data to reappear.
[ https://issues.apache.org/jira/browse/CASSANDRA-6696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13976712#comment-13976712 ] Jonathan Ellis commented on CASSANDRA-6696: --- bq. doing per-vnode sstables could enable some nice benefits, like turning off the exact vnodes that are affected by a disk failure or a mini auto-repair on corrupt sstables perhaps? CASSANDRA-4784 lists some other benefits, the strongest of which I think are # on disk failure, we can invalidate the affected vnodes and repair them, rather than continuing to serve incomplete data or halting the entire node [similar to what you are saying here] # we can deduplicate ranges for bulk load into another cluster (CASSANDRA-4756) Drive replacement in JBOD can cause data to reappear. -- Key: CASSANDRA-6696 URL: https://issues.apache.org/jira/browse/CASSANDRA-6696 Project: Cassandra Issue Type: Improvement Components: Core Reporter: sankalp kohli Assignee: Marcus Eriksson Fix For: 3.0 In JBOD, when someone gets a bad drive, the bad drive is replaced with a new empty one and repair is run. This can cause deleted data to come back in some cases. Also this is true for corrupt stables in which we delete the corrupt stable and run repair. Here is an example: Say we have 3 nodes A,B and C and RF=3 and GC grace=10days. row=sankalp col=sankalp is written 20 days back and successfully went to all three nodes. Then a delete/tombstone was written successfully for the same row column 15 days back. Since this tombstone is more than gc grace, it got compacted in Nodes A and B since it got compacted with the actual data. So there is no trace of this row column in node A and B. Now in node C, say the original data is in drive1 and tombstone is in drive2. Compaction has not yet reclaimed the data and tombstone. Drive2 becomes corrupt and was replaced with new empty drive. Due to the replacement, the tombstone in now gone and row=sankalp col=sankalp has come back to life. Now after replacing the drive we run repair. This data will be propagated to all nodes. Note: This is still a problem even if we run repair every gc grace. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Comment Edited] (CASSANDRA-6696) Drive replacement in JBOD can cause data to reappear.
[ https://issues.apache.org/jira/browse/CASSANDRA-6696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13976712#comment-13976712 ] Jonathan Ellis edited comment on CASSANDRA-6696 at 4/22/14 12:53 PM: - bq. doing per-vnode sstables could enable some nice benefits, like turning off the exact vnodes that are affected by a disk failure or a mini auto-repair on corrupt sstables perhaps? CASSANDRA-4784 lists some other benefits, the strongest of which I think are # on disk failure, we can invalidate the affected vnodes and repair them, rather than continuing to serve incomplete data or halting the entire node [similar to what you are saying here] # we can deduplicate ranges for bulk load into another cluster (CASSANDRA-4756) /cc [~kohlisankalp] was (Author: jbellis): bq. doing per-vnode sstables could enable some nice benefits, like turning off the exact vnodes that are affected by a disk failure or a mini auto-repair on corrupt sstables perhaps? CASSANDRA-4784 lists some other benefits, the strongest of which I think are # on disk failure, we can invalidate the affected vnodes and repair them, rather than continuing to serve incomplete data or halting the entire node [similar to what you are saying here] # we can deduplicate ranges for bulk load into another cluster (CASSANDRA-4756) Drive replacement in JBOD can cause data to reappear. -- Key: CASSANDRA-6696 URL: https://issues.apache.org/jira/browse/CASSANDRA-6696 Project: Cassandra Issue Type: Improvement Components: Core Reporter: sankalp kohli Assignee: Marcus Eriksson Fix For: 3.0 In JBOD, when someone gets a bad drive, the bad drive is replaced with a new empty one and repair is run. This can cause deleted data to come back in some cases. Also this is true for corrupt stables in which we delete the corrupt stable and run repair. Here is an example: Say we have 3 nodes A,B and C and RF=3 and GC grace=10days. row=sankalp col=sankalp is written 20 days back and successfully went to all three nodes. Then a delete/tombstone was written successfully for the same row column 15 days back. Since this tombstone is more than gc grace, it got compacted in Nodes A and B since it got compacted with the actual data. So there is no trace of this row column in node A and B. Now in node C, say the original data is in drive1 and tombstone is in drive2. Compaction has not yet reclaimed the data and tombstone. Drive2 becomes corrupt and was replaced with new empty drive. Due to the replacement, the tombstone in now gone and row=sankalp col=sankalp has come back to life. Now after replacing the drive we run repair. This data will be propagated to all nodes. Note: This is still a problem even if we run repair every gc grace. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6696) Drive replacement in JBOD can cause data to reappear.
[ https://issues.apache.org/jira/browse/CASSANDRA-6696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13976714#comment-13976714 ] Jonathan Ellis commented on CASSANDRA-6696: --- bq. either we need to ensure that all vnodes are of roughly equal size (very difficult), or we probably need to have a dynamic allocation strategy Why is the first option very difficult? BOP aside (and the consensus was, we can continue supporting that because its users are willing to live with its limitations), assuming that every vnode is of roughly equal side is a core part of consistent hashing. M distinct chunks gives you the worst of both worlds. Drive replacement in JBOD can cause data to reappear. -- Key: CASSANDRA-6696 URL: https://issues.apache.org/jira/browse/CASSANDRA-6696 Project: Cassandra Issue Type: Improvement Components: Core Reporter: sankalp kohli Assignee: Marcus Eriksson Fix For: 3.0 In JBOD, when someone gets a bad drive, the bad drive is replaced with a new empty one and repair is run. This can cause deleted data to come back in some cases. Also this is true for corrupt stables in which we delete the corrupt stable and run repair. Here is an example: Say we have 3 nodes A,B and C and RF=3 and GC grace=10days. row=sankalp col=sankalp is written 20 days back and successfully went to all three nodes. Then a delete/tombstone was written successfully for the same row column 15 days back. Since this tombstone is more than gc grace, it got compacted in Nodes A and B since it got compacted with the actual data. So there is no trace of this row column in node A and B. Now in node C, say the original data is in drive1 and tombstone is in drive2. Compaction has not yet reclaimed the data and tombstone. Drive2 becomes corrupt and was replaced with new empty drive. Due to the replacement, the tombstone in now gone and row=sankalp col=sankalp has come back to life. Now after replacing the drive we run repair. This data will be propagated to all nodes. Note: This is still a problem even if we run repair every gc grace. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6696) Drive replacement in JBOD can cause data to reappear.
[ https://issues.apache.org/jira/browse/CASSANDRA-6696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13976718#comment-13976718 ] Jonathan Ellis commented on CASSANDRA-6696: --- bq. With good OS tuning, I'm not scared of too many sstables We can add subdirectory-per-vnode if necessary, but aren't modern FS capable of dealing with hundreds of thousands of files per directory? Drive replacement in JBOD can cause data to reappear. -- Key: CASSANDRA-6696 URL: https://issues.apache.org/jira/browse/CASSANDRA-6696 Project: Cassandra Issue Type: Improvement Components: Core Reporter: sankalp kohli Assignee: Marcus Eriksson Fix For: 3.0 In JBOD, when someone gets a bad drive, the bad drive is replaced with a new empty one and repair is run. This can cause deleted data to come back in some cases. Also this is true for corrupt stables in which we delete the corrupt stable and run repair. Here is an example: Say we have 3 nodes A,B and C and RF=3 and GC grace=10days. row=sankalp col=sankalp is written 20 days back and successfully went to all three nodes. Then a delete/tombstone was written successfully for the same row column 15 days back. Since this tombstone is more than gc grace, it got compacted in Nodes A and B since it got compacted with the actual data. So there is no trace of this row column in node A and B. Now in node C, say the original data is in drive1 and tombstone is in drive2. Compaction has not yet reclaimed the data and tombstone. Drive2 becomes corrupt and was replaced with new empty drive. Due to the replacement, the tombstone in now gone and row=sankalp col=sankalp has come back to life. Now after replacing the drive we run repair. This data will be propagated to all nodes. Note: This is still a problem even if we run repair every gc grace. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6696) Drive replacement in JBOD can cause data to reappear.
[ https://issues.apache.org/jira/browse/CASSANDRA-6696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13976724#comment-13976724 ] Benedict commented on CASSANDRA-6696: - bq. assuming that every vnode is of roughly equal side is a core part of consistent hashing. Well the assumption is broken then. I can assure you vnodes are not of equal size, especially not with our current allocation strategy, and getting them to be of equal size is kind of tough. We may be able to improve that, though. I'm not sure how what I'm suggesting can't also provide most of these other benefits, however we can bring the two approaches closer by simply saying all vnodes starting within the first 1/DISK portion of the token range are allocated to the first disk, and so on - and then they're pretty similar. But the unequal size of vnodes means any compaction tuning will have limited impact, and probably induce more random IO Drive replacement in JBOD can cause data to reappear. -- Key: CASSANDRA-6696 URL: https://issues.apache.org/jira/browse/CASSANDRA-6696 Project: Cassandra Issue Type: Improvement Components: Core Reporter: sankalp kohli Assignee: Marcus Eriksson Fix For: 3.0 In JBOD, when someone gets a bad drive, the bad drive is replaced with a new empty one and repair is run. This can cause deleted data to come back in some cases. Also this is true for corrupt stables in which we delete the corrupt stable and run repair. Here is an example: Say we have 3 nodes A,B and C and RF=3 and GC grace=10days. row=sankalp col=sankalp is written 20 days back and successfully went to all three nodes. Then a delete/tombstone was written successfully for the same row column 15 days back. Since this tombstone is more than gc grace, it got compacted in Nodes A and B since it got compacted with the actual data. So there is no trace of this row column in node A and B. Now in node C, say the original data is in drive1 and tombstone is in drive2. Compaction has not yet reclaimed the data and tombstone. Drive2 becomes corrupt and was replaced with new empty drive. Due to the replacement, the tombstone in now gone and row=sankalp col=sankalp has come back to life. Now after replacing the drive we run repair. This data will be propagated to all nodes. Note: This is still a problem even if we run repair every gc grace. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7043) CommitLogArchiver thread pool name inconsistent with others
[ https://issues.apache.org/jira/browse/CASSANDRA-7043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13976726#comment-13976726 ] Nick Bailey commented on CASSANDRA-7043: Yeah but its an easy update and 2.1 is a major release anyway. So it's fine with me. CommitLogArchiver thread pool name inconsistent with others --- Key: CASSANDRA-7043 URL: https://issues.apache.org/jira/browse/CASSANDRA-7043 Project: Cassandra Issue Type: Bug Components: Core Reporter: Chris Lohfink Priority: Trivial Attachments: namechange.diff Pretty trivial... The names of all ThreadPoolExecutors are in CamelCase except the CommitLogArchiver as commitlog_archiver. This shows up a little more obvious in tpstats output: {code} nodetool tpstats Pool NameActive Pending Completed Blocked ReadStage 0 0 113702 0 RequestResponseStage 0 0 0 0 ... PendingRangeCalculator0 0 1 0 commitlog_archiver0 0 0 0 InternalResponseStage 0 0 0 0 HintedHandoff 0 0 0 0 {code} Seems minor enough to update this to be CommitLogArchiver but it may mean changes in any monitoring applications (although I don't think this particular pool has had much runtime or monitoring needs). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (CASSANDRA-7066) Simplify (and unify) cleanup of compaction leftovers
Benedict created CASSANDRA-7066: --- Summary: Simplify (and unify) cleanup of compaction leftovers Key: CASSANDRA-7066 URL: https://issues.apache.org/jira/browse/CASSANDRA-7066 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Priority: Minor Fix For: 3.0 Currently we manage a list of in-progress compactions in a system table, which we use to cleanup incomplete compactions when we're done. The problem with this is that 1) it's a bit clunky (and leaves us in positions where we can unnecessarily cleanup completed files, or conversely not cleanup files that have been superceded); and 2) it's only used for a regular compaction - no other compaction types are guarded in the same way, so can result in duplication if we fail before deleting the replacements. I'd like to see each sstable store in its metadata its direct ancestors, and on startup we simply delete any sstables that occur in the union of all ancestor sets. This way as soon as we finish writing we're capable of cleaning up any leftovers, so we never get duplication. It's also much easier to reason about. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (CASSANDRA-6106) Provide timestamp with true microsecond resolution
[ https://issues.apache.org/jira/browse/CASSANDRA-6106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-6106: -- Summary: Provide timestamp with true microsecond resolution (was: QueryState.getTimestamp() FBUtilities.timestampMicros() reads current timestamp with System.currentTimeMillis() * 1000 instead of System.nanoTime() / 1000) Provide timestamp with true microsecond resolution -- Key: CASSANDRA-6106 URL: https://issues.apache.org/jira/browse/CASSANDRA-6106 Project: Cassandra Issue Type: Improvement Components: Core Environment: DSE Cassandra 3.1, but also HEAD Reporter: Christopher Smith Assignee: Benedict Priority: Minor Labels: timestamps Fix For: 2.1 beta2 Attachments: microtimstamp.patch, microtimstamp_random.patch, microtimstamp_random_rev2.patch I noticed this blog post: http://aphyr.com/posts/294-call-me-maybe-cassandra mentioned issues with millisecond rounding in timestamps and was able to reproduce the issue. If I specify a timestamp in a mutating query, I get microsecond precision, but if I don't, I get timestamps rounded to the nearest millisecond, at least for my first query on a given connection, which substantially increases the possibilities of collision. I believe I found the offending code, though I am by no means sure this is comprehensive. I think we probably need a fairly comprehensive replacement of all uses of System.currentTimeMillis() with System.nanoTime(). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6696) Drive replacement in JBOD can cause data to reappear.
[ https://issues.apache.org/jira/browse/CASSANDRA-6696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13976744#comment-13976744 ] Tupshin Harper commented on CASSANDRA-6696: --- bq. We can add subdirectory-per-vnode if necessary, but aren't modern FS capable of dealing with hundreds of thousands of files per directory? Exactly my thinking. Drive replacement in JBOD can cause data to reappear. -- Key: CASSANDRA-6696 URL: https://issues.apache.org/jira/browse/CASSANDRA-6696 Project: Cassandra Issue Type: Improvement Components: Core Reporter: sankalp kohli Assignee: Marcus Eriksson Fix For: 3.0 In JBOD, when someone gets a bad drive, the bad drive is replaced with a new empty one and repair is run. This can cause deleted data to come back in some cases. Also this is true for corrupt stables in which we delete the corrupt stable and run repair. Here is an example: Say we have 3 nodes A,B and C and RF=3 and GC grace=10days. row=sankalp col=sankalp is written 20 days back and successfully went to all three nodes. Then a delete/tombstone was written successfully for the same row column 15 days back. Since this tombstone is more than gc grace, it got compacted in Nodes A and B since it got compacted with the actual data. So there is no trace of this row column in node A and B. Now in node C, say the original data is in drive1 and tombstone is in drive2. Compaction has not yet reclaimed the data and tombstone. Drive2 becomes corrupt and was replaced with new empty drive. Due to the replacement, the tombstone in now gone and row=sankalp col=sankalp has come back to life. Now after replacing the drive we run repair. This data will be propagated to all nodes. Note: This is still a problem even if we run repair every gc grace. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6696) Drive replacement in JBOD can cause data to reappear.
[ https://issues.apache.org/jira/browse/CASSANDRA-6696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13976751#comment-13976751 ] Tupshin Harper commented on CASSANDRA-6696: --- bq. Well the assumption is broken then. Yes, very true, and I've been thinking for a while now that, while we don't need a strategy to keep all vnodes the exact same size, we would benefit from a background process that gradually splits and combines the largest and smallest outliers to have vnodes *tend* to converge on the same size. Drive replacement in JBOD can cause data to reappear. -- Key: CASSANDRA-6696 URL: https://issues.apache.org/jira/browse/CASSANDRA-6696 Project: Cassandra Issue Type: Improvement Components: Core Reporter: sankalp kohli Assignee: Marcus Eriksson Fix For: 3.0 In JBOD, when someone gets a bad drive, the bad drive is replaced with a new empty one and repair is run. This can cause deleted data to come back in some cases. Also this is true for corrupt stables in which we delete the corrupt stable and run repair. Here is an example: Say we have 3 nodes A,B and C and RF=3 and GC grace=10days. row=sankalp col=sankalp is written 20 days back and successfully went to all three nodes. Then a delete/tombstone was written successfully for the same row column 15 days back. Since this tombstone is more than gc grace, it got compacted in Nodes A and B since it got compacted with the actual data. So there is no trace of this row column in node A and B. Now in node C, say the original data is in drive1 and tombstone is in drive2. Compaction has not yet reclaimed the data and tombstone. Drive2 becomes corrupt and was replaced with new empty drive. Due to the replacement, the tombstone in now gone and row=sankalp col=sankalp has come back to life. Now after replacing the drive we run repair. This data will be propagated to all nodes. Note: This is still a problem even if we run repair every gc grace. -- This message was sent by Atlassian JIRA (v6.2#6252)
[2/3] git commit: mark dropped CFs clean in commitlog patch by Benedict Elliott Smith; reviewed by jbellis for CASSANDRA-6959
mark dropped CFs clean in commitlog patch by Benedict Elliott Smith; reviewed by jbellis for CASSANDRA-6959 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/75c18519 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/75c18519 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/75c18519 Branch: refs/heads/trunk Commit: 75c185199d403dbd6d1220a4f4dcd1553c98c15f Parents: 510c82e Author: Jonathan Ellis jbel...@apache.org Authored: Tue Apr 22 08:33:12 2014 -0500 Committer: Jonathan Ellis jbel...@apache.org Committed: Tue Apr 22 08:33:12 2014 -0500 -- src/java/org/apache/cassandra/db/DefsTables.java | 2 ++ src/java/org/apache/cassandra/db/commitlog/CommitLog.java | 7 +++ 2 files changed, 9 insertions(+) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/75c18519/src/java/org/apache/cassandra/db/DefsTables.java -- diff --git a/src/java/org/apache/cassandra/db/DefsTables.java b/src/java/org/apache/cassandra/db/DefsTables.java index 6f1cc69..351ee4b 100644 --- a/src/java/org/apache/cassandra/db/DefsTables.java +++ b/src/java/org/apache/cassandra/db/DefsTables.java @@ -464,6 +464,7 @@ public class DefsTables cfs.snapshot(snapshotName); Keyspace.open(ksm.name).dropCf(cfm.cfId); } +CommitLog.instance.discardColumnFamily(cfm.cfId); } // remove the keyspace from the static instances. @@ -494,6 +495,7 @@ public class DefsTables CompactionManager.instance.interruptCompactionFor(Arrays.asList(cfm), true); +CommitLog.instance.discardColumnFamily(cfm.cfId); CommitLog.instance.forceRecycleAllSegments(); if (!StorageService.instance.isClientMode()) http://git-wip-us.apache.org/repos/asf/cassandra/blob/75c18519/src/java/org/apache/cassandra/db/commitlog/CommitLog.java -- diff --git a/src/java/org/apache/cassandra/db/commitlog/CommitLog.java b/src/java/org/apache/cassandra/db/commitlog/CommitLog.java index 48ddb5a..a230e35 100644 --- a/src/java/org/apache/cassandra/db/commitlog/CommitLog.java +++ b/src/java/org/apache/cassandra/db/commitlog/CommitLog.java @@ -237,6 +237,13 @@ public class CommitLog implements CommitLogMBean return alloc; } +public void discardColumnFamily(final UUID cfId) +{ +ReplayPosition context = getContext(); +for (CommitLogSegment cls : allocator.getActiveSegments()) +cls.markClean(cfId, context); +} + /** * Modifies the per-CF dirty cursors of any commit log segments for the column family according to the position * given. Discards any commit log segments that are no longer used.
[3/3] git commit: Merge branch 'cassandra-2.1' into trunk
Merge branch 'cassandra-2.1' into trunk Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/b1ea0aa9 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/b1ea0aa9 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/b1ea0aa9 Branch: refs/heads/trunk Commit: b1ea0aa9e9eb32d32bcfb2b17fd12b9df1fd2aee Parents: 8e943b3 75c1851 Author: Jonathan Ellis jbel...@apache.org Authored: Tue Apr 22 08:33:19 2014 -0500 Committer: Jonathan Ellis jbel...@apache.org Committed: Tue Apr 22 08:33:19 2014 -0500 -- src/java/org/apache/cassandra/db/DefsTables.java | 2 ++ src/java/org/apache/cassandra/db/commitlog/CommitLog.java | 7 +++ 2 files changed, 9 insertions(+) --
[1/3] git commit: mark dropped CFs clean in commitlog patch by Benedict Elliott Smith; reviewed by jbellis for CASSANDRA-6959
Repository: cassandra Updated Branches: refs/heads/cassandra-2.1 510c82e4e - 75c185199 refs/heads/trunk 8e943b3da - b1ea0aa9e mark dropped CFs clean in commitlog patch by Benedict Elliott Smith; reviewed by jbellis for CASSANDRA-6959 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/75c18519 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/75c18519 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/75c18519 Branch: refs/heads/cassandra-2.1 Commit: 75c185199d403dbd6d1220a4f4dcd1553c98c15f Parents: 510c82e Author: Jonathan Ellis jbel...@apache.org Authored: Tue Apr 22 08:33:12 2014 -0500 Committer: Jonathan Ellis jbel...@apache.org Committed: Tue Apr 22 08:33:12 2014 -0500 -- src/java/org/apache/cassandra/db/DefsTables.java | 2 ++ src/java/org/apache/cassandra/db/commitlog/CommitLog.java | 7 +++ 2 files changed, 9 insertions(+) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/75c18519/src/java/org/apache/cassandra/db/DefsTables.java -- diff --git a/src/java/org/apache/cassandra/db/DefsTables.java b/src/java/org/apache/cassandra/db/DefsTables.java index 6f1cc69..351ee4b 100644 --- a/src/java/org/apache/cassandra/db/DefsTables.java +++ b/src/java/org/apache/cassandra/db/DefsTables.java @@ -464,6 +464,7 @@ public class DefsTables cfs.snapshot(snapshotName); Keyspace.open(ksm.name).dropCf(cfm.cfId); } +CommitLog.instance.discardColumnFamily(cfm.cfId); } // remove the keyspace from the static instances. @@ -494,6 +495,7 @@ public class DefsTables CompactionManager.instance.interruptCompactionFor(Arrays.asList(cfm), true); +CommitLog.instance.discardColumnFamily(cfm.cfId); CommitLog.instance.forceRecycleAllSegments(); if (!StorageService.instance.isClientMode()) http://git-wip-us.apache.org/repos/asf/cassandra/blob/75c18519/src/java/org/apache/cassandra/db/commitlog/CommitLog.java -- diff --git a/src/java/org/apache/cassandra/db/commitlog/CommitLog.java b/src/java/org/apache/cassandra/db/commitlog/CommitLog.java index 48ddb5a..a230e35 100644 --- a/src/java/org/apache/cassandra/db/commitlog/CommitLog.java +++ b/src/java/org/apache/cassandra/db/commitlog/CommitLog.java @@ -237,6 +237,13 @@ public class CommitLog implements CommitLogMBean return alloc; } +public void discardColumnFamily(final UUID cfId) +{ +ReplayPosition context = getContext(); +for (CommitLogSegment cls : allocator.getActiveSegments()) +cls.markClean(cfId, context); +} + /** * Modifies the per-CF dirty cursors of any commit log segments for the column family according to the position * given. Discards any commit log segments that are no longer used.
[jira] [Updated] (CASSANDRA-7055) Broken CQL Version number in 2.0.7
[ https://issues.apache.org/jira/browse/CASSANDRA-7055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeremy Hanna updated CASSANDRA-7055: Summary: Broken CQL Version number in 2.0.7 (was: Boken CQL Version number in 2.0.7) Broken CQL Version number in 2.0.7 -- Key: CASSANDRA-7055 URL: https://issues.apache.org/jira/browse/CASSANDRA-7055 Project: Cassandra Issue Type: Bug Reporter: Michaël Figuière Assignee: Sylvain Lebresne Priority: Trivial Fix For: 2.0.8 Cassandra 2.0.7 has introduced 2 changes in the CQL language: *Add uuid() function (CASSANDRA-6473) *Add support for DELETE ... IF EXISTS to CQL3 (CASSANDRA-5708) Unfortunately the {{cql_version}} hasn't been incremented as reported in the {{system.local}} table. In 2.0.6: {code} cqlsh select cql_version from system.local; cql_version - 3.1.5 {code} In 2.0.7: {code} cqlsh select cql_version from system.local; cql_version - 3.1.5 {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7046) Update nodetool commands to output the date and time they were run on
[ https://issues.apache.org/jira/browse/CASSANDRA-7046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13976771#comment-13976771 ] Johnny Miller commented on CASSANDRA-7046: -- Clement - yes, it is possible to do this but it I don't agree that this information would not be valuable to have in the output of nodetool commands, particularly if we are trying to make them more user friendly. My experience of trying to help resolve issues is that this information is rarely to hand and makes trawling through logs to find out what was up somewhat challenging. Update nodetool commands to output the date and time they were run on - Key: CASSANDRA-7046 URL: https://issues.apache.org/jira/browse/CASSANDRA-7046 Project: Cassandra Issue Type: Improvement Reporter: Johnny Miller Priority: Trivial Labels: lhf It would help if the various nodetool commands also outputted the system date time they were run. Often these commands are executed and then we look at the cassandra log files to try and find out what was happening at that time. This is certainly just a convenience feature, but it would be nice to have the information in there to aid with diagnostics. -- This message was sent by Atlassian JIRA (v6.2#6252)
[2/3] git commit: fix CLTest post-#6764 patch by Benedict Elliott Smith; reviewed by jbellis for CASSANDRA-6764
fix CLTest post-#6764 patch by Benedict Elliott Smith; reviewed by jbellis for CASSANDRA-6764 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/44f4e790 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/44f4e790 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/44f4e790 Branch: refs/heads/trunk Commit: 44f4e790196ff6425255cd12cfd100ddf9415524 Parents: 75c1851 Author: Jonathan Ellis jbel...@apache.org Authored: Tue Apr 22 08:47:11 2014 -0500 Committer: Jonathan Ellis jbel...@apache.org Committed: Tue Apr 22 08:47:23 2014 -0500 -- .../org/apache/cassandra/db/CommitLogTest.java | 38 ++-- 1 file changed, 36 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/44f4e790/test/unit/org/apache/cassandra/db/CommitLogTest.java -- diff --git a/test/unit/org/apache/cassandra/db/CommitLogTest.java b/test/unit/org/apache/cassandra/db/CommitLogTest.java index 577692d..ddab9ea 100644 --- a/test/unit/org/apache/cassandra/db/CommitLogTest.java +++ b/test/unit/org/apache/cassandra/db/CommitLogTest.java @@ -30,9 +30,11 @@ import org.junit.Test; import org.apache.cassandra.SchemaLoader; import org.apache.cassandra.Util; +import org.apache.cassandra.config.Config; import org.apache.cassandra.config.DatabaseDescriptor; import org.apache.cassandra.db.commitlog.CommitLog; import org.apache.cassandra.db.commitlog.CommitLogDescriptor; +import org.apache.cassandra.db.composites.CellName; import org.apache.cassandra.net.MessagingService; import static org.apache.cassandra.utils.ByteBufferUtil.bytes; @@ -166,17 +168,49 @@ public class CommitLogTest extends SchemaLoader assert CommitLog.instance.activeSegments() == 1 : Expecting 1 segment, got + CommitLog.instance.activeSegments(); } +private static int getMaxRecordDataSize(String keyspace, ByteBuffer key, String table, CellName column) +{ +Mutation rm = new Mutation(Keyspace1, bytes(k)); +rm.add(Standard1, Util.cellname(c1), ByteBuffer.allocate(0), 0); + +int max = (DatabaseDescriptor.getCommitLogSegmentSize() / 2); +max -= (4 + 8 + 8); // log entry overhead +return max - (int) Mutation.serializer.serializedSize(rm, MessagingService.current_version); +} + +private static int getMaxRecordDataSize() +{ +return getMaxRecordDataSize(Keyspace1, bytes(k), Standard1, Util.cellname(c1)); +} + // CASSANDRA-3615 @Test -public void testExceedSegmentSizeWithOverhead() throws Exception +public void testEqualRecordLimit() throws Exception { CommitLog.instance.resetUnsafe(); Mutation rm = new Mutation(Keyspace1, bytes(k)); -rm.add(Standard1, Util.cellname(c1), ByteBuffer.allocate((DatabaseDescriptor.getCommitLogSegmentSize()) - 83), 0); +rm.add(Standard1, Util.cellname(c1), ByteBuffer.allocate(getMaxRecordDataSize()), 0); CommitLog.instance.add(rm); } +@Test +public void testExceedRecordLimit() throws Exception +{ +CommitLog.instance.resetUnsafe(); +try +{ +Mutation rm = new Mutation(Keyspace1, bytes(k)); +rm.add(Standard1, Util.cellname(c1), ByteBuffer.allocate(1 + getMaxRecordDataSize()), 0); +CommitLog.instance.add(rm); +throw new AssertionError(mutation larger than limit was accepted); +} +catch (IllegalArgumentException e) +{ +// IAE is thrown on too-large mutations +} +} + protected void testRecoveryWithBadSizeArgument(int size, int dataSize) throws Exception { Checksum checksum = new CRC32();
[1/3] git commit: fix CLTest post-#6764 patch by Benedict Elliott Smith; reviewed by jbellis for CASSANDRA-6764
Repository: cassandra Updated Branches: refs/heads/cassandra-2.1 75c185199 - 44f4e7901 refs/heads/trunk b1ea0aa9e - 47e81bf37 fix CLTest post-#6764 patch by Benedict Elliott Smith; reviewed by jbellis for CASSANDRA-6764 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/44f4e790 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/44f4e790 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/44f4e790 Branch: refs/heads/cassandra-2.1 Commit: 44f4e790196ff6425255cd12cfd100ddf9415524 Parents: 75c1851 Author: Jonathan Ellis jbel...@apache.org Authored: Tue Apr 22 08:47:11 2014 -0500 Committer: Jonathan Ellis jbel...@apache.org Committed: Tue Apr 22 08:47:23 2014 -0500 -- .../org/apache/cassandra/db/CommitLogTest.java | 38 ++-- 1 file changed, 36 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/44f4e790/test/unit/org/apache/cassandra/db/CommitLogTest.java -- diff --git a/test/unit/org/apache/cassandra/db/CommitLogTest.java b/test/unit/org/apache/cassandra/db/CommitLogTest.java index 577692d..ddab9ea 100644 --- a/test/unit/org/apache/cassandra/db/CommitLogTest.java +++ b/test/unit/org/apache/cassandra/db/CommitLogTest.java @@ -30,9 +30,11 @@ import org.junit.Test; import org.apache.cassandra.SchemaLoader; import org.apache.cassandra.Util; +import org.apache.cassandra.config.Config; import org.apache.cassandra.config.DatabaseDescriptor; import org.apache.cassandra.db.commitlog.CommitLog; import org.apache.cassandra.db.commitlog.CommitLogDescriptor; +import org.apache.cassandra.db.composites.CellName; import org.apache.cassandra.net.MessagingService; import static org.apache.cassandra.utils.ByteBufferUtil.bytes; @@ -166,17 +168,49 @@ public class CommitLogTest extends SchemaLoader assert CommitLog.instance.activeSegments() == 1 : Expecting 1 segment, got + CommitLog.instance.activeSegments(); } +private static int getMaxRecordDataSize(String keyspace, ByteBuffer key, String table, CellName column) +{ +Mutation rm = new Mutation(Keyspace1, bytes(k)); +rm.add(Standard1, Util.cellname(c1), ByteBuffer.allocate(0), 0); + +int max = (DatabaseDescriptor.getCommitLogSegmentSize() / 2); +max -= (4 + 8 + 8); // log entry overhead +return max - (int) Mutation.serializer.serializedSize(rm, MessagingService.current_version); +} + +private static int getMaxRecordDataSize() +{ +return getMaxRecordDataSize(Keyspace1, bytes(k), Standard1, Util.cellname(c1)); +} + // CASSANDRA-3615 @Test -public void testExceedSegmentSizeWithOverhead() throws Exception +public void testEqualRecordLimit() throws Exception { CommitLog.instance.resetUnsafe(); Mutation rm = new Mutation(Keyspace1, bytes(k)); -rm.add(Standard1, Util.cellname(c1), ByteBuffer.allocate((DatabaseDescriptor.getCommitLogSegmentSize()) - 83), 0); +rm.add(Standard1, Util.cellname(c1), ByteBuffer.allocate(getMaxRecordDataSize()), 0); CommitLog.instance.add(rm); } +@Test +public void testExceedRecordLimit() throws Exception +{ +CommitLog.instance.resetUnsafe(); +try +{ +Mutation rm = new Mutation(Keyspace1, bytes(k)); +rm.add(Standard1, Util.cellname(c1), ByteBuffer.allocate(1 + getMaxRecordDataSize()), 0); +CommitLog.instance.add(rm); +throw new AssertionError(mutation larger than limit was accepted); +} +catch (IllegalArgumentException e) +{ +// IAE is thrown on too-large mutations +} +} + protected void testRecoveryWithBadSizeArgument(int size, int dataSize) throws Exception { Checksum checksum = new CRC32();
[3/3] git commit: Merge branch 'cassandra-2.1' into trunk
Merge branch 'cassandra-2.1' into trunk Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/47e81bf3 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/47e81bf3 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/47e81bf3 Branch: refs/heads/trunk Commit: 47e81bf37540df69dde7a2c32cc75dbba476765f Parents: b1ea0aa 44f4e79 Author: Jonathan Ellis jbel...@apache.org Authored: Tue Apr 22 08:47:31 2014 -0500 Committer: Jonathan Ellis jbel...@apache.org Committed: Tue Apr 22 08:47:31 2014 -0500 -- .../org/apache/cassandra/db/CommitLogTest.java | 38 ++-- 1 file changed, 36 insertions(+), 2 deletions(-) --
[jira] [Commented] (CASSANDRA-6764) Using Batch commitlog_sync is slow and doesn't actually batch writes
[ https://issues.apache.org/jira/browse/CASSANDRA-6764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13976782#comment-13976782 ] Jonathan Ellis commented on CASSANDRA-6764: --- committed Using Batch commitlog_sync is slow and doesn't actually batch writes Key: CASSANDRA-6764 URL: https://issues.apache.org/jira/browse/CASSANDRA-6764 Project: Cassandra Issue Type: Improvement Components: Core Reporter: John Carrino Assignee: John Carrino Fix For: 2.1 beta2 Attachments: 6764.fix.txt, 6764.fix2.txt, cassandra_6764_v2.patch, cassandra_6764_v3.patch The assumption behind batch commit mode is that the client does it's own batching and wants to wait until the write is durable before returning. The problem is that the queue that cassandra uses under the covers only allows for a single ROW (RowMutation) per thread (concurrent_writes). This means that commitlog_sync_batch_window_in_ms should really be called sleep_between each_concurrent_writes_rows_in_ms. I assume the reason this slipped by for so long is that no one uses batch mode, probably because people say it's slow. We need durability so this isn't an option. However it doesn't need to be this slow. Also, if you write a row that is larger than the commit log size it silently (warn) fails to put it in the commit log. This is not ideal for batch mode. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6916) Preemptive opening of compaction result
[ https://issues.apache.org/jira/browse/CASSANDRA-6916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13976788#comment-13976788 ] Marcus Eriksson commented on CASSANDRA-6916: new patch comments; * cleanup is broken, first the log message causes npe since newSstable is null, secondly, an immutableSet is passed as rewriting to SSTableRewriter, causing moveStarts to fail since it calls removeAll on that set. Both fixed here: https://github.com/krummas/cassandra/commits/bes/6916-2 * a bit of javadoc around moveStarts could be helpful, just looking at the api now makes it feel like we are moving the start of newReader * when doing anticompaction, should we not clean up old readers? (repairedSSTableWriter.finish(false, repairedAt);) * do we need to move starts back in SSTableRewriter.resetAndTruncate()? If we resetAndTruncate right after doing early opening, i think we could create a gap between the start in the compacting file and the end in the written one Preemptive opening of compaction result --- Key: CASSANDRA-6916 URL: https://issues.apache.org/jira/browse/CASSANDRA-6916 Project: Cassandra Issue Type: Bug Components: Core Reporter: Benedict Assignee: Benedict Priority: Minor Labels: performance Fix For: 2.1 Attachments: 6916-stock2_1.mixed.cache_tweaks.tar.gz, 6916-stock2_1.mixed.logs.tar.gz, 6916v3-preempive-open-compact.logs.gz, 6916v3-preempive-open-compact.mixed.2.logs.tar.gz, 6916v3-premptive-open-compact.mixed.cache_tweaks.2.tar.gz Related to CASSANDRA-6812, but a little simpler: when compacting, we mess quite badly with the page cache. One thing we can do to mitigate this problem is to use the sstable we're writing before we've finished writing it, and to drop the regions from the old sstables from the page cache as soon as the new sstables have them (even if they're only written to the page cache). This should minimise any page cache churn, as the old sstables must be larger than the new sstable, and since both will be in memory, dropping the old sstables is at least as good as dropping the new. The approach is quite straight-forward. Every X MB written: # grab flushed length of index file; # grab second to last index summary record, after excluding those that point to positions after the flushed length; # open index file, and check that our last record doesn't occur outside of the flushed length of the data file (pretty unlikely) # Open the sstable with the calculated upper bound Some complications: # must keep running copy of compression metadata for reopening with # we need to be able to replace an sstable with itself but a different lower bound # we need to drop the old page cache only when readers have finished -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (CASSANDRA-3668) Parallel streaming for sstableloader
[ https://issues.apache.org/jira/browse/CASSANDRA-3668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-3668: -- Fix Version/s: (was: 2.1 beta2) 2.1 rc1 Parallel streaming for sstableloader Key: CASSANDRA-3668 URL: https://issues.apache.org/jira/browse/CASSANDRA-3668 Project: Cassandra Issue Type: Improvement Components: API Reporter: Manish Zope Assignee: Joshua McKenzie Priority: Minor Labels: streaming Fix For: 2.1 rc1 Attachments: 3668-1.1-v2.txt, 3668-1.1.txt, 3668_v2.txt, 3688-reply_before_closing_writer.txt, sstable-loader performance.txt Original Estimate: 48h Remaining Estimate: 48h One of my colleague had reported the bug regarding the degraded performance of the sstable generator and sstable loader. ISSUE :- https://issues.apache.org/jira/browse/CASSANDRA-3589 As stated in above issue generator performance is rectified but performance of the sstableloader is still an issue. 3589 is marked as duplicate of 3618.Both issues shows resolved status.But the problem with sstableloader still exists. So opening other issue so that sstbleloader problem should not go unnoticed. FYI : We have tested the generator part with the patch given in 3589.Its Working fine. Please let us know if you guys require further inputs from our side. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (CASSANDRA-4718) More-efficient ExecutorService for improved throughput
[ https://issues.apache.org/jira/browse/CASSANDRA-4718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benedict updated CASSANDRA-4718: Reviewer: Benedict (was: Pavel Yaskevich) More-efficient ExecutorService for improved throughput -- Key: CASSANDRA-4718 URL: https://issues.apache.org/jira/browse/CASSANDRA-4718 Project: Cassandra Issue Type: Improvement Reporter: Jonathan Ellis Assignee: Jason Brown Priority: Minor Labels: performance Fix For: 2.1 Attachments: 4718-v1.patch, PerThreadQueue.java, baq vs trunk.png, op costs of various queues.ods, stress op rate with various queues.ods, v1-stress.out Currently all our execution stages dequeue tasks one at a time. This can result in contention between producers and consumers (although we do our best to minimize this by using LinkedBlockingQueue). One approach to mitigating this would be to make consumer threads do more work in bulk instead of just one task per dequeue. (Producer threads tend to be single-task oriented by nature, so I don't see an equivalent opportunity there.) BlockingQueue has a drainTo(collection, int) method that would be perfect for this. However, no ExecutorService in the jdk supports using drainTo, nor could I google one. What I would like to do here is create just such a beast and wire it into (at least) the write and read stages. (Other possible candidates for such an optimization, such as the CommitLog and OutboundTCPConnection, are not ExecutorService-based and will need to be one-offs.) AbstractExecutorService may be useful. The implementations of ICommitLogExecutorService may also be useful. (Despite the name these are not actual ExecutorServices, although they share the most important properties of one.) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-4784) Create separate sstables for each token range handled by a node
[ https://issues.apache.org/jira/browse/CASSANDRA-4784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13976792#comment-13976792 ] T Jake Luciani commented on CASSANDRA-4784: --- Another issue would be splitting the sstables when new vnode ranges are added to the cluster. Replicas can change on any node causing the old ranges to split. Also if anyone decomissions a node. It may be simpler to start with taking a contiguous range of vnodes and putting them in a directory partition. You could have a fixed number of these (one per disk?). It would limit the damage of a dead drive in JBOD mode to a section of vnode ranges. Create separate sstables for each token range handled by a node --- Key: CASSANDRA-4784 URL: https://issues.apache.org/jira/browse/CASSANDRA-4784 Project: Cassandra Issue Type: Improvement Components: Core Affects Versions: 1.2.0 beta 1 Reporter: sankalp kohli Priority: Minor Fix For: 2.1 beta2 Attachments: 4784.patch Currently, each sstable has data for all the ranges that node is handling. If we change that and rather have separate sstables for each range that node is handling, it can lead to some improvements. Improvements 1) Node rebuild will be very fast as sstables can be directly copied over to the bootstrapping node. It will minimize any application level logic. We can directly use Linux native methods to transfer sstables without using CPU and putting less pressure on the serving node. I think in theory it will be the fastest way to transfer data. 2) Backup can only transfer sstables for a node which belong to its primary keyrange. 3) ETL process can only copy one replica of data and will be much faster. Changes: We can split the writes into multiple memtables for each range it is handling. The sstables being flushed from these can have details of which range of data it is handling. There will be no change I think for any reads as they work with interleaved data anyway. But may be we can improve there as well? Complexities: The change does not look very complicated. I am not taking into account how it will work when ranges are being changed for nodes. Vnodes might make this work more complicated. We can also have a bit on each sstable which says whether it is primary data or not. -- This message was sent by Atlassian JIRA (v6.2#6252)
[3/3] git commit: Merge branch 'cassandra-1.2' into cassandra-2.0
Merge branch 'cassandra-1.2' into cassandra-2.0 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/3dad8ca6 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/3dad8ca6 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/3dad8ca6 Branch: refs/heads/cassandra-2.0 Commit: 3dad8ca60c14a6c57a1a2830310b14d50d36b0c0 Parents: de720b4 0547d16 Author: Jonathan Ellis jbel...@apache.org Authored: Tue Apr 22 09:17:31 2014 -0500 Committer: Jonathan Ellis jbel...@apache.org Committed: Tue Apr 22 09:17:31 2014 -0500 -- --
[1/3] git commit: Fix schema concurrency exceptions (backport of #6841)
Repository: cassandra Updated Branches: refs/heads/cassandra-1.2 8d1acd93f - 0547d16d5 refs/heads/cassandra-2.0 de720b4aa - 3dad8ca60 Fix schema concurrency exceptions (backport of #6841) Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/0547d16d Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/0547d16d Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/0547d16d Branch: refs/heads/cassandra-1.2 Commit: 0547d16d5f5475e66c339ed779cf561c52869445 Parents: 8d1acd9 Author: Jonathan Ellis jbel...@apache.org Authored: Tue Apr 22 09:15:29 2014 -0500 Committer: Jonathan Ellis jbel...@apache.org Committed: Tue Apr 22 09:17:00 2014 -0500 -- CHANGES.txt | 1 + .../cassandra/db/commitlog/CommitLogAllocator.java | 2 +- .../cassandra/db/commitlog/CommitLogSegment.java | 15 +-- 3 files changed, 11 insertions(+), 7 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/0547d16d/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 8cfffad..dc48131 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 1.2.17 + * Fix schema concurrency exceptions (CASSANDRA-6841) * Fix BatchlogManager#deleteBatch() use of millisecond timsestamps (CASSANDRA-6822) * Continue assassinating even if the endpoint vanishes (CASSANDRA-6787) http://git-wip-us.apache.org/repos/asf/cassandra/blob/0547d16d/src/java/org/apache/cassandra/db/commitlog/CommitLogAllocator.java -- diff --git a/src/java/org/apache/cassandra/db/commitlog/CommitLogAllocator.java b/src/java/org/apache/cassandra/db/commitlog/CommitLogAllocator.java index d62d7ca..c668377 100644 --- a/src/java/org/apache/cassandra/db/commitlog/CommitLogAllocator.java +++ b/src/java/org/apache/cassandra/db/commitlog/CommitLogAllocator.java @@ -293,7 +293,7 @@ public class CommitLogAllocator { CommitLogSegment oldestSegment = activeSegments.peek(); -if (oldestSegment != null) +if (oldestSegment != null oldestSegment != CommitLog.instance.activeSegment) { for (UUID dirtyCFId : oldestSegment.getDirtyCFIDs()) { http://git-wip-us.apache.org/repos/asf/cassandra/blob/0547d16d/src/java/org/apache/cassandra/db/commitlog/CommitLogSegment.java -- diff --git a/src/java/org/apache/cassandra/db/commitlog/CommitLogSegment.java b/src/java/org/apache/cassandra/db/commitlog/CommitLogSegment.java index c0c7918..bd50b60 100644 --- a/src/java/org/apache/cassandra/db/commitlog/CommitLogSegment.java +++ b/src/java/org/apache/cassandra/db/commitlog/CommitLogSegment.java @@ -22,10 +22,13 @@ import java.io.IOException; import java.io.RandomAccessFile; import java.nio.MappedByteBuffer; import java.nio.channels.FileChannel; +import java.util.ArrayList; import java.util.Collection; import java.util.Comparator; import java.util.HashMap; +import java.util.Map; import java.util.UUID; +import java.util.concurrent.ConcurrentHashMap; import java.util.concurrent.atomic.AtomicInteger; import java.util.zip.Checksum; @@ -59,7 +62,7 @@ public class CommitLogSegment static final int ENTRY_OVERHEAD_SIZE = 4 + 8 + 8; // cache which cf is dirty in this segment to avoid having to lookup all ReplayPositions to decide if we can delete this segment -private final HashMapUUID, Integer cfLastWrite = new HashMapUUID, Integer(); +private final MapUUID, Integer cfLastWrite = new HashMapUUID, Integer(); public final long id; @@ -316,7 +319,7 @@ public class CommitLogSegment * @param cfIdthe column family ID that is now clean * @param context the optional clean offset */ -public void markClean(UUID cfId, ReplayPosition context) +public synchronized void markClean(UUID cfId, ReplayPosition context) { Integer lastWritten = cfLastWrite.get(cfId); @@ -329,15 +332,15 @@ public class CommitLogSegment /** * @return a collection of dirty CFIDs for this segment file. */ -public CollectionUUID getDirtyCFIDs() +public synchronized CollectionUUID getDirtyCFIDs() { -return cfLastWrite.keySet(); +return new ArrayListUUID(cfLastWrite.keySet()); } /** * @return true if this segment is unused and safe to recycle or delete */ -public boolean isUnused() +public synchronized boolean isUnused() { return cfLastWrite.isEmpty(); } @@ -357,7 +360,7 @@ public class CommitLogSegment public String dirtyString() { StringBuilder sb = new StringBuilder(); -for (UUID
[2/3] git commit: Fix schema concurrency exceptions (backport of #6841)
Fix schema concurrency exceptions (backport of #6841) Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/0547d16d Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/0547d16d Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/0547d16d Branch: refs/heads/cassandra-2.0 Commit: 0547d16d5f5475e66c339ed779cf561c52869445 Parents: 8d1acd9 Author: Jonathan Ellis jbel...@apache.org Authored: Tue Apr 22 09:15:29 2014 -0500 Committer: Jonathan Ellis jbel...@apache.org Committed: Tue Apr 22 09:17:00 2014 -0500 -- CHANGES.txt | 1 + .../cassandra/db/commitlog/CommitLogAllocator.java | 2 +- .../cassandra/db/commitlog/CommitLogSegment.java | 15 +-- 3 files changed, 11 insertions(+), 7 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/0547d16d/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 8cfffad..dc48131 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 1.2.17 + * Fix schema concurrency exceptions (CASSANDRA-6841) * Fix BatchlogManager#deleteBatch() use of millisecond timsestamps (CASSANDRA-6822) * Continue assassinating even if the endpoint vanishes (CASSANDRA-6787) http://git-wip-us.apache.org/repos/asf/cassandra/blob/0547d16d/src/java/org/apache/cassandra/db/commitlog/CommitLogAllocator.java -- diff --git a/src/java/org/apache/cassandra/db/commitlog/CommitLogAllocator.java b/src/java/org/apache/cassandra/db/commitlog/CommitLogAllocator.java index d62d7ca..c668377 100644 --- a/src/java/org/apache/cassandra/db/commitlog/CommitLogAllocator.java +++ b/src/java/org/apache/cassandra/db/commitlog/CommitLogAllocator.java @@ -293,7 +293,7 @@ public class CommitLogAllocator { CommitLogSegment oldestSegment = activeSegments.peek(); -if (oldestSegment != null) +if (oldestSegment != null oldestSegment != CommitLog.instance.activeSegment) { for (UUID dirtyCFId : oldestSegment.getDirtyCFIDs()) { http://git-wip-us.apache.org/repos/asf/cassandra/blob/0547d16d/src/java/org/apache/cassandra/db/commitlog/CommitLogSegment.java -- diff --git a/src/java/org/apache/cassandra/db/commitlog/CommitLogSegment.java b/src/java/org/apache/cassandra/db/commitlog/CommitLogSegment.java index c0c7918..bd50b60 100644 --- a/src/java/org/apache/cassandra/db/commitlog/CommitLogSegment.java +++ b/src/java/org/apache/cassandra/db/commitlog/CommitLogSegment.java @@ -22,10 +22,13 @@ import java.io.IOException; import java.io.RandomAccessFile; import java.nio.MappedByteBuffer; import java.nio.channels.FileChannel; +import java.util.ArrayList; import java.util.Collection; import java.util.Comparator; import java.util.HashMap; +import java.util.Map; import java.util.UUID; +import java.util.concurrent.ConcurrentHashMap; import java.util.concurrent.atomic.AtomicInteger; import java.util.zip.Checksum; @@ -59,7 +62,7 @@ public class CommitLogSegment static final int ENTRY_OVERHEAD_SIZE = 4 + 8 + 8; // cache which cf is dirty in this segment to avoid having to lookup all ReplayPositions to decide if we can delete this segment -private final HashMapUUID, Integer cfLastWrite = new HashMapUUID, Integer(); +private final MapUUID, Integer cfLastWrite = new HashMapUUID, Integer(); public final long id; @@ -316,7 +319,7 @@ public class CommitLogSegment * @param cfIdthe column family ID that is now clean * @param context the optional clean offset */ -public void markClean(UUID cfId, ReplayPosition context) +public synchronized void markClean(UUID cfId, ReplayPosition context) { Integer lastWritten = cfLastWrite.get(cfId); @@ -329,15 +332,15 @@ public class CommitLogSegment /** * @return a collection of dirty CFIDs for this segment file. */ -public CollectionUUID getDirtyCFIDs() +public synchronized CollectionUUID getDirtyCFIDs() { -return cfLastWrite.keySet(); +return new ArrayListUUID(cfLastWrite.keySet()); } /** * @return true if this segment is unused and safe to recycle or delete */ -public boolean isUnused() +public synchronized boolean isUnused() { return cfLastWrite.isEmpty(); } @@ -357,7 +360,7 @@ public class CommitLogSegment public String dirtyString() { StringBuilder sb = new StringBuilder(); -for (UUID cfId : cfLastWrite.keySet()) +for (UUID cfId : getDirtyCFIDs()) { CFMetaData m =
[jira] [Commented] (CASSANDRA-6949) Performance regression in tombstone heavy workloads
[ https://issues.apache.org/jira/browse/CASSANDRA-6949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13976814#comment-13976814 ] Benedict commented on CASSANDRA-6949: - Ah, right... but that only works on values that actually make it as far as the sstables - anything that is overwritten in memory will never be removed by that code path if I'm reading it right? Performance regression in tombstone heavy workloads --- Key: CASSANDRA-6949 URL: https://issues.apache.org/jira/browse/CASSANDRA-6949 Project: Cassandra Issue Type: Bug Reporter: Jeremiah Jordan Assignee: Sam Tunnicliffe Attachments: 0001-Remove-expansion-of-RangeTombstones-to-delete-from-2.patch, 6949.txt CASSANDRA-5614 causes a huge performance regression in tombstone heavy workloads. The isDeleted checks here cause a huge CPU overhead: https://github.com/apache/cassandra/blob/cassandra-2.0/src/java/org/apache/cassandra/db/AtomicSortedColumns.java#L189-L196 An insert workload which does perfectly fine on 1.2, pegs CPU use at 100% on 2.0, with all of the mutation threads sitting in that loop. For example: {noformat} MutationStage:20 daemon prio=10 tid=0x7fb1c4c72800 nid=0x2249 runnable [0x7fb1b033] java.lang.Thread.State: RUNNABLE at org.apache.cassandra.db.marshal.BytesType.bytesCompare(BytesType.java:45) at org.apache.cassandra.db.marshal.UTF8Type.compare(UTF8Type.java:34) at org.apache.cassandra.db.marshal.UTF8Type.compare(UTF8Type.java:26) at org.apache.cassandra.db.marshal.AbstractType.compareCollectionMembers(AbstractType.java:267) at org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:85) at org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:35) at org.apache.cassandra.db.RangeTombstoneList.searchInternal(RangeTombstoneList.java:253) at org.apache.cassandra.db.RangeTombstoneList.isDeleted(RangeTombstoneList.java:210) at org.apache.cassandra.db.DeletionInfo.isDeleted(DeletionInfo.java:136) at org.apache.cassandra.db.DeletionInfo.isDeleted(DeletionInfo.java:123) at org.apache.cassandra.db.AtomicSortedColumns.addAllWithSizeDelta(AtomicSortedColumns.java:193) at org.apache.cassandra.db.Memtable.resolve(Memtable.java:194) at org.apache.cassandra.db.Memtable.put(Memtable.java:158) at org.apache.cassandra.db.ColumnFamilyStore.apply(ColumnFamilyStore.java:890) at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:368) at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:333) at org.apache.cassandra.db.RowMutation.apply(RowMutation.java:201) at org.apache.cassandra.db.RowMutationVerbHandler.doVerb(RowMutationVerbHandler.java:56) at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:60) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:744) {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6974) Replaying archived commitlogs isn't working
[ https://issues.apache.org/jira/browse/CASSANDRA-6974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13976817#comment-13976817 ] Jonathan Ellis commented on CASSANDRA-6974: --- bq. replay of mutations that are associated with column families that have been assigned a different UUID Hang on, that sounds like a non-goal to me. If I had table users, dropped it, recreated w/ new schema, then pitr of rows against old table should not be attempted for new one. Isn't that how it works? Replaying archived commitlogs isn't working --- Key: CASSANDRA-6974 URL: https://issues.apache.org/jira/browse/CASSANDRA-6974 Project: Cassandra Issue Type: Bug Reporter: Ryan McGuire Assignee: Benedict Fix For: 2.1 beta2 Attachments: 2.0.system.log, 2.1.system.log I have a test for restoring archived commitlogs, which is not working in 2.1 HEAD. My commitlogs consist of 30,000 inserts, but system.log indicates there were only 2 mutations replayed: {code} INFO [main] 2014-04-02 11:49:54,173 CommitLog.java:115 - Log replay complete, 2 replayed mutations {code} There are several warnings in the logs about bad headers and invalid CRCs: {code} WARN [main] 2014-04-02 11:49:54,156 CommitLogReplayer.java:138 - Encountered bad header at position 0 of commit log /tmp/dtest -mZIlPE/test/node1/commitlogs/CommitLog-4-1396453793570.log, with invalid CRC. The end of segment marker should be zero. {code} compare that to the same test run on 2.0, where it replayed many more mutations: {code} INFO [main] 2014-04-02 11:49:04,673 CommitLog.java (line 132) Log replay complete, 35960 replayed mutations {code} I'll attach the system logs for reference. [Here is the dtest to reproduce this|https://github.com/riptano/cassandra-dtest/blob/master/snapshot_test.py#L75] - (This currently relies on the fix for snapshots available in CASSANDRA-6965.) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7046) Update nodetool commands to output the date and time they were run on
[ https://issues.apache.org/jira/browse/CASSANDRA-7046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13976818#comment-13976818 ] Brandon Williams commented on CASSANDRA-7046: - Perhaps you're asking for the wrong logs, because ~/.cassandra/nodetool.history has exactly what you want, full timestamps. Update nodetool commands to output the date and time they were run on - Key: CASSANDRA-7046 URL: https://issues.apache.org/jira/browse/CASSANDRA-7046 Project: Cassandra Issue Type: Improvement Reporter: Johnny Miller Priority: Trivial Labels: lhf It would help if the various nodetool commands also outputted the system date time they were run. Often these commands are executed and then we look at the cassandra log files to try and find out what was happening at that time. This is certainly just a convenience feature, but it would be nice to have the information in there to aid with diagnostics. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6974) Replaying archived commitlogs isn't working
[ https://issues.apache.org/jira/browse/CASSANDRA-6974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13976819#comment-13976819 ] Benedict commented on CASSANDRA-6974: - I think restore involves recreating the CF? I'm not familiar with it though, so it's possible it is indeed a non-issue, and only the dtest needs to be fixed after this test Replaying archived commitlogs isn't working --- Key: CASSANDRA-6974 URL: https://issues.apache.org/jira/browse/CASSANDRA-6974 Project: Cassandra Issue Type: Bug Reporter: Ryan McGuire Assignee: Benedict Fix For: 2.1 beta2 Attachments: 2.0.system.log, 2.1.system.log I have a test for restoring archived commitlogs, which is not working in 2.1 HEAD. My commitlogs consist of 30,000 inserts, but system.log indicates there were only 2 mutations replayed: {code} INFO [main] 2014-04-02 11:49:54,173 CommitLog.java:115 - Log replay complete, 2 replayed mutations {code} There are several warnings in the logs about bad headers and invalid CRCs: {code} WARN [main] 2014-04-02 11:49:54,156 CommitLogReplayer.java:138 - Encountered bad header at position 0 of commit log /tmp/dtest -mZIlPE/test/node1/commitlogs/CommitLog-4-1396453793570.log, with invalid CRC. The end of segment marker should be zero. {code} compare that to the same test run on 2.0, where it replayed many more mutations: {code} INFO [main] 2014-04-02 11:49:04,673 CommitLog.java (line 132) Log replay complete, 35960 replayed mutations {code} I'll attach the system logs for reference. [Here is the dtest to reproduce this|https://github.com/riptano/cassandra-dtest/blob/master/snapshot_test.py#L75] - (This currently relies on the fix for snapshots available in CASSANDRA-6965.) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (CASSANDRA-6949) Performance regression in tombstone heavy workloads
[ https://issues.apache.org/jira/browse/CASSANDRA-6949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-6949: -- Fix Version/s: 2.0.8 Performance regression in tombstone heavy workloads --- Key: CASSANDRA-6949 URL: https://issues.apache.org/jira/browse/CASSANDRA-6949 Project: Cassandra Issue Type: Bug Reporter: Jeremiah Jordan Assignee: Sam Tunnicliffe Fix For: 2.0.8 Attachments: 0001-Remove-expansion-of-RangeTombstones-to-delete-from-2.patch, 6949.txt CASSANDRA-5614 causes a huge performance regression in tombstone heavy workloads. The isDeleted checks here cause a huge CPU overhead: https://github.com/apache/cassandra/blob/cassandra-2.0/src/java/org/apache/cassandra/db/AtomicSortedColumns.java#L189-L196 An insert workload which does perfectly fine on 1.2, pegs CPU use at 100% on 2.0, with all of the mutation threads sitting in that loop. For example: {noformat} MutationStage:20 daemon prio=10 tid=0x7fb1c4c72800 nid=0x2249 runnable [0x7fb1b033] java.lang.Thread.State: RUNNABLE at org.apache.cassandra.db.marshal.BytesType.bytesCompare(BytesType.java:45) at org.apache.cassandra.db.marshal.UTF8Type.compare(UTF8Type.java:34) at org.apache.cassandra.db.marshal.UTF8Type.compare(UTF8Type.java:26) at org.apache.cassandra.db.marshal.AbstractType.compareCollectionMembers(AbstractType.java:267) at org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:85) at org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:35) at org.apache.cassandra.db.RangeTombstoneList.searchInternal(RangeTombstoneList.java:253) at org.apache.cassandra.db.RangeTombstoneList.isDeleted(RangeTombstoneList.java:210) at org.apache.cassandra.db.DeletionInfo.isDeleted(DeletionInfo.java:136) at org.apache.cassandra.db.DeletionInfo.isDeleted(DeletionInfo.java:123) at org.apache.cassandra.db.AtomicSortedColumns.addAllWithSizeDelta(AtomicSortedColumns.java:193) at org.apache.cassandra.db.Memtable.resolve(Memtable.java:194) at org.apache.cassandra.db.Memtable.put(Memtable.java:158) at org.apache.cassandra.db.ColumnFamilyStore.apply(ColumnFamilyStore.java:890) at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:368) at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:333) at org.apache.cassandra.db.RowMutation.apply(RowMutation.java:201) at org.apache.cassandra.db.RowMutationVerbHandler.doVerb(RowMutationVerbHandler.java:56) at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:60) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:744) {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6551) Rack-aware batchlog replication
[ https://issues.apache.org/jira/browse/CASSANDRA-6551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13976827#comment-13976827 ] Aleksey Yeschenko commented on CASSANDRA-6551: -- [~mishail] Want to give this a try? Rack-aware batchlog replication --- Key: CASSANDRA-6551 URL: https://issues.apache.org/jira/browse/CASSANDRA-6551 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Rick Branson Assignee: Aleksey Yeschenko Priority: Minor Fix For: 2.0.8 Right now the batchlog replication code just randomly picks 2 other nodes in the same DC, regardless of rack. Ideally we'd pick 2 replicas in other racks to achieve higher fault tolerance. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6949) Performance regression in tombstone heavy workloads
[ https://issues.apache.org/jira/browse/CASSANDRA-6949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13976847#comment-13976847 ] Aleksey Yeschenko commented on CASSANDRA-6949: -- bq. Ah, right... but that only works on values that actually make it as far as the sstables - anything that is overwritten in memory will never be removed by that code path if I'm reading it right? Yes and no. If it's been rewritten by another regular cell or a regular tombstone, then it will be removed upon updating the memtable itself. This will always be true, and this is why we don't purge deleted cells on memtable flush, even if covered by range or partition tombstones. If overwritten by a range tombstone or a partition tombstone though, then yeah, that code path will not handle it. But that code can be extended trivially enough to handle range and partition tombstones on compaction to clean up stale 2i entries. Performance regression in tombstone heavy workloads --- Key: CASSANDRA-6949 URL: https://issues.apache.org/jira/browse/CASSANDRA-6949 Project: Cassandra Issue Type: Bug Reporter: Jeremiah Jordan Assignee: Sam Tunnicliffe Fix For: 2.0.8 Attachments: 0001-Remove-expansion-of-RangeTombstones-to-delete-from-2.patch, 6949.txt CASSANDRA-5614 causes a huge performance regression in tombstone heavy workloads. The isDeleted checks here cause a huge CPU overhead: https://github.com/apache/cassandra/blob/cassandra-2.0/src/java/org/apache/cassandra/db/AtomicSortedColumns.java#L189-L196 An insert workload which does perfectly fine on 1.2, pegs CPU use at 100% on 2.0, with all of the mutation threads sitting in that loop. For example: {noformat} MutationStage:20 daemon prio=10 tid=0x7fb1c4c72800 nid=0x2249 runnable [0x7fb1b033] java.lang.Thread.State: RUNNABLE at org.apache.cassandra.db.marshal.BytesType.bytesCompare(BytesType.java:45) at org.apache.cassandra.db.marshal.UTF8Type.compare(UTF8Type.java:34) at org.apache.cassandra.db.marshal.UTF8Type.compare(UTF8Type.java:26) at org.apache.cassandra.db.marshal.AbstractType.compareCollectionMembers(AbstractType.java:267) at org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:85) at org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:35) at org.apache.cassandra.db.RangeTombstoneList.searchInternal(RangeTombstoneList.java:253) at org.apache.cassandra.db.RangeTombstoneList.isDeleted(RangeTombstoneList.java:210) at org.apache.cassandra.db.DeletionInfo.isDeleted(DeletionInfo.java:136) at org.apache.cassandra.db.DeletionInfo.isDeleted(DeletionInfo.java:123) at org.apache.cassandra.db.AtomicSortedColumns.addAllWithSizeDelta(AtomicSortedColumns.java:193) at org.apache.cassandra.db.Memtable.resolve(Memtable.java:194) at org.apache.cassandra.db.Memtable.put(Memtable.java:158) at org.apache.cassandra.db.ColumnFamilyStore.apply(ColumnFamilyStore.java:890) at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:368) at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:333) at org.apache.cassandra.db.RowMutation.apply(RowMutation.java:201) at org.apache.cassandra.db.RowMutationVerbHandler.doVerb(RowMutationVerbHandler.java:56) at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:60) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:744) {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (CASSANDRA-7058) HHOM and BM direct delivery should not cause hints to be written on timeout
[ https://issues.apache.org/jira/browse/CASSANDRA-7058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-7058: -- Attachment: 7058-simplified.txt simplified version attached that doesn't do extra refactoring since this is going into 1.2. otherwise LGTM. HHOM and BM direct delivery should not cause hints to be written on timeout --- Key: CASSANDRA-7058 URL: https://issues.apache.org/jira/browse/CASSANDRA-7058 Project: Cassandra Issue Type: Bug Reporter: Aleksey Yeschenko Assignee: Aleksey Yeschenko Fix For: 1.2.17, 2.0.8, 2.1 beta2 Attachments: 7058-simplified.txt, 7058.txt, 7058.txt Currently, a timed out HHOM hint delivery would create a further hint, with a wrong TTL. BM direct delivery code is using the same code snippet basically, so is also affected (with slightly worse consequences). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6949) Performance regression in tombstone heavy workloads
[ https://issues.apache.org/jira/browse/CASSANDRA-6949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13976853#comment-13976853 ] Benedict commented on CASSANDRA-6949: - But then all we're doing is moving this cost into the compaction workload - it will be just as expensive there, and is actually much harder to optimise algorithmically. Is that really preferable? Performance regression in tombstone heavy workloads --- Key: CASSANDRA-6949 URL: https://issues.apache.org/jira/browse/CASSANDRA-6949 Project: Cassandra Issue Type: Bug Reporter: Jeremiah Jordan Assignee: Sam Tunnicliffe Fix For: 2.0.8 Attachments: 0001-Remove-expansion-of-RangeTombstones-to-delete-from-2.patch, 6949.txt CASSANDRA-5614 causes a huge performance regression in tombstone heavy workloads. The isDeleted checks here cause a huge CPU overhead: https://github.com/apache/cassandra/blob/cassandra-2.0/src/java/org/apache/cassandra/db/AtomicSortedColumns.java#L189-L196 An insert workload which does perfectly fine on 1.2, pegs CPU use at 100% on 2.0, with all of the mutation threads sitting in that loop. For example: {noformat} MutationStage:20 daemon prio=10 tid=0x7fb1c4c72800 nid=0x2249 runnable [0x7fb1b033] java.lang.Thread.State: RUNNABLE at org.apache.cassandra.db.marshal.BytesType.bytesCompare(BytesType.java:45) at org.apache.cassandra.db.marshal.UTF8Type.compare(UTF8Type.java:34) at org.apache.cassandra.db.marshal.UTF8Type.compare(UTF8Type.java:26) at org.apache.cassandra.db.marshal.AbstractType.compareCollectionMembers(AbstractType.java:267) at org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:85) at org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:35) at org.apache.cassandra.db.RangeTombstoneList.searchInternal(RangeTombstoneList.java:253) at org.apache.cassandra.db.RangeTombstoneList.isDeleted(RangeTombstoneList.java:210) at org.apache.cassandra.db.DeletionInfo.isDeleted(DeletionInfo.java:136) at org.apache.cassandra.db.DeletionInfo.isDeleted(DeletionInfo.java:123) at org.apache.cassandra.db.AtomicSortedColumns.addAllWithSizeDelta(AtomicSortedColumns.java:193) at org.apache.cassandra.db.Memtable.resolve(Memtable.java:194) at org.apache.cassandra.db.Memtable.put(Memtable.java:158) at org.apache.cassandra.db.ColumnFamilyStore.apply(ColumnFamilyStore.java:890) at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:368) at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:333) at org.apache.cassandra.db.RowMutation.apply(RowMutation.java:201) at org.apache.cassandra.db.RowMutationVerbHandler.doVerb(RowMutationVerbHandler.java:56) at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:60) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:744) {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[07/10] git commit: Don't NPE when username is supplied but password isn't.
Don't NPE when username is supplied but password isn't. Patch by Mike Adamson, reviewed by brandonwilliams for CASSANDRA-7050 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/8b8042b0 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/8b8042b0 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/8b8042b0 Branch: refs/heads/trunk Commit: 8b8042b032fd93103fa6c74fc4b751e0dd9a207b Parents: 3dad8ca Author: Brandon Williams brandonwilli...@apache.org Authored: Tue Apr 22 09:42:04 2014 -0500 Committer: Brandon Williams brandonwilli...@apache.org Committed: Tue Apr 22 09:42:04 2014 -0500 -- .../apache/cassandra/hadoop/AbstractColumnFamilyInputFormat.java | 2 +- .../apache/cassandra/hadoop/AbstractColumnFamilyOutputFormat.java | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/8b8042b0/src/java/org/apache/cassandra/hadoop/AbstractColumnFamilyInputFormat.java -- diff --git a/src/java/org/apache/cassandra/hadoop/AbstractColumnFamilyInputFormat.java b/src/java/org/apache/cassandra/hadoop/AbstractColumnFamilyInputFormat.java index 03b1576..73bc25c 100644 --- a/src/java/org/apache/cassandra/hadoop/AbstractColumnFamilyInputFormat.java +++ b/src/java/org/apache/cassandra/hadoop/AbstractColumnFamilyInputFormat.java @@ -103,7 +103,7 @@ public abstract class AbstractColumnFamilyInputFormatK, Y extends InputFormat // log in client.set_keyspace(ConfigHelper.getInputKeyspace(conf)); -if (ConfigHelper.getInputKeyspaceUserName(conf) != null) +if ((ConfigHelper.getInputKeyspaceUserName(conf) != null) (ConfigHelper.getInputKeyspacePassword(conf) != null)) { MapString, String creds = new HashMapString, String(); creds.put(IAuthenticator.USERNAME_KEY, ConfigHelper.getInputKeyspaceUserName(conf)); http://git-wip-us.apache.org/repos/asf/cassandra/blob/8b8042b0/src/java/org/apache/cassandra/hadoop/AbstractColumnFamilyOutputFormat.java -- diff --git a/src/java/org/apache/cassandra/hadoop/AbstractColumnFamilyOutputFormat.java b/src/java/org/apache/cassandra/hadoop/AbstractColumnFamilyOutputFormat.java index 3041829..96ca65d 100644 --- a/src/java/org/apache/cassandra/hadoop/AbstractColumnFamilyOutputFormat.java +++ b/src/java/org/apache/cassandra/hadoop/AbstractColumnFamilyOutputFormat.java @@ -124,7 +124,7 @@ public abstract class AbstractColumnFamilyOutputFormatK, Y extends OutputForma TProtocol binaryProtocol = new TBinaryProtocol(transport, true, true); Cassandra.Client client = new Cassandra.Client(binaryProtocol); client.set_keyspace(ConfigHelper.getOutputKeyspace(conf)); -if (ConfigHelper.getOutputKeyspaceUserName(conf) != null) +if ((ConfigHelper.getOutputKeyspaceUserName(conf) != null) (ConfigHelper.getOutputKeyspacePassword(conf) != null)) { MapString, String creds = new HashMapString, String(); creds.put(IAuthenticator.USERNAME_KEY, ConfigHelper.getOutputKeyspaceUserName(conf));
[03/10] git commit: Merge branch 'cassandra-1.2' into cassandra-2.0
Merge branch 'cassandra-1.2' into cassandra-2.0 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/3dad8ca6 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/3dad8ca6 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/3dad8ca6 Branch: refs/heads/cassandra-2.1 Commit: 3dad8ca60c14a6c57a1a2830310b14d50d36b0c0 Parents: de720b4 0547d16 Author: Jonathan Ellis jbel...@apache.org Authored: Tue Apr 22 09:17:31 2014 -0500 Committer: Jonathan Ellis jbel...@apache.org Committed: Tue Apr 22 09:17:31 2014 -0500 -- --
[08/10] git commit: Merge branch 'cassandra-2.0' into cassandra-2.1
Merge branch 'cassandra-2.0' into cassandra-2.1 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/c531f537 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/c531f537 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/c531f537 Branch: refs/heads/trunk Commit: c531f537fe68610430218904876e30c9ceba21ee Parents: 44f4e79 8b8042b Author: Brandon Williams brandonwilli...@apache.org Authored: Tue Apr 22 09:43:01 2014 -0500 Committer: Brandon Williams brandonwilli...@apache.org Committed: Tue Apr 22 09:43:01 2014 -0500 -- .../apache/cassandra/hadoop/AbstractColumnFamilyInputFormat.java | 2 +- .../apache/cassandra/hadoop/AbstractColumnFamilyOutputFormat.java | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/c531f537/src/java/org/apache/cassandra/hadoop/AbstractColumnFamilyInputFormat.java --
[05/10] git commit: Don't NPE when username is supplied but password isn't.
Don't NPE when username is supplied but password isn't. Patch by Mike Adamson, reviewed by brandonwilliams for CASSANDRA-7050 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/8b8042b0 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/8b8042b0 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/8b8042b0 Branch: refs/heads/cassandra-2.1 Commit: 8b8042b032fd93103fa6c74fc4b751e0dd9a207b Parents: 3dad8ca Author: Brandon Williams brandonwilli...@apache.org Authored: Tue Apr 22 09:42:04 2014 -0500 Committer: Brandon Williams brandonwilli...@apache.org Committed: Tue Apr 22 09:42:04 2014 -0500 -- .../apache/cassandra/hadoop/AbstractColumnFamilyInputFormat.java | 2 +- .../apache/cassandra/hadoop/AbstractColumnFamilyOutputFormat.java | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/8b8042b0/src/java/org/apache/cassandra/hadoop/AbstractColumnFamilyInputFormat.java -- diff --git a/src/java/org/apache/cassandra/hadoop/AbstractColumnFamilyInputFormat.java b/src/java/org/apache/cassandra/hadoop/AbstractColumnFamilyInputFormat.java index 03b1576..73bc25c 100644 --- a/src/java/org/apache/cassandra/hadoop/AbstractColumnFamilyInputFormat.java +++ b/src/java/org/apache/cassandra/hadoop/AbstractColumnFamilyInputFormat.java @@ -103,7 +103,7 @@ public abstract class AbstractColumnFamilyInputFormatK, Y extends InputFormat // log in client.set_keyspace(ConfigHelper.getInputKeyspace(conf)); -if (ConfigHelper.getInputKeyspaceUserName(conf) != null) +if ((ConfigHelper.getInputKeyspaceUserName(conf) != null) (ConfigHelper.getInputKeyspacePassword(conf) != null)) { MapString, String creds = new HashMapString, String(); creds.put(IAuthenticator.USERNAME_KEY, ConfigHelper.getInputKeyspaceUserName(conf)); http://git-wip-us.apache.org/repos/asf/cassandra/blob/8b8042b0/src/java/org/apache/cassandra/hadoop/AbstractColumnFamilyOutputFormat.java -- diff --git a/src/java/org/apache/cassandra/hadoop/AbstractColumnFamilyOutputFormat.java b/src/java/org/apache/cassandra/hadoop/AbstractColumnFamilyOutputFormat.java index 3041829..96ca65d 100644 --- a/src/java/org/apache/cassandra/hadoop/AbstractColumnFamilyOutputFormat.java +++ b/src/java/org/apache/cassandra/hadoop/AbstractColumnFamilyOutputFormat.java @@ -124,7 +124,7 @@ public abstract class AbstractColumnFamilyOutputFormatK, Y extends OutputForma TProtocol binaryProtocol = new TBinaryProtocol(transport, true, true); Cassandra.Client client = new Cassandra.Client(binaryProtocol); client.set_keyspace(ConfigHelper.getOutputKeyspace(conf)); -if (ConfigHelper.getOutputKeyspaceUserName(conf) != null) +if ((ConfigHelper.getOutputKeyspaceUserName(conf) != null) (ConfigHelper.getOutputKeyspacePassword(conf) != null)) { MapString, String creds = new HashMapString, String(); creds.put(IAuthenticator.USERNAME_KEY, ConfigHelper.getOutputKeyspaceUserName(conf));
[09/10] git commit: Merge branch 'cassandra-2.0' into cassandra-2.1
Merge branch 'cassandra-2.0' into cassandra-2.1 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/c531f537 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/c531f537 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/c531f537 Branch: refs/heads/cassandra-2.1 Commit: c531f537fe68610430218904876e30c9ceba21ee Parents: 44f4e79 8b8042b Author: Brandon Williams brandonwilli...@apache.org Authored: Tue Apr 22 09:43:01 2014 -0500 Committer: Brandon Williams brandonwilli...@apache.org Committed: Tue Apr 22 09:43:01 2014 -0500 -- .../apache/cassandra/hadoop/AbstractColumnFamilyInputFormat.java | 2 +- .../apache/cassandra/hadoop/AbstractColumnFamilyOutputFormat.java | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/c531f537/src/java/org/apache/cassandra/hadoop/AbstractColumnFamilyInputFormat.java --
[10/10] git commit: Merge branch 'cassandra-2.1' into trunk
Merge branch 'cassandra-2.1' into trunk Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/b7a1eb4c Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/b7a1eb4c Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/b7a1eb4c Branch: refs/heads/trunk Commit: b7a1eb4ce846e39bb834ebe5b094f85cd8762166 Parents: 47e81bf c531f53 Author: Brandon Williams brandonwilli...@apache.org Authored: Tue Apr 22 09:43:11 2014 -0500 Committer: Brandon Williams brandonwilli...@apache.org Committed: Tue Apr 22 09:43:11 2014 -0500 -- .../apache/cassandra/hadoop/AbstractColumnFamilyInputFormat.java | 2 +- .../apache/cassandra/hadoop/AbstractColumnFamilyOutputFormat.java | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) --
[06/10] git commit: Don't NPE when username is supplied but password isn't.
Don't NPE when username is supplied but password isn't. Patch by Mike Adamson, reviewed by brandonwilliams for CASSANDRA-7050 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/8b8042b0 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/8b8042b0 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/8b8042b0 Branch: refs/heads/cassandra-2.0 Commit: 8b8042b032fd93103fa6c74fc4b751e0dd9a207b Parents: 3dad8ca Author: Brandon Williams brandonwilli...@apache.org Authored: Tue Apr 22 09:42:04 2014 -0500 Committer: Brandon Williams brandonwilli...@apache.org Committed: Tue Apr 22 09:42:04 2014 -0500 -- .../apache/cassandra/hadoop/AbstractColumnFamilyInputFormat.java | 2 +- .../apache/cassandra/hadoop/AbstractColumnFamilyOutputFormat.java | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/8b8042b0/src/java/org/apache/cassandra/hadoop/AbstractColumnFamilyInputFormat.java -- diff --git a/src/java/org/apache/cassandra/hadoop/AbstractColumnFamilyInputFormat.java b/src/java/org/apache/cassandra/hadoop/AbstractColumnFamilyInputFormat.java index 03b1576..73bc25c 100644 --- a/src/java/org/apache/cassandra/hadoop/AbstractColumnFamilyInputFormat.java +++ b/src/java/org/apache/cassandra/hadoop/AbstractColumnFamilyInputFormat.java @@ -103,7 +103,7 @@ public abstract class AbstractColumnFamilyInputFormatK, Y extends InputFormat // log in client.set_keyspace(ConfigHelper.getInputKeyspace(conf)); -if (ConfigHelper.getInputKeyspaceUserName(conf) != null) +if ((ConfigHelper.getInputKeyspaceUserName(conf) != null) (ConfigHelper.getInputKeyspacePassword(conf) != null)) { MapString, String creds = new HashMapString, String(); creds.put(IAuthenticator.USERNAME_KEY, ConfigHelper.getInputKeyspaceUserName(conf)); http://git-wip-us.apache.org/repos/asf/cassandra/blob/8b8042b0/src/java/org/apache/cassandra/hadoop/AbstractColumnFamilyOutputFormat.java -- diff --git a/src/java/org/apache/cassandra/hadoop/AbstractColumnFamilyOutputFormat.java b/src/java/org/apache/cassandra/hadoop/AbstractColumnFamilyOutputFormat.java index 3041829..96ca65d 100644 --- a/src/java/org/apache/cassandra/hadoop/AbstractColumnFamilyOutputFormat.java +++ b/src/java/org/apache/cassandra/hadoop/AbstractColumnFamilyOutputFormat.java @@ -124,7 +124,7 @@ public abstract class AbstractColumnFamilyOutputFormatK, Y extends OutputForma TProtocol binaryProtocol = new TBinaryProtocol(transport, true, true); Cassandra.Client client = new Cassandra.Client(binaryProtocol); client.set_keyspace(ConfigHelper.getOutputKeyspace(conf)); -if (ConfigHelper.getOutputKeyspaceUserName(conf) != null) +if ((ConfigHelper.getOutputKeyspaceUserName(conf) != null) (ConfigHelper.getOutputKeyspacePassword(conf) != null)) { MapString, String creds = new HashMapString, String(); creds.put(IAuthenticator.USERNAME_KEY, ConfigHelper.getOutputKeyspaceUserName(conf));
[04/10] git commit: Merge branch 'cassandra-1.2' into cassandra-2.0
Merge branch 'cassandra-1.2' into cassandra-2.0 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/3dad8ca6 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/3dad8ca6 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/3dad8ca6 Branch: refs/heads/trunk Commit: 3dad8ca60c14a6c57a1a2830310b14d50d36b0c0 Parents: de720b4 0547d16 Author: Jonathan Ellis jbel...@apache.org Authored: Tue Apr 22 09:17:31 2014 -0500 Committer: Jonathan Ellis jbel...@apache.org Committed: Tue Apr 22 09:17:31 2014 -0500 -- --
[01/10] git commit: Fix schema concurrency exceptions (backport of #6841)
Repository: cassandra Updated Branches: refs/heads/cassandra-2.0 3dad8ca60 - 8b8042b03 refs/heads/cassandra-2.1 44f4e7901 - c531f537f refs/heads/trunk 47e81bf37 - b7a1eb4ce Fix schema concurrency exceptions (backport of #6841) Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/0547d16d Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/0547d16d Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/0547d16d Branch: refs/heads/cassandra-2.1 Commit: 0547d16d5f5475e66c339ed779cf561c52869445 Parents: 8d1acd9 Author: Jonathan Ellis jbel...@apache.org Authored: Tue Apr 22 09:15:29 2014 -0500 Committer: Jonathan Ellis jbel...@apache.org Committed: Tue Apr 22 09:17:00 2014 -0500 -- CHANGES.txt | 1 + .../cassandra/db/commitlog/CommitLogAllocator.java | 2 +- .../cassandra/db/commitlog/CommitLogSegment.java | 15 +-- 3 files changed, 11 insertions(+), 7 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/0547d16d/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 8cfffad..dc48131 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 1.2.17 + * Fix schema concurrency exceptions (CASSANDRA-6841) * Fix BatchlogManager#deleteBatch() use of millisecond timsestamps (CASSANDRA-6822) * Continue assassinating even if the endpoint vanishes (CASSANDRA-6787) http://git-wip-us.apache.org/repos/asf/cassandra/blob/0547d16d/src/java/org/apache/cassandra/db/commitlog/CommitLogAllocator.java -- diff --git a/src/java/org/apache/cassandra/db/commitlog/CommitLogAllocator.java b/src/java/org/apache/cassandra/db/commitlog/CommitLogAllocator.java index d62d7ca..c668377 100644 --- a/src/java/org/apache/cassandra/db/commitlog/CommitLogAllocator.java +++ b/src/java/org/apache/cassandra/db/commitlog/CommitLogAllocator.java @@ -293,7 +293,7 @@ public class CommitLogAllocator { CommitLogSegment oldestSegment = activeSegments.peek(); -if (oldestSegment != null) +if (oldestSegment != null oldestSegment != CommitLog.instance.activeSegment) { for (UUID dirtyCFId : oldestSegment.getDirtyCFIDs()) { http://git-wip-us.apache.org/repos/asf/cassandra/blob/0547d16d/src/java/org/apache/cassandra/db/commitlog/CommitLogSegment.java -- diff --git a/src/java/org/apache/cassandra/db/commitlog/CommitLogSegment.java b/src/java/org/apache/cassandra/db/commitlog/CommitLogSegment.java index c0c7918..bd50b60 100644 --- a/src/java/org/apache/cassandra/db/commitlog/CommitLogSegment.java +++ b/src/java/org/apache/cassandra/db/commitlog/CommitLogSegment.java @@ -22,10 +22,13 @@ import java.io.IOException; import java.io.RandomAccessFile; import java.nio.MappedByteBuffer; import java.nio.channels.FileChannel; +import java.util.ArrayList; import java.util.Collection; import java.util.Comparator; import java.util.HashMap; +import java.util.Map; import java.util.UUID; +import java.util.concurrent.ConcurrentHashMap; import java.util.concurrent.atomic.AtomicInteger; import java.util.zip.Checksum; @@ -59,7 +62,7 @@ public class CommitLogSegment static final int ENTRY_OVERHEAD_SIZE = 4 + 8 + 8; // cache which cf is dirty in this segment to avoid having to lookup all ReplayPositions to decide if we can delete this segment -private final HashMapUUID, Integer cfLastWrite = new HashMapUUID, Integer(); +private final MapUUID, Integer cfLastWrite = new HashMapUUID, Integer(); public final long id; @@ -316,7 +319,7 @@ public class CommitLogSegment * @param cfIdthe column family ID that is now clean * @param context the optional clean offset */ -public void markClean(UUID cfId, ReplayPosition context) +public synchronized void markClean(UUID cfId, ReplayPosition context) { Integer lastWritten = cfLastWrite.get(cfId); @@ -329,15 +332,15 @@ public class CommitLogSegment /** * @return a collection of dirty CFIDs for this segment file. */ -public CollectionUUID getDirtyCFIDs() +public synchronized CollectionUUID getDirtyCFIDs() { -return cfLastWrite.keySet(); +return new ArrayListUUID(cfLastWrite.keySet()); } /** * @return true if this segment is unused and safe to recycle or delete */ -public boolean isUnused() +public synchronized boolean isUnused() { return cfLastWrite.isEmpty(); } @@ -357,7 +360,7 @@ public class CommitLogSegment public String dirtyString() { StringBuilder sb