date:20171002


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Jirsa updated CASSANDRA-12961:
---
   Resolution: Fixed
Fix Version/s: (was: 4.x)
   4.0
   Status: Resolved  (was: Ready to Commit)

Thanks [~vusal.ahmadoglu] ! Apologies for the delay in committing, it's up now 
as {{48562536f17a6e88aaf18d46e5ffa0a54c6b5be6}} . Congrats on your first commit 
to the project!.


> LCS needlessly checks for L0 STCS candidates multiple times
> ---
>
> Key: CASSANDRA-12961
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12961
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Jeff Jirsa
>Assignee: Vusal Ahmadoglu
>Priority: Trivial
>  Labels: lhf
> Fix For: 4.0
>
> Attachments: 
> 0001-CASSANDRA-12961-Moving-getSTCSInL0CompactionCandidat.patch
>
>
> It's very likely that the check for L0 STCS candidates (if L0 is falling 
> behind) can be moved outside of the loop, or at very least made so that it's 
> not called on each loop iteration:
> {code}
> for (int i = generations.length - 1; i > 0; i--)
> {
> List sstables = getLevel(i);
> if (sstables.isEmpty())
> continue; // mostly this just avoids polluting the debug log 
> with zero scores
> // we want to calculate score excluding compacting ones
> Set sstablesInLevel = Sets.newHashSet(sstables);
> Set remaining = Sets.difference(sstablesInLevel, 
> cfs.getTracker().getCompacting());
> double score = (double) SSTableReader.getTotalBytes(remaining) / 
> (double)maxBytesForLevel(i, maxSSTableSizeInBytes);
> logger.trace("Compaction score for level {} is {}", i, score);
> if (score > 1.001)
> {
> // before proceeding with a higher level, let's see if L0 is 
> far enough behind to warrant STCS
> CompactionCandidate l0Compaction = 
> getSTCSInL0CompactionCandidate();
> if (l0Compaction != null)
> return l0Compaction;
> ..
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-12961) LCS needlessly checks for L0 STCS candidates multiple times


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Jirsa updated CASSANDRA-12961:
---
Status: Ready to Commit  (was: Patch Available)

> LCS needlessly checks for L0 STCS candidates multiple times
> ---
>
> Key: CASSANDRA-12961
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12961
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Jeff Jirsa
>Assignee: Vusal Ahmadoglu
>Priority: Trivial
>  Labels: lhf
> Fix For: 4.x
>
> Attachments: 
> 0001-CASSANDRA-12961-Moving-getSTCSInL0CompactionCandidat.patch
>
>
> It's very likely that the check for L0 STCS candidates (if L0 is falling 
> behind) can be moved outside of the loop, or at very least made so that it's 
> not called on each loop iteration:
> {code}
> for (int i = generations.length - 1; i > 0; i--)
> {
> List sstables = getLevel(i);
> if (sstables.isEmpty())
> continue; // mostly this just avoids polluting the debug log 
> with zero scores
> // we want to calculate score excluding compacting ones
> Set sstablesInLevel = Sets.newHashSet(sstables);
> Set remaining = Sets.difference(sstablesInLevel, 
> cfs.getTracker().getCompacting());
> double score = (double) SSTableReader.getTotalBytes(remaining) / 
> (double)maxBytesForLevel(i, maxSSTableSizeInBytes);
> logger.trace("Compaction score for level {} is {}", i, score);
> if (score > 1.001)
> {
> // before proceeding with a higher level, let's see if L0 is 
> far enough behind to warrant STCS
> CompactionCandidate l0Compaction = 
> getSTCSInL0CompactionCandidate();
> if (l0Compaction != null)
> return l0Compaction;
> ..
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

cassandra git commit: Moving getSTCSInL0CompactionCandidate out of loop to prevent useless multiple retrievals

2017-10-02 Thread jjirsa

Repository: cassandra
Updated Branches:
  refs/heads/trunk 0c570c058 -> 48562536f


Moving getSTCSInL0CompactionCandidate out of loop to prevent useless multiple 
retrievals

Patch by Vusal Ahmadoglu; Reviewed by Jirsa for CASSANDRA-12961


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/48562536
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/48562536
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/48562536

Branch: refs/heads/trunk
Commit: 48562536f17a6e88aaf18d46e5ffa0a54c6b5be6
Parents: 0c570c0
Author: vusal-ahmadoglu 
Authored: Wed Sep 20 21:09:45 2017 +0200
Committer: Jeff Jirsa 
Committed: Mon Oct 2 17:50:13 2017 -0700

--
 CHANGES.txt  | 1 +
 .../org/apache/cassandra/db/compaction/LeveledManifest.java  | 8 ++--
 2 files changed, 7 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/48562536/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 2f3b106..1304f34 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 4.0
+ * LCS needlessly checks for L0 STCS candidates multiple times 
(CASSANDRA-12961)
  * Correctly close netty channels when a stream session ends (CASSANDRA-13905)
  * Update lz4 to 1.4.0 (CASSANDRA-13741)
  * Optimize Paxos prepare and propose stage for local requests 
(CASSANDRA-13862)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/48562536/src/java/org/apache/cassandra/db/compaction/LeveledManifest.java
--
diff --git a/src/java/org/apache/cassandra/db/compaction/LeveledManifest.java 
b/src/java/org/apache/cassandra/db/compaction/LeveledManifest.java
index bafb6ee..5d1169a 100644
--- a/src/java/org/apache/cassandra/db/compaction/LeveledManifest.java
+++ b/src/java/org/apache/cassandra/db/compaction/LeveledManifest.java
@@ -345,6 +345,11 @@ public class LeveledManifest
 // This isn't a magic wand -- if you are consistently writing too fast 
for LCS to keep
 // up, you're still screwed.  But if instead you have intermittent 
bursts of activity,
 // it can help a lot.
+
+// Let's check that L0 is far enough behind to warrant STCS.
+// If it is, it will be used before proceeding any of higher level
+CompactionCandidate l0Compaction = getSTCSInL0CompactionCandidate();
+
 for (int i = generations.length - 1; i > 0; i--)
 {
 List sstables = getLevel(i);
@@ -359,7 +364,6 @@ public class LeveledManifest
 if (score > 1.001)
 {
 // before proceeding with a higher level, let's see if L0 is 
far enough behind to warrant STCS
-CompactionCandidate l0Compaction = 
getSTCSInL0CompactionCandidate();
 if (l0Compaction != null)
 return l0Compaction;
 
@@ -389,7 +393,7 @@ public class LeveledManifest
 // Since we don't have any other compactions to do, see if there 
is a STCS compaction to perform in L0; if
 // there is a long running compaction, we want to make sure that 
we continue to keep the number of SSTables
 // small in L0.
-return getSTCSInL0CompactionCandidate();
+return l0Compaction;
 }
 return new CompactionCandidate(candidates, getNextLevel(candidates), 
maxSSTableSizeInBytes);
 }


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Created] (CASSANDRA-13926) Starting and stopping quickly on Windows results in "port already in use" error

2017-10-02 Thread Jason Rust (JIRA)

Jason Rust created CASSANDRA-13926:
--

 Summary: Starting and stopping quickly on Windows results in "port 
already in use" error
 Key: CASSANDRA-13926
 URL: https://issues.apache.org/jira/browse/CASSANDRA-13926
 Project: Cassandra
  Issue Type: Bug
  Components: Packaging
 Environment: Windows
Reporter: Jason Rust
Priority: Minor


If I stop/start Cassandra within a minute on Windows, using the included 
Powershell script it can fail to start with the error message "Found a port 
already in use. Aborting startup."

This is because the Powershell script uses netstat to find ports are in use, 
and even if Cassandra is stopped it is still listed for a short time (reported 
as TIME_WAIT). See 
https://superuser.com/questions/173535/what-are-close-wait-and-time-wait-states

A change to the Powershell script to ensure that only ESTABLISHED ports are 
searched solves the problem for me and involves changing from:
{code} if ($line -match "TCP" -and $line -match $portRegex){code}
to
{code} if ($line -match "TCP" -and $line -match $portRegex -and $line -match 
"ESTABLISHED"){code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-13925) Add SERIAL and LOCAL_SERIAL support for cassandra-stress

2017-10-02 Thread Jay Zhuang (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Zhuang updated CASSANDRA-13925:
---
Labels: tools  (was: )
Status: Patch Available  (was: Open)

> Add SERIAL and LOCAL_SERIAL support for cassandra-stress
> 
>
> Key: CASSANDRA-13925
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13925
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Tools
>Reporter: Jay Zhuang
>Assignee: Jay Zhuang
>Priority: Minor
>  Labels: tools
>
> Follow-up for {{CASSANDRA-7960: Add stress profile yaml with LWT}}
> As {{cassandra-stress}} supports LWT, but cannot set CL to SERIAL.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-13925) Add SERIAL and LOCAL_SERIAL support for cassandra-stress

2017-10-02 Thread Jay Zhuang (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-13925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16189051#comment-16189051
 ] 

Jay Zhuang commented on CASSANDRA-13925:


Here is the patch, please review:
| Branch | uTest |
| [13845-trunk|https://github.com/cooldoger/cassandra/tree/13925-trunk] | 
[!https://circleci.com/gh/cooldoger/cassandra/tree/13925-trunk.svg?style=svg!|https://circleci.com/gh/cooldoger/cassandra/tree/13925-trunk]
 |

Test command example:
{{$ tools/bin/cassandra-stress user profile=tools/cqlstress-lwt-example.yaml 
cl=LOCAL_SERIAL truncate=never ops\(insert=1\) n=2 no-warmup -rate threads=1}}

> Add SERIAL and LOCAL_SERIAL support for cassandra-stress
> 
>
> Key: CASSANDRA-13925
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13925
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Tools
>Reporter: Jay Zhuang
>Assignee: Jay Zhuang
>Priority: Minor
>
> Follow-up for {{CASSANDRA-7960: Add stress profile yaml with LWT}}
> As {{cassandra-stress}} supports LWT, but cannot set CL to SERIAL.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Created] (CASSANDRA-13925) Add SERIAL and LOCAL_SERIAL support for cassandra-stress

2017-10-02 Thread Jay Zhuang (JIRA)

Jay Zhuang created CASSANDRA-13925:
--

 Summary: Add SERIAL and LOCAL_SERIAL support for cassandra-stress
 Key: CASSANDRA-13925
 URL: https://issues.apache.org/jira/browse/CASSANDRA-13925
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Reporter: Jay Zhuang
Assignee: Jay Zhuang
Priority: Minor


Follow-up for {{CASSANDRA-7960: Add stress profile yaml with LWT}}
As {{cassandra-stress}} supports LWT, but cannot set CL to SERIAL.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-6936) Make all byte representations of types comparable by their unsigned byte representation only


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16188967#comment-16188967
 ] 

Jeff Jirsa commented on CASSANDRA-6936:
---

Realize this ticket is mostly idle, but CASSANDRA-13553 has a similar need 
(byte order comparable types to map into rocksdb storage model)  and in that 
Dikang found this, whic seems interesting: https://github.com/ndimiduk/orderly



> Make all byte representations of types comparable by their unsigned byte 
> representation only
> 
>
> Key: CASSANDRA-6936
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6936
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Benedict
>Assignee: Branimir Lambov
>  Labels: compaction, performance
> Fix For: 4.x
>
>
> This could be a painful change, but is necessary for implementing a 
> trie-based index, and settling for less would be suboptimal; it also should 
> make comparisons cheaper all-round, and since comparison operations are 
> pretty much the majority of C*'s business, this should be easily felt (see 
> CASSANDRA-6553 and CASSANDRA-6934 for an example of some minor changes with 
> major performance impacts). No copying/special casing/slicing should mean 
> fewer opportunities to introduce performance regressions as well.
> Since I have slated for 3.0 a lot of non-backwards-compatible sstable 
> changes, hopefully this shouldn't be too much more of a burden.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-13906) Properly close StreamCompressionInputStream to release any ByteBuf

2017-10-02 Thread Ariel Weisberg (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-13906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16188968#comment-16188968
 ] 

Ariel Weisberg commented on CASSANDRA-13906:


I agree reference counting in Java is fraught due to the lack of Destructors 
and other plumbing.
 
So do we always expect the refcnt to be one or some known number? Then we 
should assert that at runtime and take an error path if it's not true (after 
releasing the resources) or maybe just do rate limited logging.

If use after free is a concern we might want to make sure the buffer is set to 
null so we get an NPE instead of a segfault. Then the  code incorrectly using 
the buffer will signal an error as well.

> Properly close StreamCompressionInputStream to release any ByteBuf
> --
>
> Key: CASSANDRA-13906
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13906
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Jason Brown
>Assignee: Jason Brown
>
> When running dtests for trunk (4.x) that perform some streaming, sometimes a 
> {{ByteBuf}} is not released properly, and we get this error in the logs 
> (causing the dtest to fail):
> {code}
> ERROR [MessagingService-NettyOutbound-Thread-4-2] 2017-09-26 13:42:37,940 
> Slf4JLogger.java:176 - LEAK: ByteBuf.release() was not called before it's 
> garbage-collected. Enable advanced leak reporting to find out where the leak 
> occurred. To enable advanced leak reporting, specify the JVM option 
> '-Dio.netty.leakDetection.level=advanced' or call 
> ResourceLeakDetector.setLevel() See 
> http://netty.io/wiki/reference-counted-objects.html for more information.
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Created] (CASSANDRA-13924) Continuous/Infectious Repair

2017-10-02 Thread Joseph Lynch (JIRA)

Joseph Lynch created CASSANDRA-13924:

Summary: Continuous/Infectious Repair
Key: CASSANDRA-13924
URL: https://issues.apache.org/jira/browse/CASSANDRA-13924
Project: Cassandra
Issue Type: Improvement
Components: Repair
Reporter: Joseph Lynch
Priority: Minor

I've been working on a way to keep data consistent without
scheduled/external/manual repair, because for large datasets repair is
extremely expensive. The basic gist is to introduce a new kind of hint that
keeps just the primary key of the mutation (indicating that PK needs repair)
and is recorded on replicas instead of coordinators during write time. Then a
periodic background task can issue read repairs to just the PKs that were
mutated. The initial performance degradation of this approach is non trivial,
but I believe that I can optimize it so that we are doing very little
additional work (see below in the design doc for some proposed optimizations).

My extremely rough proof of concept (uses a local table instead of HintStorage,
etc) so far is [in a
branch|https://github.com/apache/cassandra/compare/cassandra-3.11...jolynch:continuous_repair]
and has a rough [design
document|https://github.com/jolynch/cassandra/blob/c597c0fc6415e00fa8db180be5034214d148822d/doc/source/architecture/continuous_repair.rst].
I'm working on getting benchmarks of the various optimizations, but I figured
I should start this ticket before I got too deep into it.

I believe this approach is particularly good for high read rate clusters
requiring consistent low latency, and for clusters that mutate a relatively
small proportion of their data (since you never have to read the whole dataset,
just what's being mutated). I view this as something that works _with_
incremental repair to reduce work required because with this technique we could
potentially flush repaired + unrepaired sstables directly from the memtable. I
also see this as something that would be enabled or disabled per table since it
is so use case specific (e.g. some tables don't need repair at all). I think
this is somewhat of a hybrid approach based on incremental repair, ticklers
(read all partitions @ ALL), mutation based repair (CASSANDRA-8911), and hinted
handoff. There are lots of tradeoffs, but I think it's worth talking about.

If anyone has feedback on the idea, I'd love to chat about it. [~bdeggleston],
[~aweisberg] I chatted with you guys a bit about this at NGCC; if you have time
I'd love to continue that conversation here.

--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-13553) Map C* table schema to RocksDB key value data model


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dikang Gu updated CASSANDRA-13553:
--
Summary: Map C* table schema to RocksDB key value data model  (was: Map C* 
Wide column to RocksDB key value data model)

> Map C* table schema to RocksDB key value data model
> ---
>
> Key: CASSANDRA-13553
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13553
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Core
>Reporter: Dikang Gu
>Assignee: Dikang Gu
>
> The goal for this ticket is to find a way to map Cassandra's wide column data 
> model to RocksDB's key value data model.
> To support most common C* queries on top of RocksDB, we plan to use this 
> strategy, for each row in Cassandra:
> 1. Encode Cassandra partition key + clustering keys into RocksDB key.
> 2. Encode rest of Cassandra columns into RocksDB value.
> With this approach, there are two major problems we need to solve:
> 1. After we encode C* keys into RocksDB key, we need to preserve the same 
> sorting order in RocksDB byte comparator, as in original data type.
> 2. Support timestamp, ttl, and tombestone on the values.
> To solve problem 1, we need to carefully design the encoding algorithm for 
> each data type. Fortunately, there are some existing libraries we can play 
> with, such as orderly (https://github.com/ndimiduk/orderly), which is used by 
> HBase. Or flatbuffer (https://github.com/google/flatbuffers)
> To solve problem 2, our plan is to encode C* timestamp, ttl, and tombestone 
> together with the values, and then use RocksDB's merge operator/compaction 
> filter to merge different version of data, and handle ttl/tombestones. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-13754) BTree.Builder memory leak

2017-10-02 Thread Robert Stupp (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-13754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16188904#comment-16188904
 ] 

Robert Stupp commented on CASSANDRA-13754:
--

I'll note that the fixed issue and what you're describing are probably two 
different things.
It might also be that the combination of recycling the btree-builders _and_ 
many cells in a partition. This is technically different from what's been fixed.
It would help a lot, if someone can come up with steps (ideally some code) to 
reproduce the issue as mat screenshots show that something happened but not why.

> BTree.Builder memory leak
> -
>
> Key: CASSANDRA-13754
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13754
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: Cassandra 3.11.0, Netty 4.0.44.Final, OpenJDK 8u141-b15
>Reporter: Eric Evans
>Assignee: Robert Stupp
> Fix For: 3.11.1
>
> Attachments: cassandra_3.11.1_Recycler_memleak.png, 
> cassandra_3.11.1_snapshot_heaputilization.png, Screenshot from 2017-09-11 
> 16-54-43.png, Screenshot from 2017-09-13 10-39-58.png
>
>
> After a chronic bout of {{OutOfMemoryError}} in our development environment, 
> a heap analysis is showing that more than 10G of our 12G heaps are consumed 
> by the {{threadLocals}} members (instances of {{java.lang.ThreadLocalMap}}) 
> of various {{io.netty.util.concurrent.FastThreadLocalThread}} instances.  
> Reverting 
> [cecbe17|https://git-wip-us.apache.org/repos/asf?p=cassandra.git;a=commit;h=cecbe17e3eafc052acc13950494f7dddf026aa54]
>  fixes the issue.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Comment Edited] (CASSANDRA-13476) RocksDB based storage engine


[ 
https://issues.apache.org/jira/browse/CASSANDRA-13476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16188902#comment-16188902
 ] 

Dikang Gu edited comment on CASSANDRA-13476 at 10/2/17 9:44 PM:


I gave an update of the progress on NGCC last week, the features we already 
supported at this moment include: 

* Most of non-nested data types
* Table schema
* Point query
* Range query
* Mutations
* Timestamp
* TTL
* Deletions/Cell tombstones
* Streaming

Features we do not support yet:
* Multi-partition query
* Nested data types
* Counters
* Range tombstone
* Materialized views
* Secondary indexes
* SASI
* Repair


was (Author: dikanggu):
I gave an update of the progress on NGCC last week, the features we already 
supported at this moment include: 

Features supported in V1:
* Most of non-nested data types
* Table schema
* Point query
* Range query
* Mutations
* Timestamp
* TTL
* Deletions/Cell tombstones
* Streaming

Features we do not support yet:
* Multi-partition query
* Nested data types
* Counters
* Range tombstone
* Materialized views
* Secondary indexes
* SASI
* Repair

> RocksDB based storage engine
> 
>
> Key: CASSANDRA-13476
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13476
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Dikang Gu
>
> As I mentioned in CASSANDRA-13474, we got huge P99 read latency gain from the 
> RocksDB integration experiment.
> After we make the existing storage engine to be pluggable, we want to 
> implement a RocksDB based storage engine, which can support existing 
> Cassandra data model, and provide better and more predictable performance.
> The effort will include but not limited to:
> 1. Wide column support on RocksDB
> 2. Streaming support on RocksDB
> 3. RocksDB instances management
> 4. Nodetool support
> 5. Metrics and monitoring
> 6. Counter support on RocksDB



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-13553) Map C* table schema to RocksDB key value data model


[ 
https://issues.apache.org/jira/browse/CASSANDRA-13553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16188906#comment-16188906
 ] 

Dikang Gu commented on CASSANDRA-13553:
---

[~doanduyhai], we do not need to do read-before-write, any mutation will be a 
new RocksDB row, and we will merge the data in read path or during compaction, 
through RocksDB's merge operator.

> Map C* table schema to RocksDB key value data model
> ---
>
> Key: CASSANDRA-13553
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13553
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Core
>Reporter: Dikang Gu
>Assignee: Dikang Gu
>
> The goal for this ticket is to find a way to map Cassandra's table data model 
> to RocksDB's key value data model.
> To support most common C* queries on top of RocksDB, we plan to use this 
> strategy, for each row in Cassandra:
> 1. Encode Cassandra partition key + clustering keys into RocksDB key.
> 2. Encode rest of Cassandra columns into RocksDB value.
> With this approach, there are two major problems we need to solve:
> 1. After we encode C* keys into RocksDB key, we need to preserve the same 
> sorting order in RocksDB byte comparator, as in original data type.
> 2. Support timestamp, ttl, and tombestone on the values.
> To solve problem 1, we need to carefully design the encoding algorithm for 
> each data type. Fortunately, there are some existing libraries we can play 
> with, such as orderly (https://github.com/ndimiduk/orderly), which is used by 
> HBase. Or flatbuffer (https://github.com/google/flatbuffers)
> To solve problem 2, our plan is to encode C* timestamp, ttl, and tombestone 
> together with the values, and then use RocksDB's merge operator/compaction 
> filter to merge different version of data, and handle ttl/tombestones. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-13476) RocksDB based storage engine


[ 
https://issues.apache.org/jira/browse/CASSANDRA-13476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16188902#comment-16188902
 ] 

Dikang Gu commented on CASSANDRA-13476:
---

I gave an update of the progress on NGCC last week, the features we already 
supported at this moment include: 

Features supported in V1:
* Most of non-nested data types
* Table schema
* Point query
* Range query
* Mutations
* Timestamp
* TTL
* Deletions/Cell tombstones
* Streaming

Features we do not support yet:
* Multi-partition query
* Nested data types
* Counters
* Range tombstone
* Materialized views
* Secondary indexes
* SASI
* Repair

> RocksDB based storage engine
> 
>
> Key: CASSANDRA-13476
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13476
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Dikang Gu
>
> As I mentioned in CASSANDRA-13474, we got huge P99 read latency gain from the 
> RocksDB integration experiment.
> After we make the existing storage engine to be pluggable, we want to 
> implement a RocksDB based storage engine, which can support existing 
> Cassandra data model, and provide better and more predictable performance.
> The effort will include but not limited to:
> 1. Wide column support on RocksDB
> 2. Streaming support on RocksDB
> 3. RocksDB instances management
> 4. Nodetool support
> 5. Metrics and monitoring
> 6. Counter support on RocksDB



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-13553) Map C* table schema to RocksDB key value data model


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dikang Gu updated CASSANDRA-13553:
--
Description: 
The goal for this ticket is to find a way to map Cassandra's table data model 
to RocksDB's key value data model.

To support most common C* queries on top of RocksDB, we plan to use this 
strategy, for each row in Cassandra:
1. Encode Cassandra partition key + clustering keys into RocksDB key.
2. Encode rest of Cassandra columns into RocksDB value.

With this approach, there are two major problems we need to solve:
1. After we encode C* keys into RocksDB key, we need to preserve the same 
sorting order in RocksDB byte comparator, as in original data type.
2. Support timestamp, ttl, and tombestone on the values.

To solve problem 1, we need to carefully design the encoding algorithm for each 
data type. Fortunately, there are some existing libraries we can play with, 
such as orderly (https://github.com/ndimiduk/orderly), which is used by HBase. 
Or flatbuffer (https://github.com/google/flatbuffers)

To solve problem 2, our plan is to encode C* timestamp, ttl, and tombestone 
together with the values, and then use RocksDB's merge operator/compaction 
filter to merge different version of data, and handle ttl/tombestones. 

  was:
The goal for this ticket is to find a way to map Cassandra's wide column data 
model to RocksDB's key value data model.

To support most common C* queries on top of RocksDB, we plan to use this 
strategy, for each row in Cassandra:
1. Encode Cassandra partition key + clustering keys into RocksDB key.
2. Encode rest of Cassandra columns into RocksDB value.

With this approach, there are two major problems we need to solve:
1. After we encode C* keys into RocksDB key, we need to preserve the same 
sorting order in RocksDB byte comparator, as in original data type.
2. Support timestamp, ttl, and tombestone on the values.

To solve problem 1, we need to carefully design the encoding algorithm for each 
data type. Fortunately, there are some existing libraries we can play with, 
such as orderly (https://github.com/ndimiduk/orderly), which is used by HBase. 
Or flatbuffer (https://github.com/google/flatbuffers)

To solve problem 2, our plan is to encode C* timestamp, ttl, and tombestone 
together with the values, and then use RocksDB's merge operator/compaction 
filter to merge different version of data, and handle ttl/tombestones. 


> Map C* table schema to RocksDB key value data model
> ---
>
> Key: CASSANDRA-13553
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13553
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Core
>Reporter: Dikang Gu
>Assignee: Dikang Gu
>
> The goal for this ticket is to find a way to map Cassandra's table data model 
> to RocksDB's key value data model.
> To support most common C* queries on top of RocksDB, we plan to use this 
> strategy, for each row in Cassandra:
> 1. Encode Cassandra partition key + clustering keys into RocksDB key.
> 2. Encode rest of Cassandra columns into RocksDB value.
> With this approach, there are two major problems we need to solve:
> 1. After we encode C* keys into RocksDB key, we need to preserve the same 
> sorting order in RocksDB byte comparator, as in original data type.
> 2. Support timestamp, ttl, and tombestone on the values.
> To solve problem 1, we need to carefully design the encoding algorithm for 
> each data type. Fortunately, there are some existing libraries we can play 
> with, such as orderly (https://github.com/ndimiduk/orderly), which is used by 
> HBase. Or flatbuffer (https://github.com/google/flatbuffers)
> To solve problem 2, our plan is to encode C* timestamp, ttl, and tombestone 
> together with the values, and then use RocksDB's merge operator/compaction 
> filter to merge different version of data, and handle ttl/tombestones. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-13923) Flushers blocked due to many SSTables

2017-10-02 Thread Dan Kinder (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-13923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16188831#comment-16188831
 ] 

Dan Kinder commented on CASSANDRA-13923:


>From my own poking around, 
>https://issues.apache.org/jira/browse/CASSANDRA-9882 seemed related.

> Flushers blocked due to many SSTables
> -
>
> Key: CASSANDRA-13923
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13923
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction, Local Write-Read Paths
> Environment: Cassandra 3.11.0
> Centos 6 (downgraded JNA)
> 64GB RAM
> 12-disk JBOD
>Reporter: Dan Kinder
> Attachments: cassandra-jstack-readstage.txt, cassandra-jstack.txt
>
>
> This started on the mailing list and I'm not 100% sure of the root cause, 
> feel free to re-title if needed.
> I just upgraded Cassandra from 2.2.6 to 3.11.0. Within a few hours of serving 
> traffic, thread pools begin to back up and grow pending tasks indefinitely. 
> This happens to multiple different stages (Read, Mutation) and consistently 
> builds pending tasks for MemtablePostFlush and MemtableFlushWriter.
> Using jstack shows that there is blocking going on when trying to call 
> getCompactionCandidates, which seems to happen on flush. We have fairly large 
> nodes that have ~15,000 SSTables per node, all LCS.
> I seems like this can cause reads to get blocked because they try to acquire 
> a read lock when calling shouldDefragment.
> And writes, of course, block once we can't allocate anymore memtables, 
> because flushes are backed up.
> We did not have this problem in 2.2.6, so it seems like there is some 
> regression causing it to be incredibly slow trying to do calls like 
> getCompactionCandidates that list out the SSTables.
> In our case this causes nodes to build up pending tasks and simply stop 
> responding to requests.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Created] (CASSANDRA-13923) Flushers blocked due to many SSTables

2017-10-02 Thread Dan Kinder (JIRA)

Dan Kinder created CASSANDRA-13923:
--

 Summary: Flushers blocked due to many SSTables
 Key: CASSANDRA-13923
 URL: https://issues.apache.org/jira/browse/CASSANDRA-13923
 Project: Cassandra
  Issue Type: Bug
  Components: Compaction, Local Write-Read Paths
 Environment: Cassandra 3.11.0
Centos 6 (downgraded JNA)
64GB RAM
12-disk JBOD
Reporter: Dan Kinder
 Attachments: cassandra-jstack-readstage.txt, cassandra-jstack.txt

This started on the mailing list and I'm not 100% sure of the root cause, feel 
free to re-title if needed.

I just upgraded Cassandra from 2.2.6 to 3.11.0. Within a few hours of serving 
traffic, thread pools begin to back up and grow pending tasks indefinitely. 
This happens to multiple different stages (Read, Mutation) and consistently 
builds pending tasks for MemtablePostFlush and MemtableFlushWriter.

Using jstack shows that there is blocking going on when trying to call 
getCompactionCandidates, which seems to happen on flush. We have fairly large 
nodes that have ~15,000 SSTables per node, all LCS.

I seems like this can cause reads to get blocked because they try to acquire a 
read lock when calling shouldDefragment.

And writes, of course, block once we can't allocate anymore memtables, because 
flushes are backed up.

We did not have this problem in 2.2.6, so it seems like there is some 
regression causing it to be incredibly slow trying to do calls like 
getCompactionCandidates that list out the SSTables.

In our case this causes nodes to build up pending tasks and simply stop 
responding to requests.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Comment Edited] (CASSANDRA-13754) BTree.Builder memory leak

2017-10-02 Thread Thomas Steinmaurer (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-13754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16188758#comment-16188758
 ] 

Thomas Steinmaurer edited comment on CASSANDRA-13754 at 10/2/17 8:28 PM:
-

72hrs heap utilization increase with 3.11.1 snapshot build from Sept. 25, 2017 
+ cluster rolling restart marker => 
cassandra_3.11.1_snapshot_heaputilization.png


was (Author: tsteinmaurer):
72hrs heap utilization increase with 3.11.1 snapshot build from Sept. 25, 2017 
+ cluster rolling restart marker: 
!cassandra_3.11.1_snapshot_heaputilization.png|thumbnail!

> BTree.Builder memory leak
> -
>
> Key: CASSANDRA-13754
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13754
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: Cassandra 3.11.0, Netty 4.0.44.Final, OpenJDK 8u141-b15
>Reporter: Eric Evans
>Assignee: Robert Stupp
> Fix For: 3.11.1
>
> Attachments: cassandra_3.11.1_Recycler_memleak.png, 
> cassandra_3.11.1_snapshot_heaputilization.png, Screenshot from 2017-09-11 
> 16-54-43.png, Screenshot from 2017-09-13 10-39-58.png
>
>
> After a chronic bout of {{OutOfMemoryError}} in our development environment, 
> a heap analysis is showing that more than 10G of our 12G heaps are consumed 
> by the {{threadLocals}} members (instances of {{java.lang.ThreadLocalMap}}) 
> of various {{io.netty.util.concurrent.FastThreadLocalThread}} instances.  
> Reverting 
> [cecbe17|https://git-wip-us.apache.org/repos/asf?p=cassandra.git;a=commit;h=cecbe17e3eafc052acc13950494f7dddf026aa54]
>  fixes the issue.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-13754) BTree.Builder memory leak

2017-10-02 Thread Thomas Steinmaurer (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Steinmaurer updated CASSANDRA-13754:
---
Attachment: cassandra_3.11.1_snapshot_heaputilization.png

72hrs heap utilization increase with 3.11.1 snapshot build from Sept. 25, 2017 
+ cluster rolling restart marker: 
!cassandra_3.11.1_snapshot_heaputilization.png|thumbnail!

> BTree.Builder memory leak
> -
>
> Key: CASSANDRA-13754
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13754
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: Cassandra 3.11.0, Netty 4.0.44.Final, OpenJDK 8u141-b15
>Reporter: Eric Evans
>Assignee: Robert Stupp
> Fix For: 3.11.1
>
> Attachments: cassandra_3.11.1_Recycler_memleak.png, 
> cassandra_3.11.1_snapshot_heaputilization.png, Screenshot from 2017-09-11 
> 16-54-43.png, Screenshot from 2017-09-13 10-39-58.png
>
>
> After a chronic bout of {{OutOfMemoryError}} in our development environment, 
> a heap analysis is showing that more than 10G of our 12G heaps are consumed 
> by the {{threadLocals}} members (instances of {{java.lang.ThreadLocalMap}}) 
> of various {{io.netty.util.concurrent.FastThreadLocalThread}} instances.  
> Reverting 
> [cecbe17|https://git-wip-us.apache.org/repos/asf?p=cassandra.git;a=commit;h=cecbe17e3eafc052acc13950494f7dddf026aa54]
>  fixes the issue.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-13910) Consider deprecating (then removing) read_repair_chance/dclocal_read_repair_chance

2017-10-02 Thread Ben Bromhead (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-13910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16188753#comment-16188753
 ] 

Ben Bromhead commented on CASSANDRA-13910:
--

read_repair_chance has caused more pain that it's solved imho. 

* Inexperienced users sometimes increase this value to insane values when 
regular repairs are not working, or they are not doing the right thing with 
their CL. 
* Doesn't ever make a significant difference in consistency given most real 
world workload and query distributions - this generally only ends up triggering 
on the hottest partitions and as mentioned in the issue description will get 
hit by read_repair anyway. 

The only good thing about read_repair_chance is it's something you can turn off 
on a cluster that is under load pressure. This backed up by my many experiences 
of turning it off :)

+1 to creating a thread on @dev / @user for feed back 

> Consider deprecating (then removing) 
> read_repair_chance/dclocal_read_repair_chance
> --
>
> Key: CASSANDRA-13910
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13910
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Sylvain Lebresne
>Priority: Minor
>  Labels: CommunityFeedbackRequested
>
> First, let me clarify so this is not misunderstood that I'm not *at all* 
> suggesting to remove the read-repair mechanism of detecting and repairing 
> inconsistencies between read responses: that mechanism is imo fine and 
> useful.  But the {{read_repair_chance}} and {{dclocal_read_repair_chance}} 
> have never been about _enabling_ that mechanism, they are about querying all 
> replicas (even when this is not required by the consistency level) for the 
> sole purpose of maybe read-repairing some of the replica that wouldn't have 
> been queried otherwise. Which btw, bring me to reason 1 for considering their 
> removal: their naming/behavior is super confusing. Over the years, I've seen 
> countless users (and not only newbies) misunderstanding what those options 
> do, and as a consequence misunderstand when read-repair itself was happening.
> But my 2nd reason for suggesting this is that I suspect 
> {{read_repair_chance}}/{{dclocal_read_repair_chance}} are, especially 
> nowadays, more harmful than anything else when enabled. When those option 
> kick in, what you trade-off is additional resources consumption (all nodes 
> have to execute the read) for a _fairly remote chance_ of having some 
> inconsistencies repaired on _some_ replica _a bit faster_ than they would 
> otherwise be. To justify that last part, let's recall that:
> # most inconsistencies are actually fixed by hints in practice; and in the 
> case where a node stay dead for a long time so that hints ends up timing-out, 
> you really should repair the node when it comes back (if not simply 
> re-bootstrapping it).  Read-repair probably don't fix _that_ much stuff in 
> the first place.
> # again, read-repair do happen without those options kicking in. If you do 
> reads at {{QUORUM}}, inconsistencies will eventually get read-repaired all 
> the same.  Just a tiny bit less quickly.
> # I suspect almost everyone use a low "chance" for those options at best 
> (because the extra resources consumption is real), so at the end of the day, 
> it's up to chance how much faster this fixes inconsistencies.
> Overall, I'm having a hard time imagining real cases where that trade-off 
> really make sense. Don't get me wrong, those options had their places a long 
> time ago when hints weren't working all that well, but I think they bring 
> more confusion than benefits now.
> And I think it's sane to reconsider stuffs every once in a while, and to 
> clean up anything that may not make all that much sense anymore, which I 
> think is the case here.
> Tl;dr, I feel the benefits brought by those options are very slim at best and 
> well overshadowed by the confusion they bring, and not worth maintaining the 
> code that supports them (which, to be fair, isn't huge, but getting rid of 
> {{ReadCallback.AsyncRepairRunner}} wouldn't hurt for instance).
> Lastly, if the consensus here ends up being that they can have their use in 
> weird case and that we fill supporting those cases is worth confusing 
> everyone else and maintaining that code, I would still suggest disabling them 
> totally by default.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-13476) RocksDB based storage engine

2017-10-02 Thread DOAN DuyHai (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-13476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16188636#comment-16188636
 ] 

DOAN DuyHai commented on CASSANDRA-13476:
-

Thanks [~jjirsa]

 The bleeding-edge part of me get very excited about new storage engine

 The conservative part of me is afraid about the impact on production, 
therefore my list of questions

> RocksDB based storage engine
> 
>
> Key: CASSANDRA-13476
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13476
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Dikang Gu
>
> As I mentioned in CASSANDRA-13474, we got huge P99 read latency gain from the 
> RocksDB integration experiment.
> After we make the existing storage engine to be pluggable, we want to 
> implement a RocksDB based storage engine, which can support existing 
> Cassandra data model, and provide better and more predictable performance.
> The effort will include but not limited to:
> 1. Wide column support on RocksDB
> 2. Streaming support on RocksDB
> 3. RocksDB instances management
> 4. Nodetool support
> 5. Metrics and monitoring
> 6. Counter support on RocksDB



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-13553) Map C* Wide column to RocksDB key value data model

2017-10-02 Thread DOAN DuyHai (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-13553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16188632#comment-16188632
 ] 

DOAN DuyHai commented on CASSANDRA-13553:
-

I like the idea of having special encoding to preserve the orders of clustering 
columns. Neat!

About point 2 of storing all normal columns of C* into RocksDB value, how would 
we deal with CQL mutation on a single column of the row ? Would it require 
necessarily a read-before-write or do we have smarter alternatives ?

If the read-before-write is not avoidable, maybe clearly document the 
performance trade-off when using RockDBs for write so that users don't get 
surprised.

> Map C* Wide column to RocksDB key value data model
> --
>
> Key: CASSANDRA-13553
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13553
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Core
>Reporter: Dikang Gu
>Assignee: Dikang Gu
>
> The goal for this ticket is to find a way to map Cassandra's wide column data 
> model to RocksDB's key value data model.
> To support most common C* queries on top of RocksDB, we plan to use this 
> strategy, for each row in Cassandra:
> 1. Encode Cassandra partition key + clustering keys into RocksDB key.
> 2. Encode rest of Cassandra columns into RocksDB value.
> With this approach, there are two major problems we need to solve:
> 1. After we encode C* keys into RocksDB key, we need to preserve the same 
> sorting order in RocksDB byte comparator, as in original data type.
> 2. Support timestamp, ttl, and tombestone on the values.
> To solve problem 1, we need to carefully design the encoding algorithm for 
> each data type. Fortunately, there are some existing libraries we can play 
> with, such as orderly (https://github.com/ndimiduk/orderly), which is used by 
> HBase. Or flatbuffer (https://github.com/google/flatbuffers)
> To solve problem 2, our plan is to encode C* timestamp, ttl, and tombestone 
> together with the values, and then use RocksDB's merge operator/compaction 
> filter to merge different version of data, and handle ttl/tombestones. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

cassandra git commit: Add link to driver for the Dart programming language

2017-10-02 Thread jjirsa

Repository: cassandra
Updated Branches:
  refs/heads/trunk 694b3c401 -> 0c570c058


Add link to driver for the Dart programming language

Closes #153

Patch by Achilleas Anagnostopoulos; Reviewed by Jeff Jirsa


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/0c570c05
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/0c570c05
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/0c570c05

Branch: refs/heads/trunk
Commit: 0c570c0585e5177f8a0298cd21136d44a8842ea6
Parents: 694b3c4
Author: Achilleas Anagnostopoulos 
Authored: Thu Sep 28 11:02:02 2017 +0100
Committer: Jeff Jirsa 
Committed: Mon Oct 2 11:44:12 2017 -0700

--
 doc/source/getting_started/drivers.rst | 5 +
 1 file changed, 5 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/0c570c05/doc/source/getting_started/drivers.rst
--
diff --git a/doc/source/getting_started/drivers.rst 
b/doc/source/getting_started/drivers.rst
index c8d2e1f..9a2c156 100644
--- a/doc/source/getting_started/drivers.rst
+++ b/doc/source/getting_started/drivers.rst
@@ -116,3 +116,8 @@ Elixir
 
 - `Xandra `__
 - `CQEx `__
+
+Dart
+
+
+- `dart_cassandra_cql `__


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-13918) Header only commit logs should be filtered before recovery


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Jirsa updated CASSANDRA-13918:
---
Fix Version/s: (was: 3.11.x)
   3.11.1

> Header only commit logs should be filtered before recovery
> --
>
> Key: CASSANDRA-13918
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13918
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Blake Eggleston
>Assignee: Blake Eggleston
> Fix For: 3.0.15, 3.11.1, 4.0
>
>
> Commit log recovery will tolerate commit log truncation in the most recent 
> log file found on disk, but will abort startup if problems are detected in 
> others. 
> Since we allocate commit log segments before they're used though, it's 
> possible to get into a state where the last commit log file actually written 
> to is not the same file that was most recently allocated, preventing startup 
> for what should otherwise be allowable incomplete final segments.
> Excluding header only files on recovery should prevent this from happening.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-13553) Map C* Wide column to RocksDB key value data model


[ 
https://issues.apache.org/jira/browse/CASSANDRA-13553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16188516#comment-16188516
 ] 

Aleksey Yeschenko commented on CASSANDRA-13553:
---

Don't forget row and range tombstones, too.

> Map C* Wide column to RocksDB key value data model
> --
>
> Key: CASSANDRA-13553
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13553
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Core
>Reporter: Dikang Gu
>Assignee: Dikang Gu
>
> The goal for this ticket is to find a way to map Cassandra's wide column data 
> model to RocksDB's key value data model.
> To support most common C* queries on top of RocksDB, we plan to use this 
> strategy, for each row in Cassandra:
> 1. Encode Cassandra partition key + clustering keys into RocksDB key.
> 2. Encode rest of Cassandra columns into RocksDB value.
> With this approach, there are two major problems we need to solve:
> 1. After we encode C* keys into RocksDB key, we need to preserve the same 
> sorting order in RocksDB byte comparator, as in original data type.
> 2. Support timestamp, ttl, and tombestone on the values.
> To solve problem 1, we need to carefully design the encoding algorithm for 
> each data type. Fortunately, there are some existing libraries we can play 
> with, such as orderly (https://github.com/ndimiduk/orderly), which is used by 
> HBase. Or flatbuffer (https://github.com/google/flatbuffers)
> To solve problem 2, our plan is to encode C* timestamp, ttl, and tombestone 
> together with the values, and then use RocksDB's merge operator/compaction 
> filter to merge different version of data, and handle ttl/tombestones. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-13476) RocksDB based storage engine


[ 
https://issues.apache.org/jira/browse/CASSANDRA-13476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16188509#comment-16188509
 ] 

Jeff Jirsa commented on CASSANDRA-13476:


{quote} how can we support all the CQL semantics (like collections, clustering, 
TTL, LWT, SASI, MV, UDF/UDA ...) {quote}

[~doanduyhai] - for clustering, there's a blocking sub-ticket specifically 
addressing it CASSANDRA-13553



> RocksDB based storage engine
> 
>
> Key: CASSANDRA-13476
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13476
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Dikang Gu
>
> As I mentioned in CASSANDRA-13474, we got huge P99 read latency gain from the 
> RocksDB integration experiment.
> After we make the existing storage engine to be pluggable, we want to 
> implement a RocksDB based storage engine, which can support existing 
> Cassandra data model, and provide better and more predictable performance.
> The effort will include but not limited to:
> 1. Wide column support on RocksDB
> 2. Streaming support on RocksDB
> 3. RocksDB instances management
> 4. Nodetool support
> 5. Metrics and monitoring
> 6. Counter support on RocksDB



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-13476) RocksDB based storage engine

2017-10-02 Thread DOAN DuyHai (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-13476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16188503#comment-16188503
 ] 

DOAN DuyHai commented on CASSANDRA-13476:
-

A stupid question but when using RockDB, how can we support all the CQL 
semantics (like collections, clustering, TTL, LWT, SASI, MV, UDF/UDA ...) ?

I don't feel like having different storage engines supporting different subsets 
of CQL features, ending up with a matrix of supported/unsupported features.

Also, I guess that implementing some features like collections or counters will 
yield different performance characteristics when switching between different 
storage engines. It would be nice that it is clearly indicated not to get the 
users by surprise

Also, compaction strategies will also be per-storage engine specific right ? 

> RocksDB based storage engine
> 
>
> Key: CASSANDRA-13476
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13476
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Dikang Gu
>
> As I mentioned in CASSANDRA-13474, we got huge P99 read latency gain from the 
> RocksDB integration experiment.
> After we make the existing storage engine to be pluggable, we want to 
> implement a RocksDB based storage engine, which can support existing 
> Cassandra data model, and provide better and more predictable performance.
> The effort will include but not limited to:
> 1. Wide column support on RocksDB
> 2. Streaming support on RocksDB
> 3. RocksDB instances management
> 4. Nodetool support
> 5. Metrics and monitoring
> 6. Counter support on RocksDB



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[cassandra] Git Push Summary

2017-10-02 Thread mshuler

Repository: cassandra
Updated Tags:  refs/tags/3.11.1-tentative [created] 983c72a84

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[cassandra] Git Push Summary

2017-10-02 Thread mshuler

Repository: cassandra
Updated Tags:  refs/tags/3.0.15-tentative [created] b32a9e645

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-13797) RepairJob blocks on syncTasks


[ 
https://issues.apache.org/jira/browse/CASSANDRA-13797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16188407#comment-16188407
 ] 

Jeff Jirsa commented on CASSANDRA-13797:


Please keep an eye on fixvers.


> RepairJob blocks on syncTasks
> -
>
> Key: CASSANDRA-13797
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13797
> Project: Cassandra
>  Issue Type: Bug
>  Components: Repair
>Reporter: Blake Eggleston
>Assignee: Blake Eggleston
> Fix For: 3.0.15, 3.11.1, 4.0
>
>
> The thread running {{RepairJob}} blocks while it waits for the validations it 
> starts to complete ([see 
> here|https://github.com/bdeggleston/cassandra/blob/9fdec0a82851f5c35cd21d02e8c4da8fc685edb2/src/java/org/apache/cassandra/repair/RepairJob.java#L185]).
>  However, the downstream callbacks (ie: the post-repair cleanup stuff) aren't 
> waiting for {{RepairJob#run}} to return, they're waiting for a result to be 
> set on RepairJob the future, which happens after the sync tasks have 
> completed. This post repair cleanup stuff also immediately shuts down the 
> executor {{RepairJob#run}} is running in. So in noop repair sessions, where 
> there's nothing to stream, I'm seeing the callbacks sometimes fire before 
> {{RepairJob#run}} wakes up, and causing an {{InterruptedException}} is thrown.
> I'm pretty sure this can just be removed, but I'd like a second opinion. This 
> appears to just be a holdover from before repair coordination became async. I 
> thought it might be doing some throttling by blocking, but each repair 
> session gets it's own executor, and validation is  throttled by the fixed 
> size executors doing the actual work of validation, so I don't think we need 
> to keep this around.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-13797) RepairJob blocks on syncTasks


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Jirsa updated CASSANDRA-13797:
---
Fix Version/s: (was: 3.11.x)
   (was: 4.x)
   (was: 3.0.x)
   4.0
   3.11.1
   3.0.15

> RepairJob blocks on syncTasks
> -
>
> Key: CASSANDRA-13797
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13797
> Project: Cassandra
>  Issue Type: Bug
>  Components: Repair
>Reporter: Blake Eggleston
>Assignee: Blake Eggleston
> Fix For: 3.0.15, 3.11.1, 4.0
>
>
> The thread running {{RepairJob}} blocks while it waits for the validations it 
> starts to complete ([see 
> here|https://github.com/bdeggleston/cassandra/blob/9fdec0a82851f5c35cd21d02e8c4da8fc685edb2/src/java/org/apache/cassandra/repair/RepairJob.java#L185]).
>  However, the downstream callbacks (ie: the post-repair cleanup stuff) aren't 
> waiting for {{RepairJob#run}} to return, they're waiting for a result to be 
> set on RepairJob the future, which happens after the sync tasks have 
> completed. This post repair cleanup stuff also immediately shuts down the 
> executor {{RepairJob#run}} is running in. So in noop repair sessions, where 
> there's nothing to stream, I'm seeing the callbacks sometimes fire before 
> {{RepairJob#run}} wakes up, and causing an {{InterruptedException}} is thrown.
> I'm pretty sure this can just be removed, but I'd like a second opinion. This 
> appears to just be a holdover from before repair coordination became async. I 
> thought it might be doing some throttling by blocking, but each repair 
> session gets it's own executor, and validation is  throttled by the fixed 
> size executors doing the actual work of validation, so I don't think we need 
> to keep this around.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-13797) RepairJob blocks on syncTasks


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Jirsa updated CASSANDRA-13797:
---
Component/s: Repair

> RepairJob blocks on syncTasks
> -
>
> Key: CASSANDRA-13797
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13797
> Project: Cassandra
>  Issue Type: Bug
>  Components: Repair
>Reporter: Blake Eggleston
>Assignee: Blake Eggleston
> Fix For: 3.0.15, 3.11.1, 4.0
>
>
> The thread running {{RepairJob}} blocks while it waits for the validations it 
> starts to complete ([see 
> here|https://github.com/bdeggleston/cassandra/blob/9fdec0a82851f5c35cd21d02e8c4da8fc685edb2/src/java/org/apache/cassandra/repair/RepairJob.java#L185]).
>  However, the downstream callbacks (ie: the post-repair cleanup stuff) aren't 
> waiting for {{RepairJob#run}} to return, they're waiting for a result to be 
> set on RepairJob the future, which happens after the sync tasks have 
> completed. This post repair cleanup stuff also immediately shuts down the 
> executor {{RepairJob#run}} is running in. So in noop repair sessions, where 
> there's nothing to stream, I'm seeing the callbacks sometimes fire before 
> {{RepairJob#run}} wakes up, and causing an {{InterruptedException}} is thrown.
> I'm pretty sure this can just be removed, but I'd like a second opinion. This 
> appears to just be a holdover from before repair coordination became async. I 
> thought it might be doing some throttling by blocking, but each repair 
> session gets it's own executor, and validation is  throttled by the fixed 
> size executors doing the actual work of validation, so I don't think we need 
> to keep this around.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Resolved] (CASSANDRA-13892) Make BTree.Builder use the initialCapacity


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcus Eriksson resolved CASSANDRA-13892.
-
Resolution: Won't Fix

> Make BTree.Builder use the initialCapacity
> --
>
> Key: CASSANDRA-13892
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13892
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Marcus Eriksson
>Assignee: Marcus Eriksson
>Priority: Minor
>
> The initialCapacity passed to {{BTree.builder(comparator, initialCapacity)}} 
> is unused, start using it



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Reopened] (CASSANDRA-13892) Make BTree.Builder use the initialCapacity


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcus Eriksson reopened CASSANDRA-13892:
-

> Make BTree.Builder use the initialCapacity
> --
>
> Key: CASSANDRA-13892
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13892
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Marcus Eriksson
>Assignee: Marcus Eriksson
>Priority: Minor
>
> The initialCapacity passed to {{BTree.builder(comparator, initialCapacity)}} 
> is unused, start using it



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Resolved] (CASSANDRA-13892) Make BTree.Builder use the initialCapacity


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcus Eriksson resolved CASSANDRA-13892.
-
   Resolution: Fixed
Fix Version/s: (was: 3.11.x)
   (was: 4.x)
   (was: 3.0.x)

bq.  I'm not certain it's important enough to warrant committing to 3.0
agreed

> Make BTree.Builder use the initialCapacity
> --
>
> Key: CASSANDRA-13892
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13892
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Marcus Eriksson
>Assignee: Marcus Eriksson
>Priority: Minor
>
> The initialCapacity passed to {{BTree.builder(comparator, initialCapacity)}} 
> is unused, start using it



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Comment Edited] (CASSANDRA-13892) Make BTree.Builder use the initialCapacity


[ 
https://issues.apache.org/jira/browse/CASSANDRA-13892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16188276#comment-16188276
 ] 

Aleksey Yeschenko edited comment on CASSANDRA-13892 at 10/2/17 3:09 PM:


I would probably write the second commit as
{code}
values = Arrays.copyOf(values, Math.max(count, 1) * 2);
{code}

It's slightly less branchy (sometimes) and never creates an array with one 
element, which just wastes 4 bytes to padding with compressed OOPS enabled.

That said,
1. I'm not certain it's important enough to warrant committing to 3.0. The risk 
is low, so I don't mind, but in spirit it might not belong to a stabilisation 
release. Will leave the decision up to you.

2. What do you do with 3.11 and trunk? B/c of recycling, providing 
{{initialCapacity}} doesn't quite work and fixing it requires some refactoring, 
which warrants extra review.

Edit: 3.0 version LGTM though, if you decide to proceed with it.


was (Author: iamaleksey):
I would probably write the second commit as
{code}
values = Arrays.copyOf(values, Math.max(count, 1) * 2);
{code}

It's slightly less branchy (sometimes) and never creates an array with one 
element, which just wastes 4 bytes to padding with compressed OOPS enabled.

That said,
1. I'm not certain it's important enough to warrant committing to 3.0. The risk 
is low, so I don't mind, but in spirit it might not belong to a stabilisation 
release. Will leave the decision up to you.

2. What do you do with 3.11 and trunk? B/c of recycling, providing 
{{initialCapacity}} doesn't quite work and fixing it requires some refactoring, 
which warrants extra review.

> Make BTree.Builder use the initialCapacity
> --
>
> Key: CASSANDRA-13892
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13892
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Marcus Eriksson
>Assignee: Marcus Eriksson
>Priority: Minor
> Fix For: 3.0.x, 3.11.x, 4.x
>
>
> The initialCapacity passed to {{BTree.builder(comparator, initialCapacity)}} 
> is unused, start using it



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-13892) Make BTree.Builder use the initialCapacity


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko updated CASSANDRA-13892:
--
Status: Open  (was: Patch Available)

> Make BTree.Builder use the initialCapacity
> --
>
> Key: CASSANDRA-13892
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13892
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Marcus Eriksson
>Assignee: Marcus Eriksson
>Priority: Minor
> Fix For: 3.0.x, 3.11.x, 4.x
>
>
> The initialCapacity passed to {{BTree.builder(comparator, initialCapacity)}} 
> is unused, start using it



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-13892) Make BTree.Builder use the initialCapacity


[ 
https://issues.apache.org/jira/browse/CASSANDRA-13892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16188276#comment-16188276
 ] 

Aleksey Yeschenko commented on CASSANDRA-13892:
---

I would probably write the second commit as
{code}
values = Arrays.copyOf(values, Math.max(count, 1) * 2);
{code}

It's slightly less branchy (sometimes) and never creates an array with one 
element, which just wastes 4 bytes to padding with compressed OOPS enabled.

That said,
1. I'm not certain it's important enough to warrant committing to 3.0. The risk 
is low, so I don't mind, but in spirit it might not belong to a stabilisation 
release. Will leave the decision up to you.

2. What do you do with 3.11 and trunk? B/c of recycling, providing 
{{initialCapacity}} doesn't quite work and fixing it requires some refactoring, 
which warrants extra review.

> Make BTree.Builder use the initialCapacity
> --
>
> Key: CASSANDRA-13892
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13892
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Marcus Eriksson
>Assignee: Marcus Eriksson
> Fix For: 3.0.x, 3.11.x, 4.x
>
>
> The initialCapacity passed to {{BTree.builder(comparator, initialCapacity)}} 
> is unused, start using it



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-13892) Make BTree.Builder use the initialCapacity


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko updated CASSANDRA-13892:
--
Priority: Minor  (was: Major)

> Make BTree.Builder use the initialCapacity
> --
>
> Key: CASSANDRA-13892
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13892
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Marcus Eriksson
>Assignee: Marcus Eriksson
>Priority: Minor
> Fix For: 3.0.x, 3.11.x, 4.x
>
>
> The initialCapacity passed to {{BTree.builder(comparator, initialCapacity)}} 
> is unused, start using it



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-13892) Make BTree.Builder use the initialCapacity


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko updated CASSANDRA-13892:
--
Issue Type: Improvement  (was: Bug)

> Make BTree.Builder use the initialCapacity
> --
>
> Key: CASSANDRA-13892
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13892
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Marcus Eriksson
>Assignee: Marcus Eriksson
> Fix For: 3.0.x, 3.11.x, 4.x
>
>
> The initialCapacity passed to {{BTree.builder(comparator, initialCapacity)}} 
> is unused, start using it



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-13922) nodetool verify should also verify sstable metadata


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcus Eriksson updated CASSANDRA-13922:

Status: Patch Available  (was: Open)

https://github.com/krummas/cassandra/commits/marcuse/13922
https://github.com/krummas/cassandra/commits/marcuse/13922-3.11
https://github.com/krummas/cassandra/commits/marcuse/13922-trunk

https://circleci.com/gh/krummas/cassandra/134
https://circleci.com/gh/krummas/cassandra/132
https://circleci.com/gh/krummas/cassandra/135

https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/355/

> nodetool verify should also verify sstable metadata
> ---
>
> Key: CASSANDRA-13922
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13922
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Marcus Eriksson
>Assignee: Marcus Eriksson
> Fix For: 3.0.x, 3.11.x, 4.x
>
>
> nodetool verify should also try to deserialize the sstable metadata (and once 
> CASSANDRA-13321 makes it in, verify the checksums)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-13741) Replace Cassandra's lz4-1.3.0.jar with lz4-java-1.4.0.jar

2017-10-02 Thread Amitkumar Ghatwal (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-13741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16187883#comment-16187883
 ] 

Amitkumar Ghatwal commented on CASSANDRA-13741:
---

thanks !!!

> Replace Cassandra's lz4-1.3.0.jar with lz4-java-1.4.0.jar
> -
>
> Key: CASSANDRA-13741
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13741
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Libraries
>Reporter: Amitkumar Ghatwal
>Assignee: Michael Kjellman
> Fix For: 4.0
>
>
> Hi All,
> The latest lz4-java library has been released 
> (https://github.com/lz4/lz4-java/releases) and uploaded to maven central . 
> Please replace in mainline the current version ( 1.3.0) with the latest one ( 
> 1.4.0) from here - http://repo1.maven.org/maven2/org/lz4/lz4-java/1.4.0/
> Adding : [~ReiOdaira].
> Regards,
> Amit



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Created] (CASSANDRA-13922) nodetool verify should also verify sstable metadata

Marcus Eriksson created CASSANDRA-13922:
---

 Summary: nodetool verify should also verify sstable metadata
 Key: CASSANDRA-13922
 URL: https://issues.apache.org/jira/browse/CASSANDRA-13922
 Project: Cassandra
  Issue Type: Improvement
Reporter: Marcus Eriksson
Assignee: Marcus Eriksson
 Fix For: 3.0.x, 3.11.x, 4.x


nodetool verify should also try to deserialize the sstable metadata (and once 
CASSANDRA-13321 makes it in, verify the checksums)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-13909) Improve TRUNCATE performance with many sstables


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcus Eriksson updated CASSANDRA-13909:

   Resolution: Fixed
Fix Version/s: (was: 3.11.x)
   (was: 4.x)
   (was: 3.0.x)
   4.0
   3.11.1
   3.0.15
   Status: Resolved  (was: Ready to Commit)

Thanks for the review, didn't see anything suspicious in the test results, 
committed as {{b32a9e6452c78e6ad08e371314bf1ab7492d0773}} to 3.0 and merged up

> Improve TRUNCATE performance with many sstables
> ---
>
> Key: CASSANDRA-13909
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13909
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Marcus Eriksson
>Assignee: Marcus Eriksson
> Fix For: 3.0.15, 3.11.1, 4.0
>
>
> Truncate is very slow in 3.0, mostly due to {{LogRecord.make}} listing all 
> files in the directory for every sstable.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[5/6] cassandra git commit: Merge branch 'cassandra-3.0' into cassandra-3.11

Merge branch 'cassandra-3.0' into cassandra-3.11


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/983c72a8
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/983c72a8
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/983c72a8

Branch: refs/heads/cassandra-3.11
Commit: 983c72a84ab6628e09a78ead9e20a0c323a005af
Parents: 9d56132 b32a9e6
Author: Marcus Eriksson 
Authored: Mon Oct 2 09:34:49 2017 +0200
Committer: Marcus Eriksson 
Committed: Mon Oct 2 09:34:49 2017 +0200

--
 CHANGES.txt |  1 +
 .../apache/cassandra/db/lifecycle/Helpers.java  | 15 +
 .../apache/cassandra/db/lifecycle/LogFile.java  | 26 
 .../cassandra/db/lifecycle/LogRecord.java   | 65 +++-
 .../cassandra/db/lifecycle/LogTransaction.java  | 16 +
 .../apache/cassandra/db/lifecycle/Tracker.java  |  2 +-
 .../cassandra/db/lifecycle/HelpersTest.java | 58 -
 7 files changed, 180 insertions(+), 3 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/983c72a8/CHANGES.txt
--
diff --cc CHANGES.txt
index 264887b,d6423b4..aca219e
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@@ -1,17 -1,5 +1,18 @@@
 -3.0.15
 +3.11.1
 + * Fix the computation of cdc_total_space_in_mb for exabyte filesystems 
(CASSANDRA-13808)
 + * AbstractTokenTreeBuilder#serializedSize returns wrong value when there is 
a single leaf and overflow collisions (CASSANDRA-13869)
 + * Add a compaction option to TWCS to ignore sstables overlapping checks 
(CASSANDRA-13418)
 + * BTree.Builder memory leak (CASSANDRA-13754)
 + * Revert CASSANDRA-10368 of supporting non-pk column filtering due to 
correctness (CASSANDRA-13798)
 + * Add a skip read validation flag to cassandra-stress (CASSANDRA-13772)
 + * Fix cassandra-stress hang issues when an error during cluster connection 
happens (CASSANDRA-12938)
 + * Better bootstrap failure message when blocked by (potential) range 
movement (CASSANDRA-13744)
 + * "ignore" option is ignored in sstableloader (CASSANDRA-13721)
 + * Deadlock in AbstractCommitLogSegmentManager (CASSANDRA-13652)
 + * Duplicate the buffer before passing it to analyser in SASI operation 
(CASSANDRA-13512)
 + * Properly evict pstmts from prepared statements cache (CASSANDRA-13641)
 +Merged from 3.0:
+  * Improve TRUNCATE performance (CASSANDRA-13909)
   * Implement short read protection on partition boundaries (CASSANDRA-13595)
   * Fix ISE thrown by UPI.Serializer.hasNext() for some SELECT queries 
(CASSANDRA-13911)
   * Filter header only commit logs before recovery (CASSANDRA-13918)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/983c72a8/src/java/org/apache/cassandra/db/lifecycle/LogFile.java
--
diff --cc src/java/org/apache/cassandra/db/lifecycle/LogFile.java
index 9691ee9,be26163..123dd8a
--- a/src/java/org/apache/cassandra/db/lifecycle/LogFile.java
+++ b/src/java/org/apache/cassandra/db/lifecycle/LogFile.java
@@@ -285,6 -285,25 +286,26 @@@ final class LogFile implements AutoClos
  throw new IllegalStateException();
  }
  
+ public void addAll(Type type, Iterable toBulkAdd)
+ {
+ for (LogRecord record : makeRecords(type, toBulkAdd))
+ if (!addRecord(record))
+ throw new IllegalStateException();
+ }
+ 
+ private Collection makeRecords(Type type, 
Iterable tables)
+ {
+ assert type == Type.ADD || type == Type.REMOVE;
+ 
+ for (SSTableReader sstable : tables)
+ {
 -File folder = sstable.descriptor.directory;
 -replicas.maybeCreateReplica(folder, getFileName(folder), records);
++File directory = sstable.descriptor.directory;
++String fileName = StringUtils.join(directory, File.separator, 
getFileName());
++replicas.maybeCreateReplica(directory, fileName, records);
+ }
+ return LogRecord.make(type, tables);
+ }
+ 
  private LogRecord makeRecord(Type type, SSTable table)
  {
  assert type == Type.ADD || type == Type.REMOVE;
@@@ -417,15 -417,26 +438,20 @@@
  return replicas.getFilePaths();
  }
  
 -private String getFileName(File folder)
 -{
 -String fileName = StringUtils.join(BigFormat.latestVersion,
 -   LogFile.SEP,
 -   "txn",
 -   LogFile.SEP,
 -   type.fileName,
 -   LogFile.SEP,
 -   id.toString(),
 -

[2/6] cassandra git commit: Improve TRUNCATE performance

Improve TRUNCATE performance

Patch by marcuse; reviewed by Stefania Alborghetti for CASSANDRA-13909


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/b32a9e64
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/b32a9e64
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/b32a9e64

Branch: refs/heads/cassandra-3.11
Commit: b32a9e6452c78e6ad08e371314bf1ab7492d0773
Parents: 15cee48
Author: Marcus Eriksson 
Authored: Mon Sep 25 14:44:37 2017 +0200
Committer: Marcus Eriksson 
Committed: Mon Oct 2 09:29:22 2017 +0200

--
 CHANGES.txt |  1 +
 .../apache/cassandra/db/lifecycle/Helpers.java  | 15 +
 .../apache/cassandra/db/lifecycle/LogFile.java  | 25 
 .../cassandra/db/lifecycle/LogRecord.java   | 65 +++-
 .../cassandra/db/lifecycle/LogTransaction.java  | 16 +
 .../apache/cassandra/db/lifecycle/Tracker.java  |  2 +-
 .../cassandra/db/lifecycle/HelpersTest.java | 58 -
 7 files changed, 179 insertions(+), 3 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/b32a9e64/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 4a45469..d6423b4 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 3.0.15
+ * Improve TRUNCATE performance (CASSANDRA-13909)
  * Implement short read protection on partition boundaries (CASSANDRA-13595)
  * Fix ISE thrown by UPI.Serializer.hasNext() for some SELECT queries 
(CASSANDRA-13911)
  * Filter header only commit logs before recovery (CASSANDRA-13918)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/b32a9e64/src/java/org/apache/cassandra/db/lifecycle/Helpers.java
--
diff --git a/src/java/org/apache/cassandra/db/lifecycle/Helpers.java 
b/src/java/org/apache/cassandra/db/lifecycle/Helpers.java
index f9555f4..b9adc4b 100644
--- a/src/java/org/apache/cassandra/db/lifecycle/Helpers.java
+++ b/src/java/org/apache/cassandra/db/lifecycle/Helpers.java
@@ -141,6 +141,21 @@ class Helpers
 return accumulate;
 }
 
+static Throwable prepareForBulkObsoletion(Iterable readers, 
LogTransaction txnLogs, List obsoletions, Throwable 
accumulate)
+{
+try
+{
+for (Map.Entry entry 
: txnLogs.bulkObsoletion(readers).entrySet())
+obsoletions.add(new LogTransaction.Obsoletion(entry.getKey(), 
entry.getValue()));
+}
+catch (Throwable t)
+{
+accumulate = Throwables.merge(accumulate, t);
+}
+
+return accumulate;
+}
+
 static Throwable abortObsoletion(List 
obsoletions, Throwable accumulate)
 {
 if (obsoletions == null || obsoletions.isEmpty())

http://git-wip-us.apache.org/repos/asf/cassandra/blob/b32a9e64/src/java/org/apache/cassandra/db/lifecycle/LogFile.java
--
diff --git a/src/java/org/apache/cassandra/db/lifecycle/LogFile.java 
b/src/java/org/apache/cassandra/db/lifecycle/LogFile.java
index da5bb39..be26163 100644
--- a/src/java/org/apache/cassandra/db/lifecycle/LogFile.java
+++ b/src/java/org/apache/cassandra/db/lifecycle/LogFile.java
@@ -37,6 +37,7 @@ import org.slf4j.LoggerFactory;
 import org.apache.cassandra.db.compaction.OperationType;
 import org.apache.cassandra.db.lifecycle.LogRecord.Type;
 import org.apache.cassandra.io.sstable.SSTable;
+import org.apache.cassandra.io.sstable.format.SSTableReader;
 import org.apache.cassandra.io.sstable.format.big.BigFormat;
 import org.apache.cassandra.utils.Throwables;
 
@@ -284,6 +285,25 @@ final class LogFile implements AutoCloseable
 throw new IllegalStateException();
 }
 
+public void addAll(Type type, Iterable toBulkAdd)
+{
+for (LogRecord record : makeRecords(type, toBulkAdd))
+if (!addRecord(record))
+throw new IllegalStateException();
+}
+
+private Collection makeRecords(Type type, 
Iterable tables)
+{
+assert type == Type.ADD || type == Type.REMOVE;
+
+for (SSTableReader sstable : tables)
+{
+File folder = sstable.descriptor.directory;
+replicas.maybeCreateReplica(folder, getFileName(folder), records);
+}
+return LogRecord.make(type, tables);
+}
+
 private LogRecord makeRecord(Type type, SSTable table)
 {
 assert type == Type.ADD || type == Type.REMOVE;
@@ -414,4 +434,9 @@ final class LogFile implements AutoCloseable
 {
 return records;
 }
+
+public boolean isEmpty()
+{
+return records.isEmpty();
+

[1/6] cassandra git commit: Improve TRUNCATE performance

Repository: cassandra
Updated Branches:
  refs/heads/cassandra-3.0 15cee48bb -> b32a9e645
  refs/heads/cassandra-3.11 9d56132ae -> 983c72a84
  refs/heads/trunk cecb2de05 -> 694b3c401


Improve TRUNCATE performance

Patch by marcuse; reviewed by Stefania Alborghetti for CASSANDRA-13909


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/b32a9e64
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/b32a9e64
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/b32a9e64

Branch: refs/heads/cassandra-3.0
Commit: b32a9e6452c78e6ad08e371314bf1ab7492d0773
Parents: 15cee48
Author: Marcus Eriksson 
Authored: Mon Sep 25 14:44:37 2017 +0200
Committer: Marcus Eriksson 
Committed: Mon Oct 2 09:29:22 2017 +0200

--
 CHANGES.txt |  1 +
 .../apache/cassandra/db/lifecycle/Helpers.java  | 15 +
 .../apache/cassandra/db/lifecycle/LogFile.java  | 25 
 .../cassandra/db/lifecycle/LogRecord.java   | 65 +++-
 .../cassandra/db/lifecycle/LogTransaction.java  | 16 +
 .../apache/cassandra/db/lifecycle/Tracker.java  |  2 +-
 .../cassandra/db/lifecycle/HelpersTest.java | 58 -
 7 files changed, 179 insertions(+), 3 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/b32a9e64/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 4a45469..d6423b4 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 3.0.15
+ * Improve TRUNCATE performance (CASSANDRA-13909)
  * Implement short read protection on partition boundaries (CASSANDRA-13595)
  * Fix ISE thrown by UPI.Serializer.hasNext() for some SELECT queries 
(CASSANDRA-13911)
  * Filter header only commit logs before recovery (CASSANDRA-13918)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/b32a9e64/src/java/org/apache/cassandra/db/lifecycle/Helpers.java
--
diff --git a/src/java/org/apache/cassandra/db/lifecycle/Helpers.java 
b/src/java/org/apache/cassandra/db/lifecycle/Helpers.java
index f9555f4..b9adc4b 100644
--- a/src/java/org/apache/cassandra/db/lifecycle/Helpers.java
+++ b/src/java/org/apache/cassandra/db/lifecycle/Helpers.java
@@ -141,6 +141,21 @@ class Helpers
 return accumulate;
 }
 
+static Throwable prepareForBulkObsoletion(Iterable readers, 
LogTransaction txnLogs, List obsoletions, Throwable 
accumulate)
+{
+try
+{
+for (Map.Entry entry 
: txnLogs.bulkObsoletion(readers).entrySet())
+obsoletions.add(new LogTransaction.Obsoletion(entry.getKey(), 
entry.getValue()));
+}
+catch (Throwable t)
+{
+accumulate = Throwables.merge(accumulate, t);
+}
+
+return accumulate;
+}
+
 static Throwable abortObsoletion(List 
obsoletions, Throwable accumulate)
 {
 if (obsoletions == null || obsoletions.isEmpty())

http://git-wip-us.apache.org/repos/asf/cassandra/blob/b32a9e64/src/java/org/apache/cassandra/db/lifecycle/LogFile.java
--
diff --git a/src/java/org/apache/cassandra/db/lifecycle/LogFile.java 
b/src/java/org/apache/cassandra/db/lifecycle/LogFile.java
index da5bb39..be26163 100644
--- a/src/java/org/apache/cassandra/db/lifecycle/LogFile.java
+++ b/src/java/org/apache/cassandra/db/lifecycle/LogFile.java
@@ -37,6 +37,7 @@ import org.slf4j.LoggerFactory;
 import org.apache.cassandra.db.compaction.OperationType;
 import org.apache.cassandra.db.lifecycle.LogRecord.Type;
 import org.apache.cassandra.io.sstable.SSTable;
+import org.apache.cassandra.io.sstable.format.SSTableReader;
 import org.apache.cassandra.io.sstable.format.big.BigFormat;
 import org.apache.cassandra.utils.Throwables;
 
@@ -284,6 +285,25 @@ final class LogFile implements AutoCloseable
 throw new IllegalStateException();
 }
 
+public void addAll(Type type, Iterable toBulkAdd)
+{
+for (LogRecord record : makeRecords(type, toBulkAdd))
+if (!addRecord(record))
+throw new IllegalStateException();
+}
+
+private Collection makeRecords(Type type, 
Iterable tables)
+{
+assert type == Type.ADD || type == Type.REMOVE;
+
+for (SSTableReader sstable : tables)
+{
+File folder = sstable.descriptor.directory;
+replicas.maybeCreateReplica(folder, getFileName(folder), records);
+}
+return LogRecord.make(type, tables);
+}
+
 private LogRecord makeRecord(Type type, SSTable table)
 {
 assert type == Type.ADD || type ==

[4/6] cassandra git commit: Merge branch 'cassandra-3.0' into cassandra-3.11

Merge branch 'cassandra-3.0' into cassandra-3.11


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/983c72a8
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/983c72a8
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/983c72a8

Branch: refs/heads/trunk
Commit: 983c72a84ab6628e09a78ead9e20a0c323a005af
Parents: 9d56132 b32a9e6
Author: Marcus Eriksson 
Authored: Mon Oct 2 09:34:49 2017 +0200
Committer: Marcus Eriksson 
Committed: Mon Oct 2 09:34:49 2017 +0200

--
 CHANGES.txt |  1 +
 .../apache/cassandra/db/lifecycle/Helpers.java  | 15 +
 .../apache/cassandra/db/lifecycle/LogFile.java  | 26 
 .../cassandra/db/lifecycle/LogRecord.java   | 65 +++-
 .../cassandra/db/lifecycle/LogTransaction.java  | 16 +
 .../apache/cassandra/db/lifecycle/Tracker.java  |  2 +-
 .../cassandra/db/lifecycle/HelpersTest.java | 58 -
 7 files changed, 180 insertions(+), 3 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/983c72a8/CHANGES.txt
--
diff --cc CHANGES.txt
index 264887b,d6423b4..aca219e
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@@ -1,17 -1,5 +1,18 @@@
 -3.0.15
 +3.11.1
 + * Fix the computation of cdc_total_space_in_mb for exabyte filesystems 
(CASSANDRA-13808)
 + * AbstractTokenTreeBuilder#serializedSize returns wrong value when there is 
a single leaf and overflow collisions (CASSANDRA-13869)
 + * Add a compaction option to TWCS to ignore sstables overlapping checks 
(CASSANDRA-13418)
 + * BTree.Builder memory leak (CASSANDRA-13754)
 + * Revert CASSANDRA-10368 of supporting non-pk column filtering due to 
correctness (CASSANDRA-13798)
 + * Add a skip read validation flag to cassandra-stress (CASSANDRA-13772)
 + * Fix cassandra-stress hang issues when an error during cluster connection 
happens (CASSANDRA-12938)
 + * Better bootstrap failure message when blocked by (potential) range 
movement (CASSANDRA-13744)
 + * "ignore" option is ignored in sstableloader (CASSANDRA-13721)
 + * Deadlock in AbstractCommitLogSegmentManager (CASSANDRA-13652)
 + * Duplicate the buffer before passing it to analyser in SASI operation 
(CASSANDRA-13512)
 + * Properly evict pstmts from prepared statements cache (CASSANDRA-13641)
 +Merged from 3.0:
+  * Improve TRUNCATE performance (CASSANDRA-13909)
   * Implement short read protection on partition boundaries (CASSANDRA-13595)
   * Fix ISE thrown by UPI.Serializer.hasNext() for some SELECT queries 
(CASSANDRA-13911)
   * Filter header only commit logs before recovery (CASSANDRA-13918)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/983c72a8/src/java/org/apache/cassandra/db/lifecycle/LogFile.java
--
diff --cc src/java/org/apache/cassandra/db/lifecycle/LogFile.java
index 9691ee9,be26163..123dd8a
--- a/src/java/org/apache/cassandra/db/lifecycle/LogFile.java
+++ b/src/java/org/apache/cassandra/db/lifecycle/LogFile.java
@@@ -285,6 -285,25 +286,26 @@@ final class LogFile implements AutoClos
  throw new IllegalStateException();
  }
  
+ public void addAll(Type type, Iterable toBulkAdd)
+ {
+ for (LogRecord record : makeRecords(type, toBulkAdd))
+ if (!addRecord(record))
+ throw new IllegalStateException();
+ }
+ 
+ private Collection makeRecords(Type type, 
Iterable tables)
+ {
+ assert type == Type.ADD || type == Type.REMOVE;
+ 
+ for (SSTableReader sstable : tables)
+ {
 -File folder = sstable.descriptor.directory;
 -replicas.maybeCreateReplica(folder, getFileName(folder), records);
++File directory = sstable.descriptor.directory;
++String fileName = StringUtils.join(directory, File.separator, 
getFileName());
++replicas.maybeCreateReplica(directory, fileName, records);
+ }
+ return LogRecord.make(type, tables);
+ }
+ 
  private LogRecord makeRecord(Type type, SSTable table)
  {
  assert type == Type.ADD || type == Type.REMOVE;
@@@ -417,15 -417,26 +438,20 @@@
  return replicas.getFilePaths();
  }
  
 -private String getFileName(File folder)
 -{
 -String fileName = StringUtils.join(BigFormat.latestVersion,
 -   LogFile.SEP,
 -   "txn",
 -   LogFile.SEP,
 -   type.fileName,
 -   LogFile.SEP,
 -   id.toString(),
 -

[6/6] cassandra git commit: Merge branch 'cassandra-3.11' into trunk

Merge branch 'cassandra-3.11' into trunk


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/694b3c40
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/694b3c40
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/694b3c40

Branch: refs/heads/trunk
Commit: 694b3c40137a3c9d9ec5d844ff40db6046882447
Parents: cecb2de 983c72a
Author: Marcus Eriksson 
Authored: Mon Oct 2 09:40:15 2017 +0200
Committer: Marcus Eriksson 
Committed: Mon Oct 2 09:40:15 2017 +0200

--
 CHANGES.txt |  1 +
 .../apache/cassandra/db/lifecycle/Helpers.java  | 15 +
 .../apache/cassandra/db/lifecycle/LogFile.java  | 26 
 .../cassandra/db/lifecycle/LogRecord.java   | 66 +++-
 .../cassandra/db/lifecycle/LogTransaction.java  | 16 +
 .../apache/cassandra/db/lifecycle/Tracker.java  |  2 +-
 .../cassandra/db/lifecycle/HelpersTest.java | 58 -
 7 files changed, 181 insertions(+), 3 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/694b3c40/CHANGES.txt
--

http://git-wip-us.apache.org/repos/asf/cassandra/blob/694b3c40/src/java/org/apache/cassandra/db/lifecycle/LogRecord.java
--
diff --cc src/java/org/apache/cassandra/db/lifecycle/LogRecord.java
index 9c1ba31,dd3fcde..0a9d73c
--- a/src/java/org/apache/cassandra/db/lifecycle/LogRecord.java
+++ b/src/java/org/apache/cassandra/db/lifecycle/LogRecord.java
@@@ -148,7 -152,11 +152,7 @@@ final class LogRecor
  
  public static LogRecord make(Type type, SSTable table)
  {
- String absoluteTablePath = 
FileUtils.getCanonicalPath(table.descriptor.baseFilename());
 -// CASSANDRA-13294: add the sstable component separator because for 
legacy (2.1) files
 -// there is no separator after the generation number, and this would 
cause files of sstables with
 -// a higher generation number that starts with the same number, to be 
incorrectly classified as files
 -// of this record sstable
+ String absoluteTablePath = 
absolutePath(table.descriptor.baseFilename());
  return make(type, getExistingFiles(absoluteTablePath), 
table.getAllFilePaths().size(), absoluteTablePath);
  }
  
@@@ -267,6 -300,41 +296,41 @@@
  return files == null ? Collections.emptyList() : Arrays.asList(files);
  }
  
+ /**
+  * absoluteFilePaths contains full file parts up to the component name
+  *
+  * this method finds all files on disk beginning with any of the paths in 
absoluteFilePaths
+  * @return a map from absoluteFilePath to actual file on disk.
+  */
+ public static Map getExistingFiles(Set 
absoluteFilePaths)
+ {
+ Set uniqueDirectories = absoluteFilePaths.stream().map(path -> 
Paths.get(path).getParent().toFile()).collect(Collectors.toSet());
+ Map fileMap = new HashMap<>();
+ FilenameFilter ff = (dir, name) -> {
+ Descriptor descriptor = null;
+ try
+ {
 -descriptor = Descriptor.fromFilename(dir, name).left;
++descriptor = Descriptor.fromFilename(new File(dir, name));
+ }
+ catch (Throwable t)
+ {// ignored - if we can't parse the filename, just skip the file
+ }
+ 
+ String absolutePath = descriptor != null ? 
absolutePath(descriptor.baseFilename()) : null;
+ if (absolutePath != null && 
absoluteFilePaths.contains(absolutePath))
+ fileMap.computeIfAbsent(absolutePath, k -> new 
ArrayList<>()).add(new File(dir, name));
+ 
+ return false;
+ };
+ 
+ // populate the file map:
+ for (File f : uniqueDirectories)
+ f.listFiles(ff);
+ 
+ return fileMap;
+ }
+ 
+ 
  public boolean isFinal()
  {
  return type.isFinal();

http://git-wip-us.apache.org/repos/asf/cassandra/blob/694b3c40/src/java/org/apache/cassandra/db/lifecycle/LogTransaction.java
--

http://git-wip-us.apache.org/repos/asf/cassandra/blob/694b3c40/src/java/org/apache/cassandra/db/lifecycle/Tracker.java
--

http://git-wip-us.apache.org/repos/asf/cassandra/blob/694b3c40/test/unit/org/apache/cassandra/db/lifecycle/HelpersTest.java
--


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional

[3/6] cassandra git commit: Improve TRUNCATE performance