date:20160613


[ 
https://issues.apache.org/jira/browse/CASSANDRA-11327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15328832#comment-15328832
 ] 

Chris Lohfink commented on CASSANDRA-11327:
---

Minor Nitpick/bike shedding: all other metric names are camel case, having one 
with spaces may mess up some tools (ie command line jmx readers).

> Maintain a histogram of times when writes are blocked due to no available 
> memory
> 
>
> Key: CASSANDRA-11327
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11327
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Core
>Reporter: Ariel Weisberg
>Assignee: Ariel Weisberg
>
> I have a theory that part of the reason C* is so sensitive to timeouts during 
> saturating write load is that throughput is basically a sawtooth with valleys 
> at zero. This is something I have observed and it gets worse as you add 2i to 
> a table or do anything that decreases the throughput of flushing.
> I think the fix for this is to incrementally release memory pinned by 
> memtables and 2i during flushing instead of releasing it all at once. I know 
> that's not really possible, but we can fake it with memory accounting that 
> tracks how close to completion flushing is and releases permits for 
> additional memory. This will lead to a bit of a sawtooth in real memory 
> usage, but we can account for that so the peak footprint is the same.
> I think the end result of this change will be a sawtooth, but the valley of 
> the sawtooth will not be zero it will be the rate at which flushing 
> progresses. Optimizing the rate at which flushing progresses and it's 
> fairness with other work can then be tackled separately.
> Before we do this I think we should demonstrate that pinned memory due to 
> flushing is actually the issue by getting better visibility into the 
> distribution of instances of not having any memory by maintaining a histogram 
> of spans of time where no memory is available and a thread is blocked.
> [MemtableAllocatr$SubPool.allocate(long)|https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/utils/memory/MemtableAllocator.java#L186]
>  should be a relatively straightforward entry point for this. The first 
> thread to block can mark the start of memory starvation and the last thread 
> out can mark the end. Have a periodic task that tracks the amount of time 
> spent blocked per interval of time and if it is greater than some threshold 
> log with more details, possibly at debug.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-12002) SSTable tools mishandling LocalPartitioner


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Lohfink updated CASSANDRA-12002:
--
Attachment: CASSADNRA-12002.txt

> SSTable tools mishandling LocalPartitioner
> --
>
> Key: CASSANDRA-12002
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12002
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tools
>Reporter: Chris Lohfink
>Assignee: Chris Lohfink
>Priority: Minor
> Attachments: CASSADNRA-12002.txt
>
>
> The sstabledump and sstablemetadata tools use the FBUtilities.newPartitioner 
> from the name of the partitioner in the validation component. This fails on 
> sstables that are created with things that use the LocalPartitioner 
> (secondary indexes, and the system.batches table). The sstabledump had a 
> check for secondary indexes, but still failed for the system table it was 
> failing for all in the metadata tool.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-12002) SSTable tools mishandling LocalPartitioner


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Lohfink updated CASSANDRA-12002:
--
Status: Patch Available  (was: Open)

> SSTable tools mishandling LocalPartitioner
> --
>
> Key: CASSANDRA-12002
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12002
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tools
>Reporter: Chris Lohfink
>Assignee: Chris Lohfink
>Priority: Minor
> Attachments: CASSADNRA-12002.txt
>
>
> The sstabledump and sstablemetadata tools use the FBUtilities.newPartitioner 
> from the name of the partitioner in the validation component. This fails on 
> sstables that are created with things that use the LocalPartitioner 
> (secondary indexes, and the system.batches table). The sstabledump had a 
> check for secondary indexes, but still failed for the system table it was 
> failing for all in the metadata tool.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (CASSANDRA-12002) SSTable tools mishandling LocalPartitioner

Chris Lohfink created CASSANDRA-12002:
-

 Summary: SSTable tools mishandling LocalPartitioner
 Key: CASSANDRA-12002
 URL: https://issues.apache.org/jira/browse/CASSANDRA-12002
 Project: Cassandra
  Issue Type: Bug
  Components: Tools
Reporter: Chris Lohfink
Assignee: Chris Lohfink
Priority: Minor


The sstabledump and sstablemetadata tools use the FBUtilities.newPartitioner 
from the name of the partitioner in the validation component. This fails on 
sstables that are created with things that use the LocalPartitioner (secondary 
indexes, and the system.batches table). The sstabledump had a check for 
secondary indexes, but still failed for the system table it was failing for all 
in the metadata tool.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (CASSANDRA-11206) Support large partitions on the 3.0 sstable format

2016-06-13 Thread Michael Kjellman (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-11206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15328785#comment-15328785
 ] 

Michael Kjellman edited comment on CASSANDRA-11206 at 6/14/16 2:14 AM:
---

going thru the changes and have some questions :)

# RowIndexEntry$serializedSize used to return the size of the index for the 
entire row. As the size of the IndexInfo elements are variable length I'm 
having trouble understanding how the new/current implementation does this:
{code}
private static int serializedSize(DeletionTime deletionTime, long headerLength, 
int columnIndexCount)
{
return TypeSizes.sizeofUnsignedVInt(headerLength)
   + (int) DeletionTime.serializer.serializedSize(deletionTime)
   + TypeSizes.sizeofUnsignedVInt(columnIndexCount);
}
{code}
# In the class level Javadoc for IndexInfo there is a lot of comment about 
serialization format changes and even a comment "Serialization format changed 
in 3.0" yet I don't see any corresponding changes in BigFormat$BigVersion
# I see a class named **Pre_C_11206_RowIndexEntry** in RowIndexEntryTest which 
has a lot of the logic that used to be in RowIndexEntry. I don't see the logic 
outside of the test classes though.


was (Author: mkjellman):
going thru the changes and have some questions :)

# RowIndexEntry$serializedSize used to return the size of the index for the 
entire row. As the size of the IndexInfo elements are variable length I'm 
having trouble understanding how the new/current implementation does this:
{code}
private static int serializedSize(DeletionTime deletionTime, long headerLength, 
int columnIndexCount)
{
return TypeSizes.sizeofUnsignedVInt(headerLength)
   + (int) DeletionTime.serializer.serializedSize(deletionTime)
   + TypeSizes.sizeofUnsignedVInt(columnIndexCount);
}
{code}
# In the class level Javadoc for IndexInfo there is a lot of comment about 
serialization format changes and even a comment "Serialization format changed 
in 3.0" yet I don't see any corresponding changes in BigFormat$BigVersion

> Support large partitions on the 3.0 sstable format
> --
>
> Key: CASSANDRA-11206
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11206
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local Write-Read Paths
>Reporter: Jonathan Ellis
>Assignee: Robert Stupp
>  Labels: docs-impacting
> Fix For: 3.6
>
> Attachments: 11206-gc.png, trunk-gc.png
>
>
> Cassandra saves a sample of IndexInfo objects that store the offset within 
> each partition of every 64KB (by default) range of rows.  To find a row, we 
> binary search this sample, then scan the partition of the appropriate range.
> The problem is that this scales poorly as partitions grow: on a cache miss, 
> we deserialize the entire set of IndexInfo, which both creates a lot of GC 
> overhead (as noted in CASSANDRA-9754) but is also non-negligible i/o activity 
> (relative to reading a single 64KB row range) as partitions get truly large.
> We introduced an "offset map" in CASSANDRA-10314 that allows us to perform 
> the IndexInfo bsearch while only deserializing IndexInfo that we need to 
> compare against, i.e. log(N) deserializations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-11327) Maintain a histogram of times when writes are blocked due to no available memory


[ 
https://issues.apache.org/jira/browse/CASSANDRA-11327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15328789#comment-15328789
 ] 

Ariel Weisberg commented on CASSANDRA-11327:


||Code|utests|dtests||
|[3.0 
code|https://github.com/apache/cassandra/compare/cassandra-3.0...aweisberg:CASSANDRA-11327-3.0?expand=1]|[utests|https://cassci.datastax.com/view/Dev/view/aweisberg/job/aweisberg-CASSANDRA-11327-3.0-testall/]|[dtests|http://cassci.datastax.com/view/Dev/view/aweisberg/job/aweisberg-CASSANDRA-11327-3.0-dtest/]|
|[trunk 
code|https://github.com/apache/cassandra/compare/trunk...aweisberg:CASSANDRA-11327-trunk?expand=1]|[utests|https://cassci.datastax.com/view/Dev/view/aweisberg/job/aweisberg-CASSANDRA-11327-trunk-testall/]|[dtests|http://cassci.datastax.com/view/Dev/view/aweisberg/job/aweisberg-CASSANDRA-11327-trunk-dtest/]|

> Maintain a histogram of times when writes are blocked due to no available 
> memory
> 
>
> Key: CASSANDRA-11327
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11327
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Core
>Reporter: Ariel Weisberg
>Assignee: Ariel Weisberg
>
> I have a theory that part of the reason C* is so sensitive to timeouts during 
> saturating write load is that throughput is basically a sawtooth with valleys 
> at zero. This is something I have observed and it gets worse as you add 2i to 
> a table or do anything that decreases the throughput of flushing.
> I think the fix for this is to incrementally release memory pinned by 
> memtables and 2i during flushing instead of releasing it all at once. I know 
> that's not really possible, but we can fake it with memory accounting that 
> tracks how close to completion flushing is and releases permits for 
> additional memory. This will lead to a bit of a sawtooth in real memory 
> usage, but we can account for that so the peak footprint is the same.
> I think the end result of this change will be a sawtooth, but the valley of 
> the sawtooth will not be zero it will be the rate at which flushing 
> progresses. Optimizing the rate at which flushing progresses and it's 
> fairness with other work can then be tackled separately.
> Before we do this I think we should demonstrate that pinned memory due to 
> flushing is actually the issue by getting better visibility into the 
> distribution of instances of not having any memory by maintaining a histogram 
> of spans of time where no memory is available and a thread is blocked.
> [MemtableAllocatr$SubPool.allocate(long)|https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/utils/memory/MemtableAllocator.java#L186]
>  should be a relatively straightforward entry point for this. The first 
> thread to block can mark the start of memory starvation and the last thread 
> out can mark the end. Have a periodic task that tracks the amount of time 
> spent blocked per interval of time and if it is greater than some threshold 
> log with more details, possibly at debug.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-11206) Support large partitions on the 3.0 sstable format

2016-06-13 Thread Michael Kjellman (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-11206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15328785#comment-15328785
 ] 

Michael Kjellman commented on CASSANDRA-11206:
--

going thru the changes and have some questions :)

# RowIndexEntry$serializedSize used to return the size of the index for the 
entire row. As the size of the IndexInfo elements are variable length I'm 
having trouble understanding how the new/current implementation does this:
{code}
private static int serializedSize(DeletionTime deletionTime, long headerLength, 
int columnIndexCount)
{
return TypeSizes.sizeofUnsignedVInt(headerLength)
   + (int) DeletionTime.serializer.serializedSize(deletionTime)
   + TypeSizes.sizeofUnsignedVInt(columnIndexCount);
}
{code}
# In the class level Javadoc for IndexInfo there is a lot of comment about 
serialization format changes and even a comment "Serialization format changed 
in 3.0" yet I don't see any corresponding changes in BigFormat$BigVersion

> Support large partitions on the 3.0 sstable format
> --
>
> Key: CASSANDRA-11206
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11206
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local Write-Read Paths
>Reporter: Jonathan Ellis
>Assignee: Robert Stupp
>  Labels: docs-impacting
> Fix For: 3.6
>
> Attachments: 11206-gc.png, trunk-gc.png
>
>
> Cassandra saves a sample of IndexInfo objects that store the offset within 
> each partition of every 64KB (by default) range of rows.  To find a row, we 
> binary search this sample, then scan the partition of the appropriate range.
> The problem is that this scales poorly as partitions grow: on a cache miss, 
> we deserialize the entire set of IndexInfo, which both creates a lot of GC 
> overhead (as noted in CASSANDRA-9754) but is also non-negligible i/o activity 
> (relative to reading a single 64KB row range) as partitions get truly large.
> We introduced an "offset map" in CASSANDRA-10314 that allows us to perform 
> the IndexInfo bsearch while only deserializing IndexInfo that we need to 
> compare against, i.e. log(N) deserializations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-11933) Cache local ranges when calculating repair neighbors


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paulo Motta updated CASSANDRA-11933:

Summary: Cache local ranges when calculating repair neighbors  (was: 
Improve Repair performance)

> Cache local ranges when calculating repair neighbors
> 
>
> Key: CASSANDRA-11933
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11933
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Cyril Scetbon
>Assignee: Mahdi Mohammadi
>
> During  a full repair on a ~ 60 nodes cluster, I've been able to see that 
> this stage can be significant (up to 60 percent of the whole time) :
> https://github.com/apache/cassandra/blob/cassandra-2.1/src/java/org/apache/cassandra/service/StorageService.java#L2983-L2997
> It's merely caused by the fact that 
> https://github.com/apache/cassandra/blob/cassandra-2.1/src/java/org/apache/cassandra/service/ActiveRepairService.java#L189
>  calls {code}ss.getLocalRanges(keyspaceName){code} everytime and that it 
> takes more than 99% of the time. This call takes 600ms when there is no load 
> on the cluster and more if there is. So for 10k ranges, you can imagine that 
> it takes at least 1.5 hours just to compute ranges. 
> Underneath it calls 
> [ReplicationStrategy.getAddressRanges|https://github.com/apache/cassandra/blob/3dcbe90e02440e6ee534f643c7603d50ca08482b/src/java/org/apache/cassandra/locator/AbstractReplicationStrategy.java#L170]
>  which can get pretty inefficient ([~jbellis]'s 
> [words|https://github.com/apache/cassandra/blob/3dcbe90e02440e6ee534f643c7603d50ca08482b/src/java/org/apache/cassandra/locator/AbstractReplicationStrategy.java#L165])
> *ss.getLocalRanges(keyspaceName)* should be cached to avoid having to spend 
> hours on it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-11933) Improve Repair performance


[ 
https://issues.apache.org/jira/browse/CASSANDRA-11933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15328698#comment-15328698
 ] 

Paulo Motta commented on CASSANDRA-11933:
-

Thanks for the update [~mahdix]. The patch looks good, I fixed one minor nit on 
2.1 test, added CHANGES.txt entries, updated commit message (and author 
information that was screwed up on 2.2 and 3.0) and resubmitted tests (still 
running).

||2.1||2.2||3.0||trunk||
|[branch|https://github.com/apache/cassandra/compare/cassandra-2.1...pauloricardomg:11933-2.1]|[branch|https://github.com/apache/cassandra/compare/cassandra-2.2...pauloricardomg:11933-2.2]|[branch|https://github.com/apache/cassandra/compare/cassandra-3.0...pauloricardomg:11933-3.0]|[branch|https://github.com/apache/cassandra/compare/trunk...pauloricardomg:11933-trunk]|
|[testall|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-11933-2.1-testall/lastCompletedBuild/testReport/]|[testall|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-11933-2.2-testall/lastCompletedBuild/testReport/]|[testall|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-11933-3.0-testall/lastCompletedBuild/testReport/]|[testall|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-11933-trunk-testall/lastCompletedBuild/testReport/]|
|[dtest|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-11933-2.1-dtest/lastCompletedBuild/testReport/]|[dtest|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-11933-2.2-dtest/lastCompletedBuild/testReport/]|[dtest|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-11933-3.0-dtest/lastCompletedBuild/testReport/]|[dtest|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-11933-trunk-dtest/lastCompletedBuild/testReport/]|

> Improve Repair performance
> --
>
> Key: CASSANDRA-11933
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11933
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Cyril Scetbon
>Assignee: Mahdi Mohammadi
>
> During  a full repair on a ~ 60 nodes cluster, I've been able to see that 
> this stage can be significant (up to 60 percent of the whole time) :
> https://github.com/apache/cassandra/blob/cassandra-2.1/src/java/org/apache/cassandra/service/StorageService.java#L2983-L2997
> It's merely caused by the fact that 
> https://github.com/apache/cassandra/blob/cassandra-2.1/src/java/org/apache/cassandra/service/ActiveRepairService.java#L189
>  calls {code}ss.getLocalRanges(keyspaceName){code} everytime and that it 
> takes more than 99% of the time. This call takes 600ms when there is no load 
> on the cluster and more if there is. So for 10k ranges, you can imagine that 
> it takes at least 1.5 hours just to compute ranges. 
> Underneath it calls 
> [ReplicationStrategy.getAddressRanges|https://github.com/apache/cassandra/blob/3dcbe90e02440e6ee534f643c7603d50ca08482b/src/java/org/apache/cassandra/locator/AbstractReplicationStrategy.java#L170]
>  which can get pretty inefficient ([~jbellis]'s 
> [words|https://github.com/apache/cassandra/blob/3dcbe90e02440e6ee534f643c7603d50ca08482b/src/java/org/apache/cassandra/locator/AbstractReplicationStrategy.java#L165])
> *ss.getLocalRanges(keyspaceName)* should be cached to avoid having to spend 
> hours on it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-7937) Apply backpressure gently when overloaded with writes


[ 
https://issues.apache.org/jira/browse/CASSANDRA-7937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15328674#comment-15328674
 ] 

Ariel Weisberg commented on CASSANDRA-7937:
---

Looking at this a little harder it looks like 3.0 might be better off than 
later versions because of changes that were part of CASSANDRA-6696 which appear 
to be what reduced the number of concurrent flushes that could occur.

[See this 
change|https://github.com/apache/cassandra/commit/e2c6341898fa43b0e262ef031f267587050b8d0f#diff-98f5acb96aa6d684781936c141132e2aR121]
 which was actually a surprise to me because I thought it worked the old way. 
[~krummas]

> Apply backpressure gently when overloaded with writes
> -
>
> Key: CASSANDRA-7937
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7937
> Project: Cassandra
>  Issue Type: Improvement
> Environment: Cassandra 2.0
>Reporter: Piotr Kołaczkowski
>  Labels: performance
>
> When writing huge amounts of data into C* cluster from analytic tools like 
> Hadoop or Apache Spark, we can see that often C* can't keep up with the load. 
> This is because analytic tools typically write data "as fast as they can" in 
> parallel, from many nodes and they are not artificially rate-limited, so C* 
> is the bottleneck here. Also, increasing the number of nodes doesn't really 
> help, because in a collocated setup this also increases number of 
> Hadoop/Spark nodes (writers) and although possible write performance is 
> higher, the problem still remains.
> We observe the following behavior:
> 1. data is ingested at an extreme fast pace into memtables and flush queue 
> fills up
> 2. the available memory limit for memtables is reached and writes are no 
> longer accepted
> 3. the application gets hit by "write timeout", and retries repeatedly, in 
> vain 
> 4. after several failed attempts to write, the job gets aborted 
> Desired behaviour:
> 1. data is ingested at an extreme fast pace into memtables and flush queue 
> fills up
> 2. after exceeding some memtable "fill threshold", C* applies adaptive rate 
> limiting to writes - the more the buffers are filled-up, the less writes/s 
> are accepted, however writes still occur within the write timeout.
> 3. thanks to slowed down data ingestion, now flush can finish before all the 
> memory gets used
> Of course the details how rate limiting could be done are up for a discussion.
> It may be also worth considering putting such logic into the driver, not C* 
> core, but then C* needs to expose at least the following information to the 
> driver, so we could calculate the desired maximum data rate:
> 1. current amount of memory available for writes before they would completely 
> block
> 2. total amount of data queued to be flushed and flush progress (amount of 
> data to flush remaining for the memtable currently being flushed)
> 3. average flush write speed



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (CASSANDRA-10862) LCS repair: compact tables before making available in L0

2016-06-13 Thread Chen Shen (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-10862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15321703#comment-15321703
 ] 

Chen Shen edited comment on CASSANDRA-10862 at 6/13/16 11:38 PM:
-

[~pauloricardomg]
I've done some investigation and I find it might not so easy to schedule a 
compaction on L0 table on reception as the only straightforward way to trigger 
a compaction is by submitting a task to CompactionManager.submitBackground, and 
1) it's not guaranteed to be executed according to my knowledge 2) 
submitBackground need a `ColumnFamilyStore` as input, so we need either create 
a new CFS, or split the compaction strategy out of CompactionManager, each of 
which might need lots of work.

So instead I am doing a different tricky approach: Don't add tables to CFS 
until the number of L0 sstables is smaller than a threshold. -And subscribe to 
`SSTableListChangedNotification` so that the `OnCompletionRunnable` could sleep 
and wait on notification.- And sleep for a while to retry.

Is this a right direction? I have a commit here 
https://github.com/scv119/cassandra/commit/87d63254f1556d1a41649a4c3b9d6ea0bce3
 if you want to take a look. I'm also planing to apply this patch to our 
production tier to see if this helps.

Updates:
This patch has been working well on your cluster, there are still some extra 
sstables created during anti-compaction stage, but I think we could disable 
incremental repair for now.


was (Author: scv...@gmail.com):
[~pauloricardomg]
I've done some investigation and I find it might not so easy to schedule a 
compaction on L0 table on reception as the only straightforward way to trigger 
a compaction is by submitting a task to CompactionManager.submitBackground, and 
1) it's not guaranteed to be executed according to my knowledge 2) 
submitBackground need a `ColumnFamilyStore` as input, so we need either create 
a new CFS, or split the compaction strategy out of CompactionManager, each of 
which might need lots of work.

So instead I am doing a different tricky approach: Don't add tables to CFS 
until the number of L0 sstables is smaller than a threshold. -And subscribe to 
`SSTableListChangedNotification` so that the `OnCompletionRunnable` could sleep 
and wait on notification.- And sleep for a while to retry.

Is this a right direction? I have a commit here 
https://github.com/scv119/cassandra/commit/87d63254f1556d1a41649a4c3b9d6ea0bce3if
 you want to take a look. I'm also planing to apply this patch to our 
production tier to see if this helps.

Updates:
This patch has been working well on your cluster, there are still some extra 
sstables created during anti-compaction stage, but I think we could disable 
incremental repair for now.

> LCS repair: compact tables before making available in L0
> 
>
> Key: CASSANDRA-10862
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10862
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction, Streaming and Messaging
>Reporter: Jeff Ferland
>Assignee: Chen Shen
>
> When doing repair on a system with lots of mismatched ranges, the number of 
> tables in L0 goes up dramatically, as correspondingly goes the number of 
> tables referenced for a query. Latency increases dramatically in tandem.
> Eventually all the copied tables are compacted down in L0, then copied into 
> L1 (which may be a very large copy), finally reducing the number of SSTables 
> per query into the manageable range.
> It seems to me that the cleanest answer is to compact after streaming, then 
> mark tables available rather than marking available when the file itself is 
> complete.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (CASSANDRA-10862) LCS repair: compact tables before making available in L0

2016-06-13 Thread Chen Shen (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-10862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15321703#comment-15321703
 ] 

Chen Shen edited comment on CASSANDRA-10862 at 6/13/16 11:36 PM:
-

[~pauloricardomg]
I've done some investigation and I find it might not so easy to schedule a 
compaction on L0 table on reception as the only straightforward way to trigger 
a compaction is by submitting a task to CompactionManager.submitBackground, and 
1) it's not guaranteed to be executed according to my knowledge 2) 
submitBackground need a `ColumnFamilyStore` as input, so we need either create 
a new CFS, or split the compaction strategy out of CompactionManager, each of 
which might need lots of work.

So instead I am doing a different tricky approach: Don't add tables to CFS 
until the number of L0 sstables is smaller than a threshold. -And subscribe to 
`SSTableListChangedNotification` so that the `OnCompletionRunnable` could sleep 
and wait on notification.- And sleep for a while to retry.

Is this a right direction? I have a commit here 
https://github.com/scv119/cassandra/commit/87d63254f1556d1a41649a4c3b9d6ea0bce3if
 you want to take a look. I'm also planing to apply this patch to our 
production tier to see if this helps.

Updates:
This patch has been working well on your cluster, there are still some extra 
sstables created during anti-compaction stage, but I think we could disable 
incremental repair for now.


was (Author: scv...@gmail.com):
[~pauloricardomg]
I've done some investigation and I find it might not so easy to schedule a 
compaction on L0 table on reception as the only straightforward way to trigger 
a compaction is by submitting a task to CompactionManager.submitBackground, and 
1) it's not guaranteed to be executed according to my knowledge 2) 
submitBackground need a `ColumnFamilyStore` as input, so we need either create 
a new CFS, or split the compaction strategy out of CompactionManager, each of 
which might need lots of work.

So instead I am doing a different tricky approach: Don't add tables to CFS 
until the number of L0 sstables is smaller than a threshold. -And subscribe to 
`SSTableListChangedNotification` so that the `OnCompletionRunnable` could sleep 
and wait on notification.- And sleep for a while to retry.

Is this a right direction? -I have a commit here 
https://github.com/scv119/cassandra/commit/149d127c76f8f4e267524ed7f642d2ffdf6188e5
 if you want to take a look.- I'm also planing to apply this patch to our 
production tier to see if this helps.
 

> LCS repair: compact tables before making available in L0
> 
>
> Key: CASSANDRA-10862
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10862
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction, Streaming and Messaging
>Reporter: Jeff Ferland
>Assignee: Chen Shen
>
> When doing repair on a system with lots of mismatched ranges, the number of 
> tables in L0 goes up dramatically, as correspondingly goes the number of 
> tables referenced for a query. Latency increases dramatically in tandem.
> Eventually all the copied tables are compacted down in L0, then copied into 
> L1 (which may be a very large copy), finally reducing the number of SSTables 
> per query into the manageable range.
> It seems to me that the cleanest answer is to compact after streaming, then 
> mark tables available rather than marking available when the file itself is 
> complete.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-9507) range metrics are not updated for timeout and unavailable in StorageProxy


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sankalp kohli updated CASSANDRA-9507:
-
Status: Patch Available  (was: Open)

> range metrics are not updated for timeout and unavailable in StorageProxy
> -
>
> Key: CASSANDRA-9507
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9507
> Project: Cassandra
>  Issue Type: Bug
>  Components: Observability
>Reporter: sankalp kohli
>Priority: Minor
> Attachments: Cassandra-9507.diff
>
>
> Looking at the code, it looks like range metrics are not updated for timeouts 
> and unavailable. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-9507) range metrics are not updated for timeout and unavailable in StorageProxy

2016-06-13 Thread Nachiket Patil (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nachiket Patil updated CASSANDRA-9507:
--
Attachment: Cassandra-9507.diff

> range metrics are not updated for timeout and unavailable in StorageProxy
> -
>
> Key: CASSANDRA-9507
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9507
> Project: Cassandra
>  Issue Type: Bug
>  Components: Observability
>Reporter: sankalp kohli
>Priority: Minor
> Attachments: Cassandra-9507.diff
>
>
> Looking at the code, it looks like range metrics are not updated for timeouts 
> and unavailable. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-8928) Add downgradesstables


[ 
https://issues.apache.org/jira/browse/CASSANDRA-8928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15328411#comment-15328411
 ] 

Paulo Motta commented on CASSANDRA-8928:


bq. The primary use case for downgrading sstables is to abort an upgrade. So 
for example, if you're on 2.1 and want to go to 2.2 or 3.x, the upgrade plan 
often includes a contingency plan to abort and revert back to the older 
version, including preserving data written while in the upgraded version.

The same end result can be achieved, although with maybe a bit more of work, by 
backing up sstables of the upgraded node and wiping/recreating the node in a 
previous version, and then use sstableloader of the newer version to reload the 
data in the recreated node, which would be easily enabled by CASSANDRA-8110. 
Although standalone downgrading could be nice, I'm not sure it's really worth 
the effort and risk of carrying sstable writing code across multiple versions.

bq. Would it be possible to limit the implementation of this to the core 
sstable data file? We could then have an offline downgrade sstable process that 
would convert to the older sstable data file and those files could be bulk 
loaded through that version's sstable loader into the reverted cluster.

This is more or less the approach I'm proposing, but instead of doing an 
offline downgrade only of the data file  + sstableloading afterwards, you would 
only need to use sstableloader of the newer version to load newer version 
sstables in the reverted cluster automatically. Wouldn't this be sufficient?

> Add downgradesstables
> -
>
> Key: CASSANDRA-8928
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8928
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Tools
>Reporter: Jeremy Hanna
>Assignee: Kaide Mu
>Priority: Minor
>  Labels: gsoc2016, mentor
>
> As mentioned in other places such as CASSANDRA-8047 and in the wild, 
> sometimes you need to go back.  A downgrade sstables utility would be nice 
> for a lot of reasons and I don't know that supporting going back to the 
> previous major version format would be too much code since we already support 
> reading the previous version.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (CASSANDRA-11327) Maintain a histogram of times when writes are blocked due to no available memory


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11327?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ariel Weisberg reassigned CASSANDRA-11327:
--

Assignee: Ariel Weisberg

> Maintain a histogram of times when writes are blocked due to no available 
> memory
> 
>
> Key: CASSANDRA-11327
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11327
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Core
>Reporter: Ariel Weisberg
>Assignee: Ariel Weisberg
>
> I have a theory that part of the reason C* is so sensitive to timeouts during 
> saturating write load is that throughput is basically a sawtooth with valleys 
> at zero. This is something I have observed and it gets worse as you add 2i to 
> a table or do anything that decreases the throughput of flushing.
> I think the fix for this is to incrementally release memory pinned by 
> memtables and 2i during flushing instead of releasing it all at once. I know 
> that's not really possible, but we can fake it with memory accounting that 
> tracks how close to completion flushing is and releases permits for 
> additional memory. This will lead to a bit of a sawtooth in real memory 
> usage, but we can account for that so the peak footprint is the same.
> I think the end result of this change will be a sawtooth, but the valley of 
> the sawtooth will not be zero it will be the rate at which flushing 
> progresses. Optimizing the rate at which flushing progresses and it's 
> fairness with other work can then be tackled separately.
> Before we do this I think we should demonstrate that pinned memory due to 
> flushing is actually the issue by getting better visibility into the 
> distribution of instances of not having any memory by maintaining a histogram 
> of spans of time where no memory is available and a thread is blocked.
> [MemtableAllocatr$SubPool.allocate(long)|https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/utils/memory/MemtableAllocator.java#L186]
>  should be a relatively straightforward entry point for this. The first 
> thread to block can mark the start of memory starvation and the last thread 
> out can mark the end. Have a periodic task that tracks the amount of time 
> spent blocked per interval of time and if it is greater than some threshold 
> log with more details, possibly at debug.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (CASSANDRA-7937) Apply backpressure gently when overloaded with writes

[
https://issues.apache.org/jira/browse/CASSANDRA-7937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15328373#comment-15328373
]

Ariel Weisberg edited comment on CASSANDRA-7937 at 6/13/16 9:51 PM:

I think we can make this situation better, and I mentioned some ideas at NGCC
and in CASSANDRA-11327.

There are two issues.

The first is that if flushing falls behind throughput falls to zero instead of
progressing at the rate at which flushing progresses which is usually not zero.
Right now it looks like it is zero because flushing doesn't release any memory
as it progresses and is all or nothing.

Aleksey mentioned we could do something like early opening for flushing so that
memory is made available sooner. Alternatively we could overcommit and then
gradually release memory as flushing progresses.

The second, and this isn't really related to backpressure, is that flushing
falls behind in several reasonable configurations. Ingest has gotten faster and
I don't think flushing has as much so it's easier for it to fall behind if it's
driven by a single thread against a busy device (even a fast SSD). I haven't
tested this yet, but I suspect that if you use multiple JBOD paths for a fast
device like an SSD and increase memtable_flush_writers you will get enough
additional flushing throughput to keep up with ingest. Right now flushing is
single threaded for a single path and only one flush can occur at any time.

Flushing falling behind is more noticeable when you let compaction have more
threads and a bigger rate limit because it can dirty enough memory in the
filesystem cache that when it flushes it causes a temporally localized slowdown
in flushing that is enough to cause timeouts when there is no more free memory
because flushing didn't finish soon enough.

I think the long term solution is that the further flushing falls behind the
more concurrent flush threads we start deploying kind of like compaction up to
the configured limit. [Right now there is a single thread scheduling them and
waiting on the
result.|https://github.com/apache/cassandra/blob/cassandra-3.7/src/java/org/apache/cassandra/db/ColumnFamilyStore.java#L1130]
memtable_flush_writers doesn't help due to this code here that only [generates
more flush runnables for a memtable if there are multiple
directories|https://github.com/apache/cassandra/blob/cassandra-3.7/src/java/org/apache/cassandra/db/Memtable.java#L278].
C* is already divvying up the heap using memtable_cleanup_threshold which
would allow for concurrent flushing it's just not actually flushing
concurrently.

was (Author: aweisberg):
I think we can make this situation better, and I mentioned some ideas at NGCC
and in CASSANDRA-11327.

There are two issues.

The second is that flushing falls behind in several reasonable configurations.
Ingest has gotten faster and I don't think flushing has as much so it's easier
for it to fall behind if it's driven by a single thread against a busy device
(even a fast SSD). I haven't tested this yet, but I suspect that if you use
multiple JBOD paths for a fast device like an SSD and increase
memtable_flush_writers you will get enough additional flushing throughput to
keep up with ingest. Right now flushing is single threaded for a single path
and only one flush can occur at any time.

[jira] [Commented] (CASSANDRA-7937) Apply backpressure gently when overloaded with writes


[ 
https://issues.apache.org/jira/browse/CASSANDRA-7937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15328373#comment-15328373
 ] 

Ariel Weisberg commented on CASSANDRA-7937:
---

I think we can make this situation better, and I mentioned some ideas at NGCC 
and in CASSANDRA-11327.

There are two issues.

The first is that if flushing falls behind throughput falls to zero instead of 
progressing at the rate at which flushing progresses which is usually not zero. 
Right now it looks like it is zero because flushing doesn't release any memory 
as it progresses and is all or nothing.

Aleksey mentioned we could do something like early opening for flushing so that 
memory is made available sooner. Alternatively we could overcommit and then 
gradually release memory as flushing progresses.

The second is that flushing falls behind in several reasonable configurations. 
Ingest has gotten faster and I don't think flushing has as much so it's easier 
for it to fall behind if it's driven by a single thread against a busy device 
(even a fast SSD). I haven't tested this yet, but I suspect that if you use 
multiple JBOD paths for a fast device like an SSD and increase 
memtable_flush_writers you will get enough additional flushing throughput to 
keep up with ingest. Right now flushing is single threaded for a single path 
and only one flush can occur at any time.

Flushing falling behind is more noticeable when you let compaction have more 
threads and a bigger rate limit because it can dirty enough memory in the 
filesystem cache that when it flushes it causes a temporally localized slowdown 
in flushing that is enough to cause timeouts when there is no more free memory 
because flushing didn't finish soon enough.

I think the long term solution is that the further flushing falls behind the 
more concurrent flush threads we start deploying kind of like compaction up to 
the configured limit. [Right now there is a single thread scheduling them and 
waiting on the 
result.|https://github.com/apache/cassandra/blob/cassandra-3.7/src/java/org/apache/cassandra/db/ColumnFamilyStore.java#L1130]
 memtable_flush_writers doesn't help due to this code here that only [generates 
more flush runnables for a memtable if there are multiple 
directories|https://github.com/apache/cassandra/blob/cassandra-3.7/src/java/org/apache/cassandra/db/Memtable.java#L278].
 C* is already divvying up the heap using memtable_cleanup_threshold which 
would allow for concurrent flushing it's just not actually flushing 
concurrently.

> Apply backpressure gently when overloaded with writes
> -
>
> Key: CASSANDRA-7937
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7937
> Project: Cassandra
>  Issue Type: Improvement
> Environment: Cassandra 2.0
>Reporter: Piotr Kołaczkowski
>  Labels: performance
>
> When writing huge amounts of data into C* cluster from analytic tools like 
> Hadoop or Apache Spark, we can see that often C* can't keep up with the load. 
> This is because analytic tools typically write data "as fast as they can" in 
> parallel, from many nodes and they are not artificially rate-limited, so C* 
> is the bottleneck here. Also, increasing the number of nodes doesn't really 
> help, because in a collocated setup this also increases number of 
> Hadoop/Spark nodes (writers) and although possible write performance is 
> higher, the problem still remains.
> We observe the following behavior:
> 1. data is ingested at an extreme fast pace into memtables and flush queue 
> fills up
> 2. the available memory limit for memtables is reached and writes are no 
> longer accepted
> 3. the application gets hit by "write timeout", and retries repeatedly, in 
> vain 
> 4. after several failed attempts to write, the job gets aborted 
> Desired behaviour:
> 1. data is ingested at an extreme fast pace into memtables and flush queue 
> fills up
> 2. after exceeding some memtable "fill threshold", C* applies adaptive rate 
> limiting to writes - the more the buffers are filled-up, the less writes/s 
> are accepted, however writes still occur within the write timeout.
> 3. thanks to slowed down data ingestion, now flush can finish before all the 
> memory gets used
> Of course the details how rate limiting could be done are up for a discussion.
> It may be also worth considering putting such logic into the driver, not C* 
> core, but then C* needs to expose at least the following information to the 
> driver, so we could calculate the desired maximum data rate:
> 1. current amount of memory available for writes before they would completely 
> block
> 2. total amount of data queued to be flushed and flush progress (amount of 
> data to flush remaining for the memtable currently being flushed)
> 3. average flush write speed



--
This message was

[jira] [Comment Edited] (CASSANDRA-8928) Add downgradesstables

2016-06-13 Thread Jeremy Hanna (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-8928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15328361#comment-15328361
 ] 

Jeremy Hanna edited comment on CASSANDRA-8928 at 6/13/16 9:50 PM:
--

The primary use case for downgrading sstables is to abort an upgrade.  So for 
example, if you're on 2.1 and want to go to 2.2 or 3.x, the upgrade plan often 
includes a contingency plan to abort and revert back to the older version, 
including preserving data written while in the upgraded version.  While 
backwards compatible streaming (CASSANDRA-8110) is helpful in other scenarios, 
it wouldn't help for that use case.

Would it be possible to limit the implementation of this to the core sstable 
data file?  We could then have an offline downgrade sstable process that would 
convert to the older sstable data file and those files could be bulk loaded 
through that version's sstable loader into the reverted cluster.


was (Author: jeromatron):
The primary use case for downgrading sstables is to abort an upgrade.  So for 
example, if you're on 2.1 and want to go to 2.2 or 3.x, the upgrade plan often 
includes a contingency plan to abort and revert back to the older version, 
including preserving data written while in the upgraded version.  While 
backwards compatible streaming (CASSANDRA-8110) is helpful in other scenarios, 
it wouldn't help for that use case.

> Add downgradesstables
> -
>
> Key: CASSANDRA-8928
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8928
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Tools
>Reporter: Jeremy Hanna
>Assignee: Kaide Mu
>Priority: Minor
>  Labels: gsoc2016, mentor
>
> As mentioned in other places such as CASSANDRA-8047 and in the wild, 
> sometimes you need to go back.  A downgrade sstables utility would be nice 
> for a lot of reasons and I don't know that supporting going back to the 
> previous major version format would be too much code since we already support 
> reading the previous version.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-8928) Add downgradesstables

2016-06-13 Thread Jeremy Hanna (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-8928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15328361#comment-15328361
 ] 

Jeremy Hanna commented on CASSANDRA-8928:
-

The primary use case for downgrading sstables is to abort an upgrade.  So for 
example, if you're on 2.1 and want to go to 2.2 or 3.x, the upgrade plan often 
includes a contingency plan to abort and revert back to the older version, 
including preserving data written while in the upgraded version.  While 
backwards compatible streaming (CASSANDRA-8110) is helpful in other scenarios, 
it wouldn't help for that use case.

> Add downgradesstables
> -
>
> Key: CASSANDRA-8928
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8928
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Tools
>Reporter: Jeremy Hanna
>Assignee: Kaide Mu
>Priority: Minor
>  Labels: gsoc2016, mentor
>
> As mentioned in other places such as CASSANDRA-8047 and in the wild, 
> sometimes you need to go back.  A downgrade sstables utility would be nice 
> for a lot of reasons and I don't know that supporting going back to the 
> previous major version format would be too much code since we already support 
> reading the previous version.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-11933) Improve Repair performance


[ 
https://issues.apache.org/jira/browse/CASSANDRA-11933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15328324#comment-15328324
 ] 

Paulo Motta commented on CASSANDRA-11933:
-

Sorry for the delay, I was away for a few days, I will setup this shortly and 
post back here.

> Improve Repair performance
> --
>
> Key: CASSANDRA-11933
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11933
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Cyril Scetbon
>Assignee: Mahdi Mohammadi
>
> During  a full repair on a ~ 60 nodes cluster, I've been able to see that 
> this stage can be significant (up to 60 percent of the whole time) :
> https://github.com/apache/cassandra/blob/cassandra-2.1/src/java/org/apache/cassandra/service/StorageService.java#L2983-L2997
> It's merely caused by the fact that 
> https://github.com/apache/cassandra/blob/cassandra-2.1/src/java/org/apache/cassandra/service/ActiveRepairService.java#L189
>  calls {code}ss.getLocalRanges(keyspaceName){code} everytime and that it 
> takes more than 99% of the time. This call takes 600ms when there is no load 
> on the cluster and more if there is. So for 10k ranges, you can imagine that 
> it takes at least 1.5 hours just to compute ranges. 
> Underneath it calls 
> [ReplicationStrategy.getAddressRanges|https://github.com/apache/cassandra/blob/3dcbe90e02440e6ee534f643c7603d50ca08482b/src/java/org/apache/cassandra/locator/AbstractReplicationStrategy.java#L170]
>  which can get pretty inefficient ([~jbellis]'s 
> [words|https://github.com/apache/cassandra/blob/3dcbe90e02440e6ee534f643c7603d50ca08482b/src/java/org/apache/cassandra/locator/AbstractReplicationStrategy.java#L165])
> *ss.getLocalRanges(keyspaceName)* should be cached to avoid having to spend 
> hours on it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-8928) Add downgradesstables


[ 
https://issues.apache.org/jira/browse/CASSANDRA-8928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15328321#comment-15328321
 ] 

Paulo Motta commented on CASSANDRA-8928:


After digging into this for a while we realized it may be too much effort to 
support writing full-fledged sstables in previous formats in a dependable way, 
specially after the large scale changes introduced by CASSANDRA-8099. This is 
due to the fact that we not only have to translate the data component, which is 
already a large effort per se, but also make sure other components such as 
indexes, samples, stats, etc, are downgraded correctly, which means a large 
bunch of legacy code that needs to be kept around until it becomes unsupported.

We initially thought this could easily enable CASSANDRA-8110 but it can 
actually be seen as the opposite: once we make streaming backward compatible, 
downgrading is just a matter of sstableloading the new-format sstables in a 
previous-version node, which will already make the sstable be rewritten in the 
old format correctly, since all the components are rewritten during streaming. 
Of course there would be some manual juggling needed to restore the node with 
the correct tokens and schema, but we can probably add some kind of 
recovery/downgrade mode that would allow an operator to reload the data with 
sstableloader before the node becomes available. Maybe this could be made 
easier after CASSANDRA-6038 and/or CASSANDRA-9587.

So my take is that we should pursue CASSANDRA-8110 first, since that seems much 
more attainable, and rethink this later, maybe via recovery mode + 
sstableloading combo as suggested previously. Since streaming basically 
transfer raw sstable partitions over the wire, downgraded stream support is 
just a matter of translating partitions on-the-fly to the previous data format, 
without the need to support downgrading other components extensively as 
required here. With this said, I will attach a design document to 
CASSANDRA-8110 shortly.

> Add downgradesstables
> -
>
> Key: CASSANDRA-8928
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8928
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Tools
>Reporter: Jeremy Hanna
>Assignee: Kaide Mu
>Priority: Minor
>  Labels: gsoc2016, mentor
>
> As mentioned in other places such as CASSANDRA-8047 and in the wild, 
> sometimes you need to go back.  A downgrade sstables utility would be nice 
> for a lot of reasons and I don't know that supporting going back to the 
> previous major version format would be too much code since we already support 
> reading the previous version.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-11933) Improve Repair performance

2016-06-13 Thread Mahdi Mohammadi (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-11933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15328318#comment-15328318
 ] 

Mahdi Mohammadi commented on CASSANDRA-11933:
-

Can someone setup CI for this ticket? 

> Improve Repair performance
> --
>
> Key: CASSANDRA-11933
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11933
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Cyril Scetbon
>Assignee: Mahdi Mohammadi
>
> During  a full repair on a ~ 60 nodes cluster, I've been able to see that 
> this stage can be significant (up to 60 percent of the whole time) :
> https://github.com/apache/cassandra/blob/cassandra-2.1/src/java/org/apache/cassandra/service/StorageService.java#L2983-L2997
> It's merely caused by the fact that 
> https://github.com/apache/cassandra/blob/cassandra-2.1/src/java/org/apache/cassandra/service/ActiveRepairService.java#L189
>  calls {code}ss.getLocalRanges(keyspaceName){code} everytime and that it 
> takes more than 99% of the time. This call takes 600ms when there is no load 
> on the cluster and more if there is. So for 10k ranges, you can imagine that 
> it takes at least 1.5 hours just to compute ranges. 
> Underneath it calls 
> [ReplicationStrategy.getAddressRanges|https://github.com/apache/cassandra/blob/3dcbe90e02440e6ee534f643c7603d50ca08482b/src/java/org/apache/cassandra/locator/AbstractReplicationStrategy.java#L170]
>  which can get pretty inefficient ([~jbellis]'s 
> [words|https://github.com/apache/cassandra/blob/3dcbe90e02440e6ee534f643c7603d50ca08482b/src/java/org/apache/cassandra/locator/AbstractReplicationStrategy.java#L165])
> *ss.getLocalRanges(keyspaceName)* should be cached to avoid having to spend 
> hours on it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-11752) histograms/metrics in 2.2 do not appear recency biased


[ 
https://issues.apache.org/jira/browse/CASSANDRA-11752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15328269#comment-15328269
 ] 

Chris Lohfink commented on CASSANDRA-11752:
---

for what its worth, if patching in ExponentiallyDecayingResvoir may want to 
consider https://bitbucket.org/marshallpierce/hdrhistogram-metrics-reservoir 
instead. the random sampling can lose 90th+ percentile pretty easily. If just 
want trending its fine though.

> histograms/metrics in 2.2 do not appear recency biased
> --
>
> Key: CASSANDRA-11752
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11752
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Chris Burroughs
>Assignee: Per Otterström
>  Labels: metrics
> Attachments: boost-metrics.png, c-jconsole-comparison.png, 
> c-metrics.png, default-histogram.png
>
>
> In addition to upgrading to metrics3, CASSANDRA-5657 switched to using  a 
> custom histogram implementation.  After upgrading to Cassandra 2.2 
> histograms/timer metrics are not suspiciously flat.  To be useful for 
> graphing and alerting metrics need to be biased towards recent events.
> I have attached images that I think illustrate this.
>  * The first two are a comparison between latency observed by a C* 2.2 (us) 
> cluster shoring very flat lines and a client (using metrics 2.2.0, ms) 
> showing server performance problems.  We can't rule out with total certainty 
> that something else isn't the cause (that's why we measure from both the 
> client & server) but they very rarely disagree.
>  * The 3rd image compares jconsole viewing of metrics on a 2.2 and 2.1 
> cluster over several minutes.  Not a single digit changed on the 2.2 cluster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-7937) Apply backpressure gently when overloaded with writes

2016-06-13 Thread Jonathan Ellis (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-7937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15328071#comment-15328071
 ] 

Jonathan Ellis commented on CASSANDRA-7937:
---

Here is where 9318 ended up:

bq. I found some ways to OOM the server (CASSANDRA-10971 and CASSANDRA-10972) 
and have patches out for those.

bq. The # of in flight requests already has bounds depending on the bottleneck 
that prevent the server from crashing so adding an explicit one isn't useful 
right now. When TPC is implemented we will have to implement a bound since 
there is no thread pool to exhaust, but that is later work.

I don't think adding more band-aids to SEDA is going to be useful.

> Apply backpressure gently when overloaded with writes
> -
>
> Key: CASSANDRA-7937
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7937
> Project: Cassandra
>  Issue Type: Improvement
> Environment: Cassandra 2.0
>Reporter: Piotr Kołaczkowski
>  Labels: performance
>
> When writing huge amounts of data into C* cluster from analytic tools like 
> Hadoop or Apache Spark, we can see that often C* can't keep up with the load. 
> This is because analytic tools typically write data "as fast as they can" in 
> parallel, from many nodes and they are not artificially rate-limited, so C* 
> is the bottleneck here. Also, increasing the number of nodes doesn't really 
> help, because in a collocated setup this also increases number of 
> Hadoop/Spark nodes (writers) and although possible write performance is 
> higher, the problem still remains.
> We observe the following behavior:
> 1. data is ingested at an extreme fast pace into memtables and flush queue 
> fills up
> 2. the available memory limit for memtables is reached and writes are no 
> longer accepted
> 3. the application gets hit by "write timeout", and retries repeatedly, in 
> vain 
> 4. after several failed attempts to write, the job gets aborted 
> Desired behaviour:
> 1. data is ingested at an extreme fast pace into memtables and flush queue 
> fills up
> 2. after exceeding some memtable "fill threshold", C* applies adaptive rate 
> limiting to writes - the more the buffers are filled-up, the less writes/s 
> are accepted, however writes still occur within the write timeout.
> 3. thanks to slowed down data ingestion, now flush can finish before all the 
> memory gets used
> Of course the details how rate limiting could be done are up for a discussion.
> It may be also worth considering putting such logic into the driver, not C* 
> core, but then C* needs to expose at least the following information to the 
> driver, so we could calculate the desired maximum data rate:
> 1. current amount of memory available for writes before they would completely 
> block
> 2. total amount of data queued to be flushed and flush progress (amount of 
> data to flush remaining for the memtable currently being flushed)
> 3. average flush write speed



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-11752) histograms/metrics in 2.2 do not appear recency biased

2016-06-13 Thread T Jake Luciani (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-11752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15328050#comment-15328050
 ] 

T Jake Luciani commented on CASSANDRA-11752:


[~eperott] I assigned you to it thanks!

> histograms/metrics in 2.2 do not appear recency biased
> --
>
> Key: CASSANDRA-11752
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11752
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Chris Burroughs
>Assignee: Per Otterström
>  Labels: metrics
> Attachments: boost-metrics.png, c-jconsole-comparison.png, 
> c-metrics.png, default-histogram.png
>
>
> In addition to upgrading to metrics3, CASSANDRA-5657 switched to using  a 
> custom histogram implementation.  After upgrading to Cassandra 2.2 
> histograms/timer metrics are not suspiciously flat.  To be useful for 
> graphing and alerting metrics need to be biased towards recent events.
> I have attached images that I think illustrate this.
>  * The first two are a comparison between latency observed by a C* 2.2 (us) 
> cluster shoring very flat lines and a client (using metrics 2.2.0, ms) 
> showing server performance problems.  We can't rule out with total certainty 
> that something else isn't the cause (that's why we measure from both the 
> client & server) but they very rarely disagree.
>  * The 3rd image compares jconsole viewing of metrics on a 2.2 and 2.1 
> cluster over several minutes.  Not a single digit changed on the 2.2 cluster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-11380) Client visible backpressure mechanism


[ 
https://issues.apache.org/jira/browse/CASSANDRA-11380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15328048#comment-15328048
 ] 

Wei Deng commented on CASSANDRA-11380:
--

Added a link to CASSANDRA-7937 as there are quite a bit of discussions from the 
dev team on this issue (and many of them are worth reading to understand what 
people have considered). As long as this general problem is still on people's 
radar, I'm ok to close this one as duplicate (assuming 7937 can be re-opened).

> Client visible backpressure mechanism
> -
>
> Key: CASSANDRA-11380
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11380
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Coordination
>Reporter: Wei Deng
>
> Cassandra currently lacks a sophisticated back pressure mechanism to prevent 
> clients ingesting data at too high throughput. One of the reasons why it 
> hasn't done so is because of its SEDA (Staged Event Driven Architecture) 
> design. With SEDA, an overloaded thread pool can drop those droppable 
> messages (in this case, MutationStage can drop mutation or counter mutation 
> messages) when they exceed the 2-second timeout. This can save the JVM from 
> running out of memory and crash. However, one downside from this kind of 
> load-shedding based backpressure approach is that increased number of dropped 
> mutations will increase the chance of inconsistency among replicas and will 
> likely require more repair (hints can help to some extent, but it's not 
> designed to cover all inconsistencies); another downside is that excessive 
> writes will also introduce much more pressure on compaction (especially LCS), 
>  and backlogged compaction will increase read latency and cause more frequent 
> GC pauses, and depending on the type of compaction, some backlog can take a 
> long time to clear up even after the write is removed. It seems that the 
> current load-shedding mechanism is not adequate to address a common bulk 
> loading scenario, where clients are trying to ingest data at highest 
> throughput possible. We need a more direct way to tell the client drivers to 
> slow down.
> It appears that HBase had suffered similar situation as discussed in 
> HBASE-5162, and they introduced some special exception type to tell the 
> client to slow down when a certain "overloaded" criteria is met. If we can 
> leverage a similar mechanism, our dropped mutation event can be used to 
> trigger such exceptions to push back on the client; at the same time, 
> backlogged compaction (when the number of pending compactions exceeds a 
> certain threshold) can also be used for the push back and this can prevent 
> vicious cycle mentioned in 
> https://issues.apache.org/jira/browse/CASSANDRA-11366?focusedCommentId=15198786=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15198786.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-11752) histograms/metrics in 2.2 do not appear recency biased

2016-06-13 Thread T Jake Luciani (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

T Jake Luciani updated CASSANDRA-11752:
---
Assignee: Per Otterström

> histograms/metrics in 2.2 do not appear recency biased
> --
>
> Key: CASSANDRA-11752
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11752
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Chris Burroughs
>Assignee: Per Otterström
>  Labels: metrics
> Attachments: boost-metrics.png, c-jconsole-comparison.png, 
> c-metrics.png, default-histogram.png
>
>
> In addition to upgrading to metrics3, CASSANDRA-5657 switched to using  a 
> custom histogram implementation.  After upgrading to Cassandra 2.2 
> histograms/timer metrics are not suspiciously flat.  To be useful for 
> graphing and alerting metrics need to be biased towards recent events.
> I have attached images that I think illustrate this.
>  * The first two are a comparison between latency observed by a C* 2.2 (us) 
> cluster shoring very flat lines and a client (using metrics 2.2.0, ms) 
> showing server performance problems.  We can't rule out with total certainty 
> that something else isn't the cause (that's why we measure from both the 
> client & server) but they very rarely disagree.
>  * The 3rd image compares jconsole viewing of metrics on a 2.2 and 2.1 
> cluster over several minutes.  Not a single digit changed on the 2.2 cluster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-11380) Client visible backpressure mechanism


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Deng updated CASSANDRA-11380:
-
External issue URL: https://issues.apache.org/jira/browse/CASSANDRA-7937
 External issue ID:   (was: 7937)

> Client visible backpressure mechanism
> -
>
> Key: CASSANDRA-11380
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11380
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Coordination
>Reporter: Wei Deng
>
> Cassandra currently lacks a sophisticated back pressure mechanism to prevent 
> clients ingesting data at too high throughput. One of the reasons why it 
> hasn't done so is because of its SEDA (Staged Event Driven Architecture) 
> design. With SEDA, an overloaded thread pool can drop those droppable 
> messages (in this case, MutationStage can drop mutation or counter mutation 
> messages) when they exceed the 2-second timeout. This can save the JVM from 
> running out of memory and crash. However, one downside from this kind of 
> load-shedding based backpressure approach is that increased number of dropped 
> mutations will increase the chance of inconsistency among replicas and will 
> likely require more repair (hints can help to some extent, but it's not 
> designed to cover all inconsistencies); another downside is that excessive 
> writes will also introduce much more pressure on compaction (especially LCS), 
>  and backlogged compaction will increase read latency and cause more frequent 
> GC pauses, and depending on the type of compaction, some backlog can take a 
> long time to clear up even after the write is removed. It seems that the 
> current load-shedding mechanism is not adequate to address a common bulk 
> loading scenario, where clients are trying to ingest data at highest 
> throughput possible. We need a more direct way to tell the client drivers to 
> slow down.
> It appears that HBase had suffered similar situation as discussed in 
> HBASE-5162, and they introduced some special exception type to tell the 
> client to slow down when a certain "overloaded" criteria is met. If we can 
> leverage a similar mechanism, our dropped mutation event can be used to 
> trigger such exceptions to push back on the client; at the same time, 
> backlogged compaction (when the number of pending compactions exceeds a 
> certain threshold) can also be used for the push back and this can prevent 
> vicious cycle mentioned in 
> https://issues.apache.org/jira/browse/CASSANDRA-11366?focusedCommentId=15198786=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15198786.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-11380) Client visible backpressure mechanism


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Deng updated CASSANDRA-11380:
-
External issue ID: 7937

> Client visible backpressure mechanism
> -
>
> Key: CASSANDRA-11380
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11380
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Coordination
>Reporter: Wei Deng
>
> Cassandra currently lacks a sophisticated back pressure mechanism to prevent 
> clients ingesting data at too high throughput. One of the reasons why it 
> hasn't done so is because of its SEDA (Staged Event Driven Architecture) 
> design. With SEDA, an overloaded thread pool can drop those droppable 
> messages (in this case, MutationStage can drop mutation or counter mutation 
> messages) when they exceed the 2-second timeout. This can save the JVM from 
> running out of memory and crash. However, one downside from this kind of 
> load-shedding based backpressure approach is that increased number of dropped 
> mutations will increase the chance of inconsistency among replicas and will 
> likely require more repair (hints can help to some extent, but it's not 
> designed to cover all inconsistencies); another downside is that excessive 
> writes will also introduce much more pressure on compaction (especially LCS), 
>  and backlogged compaction will increase read latency and cause more frequent 
> GC pauses, and depending on the type of compaction, some backlog can take a 
> long time to clear up even after the write is removed. It seems that the 
> current load-shedding mechanism is not adequate to address a common bulk 
> loading scenario, where clients are trying to ingest data at highest 
> throughput possible. We need a more direct way to tell the client drivers to 
> slow down.
> It appears that HBase had suffered similar situation as discussed in 
> HBASE-5162, and they introduced some special exception type to tell the 
> client to slow down when a certain "overloaded" criteria is met. If we can 
> leverage a similar mechanism, our dropped mutation event can be used to 
> trigger such exceptions to push back on the client; at the same time, 
> backlogged compaction (when the number of pending compactions exceeds a 
> certain threshold) can also be used for the push back and this can prevent 
> vicious cycle mentioned in 
> https://issues.apache.org/jira/browse/CASSANDRA-11366?focusedCommentId=15198786=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15198786.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-12001) nodetool stopdaemon doesn't stop cassandra gracefully

2016-06-13 Thread Anshu Vajpayee (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anshu Vajpayee updated CASSANDRA-12001:
---
Issue Type: Bug  (was: Improvement)

> nodetool stopdaemon  doesn't  stop cassandra gracefully 
> 
>
> Key: CASSANDRA-12001
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12001
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tools
> Environment: Ubuntu: Linux  3.11.0-15-generic #25~precise1-Ubuntu SMP 
> Thu Jan 30 17:39:31 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
> Cassandra Version : 
> cassandra -v
> 2.1.2
>Reporter: Anshu Vajpayee
>Priority: Minor
>
> As per general opinion, nodetool stopdaemon should perform graceful shutdown 
> rater than crash killing of cassandra daemon .
> It  doesn't flush the memtables and also it doesn't stop the thrift and CQL 
> connection interfaces before crashing/stopping  the node.  It directly calls 
> SIGTERM on process as simple as kill -15/ctrl + c. 
>  
> 1. created a table  like as below:
> cqlsh:test_ks> create table t2(id1 int, id2 text, primary key(id1));
> cqlsh:test_ks> 
> cqlsh:test_ks> insert into t2(id1,id2) values (1,'a');
> cqlsh:test_ks> insert into t2(id1,id2) values (2,'a');
> cqlsh:test_ks> insert into t2(id1,id2) values (3,'a');
> cqlsh:test_ks> select * from t2;
>  id1 | id2
> -+-
>1 |   a
>2 |   a
>3 |   a
> 2.Flush  the memtable manually using nodetool flush
> student@cascor:~/node1/apache-cassandra-2.1.2/bin$ nodetool flush
> student@cascor:~/node1/apache-cassandra-2.1.2/bin$ cd 
> ../data/data/test_ks/t2-a671f6b0319a11e6a91ae3263299699d/
> student@cascor:~/node1/apache-cassandra-2.1.2/data/data/test_ks/t2-a671f6b0319a11e6a91ae3263299699d$
>  ls -ltr 
> total 36
> -rw-rw-r-- 1 student student   16 Jun 13 12:14 test_ks-t2-ka-1-Filter.db
> -rw-rw-r-- 1 student student   54 Jun 13 12:14 test_ks-t2-ka-1-Index.db
> -rw-rw-r-- 1 student student   93 Jun 13 12:14 test_ks-t2-ka-1-Data.db
> -rw-rw-r-- 1 student student   91 Jun 13 12:14 test_ks-t2-ka-1-TOC.txt
> -rw-rw-r-- 1 student student   80 Jun 13 12:14 test_ks-t2-ka-1-Summary.db
> -rw-rw-r-- 1 student student 4442 Jun 13 12:14 test_ks-t2-ka-1-Statistics.db
> -rw-rw-r-- 1 student student   10 Jun 13 12:14 test_ks-t2-ka-1-Digest.sha1
> -rw-rw-r-- 1 student student   43 Jun 13 12:14 
> test_ks-t2-ka-1-CompressionInfo.db
> 3. Make few more changes on table t2
> cqlsh:test_ks> insert into t2(id1,id2) values (5,'a');
> cqlsh:test_ks> insert into t2(id1,id2) values (6,'a');
> cqlsh:test_ks> insert into t2(id1,id2) values (7,'a');
> cqlsh:test_ks> insert into t2(id1,id2) values (8,'a');
> cqlsh:test_ks> select * from t2;
>  id1 | id2
> -+-
>5 |   a
>1 |   a
>8 |   a
>2 |   a
>7 |   a
>6 |   a
>3 |   a
> 4. Stopping the node using nodetool stopdaemon 
> student@cascor:~$ nodetool stopdaemon
> Cassandra has shutdown.
> error: Connection refused
> -- StackTrace --
> java.net.ConnectException: Connection refused
> 5. No new version of SStables . Reason stopdaemon doesn't run nodetool 
> flush/drain before actually stopping daemon.
> student@cascor:~/node1/apache-cassandra-2.1.2/data/data/test_ks/t2-a671f6b0319a11e6a91ae3263299699d$
>  ls -ltr
> total 36
> -rw-rw-r-- 1 student student   16 Jun 13 12:14 test_ks-t2-ka-1-Filter.db
> -rw-rw-r-- 1 student student   54 Jun 13 12:14 test_ks-t2-ka-1-Index.db
> -rw-rw-r-- 1 student student   93 Jun 13 12:14 test_ks-t2-ka-1-Data.db
> -rw-rw-r-- 1 student student   91 Jun 13 12:14 test_ks-t2-ka-1-TOC.txt
> -rw-rw-r-- 1 student student   80 Jun 13 12:14 test_ks-t2-ka-1-Summary.db
> -rw-rw-r-- 1 student student 4442 Jun 13 12:14 test_ks-t2-ka-1-Statistics.db
> -rw-rw-r-- 1 student student   10 Jun 13 12:14 test_ks-t2-ka-1-Digest.sha1
> -rw-rw-r-- 1 student student   43 Jun 13 12:14 
> test_ks-t2-ka-1-CompressionInfo.db
> student@cascor:~/node1/apache-cassandra-2.1.2/data/data/test_ks/t2-a671f6b0319a11e6a91ae3263299699d$
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-11752) histograms/metrics in 2.2 do not appear recency biased

2016-06-13 Thread JIRA


[ 
https://issues.apache.org/jira/browse/CASSANDRA-11752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15328026#comment-15328026
 ] 

Per Otterström commented on CASSANDRA-11752:


Yes, I would like to give it a try.

Had a talk with [~cnlwsu] about it on NGCC. The idea is to create a separate 
DecayingEH implementation based on forward decay.

> histograms/metrics in 2.2 do not appear recency biased
> --
>
> Key: CASSANDRA-11752
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11752
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Chris Burroughs
>  Labels: metrics
> Attachments: boost-metrics.png, c-jconsole-comparison.png, 
> c-metrics.png, default-histogram.png
>
>
> In addition to upgrading to metrics3, CASSANDRA-5657 switched to using  a 
> custom histogram implementation.  After upgrading to Cassandra 2.2 
> histograms/timer metrics are not suspiciously flat.  To be useful for 
> graphing and alerting metrics need to be biased towards recent events.
> I have attached images that I think illustrate this.
>  * The first two are a comparison between latency observed by a C* 2.2 (us) 
> cluster shoring very flat lines and a client (using metrics 2.2.0, ms) 
> showing server performance problems.  We can't rule out with total certainty 
> that something else isn't the cause (that's why we measure from both the 
> client & server) but they very rarely disagree.
>  * The 3rd image compares jconsole viewing of metrics on a 2.2 and 2.1 
> cluster over several minutes.  Not a single digit changed on the 2.2 cluster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-12001) nodetool stopdaemon doesn't stop cassandra gracefully

2016-06-13 Thread Anshu Vajpayee (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anshu Vajpayee updated CASSANDRA-12001:
---
Description: 
As per general opinion, nodetool stopdaemon should perform graceful shutdown 
rater than crash killing of cassandra daemon .
It  doesn't flush the memtables and also it doesn't stop the thrift and CQL 
connection interfaces before crashing/stopping  the node.  It directly calls 
SIGTERM on process as simple as kill -15/ctrl + c. 

 

1. created a table  like as below:

cqlsh:test_ks> create table t2(id1 int, id2 text, primary key(id1));
cqlsh:test_ks> 
cqlsh:test_ks> insert into t2(id1,id2) values (1,'a');
cqlsh:test_ks> insert into t2(id1,id2) values (2,'a');
cqlsh:test_ks> insert into t2(id1,id2) values (3,'a');
cqlsh:test_ks> select * from t2;

 id1 | id2
-+-
   1 |   a
   2 |   a
   3 |   a

2.Flush  the memtable manually using nodetool flush

student@cascor:~/node1/apache-cassandra-2.1.2/bin$ nodetool flush
student@cascor:~/node1/apache-cassandra-2.1.2/bin$ cd 
../data/data/test_ks/t2-a671f6b0319a11e6a91ae3263299699d/
student@cascor:~/node1/apache-cassandra-2.1.2/data/data/test_ks/t2-a671f6b0319a11e6a91ae3263299699d$
 ls -ltr 
total 36
-rw-rw-r-- 1 student student   16 Jun 13 12:14 test_ks-t2-ka-1-Filter.db
-rw-rw-r-- 1 student student   54 Jun 13 12:14 test_ks-t2-ka-1-Index.db
-rw-rw-r-- 1 student student   93 Jun 13 12:14 test_ks-t2-ka-1-Data.db
-rw-rw-r-- 1 student student   91 Jun 13 12:14 test_ks-t2-ka-1-TOC.txt
-rw-rw-r-- 1 student student   80 Jun 13 12:14 test_ks-t2-ka-1-Summary.db
-rw-rw-r-- 1 student student 4442 Jun 13 12:14 test_ks-t2-ka-1-Statistics.db
-rw-rw-r-- 1 student student   10 Jun 13 12:14 test_ks-t2-ka-1-Digest.sha1
-rw-rw-r-- 1 student student   43 Jun 13 12:14 
test_ks-t2-ka-1-CompressionInfo.db

3. Make few more changes on table t2

cqlsh:test_ks> insert into t2(id1,id2) values (5,'a');
cqlsh:test_ks> insert into t2(id1,id2) values (6,'a');
cqlsh:test_ks> insert into t2(id1,id2) values (7,'a');
cqlsh:test_ks> insert into t2(id1,id2) values (8,'a');
cqlsh:test_ks> select * from t2;

 id1 | id2
-+-
   5 |   a
   1 |   a
   8 |   a
   2 |   a
   7 |   a
   6 |   a
   3 |   a

4. Stopping the node using nodetool stopdaemon 
student@cascor:~$ nodetool stopdaemon
Cassandra has shutdown.
error: Connection refused
-- StackTrace --
java.net.ConnectException: Connection refused

5. No new version of SStables . Reason stopdaemon doesn't run nodetool 
flush/drain before actually stopping daemon.

student@cascor:~/node1/apache-cassandra-2.1.2/data/data/test_ks/t2-a671f6b0319a11e6a91ae3263299699d$
 ls -ltr
total 36
-rw-rw-r-- 1 student student   16 Jun 13 12:14 test_ks-t2-ka-1-Filter.db
-rw-rw-r-- 1 student student   54 Jun 13 12:14 test_ks-t2-ka-1-Index.db
-rw-rw-r-- 1 student student   93 Jun 13 12:14 test_ks-t2-ka-1-Data.db
-rw-rw-r-- 1 student student   91 Jun 13 12:14 test_ks-t2-ka-1-TOC.txt
-rw-rw-r-- 1 student student   80 Jun 13 12:14 test_ks-t2-ka-1-Summary.db
-rw-rw-r-- 1 student student 4442 Jun 13 12:14 test_ks-t2-ka-1-Statistics.db
-rw-rw-r-- 1 student student   10 Jun 13 12:14 test_ks-t2-ka-1-Digest.sha1
-rw-rw-r-- 1 student student   43 Jun 13 12:14 
test_ks-t2-ka-1-CompressionInfo.db
student@cascor:~/node1/apache-cassandra-2.1.2/data/data/test_ks/t2-a671f6b0319a11e6a91ae3263299699d$
 









  was:
As per general opinion, nodetool stopdaemon should perform graceful shutdown 
rater than crash killing of cassandra daemon (like iwth kill -9).
But It  doesn't flush the memtables and also it doesn't stop the thrift and CQL 
connection before crashing/stopping  the node.  It directly calls SIGTERM on 
process as simple as kill -15. 

Testing : 

created a table  like as below:

cqlsh:test_ks> create table t2(id1 int, id2 text, primary key(id1));
cqlsh:test_ks> 
cqlsh:test_ks> insert into t2(id1,id2) values (1,'a');
cqlsh:test_ks> insert into t2(id1,id2) values (2,'a');
cqlsh:test_ks> insert into t2(id1,id2) values (3,'a');
cqlsh:test_ks> select * from t2;

 id1 | id2
-+-
   1 |   a
   2 |   a
   3 |   a

Flush  the memtable manually using nodetool flush

student@cascor:~/node1/apache-cassandra-2.1.2/bin$ nodetool flush
student@cascor:~/node1/apache-cassandra-2.1.2/bin$ cd 
../data/data/test_ks/t2-a671f6b0319a11e6a91ae3263299699d/
student@cascor:~/node1/apache-cassandra-2.1.2/data/data/test_ks/t2-a671f6b0319a11e6a91ae3263299699d$
 ls -ltr 
total 36
-rw-rw-r-- 1 student student   16 Jun 13 12:14 test_ks-t2-ka-1-Filter.db
-rw-rw-r-- 1 student student   54 Jun 13 12:14 test_ks-t2-ka-1-Index.db
-rw-rw-r-- 1 student student   93 Jun 13 12:14 test_ks-t2-ka-1-Data.db
-rw-rw-r-- 1 student student   91 Jun 13 12:14 test_ks-t2-ka-1-TOC.txt
-rw-rw-r-- 1 student student   80 Jun 13 12:14 test_ks-t2-ka-1-Summary.db
-rw-rw-r-- 1 student student 4442 Jun 13 12:14 test_ks-t2-ka-1-Statistics.db
-rw-rw-r-- 1 student student   10 Jun 13 12:14

[jira] [Commented] (CASSANDRA-7937) Apply backpressure gently when overloaded with writes


[ 
https://issues.apache.org/jira/browse/CASSANDRA-7937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15328023#comment-15328023
 ] 

Wei Deng commented on CASSANDRA-7937:
-

[~jbellis] Since CASSANDRA-9318 turned out to be an ineffective approach (and 
closed as WONT-FIX), should we reopen this JIRA?

> Apply backpressure gently when overloaded with writes
> -
>
> Key: CASSANDRA-7937
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7937
> Project: Cassandra
>  Issue Type: Improvement
> Environment: Cassandra 2.0
>Reporter: Piotr Kołaczkowski
>  Labels: performance
>
> When writing huge amounts of data into C* cluster from analytic tools like 
> Hadoop or Apache Spark, we can see that often C* can't keep up with the load. 
> This is because analytic tools typically write data "as fast as they can" in 
> parallel, from many nodes and they are not artificially rate-limited, so C* 
> is the bottleneck here. Also, increasing the number of nodes doesn't really 
> help, because in a collocated setup this also increases number of 
> Hadoop/Spark nodes (writers) and although possible write performance is 
> higher, the problem still remains.
> We observe the following behavior:
> 1. data is ingested at an extreme fast pace into memtables and flush queue 
> fills up
> 2. the available memory limit for memtables is reached and writes are no 
> longer accepted
> 3. the application gets hit by "write timeout", and retries repeatedly, in 
> vain 
> 4. after several failed attempts to write, the job gets aborted 
> Desired behaviour:
> 1. data is ingested at an extreme fast pace into memtables and flush queue 
> fills up
> 2. after exceeding some memtable "fill threshold", C* applies adaptive rate 
> limiting to writes - the more the buffers are filled-up, the less writes/s 
> are accepted, however writes still occur within the write timeout.
> 3. thanks to slowed down data ingestion, now flush can finish before all the 
> memory gets used
> Of course the details how rate limiting could be done are up for a discussion.
> It may be also worth considering putting such logic into the driver, not C* 
> core, but then C* needs to expose at least the following information to the 
> driver, so we could calculate the desired maximum data rate:
> 1. current amount of memory available for writes before they would completely 
> block
> 2. total amount of data queued to be flushed and flush progress (amount of 
> data to flush remaining for the memtable currently being flushed)
> 3. average flush write speed



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Issue Comment Deleted] (CASSANDRA-10404) Node to Node encryption transitional mode

2016-06-13 Thread Cyril Scetbon (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cyril Scetbon updated CASSANDRA-10404:
--
Comment: was deleted

(was: [~jasobrown] ok, would be great to know if others have the bandwidth (I 
can't check it) or not to be able to plan it. )

> Node to Node encryption transitional mode
> -
>
> Key: CASSANDRA-10404
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10404
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Tom Lewis
>Assignee: Jason Brown
>
> Create a transitional mode for encryption that allows encrypted and 
> unencrypted traffic node-to-node during a change over to encryption from 
> unencrypted. This alleviates downtime during the switch.
>  This is similar to https://issues.apache.org/jira/browse/CASSANDRA-8803 
> which is intended for client-to-node



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (CASSANDRA-12001) nodetool stopdaemon doesn't stop cassandra gracefully

2016-06-13 Thread Anshu Vajpayee (JIRA)

Anshu Vajpayee created CASSANDRA-12001:
--

 Summary: nodetool stopdaemon  doesn't  stop cassandra gracefully 
 Key: CASSANDRA-12001
 URL: https://issues.apache.org/jira/browse/CASSANDRA-12001
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
 Environment: Ubuntu: Linux  3.11.0-15-generic #25~precise1-Ubuntu SMP 
Thu Jan 30 17:39:31 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
Cassandra Version : 
cassandra -v
2.1.2



Reporter: Anshu Vajpayee
Priority: Minor


As per general opinion, nodetool stopdaemon should perform graceful shutdown 
rater than crash killing of cassandra daemon (like iwth kill -9).
But It  doesn't flush the memtables and also it doesn't stop the thrift and CQL 
connection before crashing/stopping  the node.  It directly calls SIGTERM on 
process as simple as kill -15. 

Testing : 

created a table  like as below:

cqlsh:test_ks> create table t2(id1 int, id2 text, primary key(id1));
cqlsh:test_ks> 
cqlsh:test_ks> insert into t2(id1,id2) values (1,'a');
cqlsh:test_ks> insert into t2(id1,id2) values (2,'a');
cqlsh:test_ks> insert into t2(id1,id2) values (3,'a');
cqlsh:test_ks> select * from t2;

 id1 | id2
-+-
   1 |   a
   2 |   a
   3 |   a

Flush  the memtable manually using nodetool flush

student@cascor:~/node1/apache-cassandra-2.1.2/bin$ nodetool flush
student@cascor:~/node1/apache-cassandra-2.1.2/bin$ cd 
../data/data/test_ks/t2-a671f6b0319a11e6a91ae3263299699d/
student@cascor:~/node1/apache-cassandra-2.1.2/data/data/test_ks/t2-a671f6b0319a11e6a91ae3263299699d$
 ls -ltr 
total 36
-rw-rw-r-- 1 student student   16 Jun 13 12:14 test_ks-t2-ka-1-Filter.db
-rw-rw-r-- 1 student student   54 Jun 13 12:14 test_ks-t2-ka-1-Index.db
-rw-rw-r-- 1 student student   93 Jun 13 12:14 test_ks-t2-ka-1-Data.db
-rw-rw-r-- 1 student student   91 Jun 13 12:14 test_ks-t2-ka-1-TOC.txt
-rw-rw-r-- 1 student student   80 Jun 13 12:14 test_ks-t2-ka-1-Summary.db
-rw-rw-r-- 1 student student 4442 Jun 13 12:14 test_ks-t2-ka-1-Statistics.db
-rw-rw-r-- 1 student student   10 Jun 13 12:14 test_ks-t2-ka-1-Digest.sha1
-rw-rw-r-- 1 student student   43 Jun 13 12:14 
test_ks-t2-ka-1-CompressionInfo.db

Made few more changes on table t2

cqlsh:test_ks> insert into t2(id1,id2) values (5,'a');
cqlsh:test_ks> insert into t2(id1,id2) values (6,'a');
cqlsh:test_ks> insert into t2(id1,id2) values (7,'a');
cqlsh:test_ks> insert into t2(id1,id2) values (8,'a');
cqlsh:test_ks> select * from t2;

 id1 | id2
-+-
   5 |   a
   1 |   a
   8 |   a
   2 |   a
   7 |   a
   6 |   a
   3 |   a

Stopping the node using nodetool stopdaemon 
student@cascor:~$ nodetool stopdaemon
Cassandra has shutdown.
error: Connection refused
-- StackTrace --
java.net.ConnectException: Connection refused

No new version of SStables . Reason stopdaemon doesn't run nodetool flush/drain 
before actually stopping daemon.

student@cascor:~/node1/apache-cassandra-2.1.2/data/data/test_ks/t2-a671f6b0319a11e6a91ae3263299699d$
 ls -ltr
total 36
-rw-rw-r-- 1 student student   16 Jun 13 12:14 test_ks-t2-ka-1-Filter.db
-rw-rw-r-- 1 student student   54 Jun 13 12:14 test_ks-t2-ka-1-Index.db
-rw-rw-r-- 1 student student   93 Jun 13 12:14 test_ks-t2-ka-1-Data.db
-rw-rw-r-- 1 student student   91 Jun 13 12:14 test_ks-t2-ka-1-TOC.txt
-rw-rw-r-- 1 student student   80 Jun 13 12:14 test_ks-t2-ka-1-Summary.db
-rw-rw-r-- 1 student student 4442 Jun 13 12:14 test_ks-t2-ka-1-Statistics.db
-rw-rw-r-- 1 student student   10 Jun 13 12:14 test_ks-t2-ka-1-Digest.sha1
-rw-rw-r-- 1 student student   43 Jun 13 12:14 
test_ks-t2-ka-1-CompressionInfo.db
student@cascor:~/node1/apache-cassandra-2.1.2/data/data/test_ks/t2-a671f6b0319a11e6a91ae3263299699d$
 











--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (CASSANDRA-7544) Allow storage port to be configurable per node


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sankalp kohli reassigned CASSANDRA-7544:


Assignee: sankalp kohli  (was: Sam Overton)

> Allow storage port to be configurable per node
> --
>
> Key: CASSANDRA-7544
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7544
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Sam Overton
>Assignee: sankalp kohli
> Fix For: 3.x
>
>
> Currently storage_port must be configured identically on all nodes in a 
> cluster and it is assumed that this is the case when connecting to a remote 
> node.
> This prevents running in any environment that requires multiple nodes to be 
> able to bind to the same network interface, such as with many automatic 
> provisioning/deployment frameworks.
> The current solutions seems to be
> * use a separate network interface for each node deployed to the same box. 
> This puts a big requirement on IP allocation at large scale.
> * allow multiple clusters to be provisioned from the same resource pool, but 
> restrict allocation to a maximum of one node per host from each cluster, 
> assuming each cluster is running on a different storage port.
> It would make operations much simpler in these kind of environments if the 
> environment provisioning the resources could assign the ports to be used when 
> bringing up a new node on shared hardware.
> The changes required would be at least the following:
> 1. configure seeds as IP:port instead of just IP
> 2. gossip the storage port as part of a node's ApplicationState
> 3. refer internally to nodes by hostID instead of IP, since there will be 
> multiple nodes with the same IP
> (1) & (2) are mostly trivial and I already have a patch for these. The bulk 
> of the work to enable this is (3), and I would structure this as a separate 
> pre-requisite patch. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-7544) Allow storage port to be configurable per node


[ 
https://issues.apache.org/jira/browse/CASSANDRA-7544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15327996#comment-15327996
 ] 

sankalp kohli commented on CASSANDRA-7544:
--

cc [~pmcfadin]

> Allow storage port to be configurable per node
> --
>
> Key: CASSANDRA-7544
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7544
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Sam Overton
>Assignee: Sam Overton
> Fix For: 3.x
>
>
> Currently storage_port must be configured identically on all nodes in a 
> cluster and it is assumed that this is the case when connecting to a remote 
> node.
> This prevents running in any environment that requires multiple nodes to be 
> able to bind to the same network interface, such as with many automatic 
> provisioning/deployment frameworks.
> The current solutions seems to be
> * use a separate network interface for each node deployed to the same box. 
> This puts a big requirement on IP allocation at large scale.
> * allow multiple clusters to be provisioned from the same resource pool, but 
> restrict allocation to a maximum of one node per host from each cluster, 
> assuming each cluster is running on a different storage port.
> It would make operations much simpler in these kind of environments if the 
> environment provisioning the resources could assign the ports to be used when 
> bringing up a new node on shared hardware.
> The changes required would be at least the following:
> 1. configure seeds as IP:port instead of just IP
> 2. gossip the storage port as part of a node's ApplicationState
> 3. refer internally to nodes by hostID instead of IP, since there will be 
> multiple nodes with the same IP
> (1) & (2) are mostly trivial and I already have a patch for these. The bulk 
> of the work to enable this is (3), and I would structure this as a separate 
> pre-requisite patch. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-7544) Allow storage port to be configurable per node


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sankalp kohli updated CASSANDRA-7544:
-
Assignee: (was: sankalp kohli)

> Allow storage port to be configurable per node
> --
>
> Key: CASSANDRA-7544
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7544
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Sam Overton
> Fix For: 3.x
>
>
> Currently storage_port must be configured identically on all nodes in a 
> cluster and it is assumed that this is the case when connecting to a remote 
> node.
> This prevents running in any environment that requires multiple nodes to be 
> able to bind to the same network interface, such as with many automatic 
> provisioning/deployment frameworks.
> The current solutions seems to be
> * use a separate network interface for each node deployed to the same box. 
> This puts a big requirement on IP allocation at large scale.
> * allow multiple clusters to be provisioned from the same resource pool, but 
> restrict allocation to a maximum of one node per host from each cluster, 
> assuming each cluster is running on a different storage port.
> It would make operations much simpler in these kind of environments if the 
> environment provisioning the resources could assign the ports to be used when 
> bringing up a new node on shared hardware.
> The changes required would be at least the following:
> 1. configure seeds as IP:port instead of just IP
> 2. gossip the storage port as part of a node's ApplicationState
> 3. refer internally to nodes by hostID instead of IP, since there will be 
> multiple nodes with the same IP
> (1) & (2) are mostly trivial and I already have a patch for these. The bulk 
> of the work to enable this is (3), and I would structure this as a separate 
> pre-requisite patch. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-11752) histograms/metrics in 2.2 do not appear recency biased

2016-06-13 Thread T Jake Luciani (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-11752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15327901#comment-15327901
 ] 

T Jake Luciani commented on CASSANDRA-11752:


[~eperott] are you working on this? 

> histograms/metrics in 2.2 do not appear recency biased
> --
>
> Key: CASSANDRA-11752
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11752
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Chris Burroughs
>  Labels: metrics
> Attachments: boost-metrics.png, c-jconsole-comparison.png, 
> c-metrics.png, default-histogram.png
>
>
> In addition to upgrading to metrics3, CASSANDRA-5657 switched to using  a 
> custom histogram implementation.  After upgrading to Cassandra 2.2 
> histograms/timer metrics are not suspiciously flat.  To be useful for 
> graphing and alerting metrics need to be biased towards recent events.
> I have attached images that I think illustrate this.
>  * The first two are a comparison between latency observed by a C* 2.2 (us) 
> cluster shoring very flat lines and a client (using metrics 2.2.0, ms) 
> showing server performance problems.  We can't rule out with total certainty 
> that something else isn't the cause (that's why we measure from both the 
> client & server) but they very rarely disagree.
>  * The 3rd image compares jconsole viewing of metrics on a 2.2 and 2.1 
> cluster over several minutes.  Not a single digit changed on the 2.2 cluster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-11752) histograms/metrics in 2.2 do not appear recency biased

2016-06-13 Thread Eric Evans (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-11752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15327891#comment-15327891
 ] 

Eric Evans commented on CASSANDRA-11752:


Just adding a Me Too to the list of those looking for a solution to this; For 
those limited to something like Grafana/Graphite to plot these percentiles, the 
status quo is pretty unhelpful, (I was forced to patch my production systems to 
use {{com.codahale.metrics.ExponentiallyDecayingResvoir}} in the interim).

> histograms/metrics in 2.2 do not appear recency biased
> --
>
> Key: CASSANDRA-11752
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11752
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Chris Burroughs
>  Labels: metrics
> Attachments: boost-metrics.png, c-jconsole-comparison.png, 
> c-metrics.png, default-histogram.png
>
>
> In addition to upgrading to metrics3, CASSANDRA-5657 switched to using  a 
> custom histogram implementation.  After upgrading to Cassandra 2.2 
> histograms/timer metrics are not suspiciously flat.  To be useful for 
> graphing and alerting metrics need to be biased towards recent events.
> I have attached images that I think illustrate this.
>  * The first two are a comparison between latency observed by a C* 2.2 (us) 
> cluster shoring very flat lines and a client (using metrics 2.2.0, ms) 
> showing server performance problems.  We can't rule out with total certainty 
> that something else isn't the cause (that's why we measure from both the 
> client & server) but they very rarely disagree.
>  * The 3rd image compares jconsole viewing of metrics on a 2.2 and 2.1 
> cluster over several minutes.  Not a single digit changed on the 2.2 cluster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

svn commit: r13983 - in /release/cassandra: 3.0.7/ 3.5/ 3.7/ debian/dists/30x/ debian/dists/30x/main/binary-amd64/ debian/dists/30x/main/binary-i386/ debian/dists/30x/main/source/ debian/dists/37x/ de

Author: jake
Date: Mon Jun 13 18:30:39 2016
New Revision: 13983

Log:
3.7 and 3.0.7

Added:
release/cassandra/3.0.7/
release/cassandra/3.0.7/apache-cassandra-3.0.7-bin.tar.gz   (with props)
release/cassandra/3.0.7/apache-cassandra-3.0.7-bin.tar.gz.asc
release/cassandra/3.0.7/apache-cassandra-3.0.7-bin.tar.gz.asc.md5
release/cassandra/3.0.7/apache-cassandra-3.0.7-bin.tar.gz.asc.sha1
release/cassandra/3.0.7/apache-cassandra-3.0.7-bin.tar.gz.md5
release/cassandra/3.0.7/apache-cassandra-3.0.7-bin.tar.gz.sha1
release/cassandra/3.0.7/apache-cassandra-3.0.7-src.tar.gz   (with props)
release/cassandra/3.0.7/apache-cassandra-3.0.7-src.tar.gz.asc
release/cassandra/3.0.7/apache-cassandra-3.0.7-src.tar.gz.asc.md5
release/cassandra/3.0.7/apache-cassandra-3.0.7-src.tar.gz.asc.sha1
release/cassandra/3.0.7/apache-cassandra-3.0.7-src.tar.gz.md5
release/cassandra/3.0.7/apache-cassandra-3.0.7-src.tar.gz.sha1
release/cassandra/3.7/
release/cassandra/3.7/apache-cassandra-3.7-bin.tar.gz   (with props)
release/cassandra/3.7/apache-cassandra-3.7-bin.tar.gz.asc
release/cassandra/3.7/apache-cassandra-3.7-bin.tar.gz.asc.md5
release/cassandra/3.7/apache-cassandra-3.7-bin.tar.gz.asc.sha1
release/cassandra/3.7/apache-cassandra-3.7-bin.tar.gz.md5
release/cassandra/3.7/apache-cassandra-3.7-bin.tar.gz.sha1
release/cassandra/3.7/apache-cassandra-3.7-src.tar.gz   (with props)
release/cassandra/3.7/apache-cassandra-3.7-src.tar.gz.asc
release/cassandra/3.7/apache-cassandra-3.7-src.tar.gz.asc.md5
release/cassandra/3.7/apache-cassandra-3.7-src.tar.gz.asc.sha1
release/cassandra/3.7/apache-cassandra-3.7-src.tar.gz.md5
release/cassandra/3.7/apache-cassandra-3.7-src.tar.gz.sha1
release/cassandra/debian/dists/37x/
release/cassandra/debian/dists/37x/InRelease
release/cassandra/debian/dists/37x/Release
release/cassandra/debian/dists/37x/Release.gpg
release/cassandra/debian/dists/37x/main/
release/cassandra/debian/dists/37x/main/binary-amd64/
release/cassandra/debian/dists/37x/main/binary-amd64/Packages
release/cassandra/debian/dists/37x/main/binary-amd64/Packages.gz   (with 
props)
release/cassandra/debian/dists/37x/main/binary-amd64/Release
release/cassandra/debian/dists/37x/main/binary-i386/
release/cassandra/debian/dists/37x/main/binary-i386/Packages
release/cassandra/debian/dists/37x/main/binary-i386/Packages.gz   (with 
props)
release/cassandra/debian/dists/37x/main/binary-i386/Release
release/cassandra/debian/dists/37x/main/source/
release/cassandra/debian/dists/37x/main/source/Release
release/cassandra/debian/dists/37x/main/source/Sources.gz   (with props)

release/cassandra/debian/pool/main/c/cassandra/cassandra-tools_3.0.7_all.deb   
(with props)
release/cassandra/debian/pool/main/c/cassandra/cassandra-tools_3.7_all.deb  
 (with props)
release/cassandra/debian/pool/main/c/cassandra/cassandra_3.0.7.diff.gz   
(with props)
release/cassandra/debian/pool/main/c/cassandra/cassandra_3.0.7.dsc
release/cassandra/debian/pool/main/c/cassandra/cassandra_3.0.7.orig.tar.gz  
 (with props)

release/cassandra/debian/pool/main/c/cassandra/cassandra_3.0.7.orig.tar.gz.asc
release/cassandra/debian/pool/main/c/cassandra/cassandra_3.0.7_all.deb   
(with props)
release/cassandra/debian/pool/main/c/cassandra/cassandra_3.7.diff.gz   
(with props)
release/cassandra/debian/pool/main/c/cassandra/cassandra_3.7.dsc
release/cassandra/debian/pool/main/c/cassandra/cassandra_3.7.orig.tar.gz   
(with props)
release/cassandra/debian/pool/main/c/cassandra/cassandra_3.7.orig.tar.gz.asc
release/cassandra/debian/pool/main/c/cassandra/cassandra_3.7_all.deb   
(with props)
Removed:
release/cassandra/3.5/
Modified:
release/cassandra/debian/dists/30x/InRelease
release/cassandra/debian/dists/30x/Release
release/cassandra/debian/dists/30x/Release.gpg
release/cassandra/debian/dists/30x/main/binary-amd64/Packages
release/cassandra/debian/dists/30x/main/binary-amd64/Packages.gz
release/cassandra/debian/dists/30x/main/binary-i386/Packages
release/cassandra/debian/dists/30x/main/binary-i386/Packages.gz
release/cassandra/debian/dists/30x/main/source/Sources.gz

Added: release/cassandra/3.0.7/apache-cassandra-3.0.7-bin.tar.gz
==
Binary file - no diff available.

Propchange: release/cassandra/3.0.7/apache-cassandra-3.0.7-bin.tar.gz
--
svn:mime-type = application/octet-stream

Added: release/cassandra/3.0.7/apache-cassandra-3.0.7-bin.tar.gz.asc
==
--- release/cassandra/3.0.7/apache-cassandra-3.0.7-bin.tar.gz.asc (added)
+++ release/cassandra/3.0.7/apache-cassandra-3.0.7-bin.tar.gz.asc Mon Jun 13 
18:30:39 2016
@@ -0,0

[cassandra] Git Push Summary

Repository: cassandra
Updated Tags:  refs/tags/3.7-tentative [deleted] 6815dc970

[cassandra] Git Push Summary

Repository: cassandra
Updated Tags:  refs/tags/cassandra-3.7 [created] 12660c4b9

[cassandra] Git Push Summary

Repository: cassandra
Updated Tags:  refs/tags/3.0.7-tentative [deleted] 040ac666a

[cassandra] Git Push Summary

Repository: cassandra
Updated Tags:  refs/tags/cassandra-3.0.7 [created] 49b561ae6

[jira] [Commented] (CASSANDRA-11963) Paged queries limited to Integer.MAX_VALUE total rows

2016-06-13 Thread Alex Baeza (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-11963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15327712#comment-15327712
 ] 

Alex Baeza commented on CASSANDRA-11963:


Thank you for looking into this. Since we often use unlimited queries to do 
custom ETL in Cassandra 2, this would be a great help.

> Paged queries limited to Integer.MAX_VALUE total rows
> -
>
> Key: CASSANDRA-11963
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11963
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Adam Holmberg
>Priority: Minor
> Fix For: 2.1.x, 2.2.x
>
>
> Paged queries are artificially limited to Integer.MAX_INT rows in total. This 
> appears to be related to PagingState.remaining, which decrements 
> monotonically as pages are consumed. 
> I don't think this is intentional behavior, and haven't found any mention of 
> it in the docs.
> Issue observed in latest 2.1 and 2.2 releases. Does not occur in 3.x



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (CASSANDRA-12000) dtest failure in thrift_tests.TestMutations.test_get_range_slice_after_deletion

2016-06-13 Thread Craig Kodman (JIRA)

Craig Kodman created CASSANDRA-12000:


 Summary: dtest failure in 
thrift_tests.TestMutations.test_get_range_slice_after_deletion
 Key: CASSANDRA-12000
 URL: https://issues.apache.org/jira/browse/CASSANDRA-12000
 Project: Cassandra
  Issue Type: Test
Reporter: Craig Kodman
Assignee: DS Test Eng


example failure:

http://cassci.datastax.com/job/cassandra-3.0_dtest_win32/253/testReport/thrift_tests/TestMutations/test_get_range_slice_after_deletion

Failed on CassCI build cassandra-3.0_dtest_win32 #253



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (CASSANDRA-11999) dtest failure in cqlsh_tests.cqlsh_tests.TestCqlsh.test_refresh_schema_on_timeout_error

2016-06-13 Thread Craig Kodman (JIRA)

Craig Kodman created CASSANDRA-11999:


 Summary: dtest failure in 
cqlsh_tests.cqlsh_tests.TestCqlsh.test_refresh_schema_on_timeout_error
 Key: CASSANDRA-11999
 URL: https://issues.apache.org/jira/browse/CASSANDRA-11999
 Project: Cassandra
  Issue Type: Test
Reporter: Craig Kodman
Assignee: DS Test Eng


example failure:

http://cassci.datastax.com/job/cassandra-3.0_dtest/745/testReport/cqlsh_tests.cqlsh_tests/TestCqlsh/test_refresh_schema_on_timeout_error

Failed on CassCI build cassandra-3.0_dtest #745



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (CASSANDRA-11960) Hints are not seekable

2016-06-13 Thread Stefan Podkowinski (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Podkowinski reassigned CASSANDRA-11960:
--

Assignee: Stefan Podkowinski

> Hints are not seekable
> --
>
> Key: CASSANDRA-11960
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11960
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Robert Stupp
>Assignee: Stefan Podkowinski
>
> Got the following error message on trunk. No idea how to reproduce. But the 
> only thing the (not overridden) seek method does is throwing this exception.
> {code}
> ERROR [HintsDispatcher:2] 2016-06-05 18:51:09,397 CassandraDaemon.java:222 - 
> Exception in thread Thread[HintsDispatcher:2,1,main]
> java.lang.UnsupportedOperationException: Hints are not seekable.
>   at org.apache.cassandra.hints.HintsReader.seek(HintsReader.java:114) 
> ~[main/:na]
>   at 
> org.apache.cassandra.hints.HintsDispatcher.seek(HintsDispatcher.java:79) 
> ~[main/:na]
>   at 
> org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.deliver(HintsDispatchExecutor.java:257)
>  ~[main/:na]
>   at 
> org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.dispatch(HintsDispatchExecutor.java:242)
>  ~[main/:na]
>   at 
> org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.dispatch(HintsDispatchExecutor.java:220)
>  ~[main/:na]
>   at 
> org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.run(HintsDispatchExecutor.java:199)
>  ~[main/:na]
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> ~[na:1.8.0_91]
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
> ~[na:1.8.0_91]
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  ~[na:1.8.0_91]
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  [na:1.8.0_91]
>   at java.lang.Thread.run(Thread.java:745) [na:1.8.0_91]
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (CASSANDRA-11998) dtest failure in offline_tools_test.TestOfflineTools.sstableofflinerelevel_test

2016-06-13 Thread Craig Kodman (JIRA)

Craig Kodman created CASSANDRA-11998:


 Summary: dtest failure in 
offline_tools_test.TestOfflineTools.sstableofflinerelevel_test
 Key: CASSANDRA-11998
 URL: https://issues.apache.org/jira/browse/CASSANDRA-11998
 Project: Cassandra
  Issue Type: Test
Reporter: Craig Kodman
Assignee: DS Test Eng


example failure:

http://cassci.datastax.com/job/cassandra-2.2_dtest/635/testReport/offline_tools_test/TestOfflineTools/sstableofflinerelevel_test

Failed on CassCI build cassandra-2.2_dtest #635



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-11997) Add a STCS compaction subproperty for DESC order bucketing

2016-06-13 Thread Vassil Lunchev (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vassil Lunchev updated CASSANDRA-11997:
---
Description: 
Looking at SizeTieredCompactionStrategy.java -> getBuckets().

This method is the only one using 3 of the 10 subproperties of STCS. It buckets 
the files by sorting them ASC and then grouping them using bucket_high and 
min_sstable_size.

getBuckets() practically doesn't use bucket_low at all. As long as it is 
between 0 and 1, the result doesn't depend on bucket_low. For example:

{code:java}
  public static void main(String[] args) {
List> files = new ArrayList<>();
files.add(new Pair<>("10.1G", 10944793422l));
files.add(new Pair<>("9.4G", 10056333820l));
files.add(new Pair<>("8.7G", 9266612562l));
files.add(new Pair<>("4.0G", 4254518390l));
files.add(new Pair<>("3.5G", 3729627496l));
files.add(new Pair<>("2.5G", 2587912419l));
files.add(new Pair<>("2.2G", 2304124647l));
files.add(new Pair<>("1.4G", 1485000127l));
files.add(new Pair<>("1.3G", 1340382610l));
files.add(new Pair<>("456M", 477906537l));
files.add(new Pair<>("451M", 472012692l));
files.add(new Pair<>("53M", 54968524l));
files.add(new Pair<>("18M", 18447540l));
List buckets = getBuckets(files, 1.5, 0.5, 50l*1024*1024);
System.out.println(buckets);
  }
{code}

The result is:
{code}
[[451M, 456M], [8.7G, 9.4G, 10.1G], [53M], [1.3G, 1.4G], [18M], [3.5G, 4.0G], 
[2.2G, 2.5G]]
{code}

You can test it with any value for bucketLow between 0 and 1, the result will 
be the same. And it contains no buckets that can be compacted.

However, if you reverse the initial sorting order to DESC (look at the files 
from largest to smallest) you get a completely different bucketing:

{code:java}
  return p2.right.compareTo(p1.right);
{code} 

{code:txt}
  [[456M, 451M], [4.0G, 3.5G, 2.5G, 2.2G], [10.1G, 9.4G, 8.7G], [53M], [1.4G, 
1.3G], [18M]]
{code}

Now there is a bucket that can be compacted: [4.0G, 3.5G, 2.5G, 2.2G]
After that compaction, there will be one more bucket that can be compacted: 
[10.1G, 9.4G, 8.7G, GB]

The sizes given here are real values, from a production load Cassandra 
deployment. We would like to have an aggressive STCS compaction that compacts 
as soon as reasonably possible. (I know about LCS, let's not include it in this 
ticket). However since the ordering in getBuckets is ASC, we cannot do much 
with configuration parameters. Specifically, using min_threshold = 3 is not 
helping - it all boils down to the ordering.

Probably bucket_high = 2 is an option, but then why does Cassandra offer a 
property that doesn't change anything (with a fixed ASC ordering, bucket_low is 
literally useless)

I would like to have the ability to configure DESC ordering. My suggestion is 
to add a new compaction subproperty for STCS, for example named 
bucket_iteration_order, which has ASC by default for backward compatibility, 
but it can be switched to DESC if an aggressive ordering is required.

  was:
Looking at SizeTieredCompactionStrategy.java -> getBuckets().

This method is the only one using 3 of the 10 subproperties of STCS. It buckets 
the files by sorting them ASC and then grouping them using bucket_high and 
min_sstable_size.

getBuckets() practically doesn't use bucket_low at all. As long as it is 
between 0 and 1, the result doesn't depend on bucket_low. For example:

{code:java}
  public static void main(String[] args) {
List> files = new ArrayList<>();
files.add(new Pair<>("10.1G", 10944793422l));
files.add(new Pair<>("9.4G", 10056333820l));
files.add(new Pair<>("8.7G", 9266612562l));
files.add(new Pair<>("4.0G", 4254518390l));
files.add(new Pair<>("3.5G", 3729627496l));
files.add(new Pair<>("2.5G", 2587912419l));
files.add(new Pair<>("2.2G", 2304124647l));
files.add(new Pair<>("1.4G", 1485000127l));
files.add(new Pair<>("1.3G", 1340382610l));
files.add(new Pair<>("456M", 477906537l));
files.add(new Pair<>("451M", 472012692l));
files.add(new Pair<>("53M", 54968524l));
files.add(new Pair<>("18M", 18447540l));
List buckets = getBuckets(files, 1.5, 0.5, 50l*1024*1024);
System.out.println(buckets);
  }
{code}

The result is:
{code}
[[451M, 456M], [8.7G, 9.4G, 10.1G], [53M], [1.3G, 1.4G], [18M], [3.5G, 4.0G], 
[2.2G, 2.5G]]
{code}

You can test it with any value for bucketLow between 0 and 1, the result will 
be the same. And it contains no buckets that can be compacted.

However, if you reverse the initial sorting order to DESC (look at the files 
from largest to smallest) you get a completely different bucketing:

{code:java}
  return p2.right.compareTo(p1.right);
{code} 

{code:txt}
  [[456M, 451M], [4.0G, 3.5G, 2.5G, 2.2G], [10.1G, 9.4G, 8.7G], [53M], [1.4G, 
1.3G], [18M]]
{code}

Now there is a bucket that can be compacted: [4.0G, 3.5G,

[jira] [Created] (CASSANDRA-11997) Add a STCS compaction subproperty for DESC order bucketing

2016-06-13 Thread Vassil Lunchev (JIRA)

Vassil Lunchev created CASSANDRA-11997:
--

 Summary: Add a STCS compaction subproperty for DESC order bucketing
 Key: CASSANDRA-11997
 URL: https://issues.apache.org/jira/browse/CASSANDRA-11997
 Project: Cassandra
  Issue Type: Improvement
  Components: Compaction
Reporter: Vassil Lunchev


Looking at SizeTieredCompactionStrategy.java -> getBuckets().

This method is the only one using 3 of the 10 subproperties of STCS. It buckets 
the files by sorting them ASC and then grouping them using bucket_high and 
min_sstable_size.

getBuckets() practically doesn't use bucket_low at all. As long as it is 
between 0 and 1, the result doesn't depend on bucket_low. For example:

{code:java}
  public static void main(String[] args) {
List> files = new ArrayList<>();
files.add(new Pair<>("10.1G", 10944793422l));
files.add(new Pair<>("9.4G", 10056333820l));
files.add(new Pair<>("8.7G", 9266612562l));
files.add(new Pair<>("4.0G", 4254518390l));
files.add(new Pair<>("3.5G", 3729627496l));
files.add(new Pair<>("2.5G", 2587912419l));
files.add(new Pair<>("2.2G", 2304124647l));
files.add(new Pair<>("1.4G", 1485000127l));
files.add(new Pair<>("1.3G", 1340382610l));
files.add(new Pair<>("456M", 477906537l));
files.add(new Pair<>("451M", 472012692l));
files.add(new Pair<>("53M", 54968524l));
files.add(new Pair<>("18M", 18447540l));
List buckets = getBuckets(files, 1.5, 0.5, 50l*1024*1024);
System.out.println(buckets);
  }
{code}

The result is:
{code}
[[451M, 456M], [8.7G, 9.4G, 10.1G], [53M], [1.3G, 1.4G], [18M], [3.5G, 4.0G], 
[2.2G, 2.5G]]
{code}

You can test it with any value for bucketLow between 0 and 1, the result will 
be the same. And it contains no buckets that can be compacted.

However, if you reverse the initial sorting order to DESC (look at the files 
from largest to smallest) you get a completely different bucketing:

{code:java}
  return p2.right.compareTo(p1.right);
{code} 

{code:txt}
  [[456M, 451M], [4.0G, 3.5G, 2.5G, 2.2G], [10.1G, 9.4G, 8.7G], [53M], [1.4G, 
1.3G], [18M]]
{code}

Now there is a bucket that can be compacted: [4.0G, 3.5G, 2.5G, 2.2G]
After that compaction, there will be one more bucket that can be compacted: 
[10.1G, 9.4G, 8.7G, GB]

The sizes given here are real values, from a production load Cassandra 
deployment. We would like to have an aggressive STCS compaction that compacts 
as soon as reasonably possible. (I know about LCS, let's not include it in this 
ticket). However since the ordering in getBuckets is ASC, we cannot do much 
with configuration parameters. Specifically, using min_threshold = 3 is not 
helping - it all boils down to the ordering.

Probably bucket_high = 2 is an option, but then why does Cassandra offer a 
property that doesn't change anything (with a fixed ASC ordering, bucket_low is 
literally useless)

I would like to have the ability to configure DESC ordering. My suggestion is 
to add a new compaction subproperty for STCS, for example named 
bucket_iteration_order, which has ASC by default for backward compatibility.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-8700) replace the wiki with docs in the git repo

2016-06-13 Thread Sylvain Lebresne (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-8700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15327521#comment-15327521
 ] 

Sylvain Lebresne commented on CASSANDRA-8700:
-

I had a look at what could be good options here. I'm sure I've missed some but 
some reasonable (as in, good without being too complex) options seems to be:
* [textile|http://redcloth.org/textile]: I mention it _only_  because it's 
currently used for the CQL doc, but having written a fair amount of said doc, I 
hate it and am against continuing to use it (the top reason for my hatred being 
that new lines in the input are mirorred in the output which is just extremely 
annoying).
* [sphinx|http://www.sphinx-doc.org/]: Was mentioned by a few people already 
and it is a great option. It's reasonably simple while being fairly complete 
and maintained. The main downside seems to be that reStructuredText is not 
always as pleasant to work with than say markdown (probably debatable but 
internet seems to generally agree on that). That said, reStructuredText is not 
that bad either and it's arguably more capable than markdown for some things. 
Sphinx also has a bunch of extension, and while I'm not sure we'll need that 
much, some might come in handy someday.
* [MkDocs|http://www.mkdocs.org/]: A nice option in that it's simple but still 
produce decently looking docs (with navigation and search), and it uses 
markdown, which has the advantage of being more well known. It is however 
arguably less flexible than sphinx: in particular it doesn't seem to be able to 
easily produce pdfs, which would be nice to have.
* [asciidoc|http://asciidoc.org/]: Haven't looked at it _that_ close but it's 
definitively capable (it's use by hbase for [their 
book|https://hbase.apache.org/book.html] in particular, and it doesn't look 
bad) but sphinx seems to have strictly more advantages.
* [mdBook|http://azerupi.github.io/mdBook/]: uses mardown so not too disimilar 
to MkDocs, but slightly less capable (doesn't probide search for instance). 
Also made for the rust book so requires to install rust which is inconvenient 
(since we have no dependency on rust currently).
* There is also a bunch of tools to convert from one markup languages to html, 
amongst which the more general is probably [pandoc|http://pandoc.org/], but 
those are probably a bit too crude.

Anyway, I'm happy to check other options and debate which is best but it seems 
that sphinx (which has had some vote already) is a pretty good choice and so 
I've started setting it up (I'll try to commit a first WIP version of the 
outline plus the sections attached above tomorrow). If we decide that we prefer 
another option in the coming days, that's fine, I'll just convert the files 
(which is pretty simple with pandoc, so if you've already written something in 
markdown, feel free to attach it that way, I'll convert).


> replace the wiki with docs in the git repo
> --
>
> Key: CASSANDRA-8700
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8700
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Documentation and Website
>Reporter: Jon Haddad
>Assignee: Sylvain Lebresne
>Priority: Minor
> Attachments: TombstonesAndGcGrace.md, bloom_filters.md, 
> compression.md, drivers_list.md, hardware.md, installation.md
>
>
> The wiki as it stands is pretty terrible.  It takes several minutes to apply 
> a single update, and as a result, it's almost never updated.  The information 
> there has very little context as to what version it applies to.  Most people 
> I've talked to that try to use the information they find there find it is 
> more confusing than helpful.
> I'd like to propose that instead of using the wiki, the doc directory in the 
> cassandra repo be used for docs (already used for CQL3 spec) in a format that 
> can be built to a variety of output formats like HTML / epub / etc.  I won't 
> start the bikeshedding on which markup format is preferable - but there are 
> several options that can work perfectly fine.  I've personally use sphinx w/ 
> restructured text, and markdown.  Both can build easily and as an added bonus 
> be pushed to readthedocs (or something similar) automatically.  For an 
> example, see cqlengine's documentation, which I think is already 
> significantly better than the wiki: 
> http://cqlengine.readthedocs.org/en/latest/
> In addition to being overall easier to maintain, putting the documentation in 
> the git repo adds context, since it evolves with the versions of Cassandra.
> If the wiki were kept even remotely up to date, I wouldn't bother with this, 
> but not having at least some basic documentation in the repo, or anywhere 
> associated with the project, is frustrating.
> For reference, the last 3 updates were:
> 1/15/15 -

[jira] [Commented] (CASSANDRA-11996) SSTableSet.CANONICAL can miss sstables


[ 
https://issues.apache.org/jira/browse/CASSANDRA-11996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15327505#comment-15327505
 ] 

Marcus Eriksson commented on CASSANDRA-11996:
-

note that the unit test will not fail every run, run it in a loop to reproduce

> SSTableSet.CANONICAL can miss sstables
> --
>
> Key: CASSANDRA-11996
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11996
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Marcus Eriksson
>Priority: Critical
> Fix For: 3.0.x, 3.x
>
>
> There is a race where we might miss sstables in SSTableSet.CANONICAL when we 
> finish up a compaction.
> Reproducing unit test pushed 
> [here|https://github.com/krummas/cassandra/commit/1292aaa61b89730cff0c022ed1262f45afd493e5]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-11944) sstablesInBounds might not actually give all sstables within the bounds due to having start positions moved in sstables


[ 
https://issues.apache.org/jira/browse/CASSANDRA-11944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15327442#comment-15327442
 ] 

Marcus Eriksson commented on CASSANDRA-11944:
-

pushed a new commit with method renames to the branch above, and triggered new 
cassci builds

> sstablesInBounds might not actually give all sstables within the bounds due 
> to having start positions moved in sstables
> ---
>
> Key: CASSANDRA-11944
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11944
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Marcus Eriksson
>Assignee: Marcus Eriksson
> Fix For: 3.0.x, 3.x
>
>
> Same problem as with CASSANDRA-11886 - if we try to fetch sstablesInBounds 
> for CANONICAL_SSTABLES, we can miss some actually overlapping sstables. In 
> 3.0+ we state which SSTableSet we want when calling the method.
> Looks like the only issue this could cause is that we include a few too many 
> sstables in compactions that we think contain only droppable tombstones



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-11886) Streaming will miss sections for early opened sstables during compaction


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcus Eriksson updated CASSANDRA-11886:

   Resolution: Fixed
Fix Version/s: (was: 3.0.x)
   (was: 2.2.x)
   (was: 2.1.x)
   (was: 3.x)
   3.0.8
   3.8
   2.2.7
   2.1.15
   Status: Resolved  (was: Patch Available)

committed, thanks

found CASSANDRA-11996 while running the tests though, so this is not really 
fixed for 3.0+ yet

> Streaming will miss sections for early opened sstables during compaction
> 
>
> Key: CASSANDRA-11886
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11886
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Stefan Podkowinski
>Assignee: Marcus Eriksson
>Priority: Critical
>  Labels: correctness, repair, streaming
> Fix For: 2.1.15, 2.2.7, 3.8, 3.0.8
>
> Attachments: 9700-test-2_1.patch
>
>
> Once validation compaction has been finished, all mismatching sstable 
> sections for a token range will be used for streaming as return by 
> {{StreamSession.getSSTableSectionsForRanges}}. Currently 2.1 will try to 
> restrict the sstable candidates by checking if they can be found in 
> {{CANONICAL_SSTABLES}} and will ignore them otherwise. At the same time 
> {{IntervalTree}} in the {{DataTracker}} will be build based on replaced 
> non-canonical sstables as well. In case of early opened sstables this becomes 
> a problem, as the tree will be update with {{OpenReason.EARLY}} replacements 
> that cannot be found in canonical. But whenever 
> {{getSSTableSectionsForRanges}} will get a early instance from the view, it 
> will fail to retrieve the corresponding canonical version from the map, as 
> the different generation will cause a hashcode mismatch. Please find a test 
> attached.
> As a consequence not all sections for a range are streamed. In our case this 
> has caused deleted data to reappear, as sections holding tombstones were left 
> out due to this behavior.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-11886) Streaming will miss sections for early opened sstables during compaction

2016-06-13 Thread Stefan Podkowinski (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-11886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15327384#comment-15327384
 ] 

Stefan Podkowinski commented on CASSANDRA-11886:


Thanks both of you for responding so quickly and fixing the issue!

> Streaming will miss sections for early opened sstables during compaction
> 
>
> Key: CASSANDRA-11886
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11886
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Stefan Podkowinski
>Assignee: Marcus Eriksson
>Priority: Critical
>  Labels: correctness, repair, streaming
> Fix For: 2.1.15, 2.2.7, 3.8, 3.0.8
>
> Attachments: 9700-test-2_1.patch
>
>
> Once validation compaction has been finished, all mismatching sstable 
> sections for a token range will be used for streaming as return by 
> {{StreamSession.getSSTableSectionsForRanges}}. Currently 2.1 will try to 
> restrict the sstable candidates by checking if they can be found in 
> {{CANONICAL_SSTABLES}} and will ignore them otherwise. At the same time 
> {{IntervalTree}} in the {{DataTracker}} will be build based on replaced 
> non-canonical sstables as well. In case of early opened sstables this becomes 
> a problem, as the tree will be update with {{OpenReason.EARLY}} replacements 
> that cannot be found in canonical. But whenever 
> {{getSSTableSectionsForRanges}} will get a early instance from the view, it 
> will fail to retrieve the corresponding canonical version from the map, as 
> the different generation will cause a hashcode mismatch. Please find a test 
> attached.
> As a consequence not all sections for a range are streamed. In our case this 
> has caused deleted data to reappear, as sections holding tombstones were left 
> out due to this behavior.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (CASSANDRA-11996) SSTableSet.CANONICAL can miss sstables

Marcus Eriksson created CASSANDRA-11996:
---

 Summary: SSTableSet.CANONICAL can miss sstables
 Key: CASSANDRA-11996
 URL: https://issues.apache.org/jira/browse/CASSANDRA-11996
 Project: Cassandra
  Issue Type: Bug
Reporter: Marcus Eriksson
Priority: Critical
 Fix For: 3.0.x, 3.x


There is a race where we might miss sstables in SSTableSet.CANONICAL when we 
finish up a compaction.

Reproducing unit test pushed 
[here|https://github.com/krummas/cassandra/commit/1292aaa61b89730cff0c022ed1262f45afd493e5]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[03/10] cassandra git commit: Create interval tree over canonical sstables to avoid missing sstables during streaming

Create interval tree over canonical sstables to avoid missing sstables during 
streaming

patch by marcuse; reviewed by benedict for CASSANDRA-11886


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/72acbcd0
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/72acbcd0
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/72acbcd0

Branch: refs/heads/cassandra-3.0
Commit: 72acbcd00fe7c46e54cd267f42868531e99e39df
Parents: 68319f7
Author: Marcus Eriksson 
Authored: Wed May 25 08:38:14 2016 +0200
Committer: Marcus Eriksson 
Committed: Mon Jun 13 14:31:47 2016 +0200

--
 CHANGES.txt |  1 +
 .../cassandra/config/DatabaseDescriptor.java|  4 ++
 .../org/apache/cassandra/db/DataTracker.java|  8 ++-
 .../cassandra/streaming/StreamSession.java  | 21 +++---
 .../io/sstable/SSTableRewriterTest.java | 72 
 5 files changed, 93 insertions(+), 13 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/72acbcd0/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index af641e1..ebcc90c 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 2.1.15
+ * Create interval tree over canonical sstables to avoid missing sstables 
during streaming (CASSANDRA-11886)
  * cqlsh COPY FROM: shutdown parent cluster after forking, to avoid corrupting 
SSL connections (CASSANDRA-11749)
  * Updated cqlsh Python driver to fix DESCRIBE problem for legacy tables 
(CASSANDRA-11055)
  * cqlsh: apply current keyspace to source command (CASSANDRA-11152)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/72acbcd0/src/java/org/apache/cassandra/config/DatabaseDescriptor.java
--
diff --git a/src/java/org/apache/cassandra/config/DatabaseDescriptor.java 
b/src/java/org/apache/cassandra/config/DatabaseDescriptor.java
index 166ce7e..559ba0b 100644
--- a/src/java/org/apache/cassandra/config/DatabaseDescriptor.java
+++ b/src/java/org/apache/cassandra/config/DatabaseDescriptor.java
@@ -1531,6 +1531,10 @@ public class DatabaseDescriptor
 {
 return conf.sstable_preemptive_open_interval_in_mb;
 }
+public static void setSSTablePreempiveOpenIntervalInMB(int mb)
+{
+conf.sstable_preemptive_open_interval_in_mb = mb;
+}
 
 public static boolean getTrickleFsync()
 {

http://git-wip-us.apache.org/repos/asf/cassandra/blob/72acbcd0/src/java/org/apache/cassandra/db/DataTracker.java
--
diff --git a/src/java/org/apache/cassandra/db/DataTracker.java 
b/src/java/org/apache/cassandra/db/DataTracker.java
index c731a35..927e717 100644
--- a/src/java/org/apache/cassandra/db/DataTracker.java
+++ b/src/java/org/apache/cassandra/db/DataTracker.java
@@ -32,6 +32,7 @@ import org.slf4j.LoggerFactory;
 import org.apache.cassandra.config.DatabaseDescriptor;
 import org.apache.cassandra.db.compaction.OperationType;
 import org.apache.cassandra.dht.AbstractBounds;
+import org.apache.cassandra.dht.IPartitioner;
 import org.apache.cassandra.io.sstable.IndexSummary;
 import org.apache.cassandra.io.sstable.SSTableReader;
 import org.apache.cassandra.io.util.FileUtils;
@@ -810,9 +811,14 @@ public class DataTracker
 
 public List 
sstablesInBounds(AbstractBounds rowBounds)
 {
+return sstablesInBounds(rowBounds, intervalTree, 
liveMemtables.get(0).cfs.partitioner);
+}
+
+public static List 
sstablesInBounds(AbstractBounds rowBounds, SSTableIntervalTree 
intervalTree, IPartitioner partitioner)
+{
 if (intervalTree.isEmpty())
 return Collections.emptyList();
-RowPosition stopInTree = 
rowBounds.right.isMinimum(liveMemtables.get(0).cfs.partitioner) ? 
intervalTree.max() : rowBounds.right;
+RowPosition stopInTree = rowBounds.right.isMinimum(partitioner) ? 
intervalTree.max() : rowBounds.right;
 return intervalTree.search(Interval.create(rowBounds.left, stopInTree));
 }
 }

http://git-wip-us.apache.org/repos/asf/cassandra/blob/72acbcd0/src/java/org/apache/cassandra/streaming/StreamSession.java
--
diff --git a/src/java/org/apache/cassandra/streaming/StreamSession.java 
b/src/java/org/apache/cassandra/streaming/StreamSession.java
index 4eb8557..273631c 100644
--- a/src/java/org/apache/cassandra/streaming/StreamSession.java
+++ b/src/java/org/apache/cassandra/streaming/StreamSession.java
@@ -27,6 +27,7 @@ import java.util.concurrent.atomic.AtomicBoolean;
 
 import

[01/10] cassandra git commit: Create interval tree over canonical sstables to avoid missing sstables during streaming

Repository: cassandra
Updated Branches:
  refs/heads/cassandra-2.1 68319f7c3 -> 72acbcd00
  refs/heads/cassandra-2.2 593bbf57d -> 05f8a008f
  refs/heads/cassandra-3.0 3d211e9fb -> 73a8341fe
  refs/heads/trunk db8df9153 -> 719e7d662


Create interval tree over canonical sstables to avoid missing sstables during 
streaming

patch by marcuse; reviewed by benedict for CASSANDRA-11886


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/72acbcd0
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/72acbcd0
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/72acbcd0

Branch: refs/heads/cassandra-2.1
Commit: 72acbcd00fe7c46e54cd267f42868531e99e39df
Parents: 68319f7
Author: Marcus Eriksson 
Authored: Wed May 25 08:38:14 2016 +0200
Committer: Marcus Eriksson 
Committed: Mon Jun 13 14:31:47 2016 +0200

--
 CHANGES.txt |  1 +
 .../cassandra/config/DatabaseDescriptor.java|  4 ++
 .../org/apache/cassandra/db/DataTracker.java|  8 ++-
 .../cassandra/streaming/StreamSession.java  | 21 +++---
 .../io/sstable/SSTableRewriterTest.java | 72 
 5 files changed, 93 insertions(+), 13 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/72acbcd0/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index af641e1..ebcc90c 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 2.1.15
+ * Create interval tree over canonical sstables to avoid missing sstables 
during streaming (CASSANDRA-11886)
  * cqlsh COPY FROM: shutdown parent cluster after forking, to avoid corrupting 
SSL connections (CASSANDRA-11749)
  * Updated cqlsh Python driver to fix DESCRIBE problem for legacy tables 
(CASSANDRA-11055)
  * cqlsh: apply current keyspace to source command (CASSANDRA-11152)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/72acbcd0/src/java/org/apache/cassandra/config/DatabaseDescriptor.java
--
diff --git a/src/java/org/apache/cassandra/config/DatabaseDescriptor.java 
b/src/java/org/apache/cassandra/config/DatabaseDescriptor.java
index 166ce7e..559ba0b 100644
--- a/src/java/org/apache/cassandra/config/DatabaseDescriptor.java
+++ b/src/java/org/apache/cassandra/config/DatabaseDescriptor.java
@@ -1531,6 +1531,10 @@ public class DatabaseDescriptor
 {
 return conf.sstable_preemptive_open_interval_in_mb;
 }
+public static void setSSTablePreempiveOpenIntervalInMB(int mb)
+{
+conf.sstable_preemptive_open_interval_in_mb = mb;
+}
 
 public static boolean getTrickleFsync()
 {

http://git-wip-us.apache.org/repos/asf/cassandra/blob/72acbcd0/src/java/org/apache/cassandra/db/DataTracker.java
--
diff --git a/src/java/org/apache/cassandra/db/DataTracker.java 
b/src/java/org/apache/cassandra/db/DataTracker.java
index c731a35..927e717 100644
--- a/src/java/org/apache/cassandra/db/DataTracker.java
+++ b/src/java/org/apache/cassandra/db/DataTracker.java
@@ -32,6 +32,7 @@ import org.slf4j.LoggerFactory;
 import org.apache.cassandra.config.DatabaseDescriptor;
 import org.apache.cassandra.db.compaction.OperationType;
 import org.apache.cassandra.dht.AbstractBounds;
+import org.apache.cassandra.dht.IPartitioner;
 import org.apache.cassandra.io.sstable.IndexSummary;
 import org.apache.cassandra.io.sstable.SSTableReader;
 import org.apache.cassandra.io.util.FileUtils;
@@ -810,9 +811,14 @@ public class DataTracker
 
 public List 
sstablesInBounds(AbstractBounds rowBounds)
 {
+return sstablesInBounds(rowBounds, intervalTree, 
liveMemtables.get(0).cfs.partitioner);
+}
+
+public static List 
sstablesInBounds(AbstractBounds rowBounds, SSTableIntervalTree 
intervalTree, IPartitioner partitioner)
+{
 if (intervalTree.isEmpty())
 return Collections.emptyList();
-RowPosition stopInTree = 
rowBounds.right.isMinimum(liveMemtables.get(0).cfs.partitioner) ? 
intervalTree.max() : rowBounds.right;
+RowPosition stopInTree = rowBounds.right.isMinimum(partitioner) ? 
intervalTree.max() : rowBounds.right;
 return intervalTree.search(Interval.create(rowBounds.left, stopInTree));
 }
 }

http://git-wip-us.apache.org/repos/asf/cassandra/blob/72acbcd0/src/java/org/apache/cassandra/streaming/StreamSession.java
--
diff --git a/src/java/org/apache/cassandra/streaming/StreamSession.java 
b/src/java/org/apache/cassandra/streaming/StreamSession.java
index

[02/10] cassandra git commit: Create interval tree over canonical sstables to avoid missing sstables during streaming

Create interval tree over canonical sstables to avoid missing sstables during 
streaming

patch by marcuse; reviewed by benedict for CASSANDRA-11886


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/72acbcd0
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/72acbcd0
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/72acbcd0

Branch: refs/heads/cassandra-2.2
Commit: 72acbcd00fe7c46e54cd267f42868531e99e39df
Parents: 68319f7
Author: Marcus Eriksson 
Authored: Wed May 25 08:38:14 2016 +0200
Committer: Marcus Eriksson 
Committed: Mon Jun 13 14:31:47 2016 +0200

--
 CHANGES.txt |  1 +
 .../cassandra/config/DatabaseDescriptor.java|  4 ++
 .../org/apache/cassandra/db/DataTracker.java|  8 ++-
 .../cassandra/streaming/StreamSession.java  | 21 +++---
 .../io/sstable/SSTableRewriterTest.java | 72 
 5 files changed, 93 insertions(+), 13 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/72acbcd0/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index af641e1..ebcc90c 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 2.1.15
+ * Create interval tree over canonical sstables to avoid missing sstables 
during streaming (CASSANDRA-11886)
  * cqlsh COPY FROM: shutdown parent cluster after forking, to avoid corrupting 
SSL connections (CASSANDRA-11749)
  * Updated cqlsh Python driver to fix DESCRIBE problem for legacy tables 
(CASSANDRA-11055)
  * cqlsh: apply current keyspace to source command (CASSANDRA-11152)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/72acbcd0/src/java/org/apache/cassandra/config/DatabaseDescriptor.java
--
diff --git a/src/java/org/apache/cassandra/config/DatabaseDescriptor.java 
b/src/java/org/apache/cassandra/config/DatabaseDescriptor.java
index 166ce7e..559ba0b 100644
--- a/src/java/org/apache/cassandra/config/DatabaseDescriptor.java
+++ b/src/java/org/apache/cassandra/config/DatabaseDescriptor.java
@@ -1531,6 +1531,10 @@ public class DatabaseDescriptor
 {
 return conf.sstable_preemptive_open_interval_in_mb;
 }
+public static void setSSTablePreempiveOpenIntervalInMB(int mb)
+{
+conf.sstable_preemptive_open_interval_in_mb = mb;
+}
 
 public static boolean getTrickleFsync()
 {

http://git-wip-us.apache.org/repos/asf/cassandra/blob/72acbcd0/src/java/org/apache/cassandra/db/DataTracker.java
--
diff --git a/src/java/org/apache/cassandra/db/DataTracker.java 
b/src/java/org/apache/cassandra/db/DataTracker.java
index c731a35..927e717 100644
--- a/src/java/org/apache/cassandra/db/DataTracker.java
+++ b/src/java/org/apache/cassandra/db/DataTracker.java
@@ -32,6 +32,7 @@ import org.slf4j.LoggerFactory;
 import org.apache.cassandra.config.DatabaseDescriptor;
 import org.apache.cassandra.db.compaction.OperationType;
 import org.apache.cassandra.dht.AbstractBounds;
+import org.apache.cassandra.dht.IPartitioner;
 import org.apache.cassandra.io.sstable.IndexSummary;
 import org.apache.cassandra.io.sstable.SSTableReader;
 import org.apache.cassandra.io.util.FileUtils;
@@ -810,9 +811,14 @@ public class DataTracker
 
 public List 
sstablesInBounds(AbstractBounds rowBounds)
 {
+return sstablesInBounds(rowBounds, intervalTree, 
liveMemtables.get(0).cfs.partitioner);
+}
+
+public static List 
sstablesInBounds(AbstractBounds rowBounds, SSTableIntervalTree 
intervalTree, IPartitioner partitioner)
+{
 if (intervalTree.isEmpty())
 return Collections.emptyList();
-RowPosition stopInTree = 
rowBounds.right.isMinimum(liveMemtables.get(0).cfs.partitioner) ? 
intervalTree.max() : rowBounds.right;
+RowPosition stopInTree = rowBounds.right.isMinimum(partitioner) ? 
intervalTree.max() : rowBounds.right;
 return intervalTree.search(Interval.create(rowBounds.left, stopInTree));
 }
 }

http://git-wip-us.apache.org/repos/asf/cassandra/blob/72acbcd0/src/java/org/apache/cassandra/streaming/StreamSession.java
--
diff --git a/src/java/org/apache/cassandra/streaming/StreamSession.java 
b/src/java/org/apache/cassandra/streaming/StreamSession.java
index 4eb8557..273631c 100644
--- a/src/java/org/apache/cassandra/streaming/StreamSession.java
+++ b/src/java/org/apache/cassandra/streaming/StreamSession.java
@@ -27,6 +27,7 @@ import java.util.concurrent.atomic.AtomicBoolean;
 
 import

[04/10] cassandra git commit: Create interval tree over canonical sstables to avoid missing sstables during streaming

Create interval tree over canonical sstables to avoid missing sstables during 
streaming

patch by marcuse; reviewed by benedict for CASSANDRA-11886


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/72acbcd0
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/72acbcd0
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/72acbcd0

Branch: refs/heads/trunk
Commit: 72acbcd00fe7c46e54cd267f42868531e99e39df
Parents: 68319f7
Author: Marcus Eriksson 
Authored: Wed May 25 08:38:14 2016 +0200
Committer: Marcus Eriksson 
Committed: Mon Jun 13 14:31:47 2016 +0200

--
 CHANGES.txt |  1 +
 .../cassandra/config/DatabaseDescriptor.java|  4 ++
 .../org/apache/cassandra/db/DataTracker.java|  8 ++-
 .../cassandra/streaming/StreamSession.java  | 21 +++---
 .../io/sstable/SSTableRewriterTest.java | 72 
 5 files changed, 93 insertions(+), 13 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/72acbcd0/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index af641e1..ebcc90c 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 2.1.15
+ * Create interval tree over canonical sstables to avoid missing sstables 
during streaming (CASSANDRA-11886)
  * cqlsh COPY FROM: shutdown parent cluster after forking, to avoid corrupting 
SSL connections (CASSANDRA-11749)
  * Updated cqlsh Python driver to fix DESCRIBE problem for legacy tables 
(CASSANDRA-11055)
  * cqlsh: apply current keyspace to source command (CASSANDRA-11152)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/72acbcd0/src/java/org/apache/cassandra/config/DatabaseDescriptor.java
--
diff --git a/src/java/org/apache/cassandra/config/DatabaseDescriptor.java 
b/src/java/org/apache/cassandra/config/DatabaseDescriptor.java
index 166ce7e..559ba0b 100644
--- a/src/java/org/apache/cassandra/config/DatabaseDescriptor.java
+++ b/src/java/org/apache/cassandra/config/DatabaseDescriptor.java
@@ -1531,6 +1531,10 @@ public class DatabaseDescriptor
 {
 return conf.sstable_preemptive_open_interval_in_mb;
 }
+public static void setSSTablePreempiveOpenIntervalInMB(int mb)
+{
+conf.sstable_preemptive_open_interval_in_mb = mb;
+}
 
 public static boolean getTrickleFsync()
 {

http://git-wip-us.apache.org/repos/asf/cassandra/blob/72acbcd0/src/java/org/apache/cassandra/db/DataTracker.java
--
diff --git a/src/java/org/apache/cassandra/db/DataTracker.java 
b/src/java/org/apache/cassandra/db/DataTracker.java
index c731a35..927e717 100644
--- a/src/java/org/apache/cassandra/db/DataTracker.java
+++ b/src/java/org/apache/cassandra/db/DataTracker.java
@@ -32,6 +32,7 @@ import org.slf4j.LoggerFactory;
 import org.apache.cassandra.config.DatabaseDescriptor;
 import org.apache.cassandra.db.compaction.OperationType;
 import org.apache.cassandra.dht.AbstractBounds;
+import org.apache.cassandra.dht.IPartitioner;
 import org.apache.cassandra.io.sstable.IndexSummary;
 import org.apache.cassandra.io.sstable.SSTableReader;
 import org.apache.cassandra.io.util.FileUtils;
@@ -810,9 +811,14 @@ public class DataTracker
 
 public List 
sstablesInBounds(AbstractBounds rowBounds)
 {
+return sstablesInBounds(rowBounds, intervalTree, 
liveMemtables.get(0).cfs.partitioner);
+}
+
+public static List 
sstablesInBounds(AbstractBounds rowBounds, SSTableIntervalTree 
intervalTree, IPartitioner partitioner)
+{
 if (intervalTree.isEmpty())
 return Collections.emptyList();
-RowPosition stopInTree = 
rowBounds.right.isMinimum(liveMemtables.get(0).cfs.partitioner) ? 
intervalTree.max() : rowBounds.right;
+RowPosition stopInTree = rowBounds.right.isMinimum(partitioner) ? 
intervalTree.max() : rowBounds.right;
 return intervalTree.search(Interval.create(rowBounds.left, stopInTree));
 }
 }

http://git-wip-us.apache.org/repos/asf/cassandra/blob/72acbcd0/src/java/org/apache/cassandra/streaming/StreamSession.java
--
diff --git a/src/java/org/apache/cassandra/streaming/StreamSession.java 
b/src/java/org/apache/cassandra/streaming/StreamSession.java
index 4eb8557..273631c 100644
--- a/src/java/org/apache/cassandra/streaming/StreamSession.java
+++ b/src/java/org/apache/cassandra/streaming/StreamSession.java
@@ -27,6 +27,7 @@ import java.util.concurrent.atomic.AtomicBoolean;
 
 import

[05/10] cassandra git commit: Merge branch 'cassandra-2.1' into cassandra-2.2

Merge branch 'cassandra-2.1' into cassandra-2.2


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/05f8a008
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/05f8a008
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/05f8a008

Branch: refs/heads/cassandra-3.0
Commit: 05f8a008f696d9624ec85176fa0e2a1ce06a1ad5
Parents: 593bbf5 72acbcd
Author: Marcus Eriksson 
Authored: Mon Jun 13 14:34:01 2016 +0200
Committer: Marcus Eriksson 
Committed: Mon Jun 13 15:00:08 2016 +0200

--
 CHANGES.txt |  1 +
 .../cassandra/config/DatabaseDescriptor.java|  4 ++
 .../org/apache/cassandra/db/lifecycle/View.java |  5 ++
 .../cassandra/streaming/StreamSession.java  | 22 +++
 .../io/sstable/SSTableRewriterTest.java | 66 
 5 files changed, 86 insertions(+), 12 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/05f8a008/CHANGES.txt
--
diff --cc CHANGES.txt
index d639d43,ebcc90c..491f72a
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@@ -1,31 -1,7 +1,32 @@@
 -2.1.15
 +2.2.7
 + * StorageService shutdown hook should use a volatile variable 
(CASSANDRA-11984)
 + * Persist local metadata earlier in startup sequence (CASSANDRA-11742)
 + * Run CommitLog tests with different compression settings (CASSANDRA-9039)
 + * cqlsh: fix tab completion for case-sensitive identifiers (CASSANDRA-11664)
 + * Avoid showing estimated key as -1 in tablestats (CASSANDRA-11587)
 + * Fix possible race condition in CommitLog.recover (CASSANDRA-11743)
 + * Enable client encryption in sstableloader with cli options 
(CASSANDRA-11708)
 + * Possible memory leak in NIODataInputStream (CASSANDRA-11867)
 + * Fix commit log replay after out-of-order flush completion (CASSANDRA-9669)
 + * Add seconds to cqlsh tracing session duration (CASSANDRA-11753)
 + * Prohibit Reverse Counter type as part of the PK (CASSANDRA-9395)
 + * cqlsh: correctly handle non-ascii chars in error messages (CASSANDRA-11626)
 + * Exit JVM if JMX server fails to startup (CASSANDRA-11540)
 + * Produce a heap dump when exiting on OOM (CASSANDRA-9861)
 + * Avoid read repairing purgeable tombstones on range slices (CASSANDRA-11427)
 + * Restore ability to filter on clustering columns when using a 2i 
(CASSANDRA-11510)
 + * JSON datetime formatting needs timezone (CASSANDRA-11137)
 + * Fix is_dense recalculation for Thrift-updated tables (CASSANDRA-11502)
 + * Remove unnescessary file existence check during anticompaction 
(CASSANDRA-11660)
 + * Add missing files to debian packages (CASSANDRA-11642)
 + * Avoid calling Iterables::concat in loops during 
ModificationStatement::getFunctions (CASSANDRA-11621)
 + * cqlsh: COPY FROM should use regular inserts for single statement batches 
and
 +   report errors correctly if workers processes crash on initialization 
(CASSANDRA-11474)
 + * Always close cluster with connection in CqlRecordWriter (CASSANDRA-11553)
 + * Fix slice queries on ordered COMPACT tables (CASSANDRA-10988)
 +Merged from 2.1:
+  * Create interval tree over canonical sstables to avoid missing sstables 
during streaming (CASSANDRA-11886)
   * cqlsh COPY FROM: shutdown parent cluster after forking, to avoid 
corrupting SSL connections (CASSANDRA-11749)
 - * Updated cqlsh Python driver to fix DESCRIBE problem for legacy tables 
(CASSANDRA-11055)
   * cqlsh: apply current keyspace to source command (CASSANDRA-11152)
   * Backport CASSANDRA-11578 (CASSANDRA-11750)
   * Clear out parent repair session if repair coordinator dies 
(CASSANDRA-11824)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/05f8a008/src/java/org/apache/cassandra/config/DatabaseDescriptor.java
--
diff --cc src/java/org/apache/cassandra/config/DatabaseDescriptor.java
index e24917c,559ba0b..d3a5028
--- a/src/java/org/apache/cassandra/config/DatabaseDescriptor.java
+++ b/src/java/org/apache/cassandra/config/DatabaseDescriptor.java
@@@ -1588,8 -1529,12 +1588,12 @@@ public class DatabaseDescripto
  
  public static int getSSTablePreempiveOpenIntervalInMB()
  {
 -return conf.sstable_preemptive_open_interval_in_mb;
 +return FBUtilities.isWindows() ? -1 : 
conf.sstable_preemptive_open_interval_in_mb;
  }
+ public static void setSSTablePreempiveOpenIntervalInMB(int mb)
+ {
+ conf.sstable_preemptive_open_interval_in_mb = mb;
+ }
  
  public static boolean getTrickleFsync()
  {

http://git-wip-us.apache.org/repos/asf/cassandra/blob/05f8a008/src/java/org/apache/cassandra/db/lifecycle/View.java
--
diff --cc

[08/10] cassandra git commit: Merge branch 'cassandra-2.2' into cassandra-3.0

Merge branch 'cassandra-2.2' into cassandra-3.0


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/73a8341f
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/73a8341f
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/73a8341f

Branch: refs/heads/cassandra-3.0
Commit: 73a8341fef25de7236bc591e84cddc637c0b7b2f
Parents: 3d211e9 05f8a00
Author: Marcus Eriksson 
Authored: Mon Jun 13 15:14:28 2016 +0200
Committer: Marcus Eriksson 
Committed: Mon Jun 13 15:14:28 2016 +0200

--
 CHANGES.txt |  1 +
 .../cassandra/config/DatabaseDescriptor.java|  4 ++
 .../org/apache/cassandra/db/lifecycle/View.java | 11 +++
 .../cassandra/streaming/StreamSession.java  | 18 +++--
 .../io/sstable/SSTableRewriterTest.java | 75 +---
 5 files changed, 88 insertions(+), 21 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/73a8341f/CHANGES.txt
--
diff --cc CHANGES.txt
index 47aef7e,491f72a..8a04077
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@@ -1,28 -1,5 +1,29 @@@
 -2.2.7
 +3.0.8
 + * Add TimeWindowCompactionStrategy (CASSANDRA-9666)
 +Merged from 2.2:
   * StorageService shutdown hook should use a volatile variable 
(CASSANDRA-11984)
 +Merged from 2.1:
++ * Create interval tree over canonical sstables to avoid missing sstables 
during streaming (CASSANDRA-11886)
 + * cqlsh COPY FROM: shutdown parent cluster after forking, to avoid 
corrupting SSL connections (CASSANDRA-11749)
 +
 +
 +3.0.7
 + * Fix legacy serialization of Thrift-generated non-compound range tombstones
 +   when communicating with 2.x nodes (CASSANDRA-11930)
 + * Fix Directories instantiations where CFS.initialDirectories should be used 
(CASSANDRA-11849)
 + * Avoid referencing DatabaseDescriptor in AbstractType (CASSANDRA-11912)
 + * Fix sstables not being protected from removal during index build 
(CASSANDRA-11905)
 + * cqlsh: Suppress stack trace from Read/WriteFailures (CASSANDRA-11032)
 + * Remove unneeded code to repair index summaries that have
 +   been improperly down-sampled (CASSANDRA-11127)
 + * Avoid WriteTimeoutExceptions during commit log replay due to materialized
 +   view lock contention (CASSANDRA-11891)
 + * Prevent OOM failures on SSTable corruption, improve tests for corruption 
detection (CASSANDRA-9530)
 + * Use CFS.initialDirectories when clearing snapshots (CASSANDRA-11705)
 + * Allow compaction strategies to disable early open (CASSANDRA-11754)
 + * Refactor Materialized View code (CASSANDRA-11475)
 + * Update Java Driver (CASSANDRA-11615)
 +Merged from 2.2:
   * Persist local metadata earlier in startup sequence (CASSANDRA-11742)
   * Run CommitLog tests with different compression settings (CASSANDRA-9039)
   * cqlsh: fix tab completion for case-sensitive identifiers (CASSANDRA-11664)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/73a8341f/src/java/org/apache/cassandra/config/DatabaseDescriptor.java
--

http://git-wip-us.apache.org/repos/asf/cassandra/blob/73a8341f/src/java/org/apache/cassandra/db/lifecycle/View.java
--
diff --cc src/java/org/apache/cassandra/db/lifecycle/View.java
index 17062b4,e303801..99903fc
--- a/src/java/org/apache/cassandra/db/lifecycle/View.java
+++ b/src/java/org/apache/cassandra/db/lifecycle/View.java
@@@ -179,52 -131,23 +179,63 @@@ public class Vie
  }
  
  /**
 -  * Returns the sstables that have any partition between {@code left} and 
{@code right}, when both bounds are taken inclusively.
 -  * The interval formed by {@code left} and {@code right} shouldn't wrap.
 -  */
 -public List sstablesInBounds(RowPosition left, RowPosition 
right)
 + * Returns the sstables that have any partition between {@code left} and 
{@code right}, when both bounds are taken inclusively.
 + * The interval formed by {@code left} and {@code right} shouldn't wrap.
 + */
 +public Iterable sstablesInBounds(SSTableSet sstableSet, 
PartitionPosition left, PartitionPosition right)
  {
 -return sstablesInBounds(left, right, intervalTree);
 +assert !AbstractBounds.strictlyWrapsAround(left, right);
 +
 +if (intervalTree.isEmpty())
 +return Collections.emptyList();
 +
 +PartitionPosition stopInTree = right.isMinimum() ? intervalTree.max() 
: right;
 +return select(sstableSet, intervalTree.search(Interval.create(left, 
stopInTree)));
  }
  
 -public static List sstablesInBounds(RowPosition left, 
RowPosition right, SSTableIntervalTree intervalTree)
++public static List

[10/10] cassandra git commit: Merge branch 'cassandra-3.0' into trunk

Merge branch 'cassandra-3.0' into trunk


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/719e7d66
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/719e7d66
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/719e7d66

Branch: refs/heads/trunk
Commit: 719e7d6624c2755295dd89670cff1aebf3c73795
Parents: db8df91 73a8341
Author: Marcus Eriksson 
Authored: Mon Jun 13 15:16:33 2016 +0200
Committer: Marcus Eriksson 
Committed: Mon Jun 13 15:16:33 2016 +0200

--
 CHANGES.txt |  1 +
 .../cassandra/config/DatabaseDescriptor.java|  4 ++
 .../org/apache/cassandra/db/lifecycle/View.java | 11 +++
 .../cassandra/streaming/StreamSession.java  | 18 +++--
 .../io/sstable/SSTableRewriterTest.java | 75 +---
 5 files changed, 88 insertions(+), 21 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/719e7d66/CHANGES.txt
--

http://git-wip-us.apache.org/repos/asf/cassandra/blob/719e7d66/src/java/org/apache/cassandra/config/DatabaseDescriptor.java
--

http://git-wip-us.apache.org/repos/asf/cassandra/blob/719e7d66/src/java/org/apache/cassandra/db/lifecycle/View.java
--

http://git-wip-us.apache.org/repos/asf/cassandra/blob/719e7d66/src/java/org/apache/cassandra/streaming/StreamSession.java
--

http://git-wip-us.apache.org/repos/asf/cassandra/blob/719e7d66/test/unit/org/apache/cassandra/io/sstable/SSTableRewriterTest.java
--

[07/10] cassandra git commit: Merge branch 'cassandra-2.1' into cassandra-2.2

Merge branch 'cassandra-2.1' into cassandra-2.2


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/05f8a008
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/05f8a008
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/05f8a008

Branch: refs/heads/trunk
Commit: 05f8a008f696d9624ec85176fa0e2a1ce06a1ad5
Parents: 593bbf5 72acbcd
Author: Marcus Eriksson 
Authored: Mon Jun 13 14:34:01 2016 +0200
Committer: Marcus Eriksson 
Committed: Mon Jun 13 15:00:08 2016 +0200

--
 CHANGES.txt |  1 +
 .../cassandra/config/DatabaseDescriptor.java|  4 ++
 .../org/apache/cassandra/db/lifecycle/View.java |  5 ++
 .../cassandra/streaming/StreamSession.java  | 22 +++
 .../io/sstable/SSTableRewriterTest.java | 66 
 5 files changed, 86 insertions(+), 12 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/05f8a008/CHANGES.txt
--
diff --cc CHANGES.txt
index d639d43,ebcc90c..491f72a
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@@ -1,31 -1,7 +1,32 @@@
 -2.1.15
 +2.2.7
 + * StorageService shutdown hook should use a volatile variable 
(CASSANDRA-11984)
 + * Persist local metadata earlier in startup sequence (CASSANDRA-11742)
 + * Run CommitLog tests with different compression settings (CASSANDRA-9039)
 + * cqlsh: fix tab completion for case-sensitive identifiers (CASSANDRA-11664)
 + * Avoid showing estimated key as -1 in tablestats (CASSANDRA-11587)
 + * Fix possible race condition in CommitLog.recover (CASSANDRA-11743)
 + * Enable client encryption in sstableloader with cli options 
(CASSANDRA-11708)
 + * Possible memory leak in NIODataInputStream (CASSANDRA-11867)
 + * Fix commit log replay after out-of-order flush completion (CASSANDRA-9669)
 + * Add seconds to cqlsh tracing session duration (CASSANDRA-11753)
 + * Prohibit Reverse Counter type as part of the PK (CASSANDRA-9395)
 + * cqlsh: correctly handle non-ascii chars in error messages (CASSANDRA-11626)
 + * Exit JVM if JMX server fails to startup (CASSANDRA-11540)
 + * Produce a heap dump when exiting on OOM (CASSANDRA-9861)
 + * Avoid read repairing purgeable tombstones on range slices (CASSANDRA-11427)
 + * Restore ability to filter on clustering columns when using a 2i 
(CASSANDRA-11510)
 + * JSON datetime formatting needs timezone (CASSANDRA-11137)
 + * Fix is_dense recalculation for Thrift-updated tables (CASSANDRA-11502)
 + * Remove unnescessary file existence check during anticompaction 
(CASSANDRA-11660)
 + * Add missing files to debian packages (CASSANDRA-11642)
 + * Avoid calling Iterables::concat in loops during 
ModificationStatement::getFunctions (CASSANDRA-11621)
 + * cqlsh: COPY FROM should use regular inserts for single statement batches 
and
 +   report errors correctly if workers processes crash on initialization 
(CASSANDRA-11474)
 + * Always close cluster with connection in CqlRecordWriter (CASSANDRA-11553)
 + * Fix slice queries on ordered COMPACT tables (CASSANDRA-10988)
 +Merged from 2.1:
+  * Create interval tree over canonical sstables to avoid missing sstables 
during streaming (CASSANDRA-11886)
   * cqlsh COPY FROM: shutdown parent cluster after forking, to avoid 
corrupting SSL connections (CASSANDRA-11749)
 - * Updated cqlsh Python driver to fix DESCRIBE problem for legacy tables 
(CASSANDRA-11055)
   * cqlsh: apply current keyspace to source command (CASSANDRA-11152)
   * Backport CASSANDRA-11578 (CASSANDRA-11750)
   * Clear out parent repair session if repair coordinator dies 
(CASSANDRA-11824)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/05f8a008/src/java/org/apache/cassandra/config/DatabaseDescriptor.java
--
diff --cc src/java/org/apache/cassandra/config/DatabaseDescriptor.java
index e24917c,559ba0b..d3a5028
--- a/src/java/org/apache/cassandra/config/DatabaseDescriptor.java
+++ b/src/java/org/apache/cassandra/config/DatabaseDescriptor.java
@@@ -1588,8 -1529,12 +1588,12 @@@ public class DatabaseDescripto
  
  public static int getSSTablePreempiveOpenIntervalInMB()
  {
 -return conf.sstable_preemptive_open_interval_in_mb;
 +return FBUtilities.isWindows() ? -1 : 
conf.sstable_preemptive_open_interval_in_mb;
  }
+ public static void setSSTablePreempiveOpenIntervalInMB(int mb)
+ {
+ conf.sstable_preemptive_open_interval_in_mb = mb;
+ }
  
  public static boolean getTrickleFsync()
  {

http://git-wip-us.apache.org/repos/asf/cassandra/blob/05f8a008/src/java/org/apache/cassandra/db/lifecycle/View.java
--
diff --cc

[09/10] cassandra git commit: Merge branch 'cassandra-2.2' into cassandra-3.0

Merge branch 'cassandra-2.2' into cassandra-3.0


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/73a8341f
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/73a8341f
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/73a8341f

Branch: refs/heads/trunk
Commit: 73a8341fef25de7236bc591e84cddc637c0b7b2f
Parents: 3d211e9 05f8a00
Author: Marcus Eriksson 
Authored: Mon Jun 13 15:14:28 2016 +0200
Committer: Marcus Eriksson 
Committed: Mon Jun 13 15:14:28 2016 +0200

--
 CHANGES.txt |  1 +
 .../cassandra/config/DatabaseDescriptor.java|  4 ++
 .../org/apache/cassandra/db/lifecycle/View.java | 11 +++
 .../cassandra/streaming/StreamSession.java  | 18 +++--
 .../io/sstable/SSTableRewriterTest.java | 75 +---
 5 files changed, 88 insertions(+), 21 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/73a8341f/CHANGES.txt
--
diff --cc CHANGES.txt
index 47aef7e,491f72a..8a04077
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@@ -1,28 -1,5 +1,29 @@@
 -2.2.7
 +3.0.8
 + * Add TimeWindowCompactionStrategy (CASSANDRA-9666)
 +Merged from 2.2:
   * StorageService shutdown hook should use a volatile variable 
(CASSANDRA-11984)
 +Merged from 2.1:
++ * Create interval tree over canonical sstables to avoid missing sstables 
during streaming (CASSANDRA-11886)
 + * cqlsh COPY FROM: shutdown parent cluster after forking, to avoid 
corrupting SSL connections (CASSANDRA-11749)
 +
 +
 +3.0.7
 + * Fix legacy serialization of Thrift-generated non-compound range tombstones
 +   when communicating with 2.x nodes (CASSANDRA-11930)
 + * Fix Directories instantiations where CFS.initialDirectories should be used 
(CASSANDRA-11849)
 + * Avoid referencing DatabaseDescriptor in AbstractType (CASSANDRA-11912)
 + * Fix sstables not being protected from removal during index build 
(CASSANDRA-11905)
 + * cqlsh: Suppress stack trace from Read/WriteFailures (CASSANDRA-11032)
 + * Remove unneeded code to repair index summaries that have
 +   been improperly down-sampled (CASSANDRA-11127)
 + * Avoid WriteTimeoutExceptions during commit log replay due to materialized
 +   view lock contention (CASSANDRA-11891)
 + * Prevent OOM failures on SSTable corruption, improve tests for corruption 
detection (CASSANDRA-9530)
 + * Use CFS.initialDirectories when clearing snapshots (CASSANDRA-11705)
 + * Allow compaction strategies to disable early open (CASSANDRA-11754)
 + * Refactor Materialized View code (CASSANDRA-11475)
 + * Update Java Driver (CASSANDRA-11615)
 +Merged from 2.2:
   * Persist local metadata earlier in startup sequence (CASSANDRA-11742)
   * Run CommitLog tests with different compression settings (CASSANDRA-9039)
   * cqlsh: fix tab completion for case-sensitive identifiers (CASSANDRA-11664)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/73a8341f/src/java/org/apache/cassandra/config/DatabaseDescriptor.java
--

http://git-wip-us.apache.org/repos/asf/cassandra/blob/73a8341f/src/java/org/apache/cassandra/db/lifecycle/View.java
--
diff --cc src/java/org/apache/cassandra/db/lifecycle/View.java
index 17062b4,e303801..99903fc
--- a/src/java/org/apache/cassandra/db/lifecycle/View.java
+++ b/src/java/org/apache/cassandra/db/lifecycle/View.java
@@@ -179,52 -131,23 +179,63 @@@ public class Vie
  }
  
  /**
 -  * Returns the sstables that have any partition between {@code left} and 
{@code right}, when both bounds are taken inclusively.
 -  * The interval formed by {@code left} and {@code right} shouldn't wrap.
 -  */
 -public List sstablesInBounds(RowPosition left, RowPosition 
right)
 + * Returns the sstables that have any partition between {@code left} and 
{@code right}, when both bounds are taken inclusively.
 + * The interval formed by {@code left} and {@code right} shouldn't wrap.
 + */
 +public Iterable sstablesInBounds(SSTableSet sstableSet, 
PartitionPosition left, PartitionPosition right)
  {
 -return sstablesInBounds(left, right, intervalTree);
 +assert !AbstractBounds.strictlyWrapsAround(left, right);
 +
 +if (intervalTree.isEmpty())
 +return Collections.emptyList();
 +
 +PartitionPosition stopInTree = right.isMinimum() ? intervalTree.max() 
: right;
 +return select(sstableSet, intervalTree.search(Interval.create(left, 
stopInTree)));
  }
  
 -public static List sstablesInBounds(RowPosition left, 
RowPosition right, SSTableIntervalTree intervalTree)
++public static List

[06/10] cassandra git commit: Merge branch 'cassandra-2.1' into cassandra-2.2