[jira] [Created] (CASSANDRA-13754) FastThreadLocal leaks memory

2017-08-10 Thread Eric Evans (JIRA)
Eric Evans created CASSANDRA-13754:
--

 Summary: FastThreadLocal leaks memory
 Key: CASSANDRA-13754
 URL: https://issues.apache.org/jira/browse/CASSANDRA-13754
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: OpenJDK 8u141-b15
Reporter: Eric Evans
 Fix For: 3.11.1


After a chronic bout of {{OutOfMemoryError}} in our development environment, a 
heap analysis is showing that more than 10G of our 12G heaps are consumed by 
the {{threadLocals}} members (instances of {{java.lang.ThreadLocalMap}}) of 
various {{io.netty.util.concurrent.FastThreadLocalThread}} instances.  
Reverting 
[cecbe17|https://git-wip-us.apache.org/repos/asf?p=cassandra.git;a=commit;h=cecbe17e3eafc052acc13950494f7dddf026aa54]
 fixes the issue.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13630) support large internode messages with netty

2017-08-10 Thread Jason Brown (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16121770#comment-16121770
 ] 

Jason Brown commented on CASSANDRA-13630:
-

The core idea here is that if the outgoing message is large/huge, we don't want 
to naively allocate a huge buffer just for serialization. For example, if it's 
a large mutation (say 16MB), we don't want to allocate 16MB * n number of 
replica buffers on the coordinator. A safer approach is to allocate standard 
sized buffers (currently 64k), serialize into them via {{DataOutputPlus}} 
interface, write each buffer to the netty channel when the buffer is full, and 
allocate another buffer for further serialization.

The outbound side which splits up serialization into multiple buffers is 
implemented in {{MessageOutHandler.ByteBufDataOutputStreamPlus}}. At the same 
time, I've made it so that all messages are written into a shared buffer (via 
{{MessageOutHandler.ByteBufDataOutputStreamPlus}}), whether it's a large 
message being chunked across multiple buffers, or multiple small messages being 
aggregated into one buffer (think mutations ACKs). This upside here is that we 
don't need to go to the netty allocator for each individual small message, and 
thus just send the single, 'aggregation' buffer downstream in the channel when 
we need to flush.

As I implemented this behavior, I discovered that the 'aggregating buffer' 
could be a problem wrt {{MessageOutHandler#channelWritabilityChanged}} as that 
method, when it gets the signal the channel is writable, attempts to drain any 
backlog from {{OutboundMessagingConnection}} (via the 
{{MessageOutHandler#backlogSupplier}}). If i had retained the current code it 
is quite likely that I would start to serialize a backlogged message while in 
the middle of a message already being serialized (from 
{{MessageOutHandler#write}}), which happened to fill the buffer and write it to 
the channel.

Further, I noticed I needed to forward-port more of CASSANDRA-13265 in order to 
handle expiring messages from the backlog. (FTR, 
{{MessageOutHandler#userEventTriggered}} handles closing the channel when we 
make no progress, but there's no other purging or removing items from the 
backlog queue. Closing the channel will fail any messages in the channel, but 
not from the backlog). Thus, I added the backlog-expiring behavior to 
{{OutboundMessagingConnection#expireMessages}}, and now drain messages from the 
backlog in {{MessageOutHandler#write}}. By trying to send the backlogged 
messages before the incoming message on the channel, it gives us a better shot 
at ordering the sending of the messages wrt the order in which they came into 
the {{OutboundMessagingConnection}}.

I updated jctools to 2.0.2. Instead of using a {{LinkedBlockingQueue}} in 
{{OutboundMessagingConnection}} for the backlog, I decided to use something 
without locks from jctools. Even though the queue still needs to be an 
unbounded multi-producer/multi-consumer (at least, to replicate existing 
behaviors), the jctools queue should be a bit more efficient than an LBQ.

Fixing the outbound size is only half of the problem, as we don't want to 
naively allocate a huge buffer on the receiving node, either. This is a bit 
trickier due to the blocking IO style of our deserializers. Thus, similar to 
what I've done in CASSANDRA-12229, I need to add incoming {{ByteBuf}}s to a 
{{RebufferingByteBufDataInputPlus}} and spin up a background thread for 
performing the deserialization. Since we only need to spin up the the thread 
when we have large message payloads, this will only happen in a minority of use 
cases:

- we are actually transmitting a message larger than 
{{OutboundMessagingPool#LARGE_MESSAGE_THRESHOLD}}, which defaults to 64k. At 
that point we're sending all of those over the outbound large message queue 
anyway, so all messages on that channel/socket will be over the threshold and 
require the background deserialization. So this won't apply to the small 
messages channel, where we can still handle all those messages in-line on the 
inbound netty event loop.
- If you are operating a huge sized cluster (I'm guessing at least 500 nodes in 
size, haven't done the math, tbh), large gossip messages might trigger the 
receiving gossip channel to switch to the background deserialization mode, 
especially ACK/ACK2 messages after a bounce as they will contain all the 
{{ApplicationState}}s for all the peers in the cluster. I do not think this 
will be a problem in practice.

I want to add more comments/documentation before committing, but that should 
not hold up a review. Also, this code is based on the current CASSANDRA-12229. 
Currently failing tests for this branch seem to be race conditions only in the 
streaming code, so I'll fix on the CASSANDRA-12229 branch.

> support large internode messages with netty
> ---
>
>  

cassandra-dtest git commit: Update regex for expected digest mismatch log message

2017-08-10 Thread spod
Repository: cassandra-dtest
Updated Branches:
  refs/heads/master 959208749 -> 459943a35


Update regex for expected digest mismatch log message

patch by Zhao Yang; reviewed by Stefan Podkowinski for CASSANDRA-13723


Project: http://git-wip-us.apache.org/repos/asf/cassandra-dtest/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra-dtest/commit/459943a3
Tree: http://git-wip-us.apache.org/repos/asf/cassandra-dtest/tree/459943a3
Diff: http://git-wip-us.apache.org/repos/asf/cassandra-dtest/diff/459943a3

Branch: refs/heads/master
Commit: 459943a35e7ea9ef49791b47bebaacc0b5af6e04
Parents: 9592087
Author: Zhao Yang 
Authored: Mon Aug 7 15:49:04 2017 +0800
Committer: Stefan Podkowinski 
Committed: Thu Aug 10 08:30:39 2017 +0200

--
 materialized_views_test.py | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra-dtest/blob/459943a3/materialized_views_test.py
--
diff --git a/materialized_views_test.py b/materialized_views_test.py
index 77b20e6..79679ca 100644
--- a/materialized_views_test.py
+++ b/materialized_views_test.py
@@ -228,7 +228,6 @@ class TestMaterializedViews(Tester):
 
 debug("wait that all batchlogs are replayed")
 self._replay_batchlogs()
-
 for i in xrange(5):
 for j in xrange(1):
 assert_one(session, "SELECT * FROM t_by_v WHERE id = {} AND v 
= {}".format(i, j), [j, i])
@@ -1064,8 +1063,8 @@ class TestMaterializedViews(Tester):
 # execution happening
 
 # Look for messages like:
-# Digest mismatch: 
org.apache.cassandra.service.DigestMismatchException: Mismatch for key 
DecoratedKey
-regex = r"Digest mismatch: 
org.apache.cassandra.service.DigestMismatchException: Mismatch for key 
DecoratedKey"
+# Digest mismatch: Mismatch for key DecoratedKey
+regex = r"Digest mismatch: Mismatch for key DecoratedKey"
 for event in trace.events:
 desc = event.description
 match = re.match(regex, desc)


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



cassandra-dtest git commit: Restore <4.0 compatibility for digest mismatch log message matching

2017-08-10 Thread spod
Repository: cassandra-dtest
Updated Branches:
  refs/heads/master 459943a35 -> 61cbd5cdc


Restore <4.0 compatibility for digest mismatch log message matching


Project: http://git-wip-us.apache.org/repos/asf/cassandra-dtest/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra-dtest/commit/61cbd5cd
Tree: http://git-wip-us.apache.org/repos/asf/cassandra-dtest/tree/61cbd5cd
Diff: http://git-wip-us.apache.org/repos/asf/cassandra-dtest/diff/61cbd5cd

Branch: refs/heads/master
Commit: 61cbd5cdcb435503bcb828249cce60ca779995e0
Parents: 459943a
Author: Stefan Podkowinski 
Authored: Thu Aug 10 09:02:24 2017 +0200
Committer: Stefan Podkowinski 
Committed: Thu Aug 10 09:02:24 2017 +0200

--
 materialized_views_test.py | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra-dtest/blob/61cbd5cd/materialized_views_test.py
--
diff --git a/materialized_views_test.py b/materialized_views_test.py
index 79679ca..574d90f 100644
--- a/materialized_views_test.py
+++ b/materialized_views_test.py
@@ -1063,8 +1063,9 @@ class TestMaterializedViews(Tester):
 # execution happening
 
 # Look for messages like:
-# Digest mismatch: Mismatch for key DecoratedKey
-regex = r"Digest mismatch: Mismatch for key DecoratedKey"
+#  4.0+Digest mismatch: Mismatch for key DecoratedKey
+# <4.0 Digest mismatch: 
org.apache.cassandra.service.DigestMismatchException: Mismatch for key 
DecoratedKey
+regex = r"Digest mismatch: ([a-zA-Z.]+:\s)?Mismatch for key 
DecoratedKey"
 for event in trace.events:
 desc = event.description
 match = re.match(regex, desc)


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13723) fix exception logging that should be consumed by placeholder to 'getMessage()' for new slf4j version

2017-08-10 Thread Stefan Podkowinski (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Podkowinski updated CASSANDRA-13723:
---
Resolution: Fixed
  Reviewer: Stefan Podkowinski
Status: Resolved  (was: Patch Available)

Merged as ba87ab4e954ad2
Thanks!

> fix exception logging that should be consumed by placeholder to 
> 'getMessage()' for new slf4j version
> 
>
> Key: CASSANDRA-13723
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13723
> Project: Cassandra
>  Issue Type: Bug
>Reporter: ZhaoYang
>Assignee: ZhaoYang
>Priority: Trivial
> Fix For: 4.0
>
> Attachments: CASSANDRA-13723.patch
>
>
> The wrong tracing log will fail 
> {{materialized_views_test.py:TestMaterializedViews.view_tombstone_test}} and 
> impact clients.
> Current log: {{Digest mismatch: {} on 127.0.0.1}}
> Expected log: {{Digest mismatch: 
> org.apache.cassandra.service.DigestMismatchException: Mismatch for key 
> DecoratedKey... on 127.0.0.1}}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13615) Include 'ppc64le' library for sigar-1.6.4.jar

2017-08-10 Thread Amitkumar Ghatwal (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16121107#comment-16121107
 ] 

Amitkumar Ghatwal commented on CASSANDRA-13615:
---

thanks a the merge ...[~mshuler] , [~jjirsa] 

> Include 'ppc64le' library for sigar-1.6.4.jar
> -
>
> Key: CASSANDRA-13615
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13615
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Libraries
> Environment: # arch
> ppc64le
>Reporter: Amitkumar Ghatwal
>Assignee: Michael Shuler
>  Labels: easyfix
> Fix For: 4.0
>
> Attachments: libsigar-ppc64le-linux.so
>
>
> Hi All,
> sigar-1.6.4.jar does not include a ppc64le library, so we had to install 
> libsigar-ppc64le-linux.so.As the community has been inactive for long 
> (https://github.com/hyperic/sigar), requesting the community to include the 
> ppc64le library directly here.
> Attaching the ppc64le library ( *.so) file to be included under 
> "/lib/sigar-bin". let me know of issues/dependency if any.
> FYI - [~ReiOdaira],[~jjirsa], [~mshuler]
> Regards,
> Amit



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-13615) Include 'ppc64le' library for sigar-1.6.4.jar

2017-08-10 Thread Amitkumar Ghatwal (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16121107#comment-16121107
 ] 

Amitkumar Ghatwal edited comment on CASSANDRA-13615 at 8/10/17 5:58 AM:


thanks for the merge ...[~mshuler] , [~jjirsa] 


was (Author: amitkumar_ghatwal):
thanks a the merge ...[~mshuler] , [~jjirsa] 

> Include 'ppc64le' library for sigar-1.6.4.jar
> -
>
> Key: CASSANDRA-13615
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13615
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Libraries
> Environment: # arch
> ppc64le
>Reporter: Amitkumar Ghatwal
>Assignee: Michael Shuler
>  Labels: easyfix
> Fix For: 4.0
>
> Attachments: libsigar-ppc64le-linux.so
>
>
> Hi All,
> sigar-1.6.4.jar does not include a ppc64le library, so we had to install 
> libsigar-ppc64le-linux.so.As the community has been inactive for long 
> (https://github.com/hyperic/sigar), requesting the community to include the 
> ppc64le library directly here.
> Attaching the ppc64le library ( *.so) file to be included under 
> "/lib/sigar-bin". let me know of issues/dependency if any.
> FYI - [~ReiOdaira],[~jjirsa], [~mshuler]
> Regards,
> Amit



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



cassandra git commit: Explicitly use e.getMessage() for log message formatting

2017-08-10 Thread spod
Repository: cassandra
Updated Branches:
  refs/heads/trunk bcdbee5cd -> ba87ab4e9


Explicitly use e.getMessage() for log message formatting

patch by Zhao Yang; reviewed by Stefan Podkowinski for CASSANDRA-13723


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/ba87ab4e
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/ba87ab4e
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/ba87ab4e

Branch: refs/heads/trunk
Commit: ba87ab4e954ad2e537f6690953bd7ebaa069f5cd
Parents: bcdbee5
Author: Zhao Yang 
Authored: Mon Jul 24 18:13:14 2017 +0800
Committer: Stefan Podkowinski 
Committed: Thu Aug 10 08:13:45 2017 +0200

--
 src/java/org/apache/cassandra/auth/CassandraAuthorizer.java| 6 --
 src/java/org/apache/cassandra/batchlog/BatchlogManager.java| 2 +-
 .../concurrent/AbstractLocalAwareExecutorService.java  | 2 +-
 src/java/org/apache/cassandra/concurrent/SEPWorker.java| 4 ++--
 src/java/org/apache/cassandra/db/Directories.java  | 2 +-
 src/java/org/apache/cassandra/hints/HintsDispatchExecutor.java | 4 +++-
 src/java/org/apache/cassandra/io/util/FileUtils.java   | 2 +-
 src/java/org/apache/cassandra/service/StorageProxy.java| 2 +-
 .../apache/cassandra/streaming/DefaultConnectionFactory.java   | 2 +-
 src/java/org/apache/cassandra/utils/NativeLibrary.java | 2 +-
 10 files changed, 16 insertions(+), 12 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/ba87ab4e/src/java/org/apache/cassandra/auth/CassandraAuthorizer.java
--
diff --git a/src/java/org/apache/cassandra/auth/CassandraAuthorizer.java 
b/src/java/org/apache/cassandra/auth/CassandraAuthorizer.java
index e95a1fd..d760dce 100644
--- a/src/java/org/apache/cassandra/auth/CassandraAuthorizer.java
+++ b/src/java/org/apache/cassandra/auth/CassandraAuthorizer.java
@@ -129,7 +129,9 @@ public class CassandraAuthorizer implements IAuthorizer
 }
 catch (RequestExecutionException | RequestValidationException e)
 {
-logger.warn("CassandraAuthorizer failed to revoke all permissions 
of {}: {}",  revokee.getRoleName(), e);
+logger.warn("CassandraAuthorizer failed to revoke all permissions 
of {}: {}",
+revokee.getRoleName(),
+e.getMessage());
 }
 }
 
@@ -166,7 +168,7 @@ public class CassandraAuthorizer implements IAuthorizer
 }
 catch (RequestExecutionException | RequestValidationException e)
 {
-logger.warn("CassandraAuthorizer failed to revoke all permissions 
on {}: {}", droppedResource, e);
+logger.warn("CassandraAuthorizer failed to revoke all permissions 
on {}: {}", droppedResource, e.getMessage());
 return;
 }
 }

http://git-wip-us.apache.org/repos/asf/cassandra/blob/ba87ab4e/src/java/org/apache/cassandra/batchlog/BatchlogManager.java
--
diff --git a/src/java/org/apache/cassandra/batchlog/BatchlogManager.java 
b/src/java/org/apache/cassandra/batchlog/BatchlogManager.java
index 9ca7acf..9d2867f 100644
--- a/src/java/org/apache/cassandra/batchlog/BatchlogManager.java
+++ b/src/java/org/apache/cassandra/batchlog/BatchlogManager.java
@@ -272,7 +272,7 @@ public class BatchlogManager implements BatchlogManagerMBean
 }
 catch (IOException e)
 {
-logger.warn("Skipped batch replay of {} due to {}", id, e);
+logger.warn("Skipped batch replay of {} due to {}", id, 
e.getMessage());
 remove(id);
 }
 

http://git-wip-us.apache.org/repos/asf/cassandra/blob/ba87ab4e/src/java/org/apache/cassandra/concurrent/AbstractLocalAwareExecutorService.java
--
diff --git 
a/src/java/org/apache/cassandra/concurrent/AbstractLocalAwareExecutorService.java
 
b/src/java/org/apache/cassandra/concurrent/AbstractLocalAwareExecutorService.java
index 530f46e..97dbe86 100644
--- 
a/src/java/org/apache/cassandra/concurrent/AbstractLocalAwareExecutorService.java
+++ 
b/src/java/org/apache/cassandra/concurrent/AbstractLocalAwareExecutorService.java
@@ -164,7 +164,7 @@ public abstract class AbstractLocalAwareExecutorService 
implements LocalAwareExe
 catch (Throwable t)
 {
 JVMStabilityInspector.inspectThrowable(t);
-logger.warn("Uncaught exception on thread {}: {}", 
Thread.currentThread(), t);
+logger.warn("Uncaught exception on thread {}: {}", 
Thread.currentThread(), t.getMessage());

[jira] [Commented] (CASSANDRA-11483) Enhance sstablemetadata

2017-08-10 Thread Marcus Eriksson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16121238#comment-16121238
 ] 

Marcus Eriksson commented on CASSANDRA-11483:
-

I ran the dtests but it seems they have been rotated out now - I must have 
missed this failure

> Enhance sstablemetadata
> ---
>
> Key: CASSANDRA-11483
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11483
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Observability
>Reporter: Chris Lohfink
>Assignee: Chris Lohfink
>Priority: Minor
> Fix For: 4.0
>
> Attachments: CASSANDRA-11483.txt, CASSANDRA-11483v2.txt, 
> CASSANDRA-11483v3.txt, CASSANDRA-11483v4.txt, CASSANDRA-11483v5.txt, Screen 
> Shot 2016-04-03 at 11.40.32 PM.png
>
>
> sstablemetadata provides quite a bit of useful information but theres a few 
> hiccups I would like to see addressed:
> * Does not use client mode
> * Units are not provided (or anything for that matter). There is data in 
> micros, millis, seconds as durations and timestamps from epoch. But there is 
> no way to tell what one is without a non-trival code dive
> * in general pretty frustrating to parse



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13630) support large internode messages with netty

2017-08-10 Thread Jason Brown (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Brown updated CASSANDRA-13630:

Status: Patch Available  (was: Open)

> support large internode messages with netty
> ---
>
> Key: CASSANDRA-13630
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13630
> Project: Cassandra
>  Issue Type: Task
>  Components: Streaming and Messaging
>Reporter: Jason Brown
>Assignee: Jason Brown
> Fix For: 4.0
>
>
> As part of CASSANDRA-8457, we decided to punt on large mesages to reduce the 
> scope of that ticket. However, we still need that functionality to ship a 
> correctly operating internode messaging subsystem.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-12884) Batch logic can lead to unbalanced use of system.batches

2017-08-10 Thread Jeff Jirsa (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1614#comment-1614
 ] 

Jeff Jirsa commented on CASSANDRA-12884:


[~iamaleksey] will have a more comprehensive review, I'm sure, but a few notes 
from a very cursory glance:

1) I don't see the purpose of stubbing out {{BatchlogManager::shuffle}} as a 
helper function here.

2) In the case where {{validated.keySet().size() == 1}} , shuffling all of the 
IPs in a given rack may not be all that efficient - may be quicker to just pick 
2 random ints, and grab the IPs at those offsets (like we do for the case where 
we have more than 2 racks, 
{{result.add(rackMembers.get(getRandomInt(rackMembers.size(;}} )



> Batch logic can lead to unbalanced use of system.batches
> 
>
> Key: CASSANDRA-12884
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12884
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Adam Hattrell
>Assignee: Daniel Cranford
> Fix For: 3.0.x, 3.11.x
>
> Attachments: 0001-CASSANDRA-12884.patch
>
>
> It looks as though there are some odd edge cases in how we distribute the 
> copies in system.batches.
> The main issue is in the filter method for 
> org.apache.cassandra.batchlog.BatchlogManager
> {code:java}
>  if (validated.size() - validated.get(localRack).size() >= 2)
>  {
> // we have enough endpoints in other racks
> validated.removeAll(localRack);
>   }
>  if (validated.keySet().size() == 1)
>  {
>// we have only 1 `other` rack
>Collection otherRack = 
> Iterables.getOnlyElement(validated.asMap().values());
>
> return Lists.newArrayList(Iterables.limit(otherRack, 2));
>  }
> {code}
> So with one or two racks we just return the first 2 entries in the list.  
> There's no shuffle or randomisation here.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-12884) Batch logic can lead to unbalanced use of system.batches

2017-08-10 Thread Jeff Jirsa (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1614#comment-1614
 ] 

Jeff Jirsa edited comment on CASSANDRA-12884 at 8/10/17 8:34 PM:
-

[~iamaleksey] will have a more comprehensive review, I'm sure, but a few notes 
from a very cursory glance:

-1) I don't see the purpose of stubbing out {{BatchlogManager::shuffle}} as a 
helper function here.- (You're overriding it for deterministic testing)

2) In the case where {{validated.keySet().size() == 1}} , shuffling all of the 
IPs in a given rack may not be all that efficient - may be quicker to just pick 
2 random ints, and grab the IPs at those offsets (like we do for the case where 
we have more than 2 racks, 
{{result.add(rackMembers.get(getRandomInt(rackMembers.size(;}} )




was (Author: jjirsa):
[~iamaleksey] will have a more comprehensive review, I'm sure, but a few notes 
from a very cursory glance:

1) I don't see the purpose of stubbing out {{BatchlogManager::shuffle}} as a 
helper function here.

2) In the case where {{validated.keySet().size() == 1}} , shuffling all of the 
IPs in a given rack may not be all that efficient - may be quicker to just pick 
2 random ints, and grab the IPs at those offsets (like we do for the case where 
we have more than 2 racks, 
{{result.add(rackMembers.get(getRandomInt(rackMembers.size(;}} )



> Batch logic can lead to unbalanced use of system.batches
> 
>
> Key: CASSANDRA-12884
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12884
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Adam Hattrell
>Assignee: Daniel Cranford
> Fix For: 3.0.x, 3.11.x
>
> Attachments: 0001-CASSANDRA-12884.patch
>
>
> It looks as though there are some odd edge cases in how we distribute the 
> copies in system.batches.
> The main issue is in the filter method for 
> org.apache.cassandra.batchlog.BatchlogManager
> {code:java}
>  if (validated.size() - validated.get(localRack).size() >= 2)
>  {
> // we have enough endpoints in other racks
> validated.removeAll(localRack);
>   }
>  if (validated.keySet().size() == 1)
>  {
>// we have only 1 `other` rack
>Collection otherRack = 
> Iterables.getOnlyElement(validated.asMap().values());
>
> return Lists.newArrayList(Iterables.limit(otherRack, 2));
>  }
> {code}
> So with one or two racks we just return the first 2 entries in the list.  
> There's no shuffle or randomisation here.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-12884) Batch logic can lead to unbalanced use of system.batches

2017-08-10 Thread Daniel Cranford (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16122288#comment-16122288
 ] 

Daniel Cranford commented on CASSANDRA-12884:
-

1) BatchlogManager::shuffle is stubbed out so the unit test can provide a 
deterministic override. The unit test has been expanded to provide a test which 
catches this regression. (the existing code used the same pattern for 
getRandomInt which is overridden to be non-random in the unit test)

2) getRandomInt could return the same value twice (sampling with replacement) 
resulting in the same replica being chosen. The existing code uses the 
shuffle+take head pattern, eg in BatchlogManager.java line 545 
{{shuffle((List) racks);}} and line 550 {{for (String rack : 
Iterables.limit(racks, 2))}} to perform sampling without replacement.


> Batch logic can lead to unbalanced use of system.batches
> 
>
> Key: CASSANDRA-12884
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12884
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Adam Hattrell
>Assignee: Daniel Cranford
> Fix For: 3.0.x, 3.11.x
>
> Attachments: 0001-CASSANDRA-12884.patch
>
>
> It looks as though there are some odd edge cases in how we distribute the 
> copies in system.batches.
> The main issue is in the filter method for 
> org.apache.cassandra.batchlog.BatchlogManager
> {code:java}
>  if (validated.size() - validated.get(localRack).size() >= 2)
>  {
> // we have enough endpoints in other racks
> validated.removeAll(localRack);
>   }
>  if (validated.keySet().size() == 1)
>  {
>// we have only 1 `other` rack
>Collection otherRack = 
> Iterables.getOnlyElement(validated.asMap().values());
>
> return Lists.newArrayList(Iterables.limit(otherRack, 2));
>  }
> {code}
> So with one or two racks we just return the first 2 entries in the list.  
> There's no shuffle or randomisation here.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-12884) Batch logic can lead to unbalanced use of system.batches

2017-08-10 Thread Daniel Cranford (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16122335#comment-16122335
 ] 

Daniel Cranford commented on CASSANDRA-12884:
-

Technically, if efficiency is key, we could implement something like a 
Durstenfeld/Knuth shuffle, eg https://stackoverflow.com/a/35278327

> Batch logic can lead to unbalanced use of system.batches
> 
>
> Key: CASSANDRA-12884
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12884
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Adam Hattrell
>Assignee: Daniel Cranford
> Fix For: 3.0.x, 3.11.x
>
> Attachments: 0001-CASSANDRA-12884.patch
>
>
> It looks as though there are some odd edge cases in how we distribute the 
> copies in system.batches.
> The main issue is in the filter method for 
> org.apache.cassandra.batchlog.BatchlogManager
> {code:java}
>  if (validated.size() - validated.get(localRack).size() >= 2)
>  {
> // we have enough endpoints in other racks
> validated.removeAll(localRack);
>   }
>  if (validated.keySet().size() == 1)
>  {
>// we have only 1 `other` rack
>Collection otherRack = 
> Iterables.getOnlyElement(validated.asMap().values());
>
> return Lists.newArrayList(Iterables.limit(otherRack, 2));
>  }
> {code}
> So with one or two racks we just return the first 2 entries in the list.  
> There's no shuffle or randomisation here.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13754) FastThreadLocal leaks memory

2017-08-10 Thread Eric Evans (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16122390#comment-16122390
 ] 

Eric Evans commented on CASSANDRA-13754:


Cassandra 3.11.0, Netty 4.0.44.Final

> FastThreadLocal leaks memory
> 
>
> Key: CASSANDRA-13754
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13754
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: OpenJDK 8u141-b15
>Reporter: Eric Evans
> Fix For: 3.11.1
>
>
> After a chronic bout of {{OutOfMemoryError}} in our development environment, 
> a heap analysis is showing that more than 10G of our 12G heaps are consumed 
> by the {{threadLocals}} members (instances of {{java.lang.ThreadLocalMap}}) 
> of various {{io.netty.util.concurrent.FastThreadLocalThread}} instances.  
> Reverting 
> [cecbe17|https://git-wip-us.apache.org/repos/asf?p=cassandra.git;a=commit;h=cecbe17e3eafc052acc13950494f7dddf026aa54]
>  fixes the issue.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13754) FastThreadLocal leaks memory

2017-08-10 Thread Eric Evans (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Evans updated CASSANDRA-13754:
---
Environment: Cassandra 3.11.0, Netty 4.0.44.Final, OpenJDK 8u141-b15  (was: 
OpenJDK 8u141-b15)

> FastThreadLocal leaks memory
> 
>
> Key: CASSANDRA-13754
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13754
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: Cassandra 3.11.0, Netty 4.0.44.Final, OpenJDK 8u141-b15
>Reporter: Eric Evans
> Fix For: 3.11.1
>
>
> After a chronic bout of {{OutOfMemoryError}} in our development environment, 
> a heap analysis is showing that more than 10G of our 12G heaps are consumed 
> by the {{threadLocals}} members (instances of {{java.lang.ThreadLocalMap}}) 
> of various {{io.netty.util.concurrent.FastThreadLocalThread}} instances.  
> Reverting 
> [cecbe17|https://git-wip-us.apache.org/repos/asf?p=cassandra.git;a=commit;h=cecbe17e3eafc052acc13950494f7dddf026aa54]
>  fixes the issue.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13752) Corrupted SSTables created in 3.11

2017-08-10 Thread JIRA

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16122413#comment-16122413
 ] 

Hannu Kröger commented on CASSANDRA-13752:
--

Background information:
- Incremental repairs are being run regularly
- Same cluster suffers also from this: 
https://issues.apache.org/jira/browse/CASSANDRA-13718
- To mitigate the previous bug we have run full repairs on the full cluster on 
problematic tables
- Lucene index plugin is installed but not in use in the keyspace in question
- Cassandra version was 2.2.8 but was upgraded to 3.11.0
- 4 nodes in DC1 (DC2 not connected atm.), RF=3
- Upgrade to 3.11 was done maybe 1,5 weeks ago
- Cluster has been running since may '17

> Corrupted SSTables created in 3.11
> --
>
> Key: CASSANDRA-13752
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13752
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Hannu Kröger
>Priority: Blocker
>
> We have discovered issues with corrupted SSTables. 
> {code}
> ERROR [SSTableBatchOpen:22] 2017-08-03 20:19:53,195 SSTableReader.java:577 - 
> Cannot read sstable 
> /cassandra/data/mykeyspace/mytable-7a4992800d5611e7b782cb90016f2d17/mc-35556-big=[Data.db,
>  Statistics.db, Summary.db, Digest.crc32, CompressionInfo.db, TOC.txt, 
> Index.db, Filter.db]; other IO error, skipping table
> java.io.EOFException: EOF after 1898 bytes out of 21093
> at 
> org.apache.cassandra.io.util.RebufferingInputStream.readFully(RebufferingInputStream.java:68)
>  ~[apache-cassandra-3.11.0.jar:3.11.0]
> at 
> org.apache.cassandra.io.util.RebufferingInputStream.readFully(RebufferingInputStream.java:60)
>  ~[apache-cassandra-3.11.0.jar:3.11.0]
> at 
> org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:402) 
> ~[apache-cassandra-3.11.0.jar:3.11.0]
> at 
> org.apache.cassandra.utils.ByteBufferUtil.readWithShortLength(ByteBufferUtil.java:377)
>  ~[apache-cassandra-3.11.0.jar:3.11.0]
> at 
> org.apache.cassandra.io.sstable.metadata.StatsMetadata$StatsMetadataSerializer.deserialize(StatsMetadata.java:325)
>  ~[apache-cassandra-3.11.0.jar:3.11.0]
> at 
> org.apache.cassandra.io.sstable.metadata.StatsMetadata$StatsMetadataSerializer.deserialize(StatsMetadata.java:231)
>  ~[apache-cassandra-3.11.0.jar:3.11.0]
> at 
> org.apache.cassandra.io.sstable.metadata.MetadataSerializer.deserialize(MetadataSerializer.java:122)
>  ~[apache-cassandra-3.11.0.jar:3.11.0]
> at 
> org.apache.cassandra.io.sstable.metadata.MetadataSerializer.deserialize(MetadataSerializer.java:93)
>  ~[apache-cassandra-3.11.0.jar:3.11.0]
> at 
> org.apache.cassandra.io.sstable.format.SSTableReader.open(SSTableReader.java:488)
>  ~[apache-cassandra-3.11.0.jar:3.11.0]
> at 
> org.apache.cassandra.io.sstable.format.SSTableReader.open(SSTableReader.java:396)
>  ~[apache-cassandra-3.11.0.jar:3.11.0]
> at 
> org.apache.cassandra.io.sstable.format.SSTableReader$5.run(SSTableReader.java:561)
>  ~[apache-cassandra-3.11.0.jar:3.11.0]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> [na:1.8.0_111]
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
> [na:1.8.0_111]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  [na:1.8.0_111]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  [na:1.8.0_111]
> at 
> org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:81)
>  [apache-cassandra-3.11.0.jar:3.11.0]
> {code}
> Files look like this:
> {code}
> -rw-r--r--. 1 cassandra cassandra 3899251 Aug  7 08:37 
> mc-6166-big-CompressionInfo.db
> -rw-r--r--. 1 cassandra cassandra 16874421686 Aug  7 08:37 mc-6166-big-Data.db
> -rw-r--r--. 1 cassandra cassandra  10 Aug  7 08:37 
> mc-6166-big-Digest.crc32
> -rw-r--r--. 1 cassandra cassandra 2930904 Aug  7 08:37 
> mc-6166-big-Filter.db
> -rw-r--r--. 1 cassandra cassandra   75880 Aug  7 08:37 
> mc-6166-big-Index.db
> -rw-r--r--. 1 cassandra cassandra   13762 Aug  7 08:37 
> mc-6166-big-Statistics.db
> -rw-r--r--. 1 cassandra cassandra  882008 Aug  7 08:37 
> mc-6166-big-Summary.db
> -rw-r--r--. 1 cassandra cassandra  92 Aug  7 08:37 mc-6166-big-TOC.txt
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-13755) dtest failure: repair_tests.incremental_repair_test:TestIncRepair.consistent_repair_test

2017-08-10 Thread Blake Eggleston (JIRA)
Blake Eggleston created CASSANDRA-13755:
---

 Summary: dtest failure: 
repair_tests.incremental_repair_test:TestIncRepair.consistent_repair_test
 Key: CASSANDRA-13755
 URL: https://issues.apache.org/jira/browse/CASSANDRA-13755
 Project: Cassandra
  Issue Type: Bug
Reporter: Blake Eggleston
Assignee: Blake Eggleston






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13755) dtest failure: repair_tests.incremental_repair_test:TestIncRepair.consistent_repair_test

2017-08-10 Thread Blake Eggleston (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Blake Eggleston updated CASSANDRA-13755:

Status: Patch Available  (was: Open)

A change in sstablemeta output format broke the test. This branch fixes it:

https://github.com/bdeggleston/cassandra-dtest/tree/13755

> dtest failure: 
> repair_tests.incremental_repair_test:TestIncRepair.consistent_repair_test
> 
>
> Key: CASSANDRA-13755
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13755
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Blake Eggleston
>Assignee: Blake Eggleston
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Issue Comment Deleted] (CASSANDRA-13755) dtest failure: repair_tests.incremental_repair_test:TestIncRepair.consistent_repair_test

2017-08-10 Thread Blake Eggleston (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Blake Eggleston updated CASSANDRA-13755:

Comment: was deleted

(was: A change in sstablemeta output format broke the test. This branch fixes 
it:

https://github.com/bdeggleston/cassandra-dtest/tree/13755)

> dtest failure: 
> repair_tests.incremental_repair_test:TestIncRepair.consistent_repair_test
> 
>
> Key: CASSANDRA-13755
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13755
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Blake Eggleston
>Assignee: Blake Eggleston
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13755) dtest failure: repair_tests.incremental_repair_test:TestIncRepair.consistent_repair_test

2017-08-10 Thread Blake Eggleston (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16122445#comment-16122445
 ] 

Blake Eggleston commented on CASSANDRA-13755:
-

patch by [~jkni] here: 
https://github.com/jkni/cassandra-dtest/commit/f55f78b093fc668dc5cc9d1fc72f66dc5a9bf3a6

> dtest failure: 
> repair_tests.incremental_repair_test:TestIncRepair.consistent_repair_test
> 
>
> Key: CASSANDRA-13755
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13755
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Blake Eggleston
>Assignee: Blake Eggleston
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Assigned] (CASSANDRA-13755) dtest failure: repair_tests.incremental_repair_test:TestIncRepair.consistent_repair_test

2017-08-10 Thread Blake Eggleston (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Blake Eggleston reassigned CASSANDRA-13755:
---

Assignee: Joel Knighton  (was: Blake Eggleston)
Reviewer: Blake Eggleston

> dtest failure: 
> repair_tests.incremental_repair_test:TestIncRepair.consistent_repair_test
> 
>
> Key: CASSANDRA-13755
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13755
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Blake Eggleston
>Assignee: Joel Knighton
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



cassandra-dtest git commit: Handle difference in sstablemetadata output for pending repairs following CASSANDRA-11483

2017-08-10 Thread bdeggleston
Repository: cassandra-dtest
Updated Branches:
  refs/heads/master 61cbd5cdc -> 013efa11f


Handle difference in sstablemetadata output for pending repairs following 
CASSANDRA-11483

Patch by Joel Knighton; reviewed by Blake Eggleston for CASSANDRA-13755


Project: http://git-wip-us.apache.org/repos/asf/cassandra-dtest/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra-dtest/commit/013efa11
Tree: http://git-wip-us.apache.org/repos/asf/cassandra-dtest/tree/013efa11
Diff: http://git-wip-us.apache.org/repos/asf/cassandra-dtest/diff/013efa11

Branch: refs/heads/master
Commit: 013efa11f3d7bd2e3f64a4a5a865ff5dad565552
Parents: 61cbd5c
Author: Joel Knighton 
Authored: Wed Aug 9 13:03:21 2017 -0500
Committer: Blake Eggleston 
Committed: Thu Aug 10 15:34:00 2017 -0700

--
 repair_tests/incremental_repair_test.py | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra-dtest/blob/013efa11/repair_tests/incremental_repair_test.py
--
diff --git a/repair_tests/incremental_repair_test.py 
b/repair_tests/incremental_repair_test.py
index a447d56..b081d44 100644
--- a/repair_tests/incremental_repair_test.py
+++ b/repair_tests/incremental_repair_test.py
@@ -34,7 +34,7 @@ class TestIncRepair(Tester):
 def _get_repaired_data(cls, node, keyspace):
 _sstable_name = compile('SSTable: (.+)')
 _repaired_at = compile('Repaired at: (\d+)')
-_pending_repair = compile('Pending repair: (null|[a-f0-9\-]+)')
+_pending_repair = compile('Pending repair: (\-\-|null|[a-f0-9\-]+)')
 _sstable_data = namedtuple('_sstabledata', ('name', 'repaired', 
'pending_id'))
 
 out = node.run_sstablemetadata(keyspace=keyspace).stdout
@@ -45,7 +45,7 @@ class TestIncRepair(Tester):
 repaired_times = [int(m.group(1)) for m in matches(_repaired_at)]
 
 def uuid_or_none(s):
-return None if s == 'null' else UUID(s)
+return None if s == 'null' or s == '--' else UUID(s)
 pending_repairs = [uuid_or_none(m.group(1)) for m in 
matches(_pending_repair)]
 assert names
 assert repaired_times


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13755) dtest failure: repair_tests.incremental_repair_test:TestIncRepair.consistent_repair_test

2017-08-10 Thread Blake Eggleston (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Blake Eggleston updated CASSANDRA-13755:

Resolution: Fixed
Status: Resolved  (was: Patch Available)

committed as {{013efa11f3d7bd2e3f64a4a5a865ff5dad565552}} thanks!

> dtest failure: 
> repair_tests.incremental_repair_test:TestIncRepair.consistent_repair_test
> 
>
> Key: CASSANDRA-13755
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13755
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Blake Eggleston
>Assignee: Joel Knighton
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



cassandra git commit: Fix digest calculation for counter cells

2017-08-10 Thread bdeggleston
Repository: cassandra
Updated Branches:
  refs/heads/cassandra-3.0 1a70dede3 -> eb6f03c89


Fix digest calculation for counter cells

Patch by Blake Eggleston; reviewed by Aleksey Yeschenko for CASSANDRA-13750


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/eb6f03c8
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/eb6f03c8
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/eb6f03c8

Branch: refs/heads/cassandra-3.0
Commit: eb6f03c8928e913cb6f9eaa7c9ea9f4501039112
Parents: 1a70ded
Author: Blake Eggleston 
Authored: Tue Aug 8 13:45:41 2017 -0700
Committer: Blake Eggleston 
Committed: Thu Aug 10 15:42:31 2017 -0700

--
 CHANGES.txt |  1 +
 src/java/org/apache/cassandra/db/rows/AbstractCell.java | 10 +-
 test/unit/org/apache/cassandra/db/CounterCellTest.java  |  4 ++--
 3 files changed, 12 insertions(+), 3 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/eb6f03c8/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 1f42c70..0b92a7e 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 3.0.15
+ * Fix digest calculation for counter cells (CASSANDRA-13750)
  * Fix ColumnDefinition.cellValueType() for non-frozen collection and change 
SSTabledump to use type.toJSONString() (CASSANDRA-13573)
  * Skip materialized view addition if the base table doesn't exist 
(CASSANDRA-13737)
  * Drop table should remove corresponding entries in dropped_columns table 
(CASSANDRA-13730)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/eb6f03c8/src/java/org/apache/cassandra/db/rows/AbstractCell.java
--
diff --git a/src/java/org/apache/cassandra/db/rows/AbstractCell.java 
b/src/java/org/apache/cassandra/db/rows/AbstractCell.java
index 7e93c2e..576351e 100644
--- a/src/java/org/apache/cassandra/db/rows/AbstractCell.java
+++ b/src/java/org/apache/cassandra/db/rows/AbstractCell.java
@@ -42,7 +42,15 @@ public abstract class AbstractCell extends Cell
 
 public void digest(MessageDigest digest)
 {
-digest.update(value().duplicate());
+if (isCounterCell())
+{
+CounterContext.instance().updateDigest(digest, value());
+}
+else
+{
+digest.update(value().duplicate());
+}
+
 FBUtilities.updateWithLong(digest, timestamp());
 FBUtilities.updateWithInt(digest, ttl());
 FBUtilities.updateWithBoolean(digest, isCounterCell());

http://git-wip-us.apache.org/repos/asf/cassandra/blob/eb6f03c8/test/unit/org/apache/cassandra/db/CounterCellTest.java
--
diff --git a/test/unit/org/apache/cassandra/db/CounterCellTest.java 
b/test/unit/org/apache/cassandra/db/CounterCellTest.java
index 08e0b25..a8ddfcc 100644
--- a/test/unit/org/apache/cassandra/db/CounterCellTest.java
+++ b/test/unit/org/apache/cassandra/db/CounterCellTest.java
@@ -276,8 +276,8 @@ public class CounterCellTest
 ColumnDefinition cDef = cfs.metadata.getColumnDefinition(col);
 Cell cleared = BufferCell.live(cfs.metadata, cDef, 5, 
CounterContext.instance().clearAllLocal(state.context));
 
-CounterContext.instance().updateDigest(digest1, original.value());
-CounterContext.instance().updateDigest(digest2, cleared.value());
+original.digest(digest1);
+cleared.digest(digest2);
 
 assert Arrays.equals(digest1.digest(), digest2.digest());
 }


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[2/2] cassandra git commit: Merge branch 'cassandra-3.0' into cassandra-3.11

2017-08-10 Thread bdeggleston
Merge branch 'cassandra-3.0' into cassandra-3.11


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/e018bec8
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/e018bec8
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/e018bec8

Branch: refs/heads/cassandra-3.11
Commit: e018bec8ad482a1892b97b5f829ff5fa5801190a
Parents: 303dba6 eb6f03c
Author: Blake Eggleston 
Authored: Thu Aug 10 15:43:45 2017 -0700
Committer: Blake Eggleston 
Committed: Thu Aug 10 15:47:22 2017 -0700

--
 CHANGES.txt |  1 +
 src/java/org/apache/cassandra/db/rows/AbstractCell.java | 10 +-
 test/unit/org/apache/cassandra/db/CounterCellTest.java  |  4 ++--
 3 files changed, 12 insertions(+), 3 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/e018bec8/CHANGES.txt
--
diff --cc CHANGES.txt
index 145a746,0b92a7e..3308287
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@@ -1,9 -1,5 +1,10 @@@
 -3.0.15
 +3.11.1
 + * "ignore" option is ignored in sstableloader (CASSANDRA-13721)
 + * Deadlock in AbstractCommitLogSegmentManager (CASSANDRA-13652)
 + * Duplicate the buffer before passing it to analyser in SASI operation 
(CASSANDRA-13512)
 + * Properly evict pstmts from prepared statements cache (CASSANDRA-13641)
 +Merged from 3.0:
+  * Fix digest calculation for counter cells (CASSANDRA-13750)
   * Fix ColumnDefinition.cellValueType() for non-frozen collection and change 
SSTabledump to use type.toJSONString() (CASSANDRA-13573)
   * Skip materialized view addition if the base table doesn't exist 
(CASSANDRA-13737)
   * Drop table should remove corresponding entries in dropped_columns table 
(CASSANDRA-13730)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/e018bec8/src/java/org/apache/cassandra/db/rows/AbstractCell.java
--
diff --cc src/java/org/apache/cassandra/db/rows/AbstractCell.java
index 54c8f24,576351e..744d113
--- a/src/java/org/apache/cassandra/db/rows/AbstractCell.java
+++ b/src/java/org/apache/cassandra/db/rows/AbstractCell.java
@@@ -44,84 -40,17 +44,92 @@@ public abstract class AbstractCell exte
  super(column);
  }
  
 +public boolean isCounterCell()
 +{
 +return !isTombstone() && column.isCounterColumn();
 +}
 +
 +public boolean isLive(int nowInSec)
 +{
 +return localDeletionTime() == NO_DELETION_TIME || (ttl() != NO_TTL && 
nowInSec < localDeletionTime());
 +}
 +
 +public boolean isTombstone()
 +{
 +return localDeletionTime() != NO_DELETION_TIME && ttl() == NO_TTL;
 +}
 +
 +public boolean isExpiring()
 +{
 +return ttl() != NO_TTL;
 +}
 +
 +public Cell markCounterLocalToBeCleared()
 +{
 +if (!isCounterCell())
 +return this;
 +
 +ByteBuffer value = value();
 +ByteBuffer marked = 
CounterContext.instance().markLocalToBeCleared(value);
 +return marked == value ? this : new BufferCell(column, timestamp(), 
ttl(), localDeletionTime(), marked, path());
 +}
 +
 +public Cell purge(DeletionPurger purger, int nowInSec)
 +{
 +if (!isLive(nowInSec))
 +{
 +if (purger.shouldPurge(timestamp(), localDeletionTime()))
 +return null;
 +
 +// We slightly hijack purging to convert expired but not 
purgeable columns to tombstones. The reason we do that is
 +// that once a column has expired it is equivalent to a tombstone 
but actually using a tombstone is more compact since
 +// we don't keep the column value. The reason we do it here is 
that 1) it's somewhat related to dealing with tombstones
 +// so hopefully not too surprising and 2) we want to this and 
purging at the same places, so it's simpler/more efficient
 +// to do both here.
 +if (isExpiring())
 +{
 +// Note that as long as the expiring column and the tombstone 
put together live longer than GC grace seconds,
 +// we'll fulfil our responsibility to repair. See discussion 
at
 +// 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/repair-compaction-and-tombstone-rows-td7583481.html
 +return BufferCell.tombstone(column, timestamp(), 
localDeletionTime() - ttl(), path()).purge(purger, nowInSec);
 +}
 +}
 +return this;
 +}
 +
 +public Cell copy(AbstractAllocator allocator)
 +{
 +CellPath path = path();
 +return new BufferCell(column, timestamp(), ttl(), 
localDeletionTime(), allocator.clone(value()), path == null 

[1/2] cassandra git commit: Fix digest calculation for counter cells

2017-08-10 Thread bdeggleston
Repository: cassandra
Updated Branches:
  refs/heads/cassandra-3.11 303dba650 -> e018bec8a


Fix digest calculation for counter cells

Patch by Blake Eggleston; reviewed by Aleksey Yeschenko for CASSANDRA-13750


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/eb6f03c8
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/eb6f03c8
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/eb6f03c8

Branch: refs/heads/cassandra-3.11
Commit: eb6f03c8928e913cb6f9eaa7c9ea9f4501039112
Parents: 1a70ded
Author: Blake Eggleston 
Authored: Tue Aug 8 13:45:41 2017 -0700
Committer: Blake Eggleston 
Committed: Thu Aug 10 15:42:31 2017 -0700

--
 CHANGES.txt |  1 +
 src/java/org/apache/cassandra/db/rows/AbstractCell.java | 10 +-
 test/unit/org/apache/cassandra/db/CounterCellTest.java  |  4 ++--
 3 files changed, 12 insertions(+), 3 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/eb6f03c8/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 1f42c70..0b92a7e 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 3.0.15
+ * Fix digest calculation for counter cells (CASSANDRA-13750)
  * Fix ColumnDefinition.cellValueType() for non-frozen collection and change 
SSTabledump to use type.toJSONString() (CASSANDRA-13573)
  * Skip materialized view addition if the base table doesn't exist 
(CASSANDRA-13737)
  * Drop table should remove corresponding entries in dropped_columns table 
(CASSANDRA-13730)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/eb6f03c8/src/java/org/apache/cassandra/db/rows/AbstractCell.java
--
diff --git a/src/java/org/apache/cassandra/db/rows/AbstractCell.java 
b/src/java/org/apache/cassandra/db/rows/AbstractCell.java
index 7e93c2e..576351e 100644
--- a/src/java/org/apache/cassandra/db/rows/AbstractCell.java
+++ b/src/java/org/apache/cassandra/db/rows/AbstractCell.java
@@ -42,7 +42,15 @@ public abstract class AbstractCell extends Cell
 
 public void digest(MessageDigest digest)
 {
-digest.update(value().duplicate());
+if (isCounterCell())
+{
+CounterContext.instance().updateDigest(digest, value());
+}
+else
+{
+digest.update(value().duplicate());
+}
+
 FBUtilities.updateWithLong(digest, timestamp());
 FBUtilities.updateWithInt(digest, ttl());
 FBUtilities.updateWithBoolean(digest, isCounterCell());

http://git-wip-us.apache.org/repos/asf/cassandra/blob/eb6f03c8/test/unit/org/apache/cassandra/db/CounterCellTest.java
--
diff --git a/test/unit/org/apache/cassandra/db/CounterCellTest.java 
b/test/unit/org/apache/cassandra/db/CounterCellTest.java
index 08e0b25..a8ddfcc 100644
--- a/test/unit/org/apache/cassandra/db/CounterCellTest.java
+++ b/test/unit/org/apache/cassandra/db/CounterCellTest.java
@@ -276,8 +276,8 @@ public class CounterCellTest
 ColumnDefinition cDef = cfs.metadata.getColumnDefinition(col);
 Cell cleared = BufferCell.live(cfs.metadata, cDef, 5, 
CounterContext.instance().clearAllLocal(state.context));
 
-CounterContext.instance().updateDigest(digest1, original.value());
-CounterContext.instance().updateDigest(digest2, cleared.value());
+original.digest(digest1);
+cleared.digest(digest2);
 
 assert Arrays.equals(digest1.digest(), digest2.digest());
 }


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[2/3] cassandra git commit: Merge branch 'cassandra-3.0' into cassandra-3.11

2017-08-10 Thread bdeggleston
Merge branch 'cassandra-3.0' into cassandra-3.11


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/e018bec8
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/e018bec8
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/e018bec8

Branch: refs/heads/trunk
Commit: e018bec8ad482a1892b97b5f829ff5fa5801190a
Parents: 303dba6 eb6f03c
Author: Blake Eggleston 
Authored: Thu Aug 10 15:43:45 2017 -0700
Committer: Blake Eggleston 
Committed: Thu Aug 10 15:47:22 2017 -0700

--
 CHANGES.txt |  1 +
 src/java/org/apache/cassandra/db/rows/AbstractCell.java | 10 +-
 test/unit/org/apache/cassandra/db/CounterCellTest.java  |  4 ++--
 3 files changed, 12 insertions(+), 3 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/e018bec8/CHANGES.txt
--
diff --cc CHANGES.txt
index 145a746,0b92a7e..3308287
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@@ -1,9 -1,5 +1,10 @@@
 -3.0.15
 +3.11.1
 + * "ignore" option is ignored in sstableloader (CASSANDRA-13721)
 + * Deadlock in AbstractCommitLogSegmentManager (CASSANDRA-13652)
 + * Duplicate the buffer before passing it to analyser in SASI operation 
(CASSANDRA-13512)
 + * Properly evict pstmts from prepared statements cache (CASSANDRA-13641)
 +Merged from 3.0:
+  * Fix digest calculation for counter cells (CASSANDRA-13750)
   * Fix ColumnDefinition.cellValueType() for non-frozen collection and change 
SSTabledump to use type.toJSONString() (CASSANDRA-13573)
   * Skip materialized view addition if the base table doesn't exist 
(CASSANDRA-13737)
   * Drop table should remove corresponding entries in dropped_columns table 
(CASSANDRA-13730)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/e018bec8/src/java/org/apache/cassandra/db/rows/AbstractCell.java
--
diff --cc src/java/org/apache/cassandra/db/rows/AbstractCell.java
index 54c8f24,576351e..744d113
--- a/src/java/org/apache/cassandra/db/rows/AbstractCell.java
+++ b/src/java/org/apache/cassandra/db/rows/AbstractCell.java
@@@ -44,84 -40,17 +44,92 @@@ public abstract class AbstractCell exte
  super(column);
  }
  
 +public boolean isCounterCell()
 +{
 +return !isTombstone() && column.isCounterColumn();
 +}
 +
 +public boolean isLive(int nowInSec)
 +{
 +return localDeletionTime() == NO_DELETION_TIME || (ttl() != NO_TTL && 
nowInSec < localDeletionTime());
 +}
 +
 +public boolean isTombstone()
 +{
 +return localDeletionTime() != NO_DELETION_TIME && ttl() == NO_TTL;
 +}
 +
 +public boolean isExpiring()
 +{
 +return ttl() != NO_TTL;
 +}
 +
 +public Cell markCounterLocalToBeCleared()
 +{
 +if (!isCounterCell())
 +return this;
 +
 +ByteBuffer value = value();
 +ByteBuffer marked = 
CounterContext.instance().markLocalToBeCleared(value);
 +return marked == value ? this : new BufferCell(column, timestamp(), 
ttl(), localDeletionTime(), marked, path());
 +}
 +
 +public Cell purge(DeletionPurger purger, int nowInSec)
 +{
 +if (!isLive(nowInSec))
 +{
 +if (purger.shouldPurge(timestamp(), localDeletionTime()))
 +return null;
 +
 +// We slightly hijack purging to convert expired but not 
purgeable columns to tombstones. The reason we do that is
 +// that once a column has expired it is equivalent to a tombstone 
but actually using a tombstone is more compact since
 +// we don't keep the column value. The reason we do it here is 
that 1) it's somewhat related to dealing with tombstones
 +// so hopefully not too surprising and 2) we want to this and 
purging at the same places, so it's simpler/more efficient
 +// to do both here.
 +if (isExpiring())
 +{
 +// Note that as long as the expiring column and the tombstone 
put together live longer than GC grace seconds,
 +// we'll fulfil our responsibility to repair. See discussion 
at
 +// 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/repair-compaction-and-tombstone-rows-td7583481.html
 +return BufferCell.tombstone(column, timestamp(), 
localDeletionTime() - ttl(), path()).purge(purger, nowInSec);
 +}
 +}
 +return this;
 +}
 +
 +public Cell copy(AbstractAllocator allocator)
 +{
 +CellPath path = path();
 +return new BufferCell(column, timestamp(), ttl(), 
localDeletionTime(), allocator.clone(value()), path == null ? null : 

[3/3] cassandra git commit: Merge branch 'cassandra-3.11' into trunk

2017-08-10 Thread bdeggleston
Merge branch 'cassandra-3.11' into trunk


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/4b736366
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/4b736366
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/4b736366

Branch: refs/heads/trunk
Commit: 4b736366c2a958e67dffa12ad776d850ba370752
Parents: 9c3354e e018bec
Author: Blake Eggleston 
Authored: Thu Aug 10 15:48:09 2017 -0700
Committer: Blake Eggleston 
Committed: Thu Aug 10 15:49:20 2017 -0700

--
 CHANGES.txt |  1 +
 src/java/org/apache/cassandra/db/rows/AbstractCell.java | 10 +-
 test/unit/org/apache/cassandra/db/CounterCellTest.java  |  4 ++--
 3 files changed, 12 insertions(+), 3 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/4b736366/CHANGES.txt
--

http://git-wip-us.apache.org/repos/asf/cassandra/blob/4b736366/src/java/org/apache/cassandra/db/rows/AbstractCell.java
--

http://git-wip-us.apache.org/repos/asf/cassandra/blob/4b736366/test/unit/org/apache/cassandra/db/CounterCellTest.java
--
diff --cc test/unit/org/apache/cassandra/db/CounterCellTest.java
index 8c1347d,74599c3..b10a9c7
--- a/test/unit/org/apache/cassandra/db/CounterCellTest.java
+++ b/test/unit/org/apache/cassandra/db/CounterCellTest.java
@@@ -272,11 -272,11 +272,11 @@@ public class CounterCellTes
  
  Cell original = createCounterCellFromContext(cfs, col, state, 5);
  
 -ColumnDefinition cDef = cfs.metadata.getColumnDefinition(col);
 +ColumnMetadata cDef = cfs.metadata().getColumn(col);
  Cell cleared = BufferCell.live(cDef, 5, 
CounterContext.instance().clearAllLocal(state.context));
  
- CounterContext.instance().updateDigest(digest1, original.value());
- CounterContext.instance().updateDigest(digest2, cleared.value());
+ original.digest(digest1);
+ cleared.digest(digest2);
  
  assert Arrays.equals(digest1.digest(), digest2.digest());
  }


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[1/3] cassandra git commit: Fix digest calculation for counter cells

2017-08-10 Thread bdeggleston
Repository: cassandra
Updated Branches:
  refs/heads/trunk 9c3354e32 -> 4b736366c


Fix digest calculation for counter cells

Patch by Blake Eggleston; reviewed by Aleksey Yeschenko for CASSANDRA-13750


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/eb6f03c8
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/eb6f03c8
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/eb6f03c8

Branch: refs/heads/trunk
Commit: eb6f03c8928e913cb6f9eaa7c9ea9f4501039112
Parents: 1a70ded
Author: Blake Eggleston 
Authored: Tue Aug 8 13:45:41 2017 -0700
Committer: Blake Eggleston 
Committed: Thu Aug 10 15:42:31 2017 -0700

--
 CHANGES.txt |  1 +
 src/java/org/apache/cassandra/db/rows/AbstractCell.java | 10 +-
 test/unit/org/apache/cassandra/db/CounterCellTest.java  |  4 ++--
 3 files changed, 12 insertions(+), 3 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/eb6f03c8/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 1f42c70..0b92a7e 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 3.0.15
+ * Fix digest calculation for counter cells (CASSANDRA-13750)
  * Fix ColumnDefinition.cellValueType() for non-frozen collection and change 
SSTabledump to use type.toJSONString() (CASSANDRA-13573)
  * Skip materialized view addition if the base table doesn't exist 
(CASSANDRA-13737)
  * Drop table should remove corresponding entries in dropped_columns table 
(CASSANDRA-13730)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/eb6f03c8/src/java/org/apache/cassandra/db/rows/AbstractCell.java
--
diff --git a/src/java/org/apache/cassandra/db/rows/AbstractCell.java 
b/src/java/org/apache/cassandra/db/rows/AbstractCell.java
index 7e93c2e..576351e 100644
--- a/src/java/org/apache/cassandra/db/rows/AbstractCell.java
+++ b/src/java/org/apache/cassandra/db/rows/AbstractCell.java
@@ -42,7 +42,15 @@ public abstract class AbstractCell extends Cell
 
 public void digest(MessageDigest digest)
 {
-digest.update(value().duplicate());
+if (isCounterCell())
+{
+CounterContext.instance().updateDigest(digest, value());
+}
+else
+{
+digest.update(value().duplicate());
+}
+
 FBUtilities.updateWithLong(digest, timestamp());
 FBUtilities.updateWithInt(digest, ttl());
 FBUtilities.updateWithBoolean(digest, isCounterCell());

http://git-wip-us.apache.org/repos/asf/cassandra/blob/eb6f03c8/test/unit/org/apache/cassandra/db/CounterCellTest.java
--
diff --git a/test/unit/org/apache/cassandra/db/CounterCellTest.java 
b/test/unit/org/apache/cassandra/db/CounterCellTest.java
index 08e0b25..a8ddfcc 100644
--- a/test/unit/org/apache/cassandra/db/CounterCellTest.java
+++ b/test/unit/org/apache/cassandra/db/CounterCellTest.java
@@ -276,8 +276,8 @@ public class CounterCellTest
 ColumnDefinition cDef = cfs.metadata.getColumnDefinition(col);
 Cell cleared = BufferCell.live(cfs.metadata, cDef, 5, 
CounterContext.instance().clearAllLocal(state.context));
 
-CounterContext.instance().updateDigest(digest1, original.value());
-CounterContext.instance().updateDigest(digest2, cleared.value());
+original.digest(digest1);
+cleared.digest(digest2);
 
 assert Arrays.equals(digest1.digest(), digest2.digest());
 }


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13750) Counter digests include local data

2017-08-10 Thread Blake Eggleston (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Blake Eggleston updated CASSANDRA-13750:

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed as {{eb6f03c8928e913cb6f9eaa7c9ea9f4501039112}}

Opened/reviewed/committed CASSANDRA-13755 to fix only non-flaky test failure

> Counter digests include local data
> --
>
> Key: CASSANDRA-13750
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13750
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Blake Eggleston
>Assignee: Blake Eggleston
>Priority: Minor
> Fix For: 4.0, 3.0.x, 3.11.x
>
>
> In 3.x+, the raw counter value bytes are used when hashing counters for reads 
> and repair, including local shard data, which is removed when streamed. This 
> leads to constant digest mismatches and repair overstreaming.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13755) dtest failure: repair_tests.incremental_repair_test:TestIncRepair.consistent_repair_test

2017-08-10 Thread Joel Knighton (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16122535#comment-16122535
 ] 

Joel Knighton commented on CASSANDRA-13755:
---

Thanks!

> dtest failure: 
> repair_tests.incremental_repair_test:TestIncRepair.consistent_repair_test
> 
>
> Key: CASSANDRA-13755
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13755
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Blake Eggleston
>Assignee: Joel Knighton
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-11748) Schema version mismatch may leads to Casandra OOM at bootstrap during a rolling upgrade process

2017-08-10 Thread Aleksey Yeschenko (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16121857#comment-16121857
 ] 

Aleksey Yeschenko commented on CASSANDRA-11748:
---

bq. But we should also not forget to look at the receiver side for incoming 
pull requests. Joining the cluster with a schema mismatch should not cause a 
node to answer each of those in parallel.

Good observation, though maybe there is a better solution. I think we shouldn't 
pull schema immediately from a node that just went up (and is potentially 
missing updates).

Schedule that pull with a delay instead, give the new node a chance to pull the 
new schema from one of the nodes in the cluster. It'll most likely converge by 
the time the delay has passed, so we'd just abort the request if schema 
versions now match.

> Schema version mismatch may leads to Casandra OOM at bootstrap during a 
> rolling upgrade process
> ---
>
> Key: CASSANDRA-11748
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11748
> Project: Cassandra
>  Issue Type: Bug
> Environment: Rolling upgrade process from 1.2.19 to 2.0.17. 
> CentOS 6.6
> Occurred in different C* node of different scale of deployment (2G ~ 5G)
>Reporter: Michael Fong
>Assignee: Matt Byrd
>Priority: Critical
> Fix For: 3.0.x, 3.11.x, 4.x
>
>
> We have observed multiple times when a multi-node C* (v2.0.17) cluster ran 
> into OOM in bootstrap during a rolling upgrade process from 1.2.19 to 2.0.17. 
> Here is the simple guideline of our rolling upgrade process
> 1. Update schema on a node, and wait until all nodes to be in schema version 
> agreemnt - via nodetool describeclulster
> 2. Restart a Cassandra node
> 3. After restart, there is a chance that the the restarted node has different 
> schema version.
> 4. All nodes in cluster start to rapidly exchange schema information, and any 
> of node could run into OOM. 
> The following is the system.log that occur in one of our 2-node cluster test 
> bed
> --
> Before rebooting node 2:
> Node 1: DEBUG [MigrationStage:1] 2016-04-19 11:09:42,326 
> MigrationManager.java (line 328) Gossiping my schema version 
> 4cb463f8-5376-3baf-8e88-a5cc6a94f58f
> Node 2: DEBUG [MigrationStage:1] 2016-04-19 11:09:42,122 
> MigrationManager.java (line 328) Gossiping my schema version 
> 4cb463f8-5376-3baf-8e88-a5cc6a94f58f
> After rebooting node 2, 
> Node 2: DEBUG [main] 2016-04-19 11:18:18,016 MigrationManager.java (line 328) 
> Gossiping my schema version f5270873-ba1f-39c7-ab2e-a86db868b09b
> The node2  keeps submitting the migration task over 100+ times to the other 
> node.
> INFO [GossipStage:1] 2016-04-19 11:18:18,261 Gossiper.java (line 1011) Node 
> /192.168.88.33 has restarted, now UP
> INFO [GossipStage:1] 2016-04-19 11:18:18,262 TokenMetadata.java (line 414) 
> Updating topology for /192.168.88.33
> ...
> DEBUG [GossipStage:1] 2016-04-19 11:18:18,265 MigrationManager.java (line 
> 102) Submitting migration task for /192.168.88.33
> ... ( over 100+ times)
> --
> On the otherhand, Node 1 keeps updating its gossip information, followed by 
> receiving and submitting migrationTask afterwards: 
> INFO [RequestResponseStage:3] 2016-04-19 11:18:18,333 Gossiper.java (line 
> 978) InetAddress /192.168.88.34 is now UP
> ...
> DEBUG [MigrationStage:1] 2016-04-19 11:18:18,496 
> MigrationRequestVerbHandler.java (line 41) Received migration request from 
> /192.168.88.34.
> …… ( over 100+ times)
> DEBUG [OptionalTasks:1] 2016-04-19 11:19:18,337 MigrationManager.java (line 
> 127) submitting migration task for /192.168.88.34
> .  (over 50+ times)
> On the side note, we have over 200+ column families defined in Cassandra 
> database, which may related to this amount of rpc traffic.
> P.S.2 The over requested schema migration task will eventually have 
> InternalResponseStage performing schema merge operation. Since this operation 
> requires a compaction for each merge and is much slower to consume. Thus, the 
> back-pressure of incoming schema migration content objects consumes all of 
> the heap space and ultimately ends up OOM!



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13754) FastThreadLocal leaks memory

2017-08-10 Thread Jeff Jirsa (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16122012#comment-16122012
 ] 

Jeff Jirsa commented on CASSANDRA-13754:


What version are you on [~urandom] (or really, which version of netty is in the 
classpath) ? 


> FastThreadLocal leaks memory
> 
>
> Key: CASSANDRA-13754
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13754
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: OpenJDK 8u141-b15
>Reporter: Eric Evans
> Fix For: 3.11.1
>
>
> After a chronic bout of {{OutOfMemoryError}} in our development environment, 
> a heap analysis is showing that more than 10G of our 12G heaps are consumed 
> by the {{threadLocals}} members (instances of {{java.lang.ThreadLocalMap}}) 
> of various {{io.netty.util.concurrent.FastThreadLocalThread}} instances.  
> Reverting 
> [cecbe17|https://git-wip-us.apache.org/repos/asf?p=cassandra.git;a=commit;h=cecbe17e3eafc052acc13950494f7dddf026aa54]
>  fixes the issue.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Assigned] (CASSANDRA-10726) Read repair inserts should not be blocking

2017-08-10 Thread sankalp kohli (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sankalp kohli reassigned CASSANDRA-10726:
-

Assignee: (was: Marcus Eriksson)

> Read repair inserts should not be blocking
> --
>
> Key: CASSANDRA-10726
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10726
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Coordination
>Reporter: Richard Low
> Fix For: 3.0.x
>
>
> Today, if there’s a digest mismatch in a foreground read repair, the insert 
> to update out of date replicas is blocking. This means, if it fails, the read 
> fails with a timeout. If a node is dropping writes (maybe it is overloaded or 
> the mutation stage is backed up for some other reason), all reads to a 
> replica set could fail. Further, replicas dropping writes get more out of 
> sync so will require more read repair.
> The comment on the code for why the writes are blocking is:
> {code}
> // wait for the repair writes to be acknowledged, to minimize impact on any 
> replica that's
> // behind on writes in case the out-of-sync row is read multiple times in 
> quick succession
> {code}
> but the bad side effect is that reads timeout. Either the writes should not 
> be blocking or we should return success for the read even if the write times 
> out.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13752) Corrupted SSTables created in 3.11

2017-08-10 Thread Jeff Jirsa (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16122020#comment-16122020
 ] 

Jeff Jirsa commented on CASSANDRA-13752:


Any additional context you can provide - how old is the cluster? Have you 
changed anything recently? When did you upgrade to 3.11.0? How long before you 
saw those errors? Do you run repairs? Incremental repairs or full? Anything 
else in the logs that looks atypical? 

> Corrupted SSTables created in 3.11
> --
>
> Key: CASSANDRA-13752
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13752
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Hannu Kröger
>Priority: Blocker
>
> We have discovered issues with corrupted SSTables. 
> {code}
> ERROR [SSTableBatchOpen:22] 2017-08-03 20:19:53,195 SSTableReader.java:577 - 
> Cannot read sstable 
> /cassandra/data/mykeyspace/mytable-7a4992800d5611e7b782cb90016f2d17/mc-35556-big=[Data.db,
>  Statistics.db, Summary.db, Digest.crc32, CompressionInfo.db, TOC.txt, 
> Index.db, Filter.db]; other IO error, skipping table
> java.io.EOFException: EOF after 1898 bytes out of 21093
> at 
> org.apache.cassandra.io.util.RebufferingInputStream.readFully(RebufferingInputStream.java:68)
>  ~[apache-cassandra-3.11.0.jar:3.11.0]
> at 
> org.apache.cassandra.io.util.RebufferingInputStream.readFully(RebufferingInputStream.java:60)
>  ~[apache-cassandra-3.11.0.jar:3.11.0]
> at 
> org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:402) 
> ~[apache-cassandra-3.11.0.jar:3.11.0]
> at 
> org.apache.cassandra.utils.ByteBufferUtil.readWithShortLength(ByteBufferUtil.java:377)
>  ~[apache-cassandra-3.11.0.jar:3.11.0]
> at 
> org.apache.cassandra.io.sstable.metadata.StatsMetadata$StatsMetadataSerializer.deserialize(StatsMetadata.java:325)
>  ~[apache-cassandra-3.11.0.jar:3.11.0]
> at 
> org.apache.cassandra.io.sstable.metadata.StatsMetadata$StatsMetadataSerializer.deserialize(StatsMetadata.java:231)
>  ~[apache-cassandra-3.11.0.jar:3.11.0]
> at 
> org.apache.cassandra.io.sstable.metadata.MetadataSerializer.deserialize(MetadataSerializer.java:122)
>  ~[apache-cassandra-3.11.0.jar:3.11.0]
> at 
> org.apache.cassandra.io.sstable.metadata.MetadataSerializer.deserialize(MetadataSerializer.java:93)
>  ~[apache-cassandra-3.11.0.jar:3.11.0]
> at 
> org.apache.cassandra.io.sstable.format.SSTableReader.open(SSTableReader.java:488)
>  ~[apache-cassandra-3.11.0.jar:3.11.0]
> at 
> org.apache.cassandra.io.sstable.format.SSTableReader.open(SSTableReader.java:396)
>  ~[apache-cassandra-3.11.0.jar:3.11.0]
> at 
> org.apache.cassandra.io.sstable.format.SSTableReader$5.run(SSTableReader.java:561)
>  ~[apache-cassandra-3.11.0.jar:3.11.0]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> [na:1.8.0_111]
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
> [na:1.8.0_111]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  [na:1.8.0_111]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  [na:1.8.0_111]
> at 
> org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:81)
>  [apache-cassandra-3.11.0.jar:3.11.0]
> {code}
> Files look like this:
> {code}
> -rw-r--r--. 1 cassandra cassandra 3899251 Aug  7 08:37 
> mc-6166-big-CompressionInfo.db
> -rw-r--r--. 1 cassandra cassandra 16874421686 Aug  7 08:37 mc-6166-big-Data.db
> -rw-r--r--. 1 cassandra cassandra  10 Aug  7 08:37 
> mc-6166-big-Digest.crc32
> -rw-r--r--. 1 cassandra cassandra 2930904 Aug  7 08:37 
> mc-6166-big-Filter.db
> -rw-r--r--. 1 cassandra cassandra   75880 Aug  7 08:37 
> mc-6166-big-Index.db
> -rw-r--r--. 1 cassandra cassandra   13762 Aug  7 08:37 
> mc-6166-big-Statistics.db
> -rw-r--r--. 1 cassandra cassandra  882008 Aug  7 08:37 
> mc-6166-big-Summary.db
> -rw-r--r--. 1 cassandra cassandra  92 Aug  7 08:37 mc-6166-big-TOC.txt
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



cassandra git commit: Fix race / ref leak in PendingRepairManager

2017-08-10 Thread bdeggleston
Repository: cassandra
Updated Branches:
  refs/heads/trunk ba87ab4e9 -> 9c3354e32


Fix race / ref leak in PendingRepairManager

Patch by Blake Eggleston; Reviewed by Marcus Eriksson for CASSANDRA-13751


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/9c3354e3
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/9c3354e3
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/9c3354e3

Branch: refs/heads/trunk
Commit: 9c3354e3211c6a3f3982e87477e156c29cd9b7ea
Parents: ba87ab4
Author: Blake Eggleston 
Authored: Tue Aug 8 10:32:35 2017 -0700
Committer: Blake Eggleston 
Committed: Thu Aug 10 12:01:00 2017 -0700

--
 CHANGES.txt |  1 +
 .../compaction/AbstractCompactionStrategy.java  | 29 ++---
 .../compaction/CompactionStrategyManager.java   | 25 +++---
 .../compaction/LeveledCompactionStrategy.java   | 10 +-
 .../db/compaction/PendingRepairManager.java | 34 +---
 .../cassandra/io/sstable/ISSTableScanner.java   | 34 
 .../db/compaction/PendingRepairManagerTest.java | 24 ++
 7 files changed, 113 insertions(+), 44 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/9c3354e3/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 849848f..e997b50 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 4.0
+ * Fix race / ref leak in PendingRepairManager (CASSANDRA-13751)
  * Enable ppc64le runtime as unsupported architecture (CASSANDRA-13615)
  * Improve sstablemetadata output (CASSANDRA-11483)
  * Support for migrating legacy users to roles has been dropped 
(CASSANDRA-13371)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/9c3354e3/src/java/org/apache/cassandra/db/compaction/AbstractCompactionStrategy.java
--
diff --git 
a/src/java/org/apache/cassandra/db/compaction/AbstractCompactionStrategy.java 
b/src/java/org/apache/cassandra/db/compaction/AbstractCompactionStrategy.java
index 5333683..f1f42a7 100644
--- 
a/src/java/org/apache/cassandra/db/compaction/AbstractCompactionStrategy.java
+++ 
b/src/java/org/apache/cassandra/db/compaction/AbstractCompactionStrategy.java
@@ -293,15 +293,7 @@ public abstract class AbstractCompactionStrategy
 }
 catch (Throwable t)
 {
-try
-{
-new ScannerList(scanners).close();
-}
-catch (Throwable t2)
-{
-t.addSuppressed(t2);
-}
-throw t;
+ISSTableScanner.closeAllAndPropagate(scanners, t);
 }
 return new ScannerList(scanners);
 }
@@ -385,24 +377,7 @@ public abstract class AbstractCompactionStrategy
 
 public void close()
 {
-Throwable t = null;
-for (ISSTableScanner scanner : scanners)
-{
-try
-{
-scanner.close();
-}
-catch (Throwable t2)
-{
-JVMStabilityInspector.inspectThrowable(t2);
-if (t == null)
-t = t2;
-else
-t.addSuppressed(t2);
-}
-}
-if (t != null)
-throw Throwables.propagate(t);
+ISSTableScanner.closeAllAndPropagate(scanners, null);
 }
 }
 

http://git-wip-us.apache.org/repos/asf/cassandra/blob/9c3354e3/src/java/org/apache/cassandra/db/compaction/CompactionStrategyManager.java
--
diff --git 
a/src/java/org/apache/cassandra/db/compaction/CompactionStrategyManager.java 
b/src/java/org/apache/cassandra/db/compaction/CompactionStrategyManager.java
index e58ccc2..6342a1b 100644
--- a/src/java/org/apache/cassandra/db/compaction/CompactionStrategyManager.java
+++ b/src/java/org/apache/cassandra/db/compaction/CompactionStrategyManager.java
@@ -21,7 +21,6 @@ package org.apache.cassandra.db.compaction;
 import java.util.*;
 import java.util.concurrent.Callable;
 import java.util.concurrent.locks.ReentrantReadWriteLock;
-import java.util.function.Predicate;
 import java.util.stream.Collectors;
 import java.util.stream.Stream;
 import java.util.function.Supplier;
@@ -735,7 +734,7 @@ public class CompactionStrategyManager implements 
INotificationConsumer
  * @return
  */
 @SuppressWarnings("resource")
-public AbstractCompactionStrategy.ScannerList 
getScanners(Collection sstables,  Collection 
ranges)
+public 

[jira] [Assigned] (CASSANDRA-10726) Read repair inserts should not be blocking

2017-08-10 Thread sankalp kohli (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sankalp kohli reassigned CASSANDRA-10726:
-

Assignee: Marcus Eriksson  (was: Xiaolong Jiang)

> Read repair inserts should not be blocking
> --
>
> Key: CASSANDRA-10726
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10726
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Coordination
>Reporter: Richard Low
>Assignee: Marcus Eriksson
> Fix For: 3.0.x
>
>
> Today, if there’s a digest mismatch in a foreground read repair, the insert 
> to update out of date replicas is blocking. This means, if it fails, the read 
> fails with a timeout. If a node is dropping writes (maybe it is overloaded or 
> the mutation stage is backed up for some other reason), all reads to a 
> replica set could fail. Further, replicas dropping writes get more out of 
> sync so will require more read repair.
> The comment on the code for why the writes are blocking is:
> {code}
> // wait for the repair writes to be acknowledged, to minimize impact on any 
> replica that's
> // behind on writes in case the out-of-sync row is read multiple times in 
> quick succession
> {code}
> but the bad side effect is that reads timeout. Either the writes should not 
> be blocking or we should return success for the read even if the write times 
> out.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13363) java.lang.ArrayIndexOutOfBoundsException: null

2017-08-10 Thread Aleksey Yeschenko (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko updated CASSANDRA-13363:
--
Reviewer: Aleksey Yeschenko

> java.lang.ArrayIndexOutOfBoundsException: null
> --
>
> Key: CASSANDRA-13363
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13363
> Project: Cassandra
>  Issue Type: Bug
> Environment: CentOS 6, Cassandra 3.10
>Reporter: Artem Rokhin
>Assignee: zhaoyan
>Priority: Critical
> Fix For: 3.0.x, 3.11.x, 4.x
>
>
> Constantly see this error in the log without any additional information or a 
> stack trace.
> {code}
> Exception in thread Thread[MessagingService-Incoming-/10.0.1.26,5,main]
> {code}
> {code}
> java.lang.ArrayIndexOutOfBoundsException: null
> {code}
> Logger: org.apache.cassandra.service.CassandraDaemon
> Thrdead: MessagingService-Incoming-/10.0.1.12
> Method: uncaughtException
> File: CassandraDaemon.java
> Line: 229



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13751) Race / ref leak in PendingRepairManager

2017-08-10 Thread Blake Eggleston (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Blake Eggleston updated CASSANDRA-13751:

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Got the utest passing. dtests failures were flaky/succeeding locally. Committed 
as {{9c3354e3211c6a3f3982e87477e156c29cd9b7ea}}

> Race / ref leak in PendingRepairManager
> ---
>
> Key: CASSANDRA-13751
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13751
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Blake Eggleston
>Assignee: Blake Eggleston
>Priority: Minor
> Fix For: 4.0
>
>
> PendingRepairManager#getScanners has an assertion that confirms an sstable 
> is, in fact, marked as pending repair. Since validation compactions don't use 
> the same concurrency controls as proper compactions, they can race with 
> promotion/demotion compactions and end up getting assertion errors when the 
> pending repair id is changed while the scanners are being acquired. Also, 
> error handling in PendingRepairManager and CompactionStrategyManager leaks 
> refs when this happens.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-10726) Read repair inserts should not be blocking

2017-08-10 Thread sankalp kohli (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sankalp kohli updated CASSANDRA-10726:
--
Reviewer: Marcus Eriksson  (was: Blake Eggleston)

> Read repair inserts should not be blocking
> --
>
> Key: CASSANDRA-10726
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10726
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Coordination
>Reporter: Richard Low
>Assignee: Xiaolong Jiang
> Fix For: 3.0.x
>
>
> Today, if there’s a digest mismatch in a foreground read repair, the insert 
> to update out of date replicas is blocking. This means, if it fails, the read 
> fails with a timeout. If a node is dropping writes (maybe it is overloaded or 
> the mutation stage is backed up for some other reason), all reads to a 
> replica set could fail. Further, replicas dropping writes get more out of 
> sync so will require more read repair.
> The comment on the code for why the writes are blocking is:
> {code}
> // wait for the repair writes to be acknowledged, to minimize impact on any 
> replica that's
> // behind on writes in case the out-of-sync row is read multiple times in 
> quick succession
> {code}
> but the bad side effect is that reads timeout. Either the writes should not 
> be blocking or we should return success for the read even if the write times 
> out.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Assigned] (CASSANDRA-10726) Read repair inserts should not be blocking

2017-08-10 Thread sankalp kohli (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sankalp kohli reassigned CASSANDRA-10726:
-

Assignee: Xiaolong Jiang

> Read repair inserts should not be blocking
> --
>
> Key: CASSANDRA-10726
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10726
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Coordination
>Reporter: Richard Low
>Assignee: Xiaolong Jiang
> Fix For: 3.0.x
>
>
> Today, if there’s a digest mismatch in a foreground read repair, the insert 
> to update out of date replicas is blocking. This means, if it fails, the read 
> fails with a timeout. If a node is dropping writes (maybe it is overloaded or 
> the mutation stage is backed up for some other reason), all reads to a 
> replica set could fail. Further, replicas dropping writes get more out of 
> sync so will require more read repair.
> The comment on the code for why the writes are blocking is:
> {code}
> // wait for the repair writes to be acknowledged, to minimize impact on any 
> replica that's
> // behind on writes in case the out-of-sync row is read multiple times in 
> quick succession
> {code}
> but the bad side effect is that reads timeout. Either the writes should not 
> be blocking or we should return success for the read even if the write times 
> out.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-13752) Corrupted SSTables created in 3.11

2017-08-10 Thread JIRA
Hannu Kröger created CASSANDRA-13752:


 Summary: Corrupted SSTables created in 3.11
 Key: CASSANDRA-13752
 URL: https://issues.apache.org/jira/browse/CASSANDRA-13752
 Project: Cassandra
  Issue Type: Bug
Reporter: Hannu Kröger
Priority: Blocker


We have discovered issues with corrupted SSTables. 

{code}
ERROR [SSTableBatchOpen:22] 2017-08-03 20:19:53,195 SSTableReader.java:577 - 
Cannot read sstable 
/cassandra/data/mykeyspace/mytable-7a4992800d5611e7b782cb90016f2d17/mc-35556-big=[Data.db,
 Statistics.db, Summary.db, Digest.crc32, CompressionInfo.db, TOC.txt, 
Index.db, Filter.db]; other IO error, skipping table
java.io.EOFException: EOF after 1898 bytes out of 21093
at 
org.apache.cassandra.io.util.RebufferingInputStream.readFully(RebufferingInputStream.java:68)
 ~[apache-cassandra-3.11.0.jar:3.11.0]
at 
org.apache.cassandra.io.util.RebufferingInputStream.readFully(RebufferingInputStream.java:60)
 ~[apache-cassandra-3.11.0.jar:3.11.0]
at 
org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:402) 
~[apache-cassandra-3.11.0.jar:3.11.0]
at 
org.apache.cassandra.utils.ByteBufferUtil.readWithShortLength(ByteBufferUtil.java:377)
 ~[apache-cassandra-3.11.0.jar:3.11.0]
at 
org.apache.cassandra.io.sstable.metadata.StatsMetadata$StatsMetadataSerializer.deserialize(StatsMetadata.java:325)
 ~[apache-cassandra-3.11.0.jar:3.11.0]
at 
org.apache.cassandra.io.sstable.metadata.StatsMetadata$StatsMetadataSerializer.deserialize(StatsMetadata.java:231)
 ~[apache-cassandra-3.11.0.jar:3.11.0]
at 
org.apache.cassandra.io.sstable.metadata.MetadataSerializer.deserialize(MetadataSerializer.java:122)
 ~[apache-cassandra-3.11.0.jar:3.11.0]
at 
org.apache.cassandra.io.sstable.metadata.MetadataSerializer.deserialize(MetadataSerializer.java:93)
 ~[apache-cassandra-3.11.0.jar:3.11.0]
at 
org.apache.cassandra.io.sstable.format.SSTableReader.open(SSTableReader.java:488)
 ~[apache-cassandra-3.11.0.jar:3.11.0]
at 
org.apache.cassandra.io.sstable.format.SSTableReader.open(SSTableReader.java:396)
 ~[apache-cassandra-3.11.0.jar:3.11.0]
at 
org.apache.cassandra.io.sstable.format.SSTableReader$5.run(SSTableReader.java:561)
 ~[apache-cassandra-3.11.0.jar:3.11.0]
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
[na:1.8.0_111]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
[na:1.8.0_111]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
[na:1.8.0_111]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
[na:1.8.0_111]
at 
org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:81)
 [apache-cassandra-3.11.0.jar:3.11.0]
{code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13752) Corrupted SSTables created in 3.11

2017-08-10 Thread JIRA

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16121388#comment-16121388
 ] 

Hannu Kröger commented on CASSANDRA-13752:
--

This has happened on 2 servers for total of at least 3 sstables so far and I 
can read those files with unix tools like cat so it doesn't seem like it's a FS 
or HW issue.

> Corrupted SSTables created in 3.11
> --
>
> Key: CASSANDRA-13752
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13752
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Hannu Kröger
>Priority: Blocker
>
> We have discovered issues with corrupted SSTables. 
> {code}
> ERROR [SSTableBatchOpen:22] 2017-08-03 20:19:53,195 SSTableReader.java:577 - 
> Cannot read sstable 
> /cassandra/data/mykeyspace/mytable-7a4992800d5611e7b782cb90016f2d17/mc-35556-big=[Data.db,
>  Statistics.db, Summary.db, Digest.crc32, CompressionInfo.db, TOC.txt, 
> Index.db, Filter.db]; other IO error, skipping table
> java.io.EOFException: EOF after 1898 bytes out of 21093
> at 
> org.apache.cassandra.io.util.RebufferingInputStream.readFully(RebufferingInputStream.java:68)
>  ~[apache-cassandra-3.11.0.jar:3.11.0]
> at 
> org.apache.cassandra.io.util.RebufferingInputStream.readFully(RebufferingInputStream.java:60)
>  ~[apache-cassandra-3.11.0.jar:3.11.0]
> at 
> org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:402) 
> ~[apache-cassandra-3.11.0.jar:3.11.0]
> at 
> org.apache.cassandra.utils.ByteBufferUtil.readWithShortLength(ByteBufferUtil.java:377)
>  ~[apache-cassandra-3.11.0.jar:3.11.0]
> at 
> org.apache.cassandra.io.sstable.metadata.StatsMetadata$StatsMetadataSerializer.deserialize(StatsMetadata.java:325)
>  ~[apache-cassandra-3.11.0.jar:3.11.0]
> at 
> org.apache.cassandra.io.sstable.metadata.StatsMetadata$StatsMetadataSerializer.deserialize(StatsMetadata.java:231)
>  ~[apache-cassandra-3.11.0.jar:3.11.0]
> at 
> org.apache.cassandra.io.sstable.metadata.MetadataSerializer.deserialize(MetadataSerializer.java:122)
>  ~[apache-cassandra-3.11.0.jar:3.11.0]
> at 
> org.apache.cassandra.io.sstable.metadata.MetadataSerializer.deserialize(MetadataSerializer.java:93)
>  ~[apache-cassandra-3.11.0.jar:3.11.0]
> at 
> org.apache.cassandra.io.sstable.format.SSTableReader.open(SSTableReader.java:488)
>  ~[apache-cassandra-3.11.0.jar:3.11.0]
> at 
> org.apache.cassandra.io.sstable.format.SSTableReader.open(SSTableReader.java:396)
>  ~[apache-cassandra-3.11.0.jar:3.11.0]
> at 
> org.apache.cassandra.io.sstable.format.SSTableReader$5.run(SSTableReader.java:561)
>  ~[apache-cassandra-3.11.0.jar:3.11.0]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> [na:1.8.0_111]
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
> [na:1.8.0_111]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  [na:1.8.0_111]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  [na:1.8.0_111]
> at 
> org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:81)
>  [apache-cassandra-3.11.0.jar:3.11.0]
> {code}
> Files look like this:
> {code}
> -rw-r--r--. 1 cassandra cassandra 3899251 Aug  7 08:37 
> mc-6166-big-CompressionInfo.db
> -rw-r--r--. 1 cassandra cassandra 16874421686 Aug  7 08:37 mc-6166-big-Data.db
> -rw-r--r--. 1 cassandra cassandra  10 Aug  7 08:37 
> mc-6166-big-Digest.crc32
> -rw-r--r--. 1 cassandra cassandra 2930904 Aug  7 08:37 
> mc-6166-big-Filter.db
> -rw-r--r--. 1 cassandra cassandra   75880 Aug  7 08:37 
> mc-6166-big-Index.db
> -rw-r--r--. 1 cassandra cassandra   13762 Aug  7 08:37 
> mc-6166-big-Statistics.db
> -rw-r--r--. 1 cassandra cassandra  882008 Aug  7 08:37 
> mc-6166-big-Summary.db
> -rw-r--r--. 1 cassandra cassandra  92 Aug  7 08:37 mc-6166-big-TOC.txt
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-13753) The documentation website can be fitted well on device width.

2017-08-10 Thread Ashish Tomer (JIRA)
Ashish Tomer created CASSANDRA-13753:


 Summary: The documentation website can be fitted well on device 
width.
 Key: CASSANDRA-13753
 URL: https://issues.apache.org/jira/browse/CASSANDRA-13753
 Project: Cassandra
  Issue Type: Improvement
  Components: Documentation and Website
 Environment: *Operating System : *Ubuntu
*Browsers: *
* Firefox
* Google Chrome
Reporter: Ashish Tomer
 Fix For: 4.x


The following shortcomings/ issues are noticed on the pages of cassandra 
documentation website ([http://cassandra.apache.org/doc/latest/])
*1.* On laptop screen with resolution 1366  768 the width of the webpage 
is more than the width of the screen. The content of the website is going left 
and user has to scroll horizontally to read the lines. The horizontal scrollbar 
at the bottom needs to be removed.
*2.* When some pages are scrolled down the whole page fluctuate and jump back 
to top of the page. {color:red}Example link - 
{color}[http://cassandra.apache.org/doc/latest/architecture/overview.html]
*3.* The website is not mobile friendly and can be made responsive.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-12728) Handling partially written hint files

2017-08-10 Thread Hansey Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16121904#comment-16121904
 ] 

Hansey Chen commented on CASSANDRA-12728:
-

I was looking at this issue and could not understand one of the effects of this 
bug.

Garvit Juniwal mentioned in 
[one|https://issues.apache.org/jira/browse/CASSANDRA-12728?focusedCommentId=15576548=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15576548]
 of his comments that this bug will "put cassandra in a crash loop". Also 
Harikrishnan said in [a related 
issue|https://issues.apache.org/jira/browse/CASSANDRA-12844] that this bug 
crashed many nodes.

But I cannot figure out how an EOFE during hinted handoff can crash a cassandra 
node. Is it only crashing the hints dispatching thread? And how can it affect 
other nodes?

Could anyone please explain a little bit more? Many thanks in advance.

> Handling partially written hint files
> -
>
> Key: CASSANDRA-12728
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12728
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Sharvanath Pathak
>Assignee: Garvit Juniwal
>  Labels: lhf
> Fix For: 3.0.14, 3.11.0, 4.0
>
> Attachments: CASSANDRA-12728.patch
>
>
> {noformat}
> ERROR [HintsDispatcher:1] 2016-09-28 17:44:43,397 
> HintsDispatchExecutor.java:225 - Failed to dispatch hints file 
> d5d7257c-9f81-49b2-8633-6f9bda6e3dea-1474892654160-1.hints: file is corrupted 
> ({})
> org.apache.cassandra.io.FSReadError: java.io.EOFException
> at 
> org.apache.cassandra.hints.HintsReader$BuffersIterator.computeNext(HintsReader.java:282)
>  ~[apache-cassandra-3.0.6.jar:3.0.6]
> at 
> org.apache.cassandra.hints.HintsReader$BuffersIterator.computeNext(HintsReader.java:252)
>  ~[apache-cassandra-3.0.6.jar:3.0.6]
> at 
> org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) 
> ~[apache-cassandra-3.0.6.jar:3.0.6]
> at 
> org.apache.cassandra.hints.HintsDispatcher.sendHints(HintsDispatcher.java:156)
>  ~[apache-cassandra-3.0.6.jar:3.0.6]
> at 
> org.apache.cassandra.hints.HintsDispatcher.sendHintsAndAwait(HintsDispatcher.java:137)
>  ~[apache-cassandra-3.0.6.jar:3.0.6]
> at 
> org.apache.cassandra.hints.HintsDispatcher.dispatch(HintsDispatcher.java:119) 
> ~[apache-cassandra-3.0.6.jar:3.0.6]
> at 
> org.apache.cassandra.hints.HintsDispatcher.dispatch(HintsDispatcher.java:91) 
> ~[apache-cassandra-3.0.6.jar:3.0.6]
> at 
> org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.deliver(HintsDispatchExecutor.java:259)
>  [apache-cassandra-3.0.6.jar:3.0.6]
> at 
> org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.dispatch(HintsDispatchExecutor.java:242)
>  [apache-cassandra-3.0.6.jar:3.0.6]
> at 
> org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.dispatch(HintsDispatchExecutor.java:220)
>  [apache-cassandra-3.0.6.jar:3.0.6]
> at 
> org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.run(HintsDispatchExecutor.java:199)
>  [apache-cassandra-3.0.6.jar:3.0.6]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> [na:1.8.0_77]
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
> [na:1.8.0_77]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  [na:1.8.0_77]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  [na:1.8.0_77]
> at java.lang.Thread.run(Thread.java:745) [na:1.8.0_77]
> Caused by: java.io.EOFException: null
> at 
> org.apache.cassandra.io.util.RebufferingInputStream.readFully(RebufferingInputStream.java:68)
>  ~[apache-cassandra-3.0.6.jar:3.0.6]
> at 
> org.apache.cassandra.io.util.RebufferingInputStream.readFully(RebufferingInputStream.java:60)
>  ~[apache-cassandra-3.0.6.jar:3.0.6]
> at 
> org.apache.cassandra.hints.ChecksummedDataInput.readFully(ChecksummedDataInput.java:126)
>  ~[apache-cassandra-3.0.6.jar:3.0.6]
> at 
> org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:402) 
> ~[apache-cassandra-3.0.6.jar:3.0.6]
> at 
> org.apache.cassandra.hints.HintsReader$BuffersIterator.readBuffer(HintsReader.java:310)
>  ~[apache-cassandra-3.0.6.jar:3.0.6]
> at 
> org.apache.cassandra.hints.HintsReader$BuffersIterator.computeNextInternal(HintsReader.java:301)
>  ~[apache-cassandra-3.0.6.jar:3.0.6]
> at 
> org.apache.cassandra.hints.HintsReader$BuffersIterator.computeNext(HintsReader.java:278)
>  ~[apache-cassandra-3.0.6.jar:3.0.6]
> ... 15 common frames omitted
> {noformat}
> We've found out that the hint file was truncated because there was a hard 
> reboot 

[jira] [Commented] (CASSANDRA-11748) Schema version mismatch may leads to Casandra OOM at bootstrap during a rolling upgrade process

2017-08-10 Thread Matt Byrd (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16122194#comment-16122194
 ] 

Matt Byrd commented on CASSANDRA-11748:
---

{quote}
But we should at least take the schema Ids and/or endpoints into account as 
well. It just doesn't make sense to queue 50 requests for the same schema Id 
and potentially drop requests for a different schema afterwards.
{quote}
Yes, I did also have a patch with an expiring map of schema-version to counter 
and was limiting it per schema version, but decided to keep it simple, since 
the single limit sufficed for a particular scenario. Less relevant, but it also 
provides some protection in the rather strange case that there are actually 
lots of different schema versions in the cluster. I could resurrect the schema 
version patch, but it sounds like we're considering a slightly different 
approach.

{quote}
Schedule that pull with a delay instead, give the new node a chance to pull the 
new schema from one of the nodes in the cluster. It'll most likely converge by 
the time the delay has passed, so we'd just abort the request if schema 
versions now match.
{quote}
Once a node has been up for MIGRATION_DELAY_IN_MS and doesn't have an empty 
schema, it will always schedule the task to pull schema with a delay of 
MIGRATION_DELAY_IN_MS and then do a further check within the task itself to see 
if the schema versions still differ before asking for schema.

Though admittedly this problem does still exist if two nodes start up at the 
same time, they may pull from each other.
I suppose we're going to schedule a pull from a newer node too, then assuming 
we successively merge the schema together we end up hopefully at the final 
desired state? Although in the interim I suppose it's possible a node might 
come into play with a slightly older schema, but I suppose that can just happen 
whenever a DOWN node comes up with out of date schema.

It's also possible that if the node is so overwhelmed by the reverse problem, 
it won't have made it to the correct schema version in MIGRATION_DELAY_IN_MS 
and hence will start sending it's old schema back at all the other nodes in the 
cluster, fortunately the sending happens on the migration stage so is single 
threaded and less likely to cause OOMS. 



> Schema version mismatch may leads to Casandra OOM at bootstrap during a 
> rolling upgrade process
> ---
>
> Key: CASSANDRA-11748
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11748
> Project: Cassandra
>  Issue Type: Bug
> Environment: Rolling upgrade process from 1.2.19 to 2.0.17. 
> CentOS 6.6
> Occurred in different C* node of different scale of deployment (2G ~ 5G)
>Reporter: Michael Fong
>Assignee: Matt Byrd
>Priority: Critical
> Fix For: 3.0.x, 3.11.x, 4.x
>
>
> We have observed multiple times when a multi-node C* (v2.0.17) cluster ran 
> into OOM in bootstrap during a rolling upgrade process from 1.2.19 to 2.0.17. 
> Here is the simple guideline of our rolling upgrade process
> 1. Update schema on a node, and wait until all nodes to be in schema version 
> agreemnt - via nodetool describeclulster
> 2. Restart a Cassandra node
> 3. After restart, there is a chance that the the restarted node has different 
> schema version.
> 4. All nodes in cluster start to rapidly exchange schema information, and any 
> of node could run into OOM. 
> The following is the system.log that occur in one of our 2-node cluster test 
> bed
> --
> Before rebooting node 2:
> Node 1: DEBUG [MigrationStage:1] 2016-04-19 11:09:42,326 
> MigrationManager.java (line 328) Gossiping my schema version 
> 4cb463f8-5376-3baf-8e88-a5cc6a94f58f
> Node 2: DEBUG [MigrationStage:1] 2016-04-19 11:09:42,122 
> MigrationManager.java (line 328) Gossiping my schema version 
> 4cb463f8-5376-3baf-8e88-a5cc6a94f58f
> After rebooting node 2, 
> Node 2: DEBUG [main] 2016-04-19 11:18:18,016 MigrationManager.java (line 328) 
> Gossiping my schema version f5270873-ba1f-39c7-ab2e-a86db868b09b
> The node2  keeps submitting the migration task over 100+ times to the other 
> node.
> INFO [GossipStage:1] 2016-04-19 11:18:18,261 Gossiper.java (line 1011) Node 
> /192.168.88.33 has restarted, now UP
> INFO [GossipStage:1] 2016-04-19 11:18:18,262 TokenMetadata.java (line 414) 
> Updating topology for /192.168.88.33
> ...
> DEBUG [GossipStage:1] 2016-04-19 11:18:18,265 MigrationManager.java (line 
> 102) Submitting migration task for /192.168.88.33
> ... ( over 100+ times)
> --
> On the otherhand, Node 1 keeps updating its gossip information, followed by 
> receiving and submitting migrationTask afterwards: 
> INFO [RequestResponseStage:3] 2016-04-19 

[jira] [Updated] (CASSANDRA-13655) Range deletes in a CAS batch are ignored

2017-08-10 Thread Jeff Jirsa (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Jirsa updated CASSANDRA-13655:
---
Priority: Blocker  (was: Critical)

> Range deletes in a CAS batch are ignored
> 
>
> Key: CASSANDRA-13655
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13655
> Project: Cassandra
>  Issue Type: Bug
>  Components: CQL
>Reporter: Jeff Jirsa
>Assignee: Jeff Jirsa
>Priority: Blocker
> Fix For: 3.0.x, 3.11.x, 4.x
>
>
> Range deletes in a CAS batch are ignored 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-11748) Schema version mismatch may leads to Casandra OOM at bootstrap during a rolling upgrade process

2017-08-10 Thread Michael Fong (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16122718#comment-16122718
 ] 

Michael Fong edited comment on CASSANDRA-11748 at 8/11/17 2:34 AM:
---

Hi, guys, 

Thanks for putting some time on this issue, and this is an awesome discussion 
thread. 

When we reported this issue a year ago, we ended up patching the C* (v2.0) with 
similar approach to CASSANDRA-13569, but later we found it was not addressing 
the root problem but putting more patches on top of one another as time goes 
by. In my humble opinion, I am not sure if we want to have many more types of 
soft/hard caps to reduce risks of running into OOM. Instead, we could probably 
look deeper into causes behind the current working model, such as 
1. Have migration checks and requests fired asynchronously and finally stack up 
the all message at the receiver end merge the schema one-by-one at 
{code:java}Schema.instance.mergeAndAnnounceVersion(){code}
2. Send the receiver the complete copy of schema, instead of delta copy of 
schema out of diff between two nodes.
3. Last but not least, the most mysterious problem that leads to OOM and  we 
could not figure out why back then, is that there are hundreds of migration 
task all fired nearly simultaneously,  within 2 s. The number of rpcs does not 
match with the nodes in cluster, but is close to number of second taken for the 
node to reboot. 

Maybe there are other tickets working to address these items already, which I 
may not know. 

Thanks.

Michael Fong


was (Author: mcfongtw):
Hi, guys, 

Thanks for putting some time on this issue, and this is an awesome discussion 
thread. 

When we reported this issue a year ago, we ended up patching the C* (v2.0) with 
similar approach to CASSANDRA-13569, but later we found it was not addressing 
the root problem but putting more patches on top of one another as time goes 
by. In my humble opinion, I am not sure if we want to have many more types of 
soft/hard caps to reduce risks of running into OOM. Instead, we could probably 
look deeper into causes behind the current working model, such as 
1. Have migration checks and requests fired asynchronously and finally stack up 
the all message at the receiver end merge the schema one-by-one at 
{code:java}
Schema.instance.mergeAndAnnounceVersion()
{code}
2. Send the receiver the complete copy of schema, instead of delta copy of 
schema out of diff between two nodes.
3. Last but not least, the most mysterious problem that leads to OOM and  we 
could not figure out why back then, is that there are hundreds of migration 
task all fired nearly simultaneously,  within 2 s. The number of rpcs does not 
match with the nodes in cluster, but is close to number of second taken for the 
node to reboot. 

Maybe there are other tickets working to address these items already, which I 
may not know. 

Thanks.

Michael Fong

> Schema version mismatch may leads to Casandra OOM at bootstrap during a 
> rolling upgrade process
> ---
>
> Key: CASSANDRA-11748
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11748
> Project: Cassandra
>  Issue Type: Bug
> Environment: Rolling upgrade process from 1.2.19 to 2.0.17. 
> CentOS 6.6
> Occurred in different C* node of different scale of deployment (2G ~ 5G)
>Reporter: Michael Fong
>Assignee: Matt Byrd
>Priority: Critical
> Fix For: 3.0.x, 3.11.x, 4.x
>
>
> We have observed multiple times when a multi-node C* (v2.0.17) cluster ran 
> into OOM in bootstrap during a rolling upgrade process from 1.2.19 to 2.0.17. 
> Here is the simple guideline of our rolling upgrade process
> 1. Update schema on a node, and wait until all nodes to be in schema version 
> agreemnt - via nodetool describeclulster
> 2. Restart a Cassandra node
> 3. After restart, there is a chance that the the restarted node has different 
> schema version.
> 4. All nodes in cluster start to rapidly exchange schema information, and any 
> of node could run into OOM. 
> The following is the system.log that occur in one of our 2-node cluster test 
> bed
> --
> Before rebooting node 2:
> Node 1: DEBUG [MigrationStage:1] 2016-04-19 11:09:42,326 
> MigrationManager.java (line 328) Gossiping my schema version 
> 4cb463f8-5376-3baf-8e88-a5cc6a94f58f
> Node 2: DEBUG [MigrationStage:1] 2016-04-19 11:09:42,122 
> MigrationManager.java (line 328) Gossiping my schema version 
> 4cb463f8-5376-3baf-8e88-a5cc6a94f58f
> After rebooting node 2, 
> Node 2: DEBUG [main] 2016-04-19 11:18:18,016 MigrationManager.java (line 328) 
> Gossiping my schema version f5270873-ba1f-39c7-ab2e-a86db868b09b
> The node2  keeps submitting the migration task over 100+ times to the 

[jira] [Commented] (CASSANDRA-11748) Schema version mismatch may leads to Casandra OOM at bootstrap during a rolling upgrade process

2017-08-10 Thread Michael Fong (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16122718#comment-16122718
 ] 

Michael Fong commented on CASSANDRA-11748:
--

Hi, guys, 

Thanks for putting some time on this issue, and this is an awesome discussion 
thread. 

When we reported this issue a year ago, we ended up patching the C* (v2.0) with 
similar approach to CASSANDRA-13569, but later we found it was not addressing 
the root problem but putting more patches on top of one another as time goes 
by. In my humble opinion, I am not sure if we want to have many more types of 
soft/hard caps to reduce risks of running into OOM. Instead, we could probably 
look deeper into causes behind the current working model, such as 
1. Have migration checks and requests fired asynchronously and finally stack up 
the all message at the receiver end merge the schema one-by-one at 
{code:java}
Schema.instance.mergeAndAnnounceVersion()
{code}
2. Send the receiver the complete copy of schema, instead of delta copy of 
schema out of diff between two nodes.
3. Last but not least, the most mysterious problem that leads to OOM and  we 
could not figure out why back then, is that there are hundreds of migration 
task all fired nearly simultaneously,  within 2 s. The number of rpcs does not 
match with the nodes in cluster, but is close to number of second taken for the 
node to reboot. 

Maybe there are other tickets working to address these items already, which I 
may not know. 

Thanks.

Michael Fong

> Schema version mismatch may leads to Casandra OOM at bootstrap during a 
> rolling upgrade process
> ---
>
> Key: CASSANDRA-11748
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11748
> Project: Cassandra
>  Issue Type: Bug
> Environment: Rolling upgrade process from 1.2.19 to 2.0.17. 
> CentOS 6.6
> Occurred in different C* node of different scale of deployment (2G ~ 5G)
>Reporter: Michael Fong
>Assignee: Matt Byrd
>Priority: Critical
> Fix For: 3.0.x, 3.11.x, 4.x
>
>
> We have observed multiple times when a multi-node C* (v2.0.17) cluster ran 
> into OOM in bootstrap during a rolling upgrade process from 1.2.19 to 2.0.17. 
> Here is the simple guideline of our rolling upgrade process
> 1. Update schema on a node, and wait until all nodes to be in schema version 
> agreemnt - via nodetool describeclulster
> 2. Restart a Cassandra node
> 3. After restart, there is a chance that the the restarted node has different 
> schema version.
> 4. All nodes in cluster start to rapidly exchange schema information, and any 
> of node could run into OOM. 
> The following is the system.log that occur in one of our 2-node cluster test 
> bed
> --
> Before rebooting node 2:
> Node 1: DEBUG [MigrationStage:1] 2016-04-19 11:09:42,326 
> MigrationManager.java (line 328) Gossiping my schema version 
> 4cb463f8-5376-3baf-8e88-a5cc6a94f58f
> Node 2: DEBUG [MigrationStage:1] 2016-04-19 11:09:42,122 
> MigrationManager.java (line 328) Gossiping my schema version 
> 4cb463f8-5376-3baf-8e88-a5cc6a94f58f
> After rebooting node 2, 
> Node 2: DEBUG [main] 2016-04-19 11:18:18,016 MigrationManager.java (line 328) 
> Gossiping my schema version f5270873-ba1f-39c7-ab2e-a86db868b09b
> The node2  keeps submitting the migration task over 100+ times to the other 
> node.
> INFO [GossipStage:1] 2016-04-19 11:18:18,261 Gossiper.java (line 1011) Node 
> /192.168.88.33 has restarted, now UP
> INFO [GossipStage:1] 2016-04-19 11:18:18,262 TokenMetadata.java (line 414) 
> Updating topology for /192.168.88.33
> ...
> DEBUG [GossipStage:1] 2016-04-19 11:18:18,265 MigrationManager.java (line 
> 102) Submitting migration task for /192.168.88.33
> ... ( over 100+ times)
> --
> On the otherhand, Node 1 keeps updating its gossip information, followed by 
> receiving and submitting migrationTask afterwards: 
> INFO [RequestResponseStage:3] 2016-04-19 11:18:18,333 Gossiper.java (line 
> 978) InetAddress /192.168.88.34 is now UP
> ...
> DEBUG [MigrationStage:1] 2016-04-19 11:18:18,496 
> MigrationRequestVerbHandler.java (line 41) Received migration request from 
> /192.168.88.34.
> …… ( over 100+ times)
> DEBUG [OptionalTasks:1] 2016-04-19 11:19:18,337 MigrationManager.java (line 
> 127) submitting migration task for /192.168.88.34
> .  (over 50+ times)
> On the side note, we have over 200+ column families defined in Cassandra 
> database, which may related to this amount of rpc traffic.
> P.S.2 The over requested schema migration task will eventually have 
> InternalResponseStage performing schema merge operation. Since this operation 
> requires a compaction for each merge and is much slower to consume. Thus, the 
> 

[jira] [Comment Edited] (CASSANDRA-11748) Schema version mismatch may leads to Casandra OOM at bootstrap during a rolling upgrade process

2017-08-10 Thread Michael Fong (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16122718#comment-16122718
 ] 

Michael Fong edited comment on CASSANDRA-11748 at 8/11/17 2:35 AM:
---

Hi, guys, 

Thanks for putting some time on this issue, and this is an awesome discussion 
thread. 

When we reported this issue a year ago, we ended up patching the C* (v2.0) with 
similar approach to CASSANDRA-13569, but later we found it was not addressing 
the root problem but putting more patches on top of one another as time goes 
by. In my humble opinion, I am not sure if we want to have many more types of 
soft/hard caps to reduce risks of running into OOM. Instead, we could probably 
look deeper into causes behind the current working model, such as 
1. Have migration checks and requests fired asynchronously and finally stack up 
the all message at the receiver end merge the schema one-by-one at 
{code:java}Schema.instance.mergeSchemaAndAnnounceVersion(){code}
2. Send the receiver the complete copy of schema, instead of delta copy of 
schema out of diff between two nodes.
3. Last but not least, the most mysterious problem that leads to OOM and  we 
could not figure out why back then, is that there are hundreds of migration 
task all fired nearly simultaneously,  within 2 s. The number of rpcs does not 
match with the nodes in cluster, but is close to number of second taken for the 
node to reboot. 

Maybe there are other tickets working to address these items already, which I 
may not know. 

Thanks.

Michael Fong


was (Author: mcfongtw):
Hi, guys, 

Thanks for putting some time on this issue, and this is an awesome discussion 
thread. 

When we reported this issue a year ago, we ended up patching the C* (v2.0) with 
similar approach to CASSANDRA-13569, but later we found it was not addressing 
the root problem but putting more patches on top of one another as time goes 
by. In my humble opinion, I am not sure if we want to have many more types of 
soft/hard caps to reduce risks of running into OOM. Instead, we could probably 
look deeper into causes behind the current working model, such as 
1. Have migration checks and requests fired asynchronously and finally stack up 
the all message at the receiver end merge the schema one-by-one at 
{code:java}Schema.instance.mergeAndAnnounceVersion(){code}
2. Send the receiver the complete copy of schema, instead of delta copy of 
schema out of diff between two nodes.
3. Last but not least, the most mysterious problem that leads to OOM and  we 
could not figure out why back then, is that there are hundreds of migration 
task all fired nearly simultaneously,  within 2 s. The number of rpcs does not 
match with the nodes in cluster, but is close to number of second taken for the 
node to reboot. 

Maybe there are other tickets working to address these items already, which I 
may not know. 

Thanks.

Michael Fong

> Schema version mismatch may leads to Casandra OOM at bootstrap during a 
> rolling upgrade process
> ---
>
> Key: CASSANDRA-11748
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11748
> Project: Cassandra
>  Issue Type: Bug
> Environment: Rolling upgrade process from 1.2.19 to 2.0.17. 
> CentOS 6.6
> Occurred in different C* node of different scale of deployment (2G ~ 5G)
>Reporter: Michael Fong
>Assignee: Matt Byrd
>Priority: Critical
> Fix For: 3.0.x, 3.11.x, 4.x
>
>
> We have observed multiple times when a multi-node C* (v2.0.17) cluster ran 
> into OOM in bootstrap during a rolling upgrade process from 1.2.19 to 2.0.17. 
> Here is the simple guideline of our rolling upgrade process
> 1. Update schema on a node, and wait until all nodes to be in schema version 
> agreemnt - via nodetool describeclulster
> 2. Restart a Cassandra node
> 3. After restart, there is a chance that the the restarted node has different 
> schema version.
> 4. All nodes in cluster start to rapidly exchange schema information, and any 
> of node could run into OOM. 
> The following is the system.log that occur in one of our 2-node cluster test 
> bed
> --
> Before rebooting node 2:
> Node 1: DEBUG [MigrationStage:1] 2016-04-19 11:09:42,326 
> MigrationManager.java (line 328) Gossiping my schema version 
> 4cb463f8-5376-3baf-8e88-a5cc6a94f58f
> Node 2: DEBUG [MigrationStage:1] 2016-04-19 11:09:42,122 
> MigrationManager.java (line 328) Gossiping my schema version 
> 4cb463f8-5376-3baf-8e88-a5cc6a94f58f
> After rebooting node 2, 
> Node 2: DEBUG [main] 2016-04-19 11:18:18,016 MigrationManager.java (line 328) 
> Gossiping my schema version f5270873-ba1f-39c7-ab2e-a86db868b09b
> The node2  keeps submitting the migration task over 100+ times to the 

[jira] [Assigned] (CASSANDRA-13743) CAPTURE not easilly usable with PAGING

2017-08-10 Thread Jeff Jirsa (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Jirsa reassigned CASSANDRA-13743:
--

Assignee: Corentin Chary

> CAPTURE not easilly usable with PAGING
> --
>
> Key: CASSANDRA-13743
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13743
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tools
>Reporter: Corentin Chary
>Assignee: Corentin Chary
> Fix For: 4.x
>
>
> See 
> https://github.com/iksaif/cassandra/commit/7ed56966a7150ced44c375af307685517d7e09a3
>  for a patch fixing that.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13744) Better bootstrap failure message when blocked by (potential) range movement

2017-08-10 Thread Jeff Jirsa (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16122628#comment-16122628
 ] 

Jeff Jirsa commented on CASSANDRA-13744:


I'll claim it, not only because I'm first, but because I'm not sure if Jason's 
+1 carries through on the new tests.


> Better bootstrap failure message when blocked by (potential) range movement
> ---
>
> Key: CASSANDRA-13744
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13744
> Project: Cassandra
>  Issue Type: Bug
>Reporter: mck
>Assignee: mck
>Priority: Trivial
> Fix For: 3.11.x, 4.x
>
>
> The UnsupportedOperationException thrown from 
> {{StorageService.joinTokenRing(..)}} when it's detected that other nodes are 
> bootstrapping|leaving|moving offers no information as to which are those 
> other nodes.
> In a large cluster this might not be obvious nor easy to discover, gossipinfo 
> can hold information that takes a bit of effort to uncover. Even when it is 
> easily seen it's helpful to have it confirmed.
> Attached is the patch that provides a more thorough exception message to the 
> failed bootstrap attempt.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13744) Better bootstrap failure message when blocked by (potential) range movement

2017-08-10 Thread Jeff Jirsa (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Jirsa updated CASSANDRA-13744:
---
Reviewer: Jeff Jirsa

> Better bootstrap failure message when blocked by (potential) range movement
> ---
>
> Key: CASSANDRA-13744
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13744
> Project: Cassandra
>  Issue Type: Bug
>Reporter: mck
>Assignee: mck
>Priority: Trivial
> Fix For: 3.11.x, 4.x
>
>
> The UnsupportedOperationException thrown from 
> {{StorageService.joinTokenRing(..)}} when it's detected that other nodes are 
> bootstrapping|leaving|moving offers no information as to which are those 
> other nodes.
> In a large cluster this might not be obvious nor easy to discover, gossipinfo 
> can hold information that takes a bit of effort to uncover. Even when it is 
> easily seen it's helpful to have it confirmed.
> Attached is the patch that provides a more thorough exception message to the 
> failed bootstrap attempt.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



cassandra git commit: cqlsh: don't pause when capturing data

2017-08-10 Thread jjirsa
Repository: cassandra
Updated Branches:
  refs/heads/trunk 4b736366c -> ed0243954


cqlsh: don't pause when capturing data

Patch by Corentin Chary; Reviewed by Chris Lohfink for CASSANDRA-13473


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/ed024395
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/ed024395
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/ed024395

Branch: refs/heads/trunk
Commit: ed0243954f9ab9c5c68a4516a836ab3710891d5b
Parents: 4b73636
Author: Corentin Chary 
Authored: Fri Aug 4 10:19:57 2017 +0200
Committer: Jeff Jirsa 
Committed: Thu Aug 10 18:02:31 2017 -0700

--
 CHANGES.txt  | 1 +
 bin/cqlsh.py | 4 +++-
 2 files changed, 4 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/ed024395/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 808665a..5c6994a 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -107,6 +107,7 @@
  * Nodetool repair can hang forever if we lose the notification for the repair 
completing/failing (CASSANDRA-13480)
  * Anticompaction can cause noisy log messages (CASSANDRA-13684)
  * Switch to client init for sstabledump (CASSANDRA-13683)
+ * CQLSH: Don't pause when capturing data (CASSANDRA-13473)
 
 
 3.11.1

http://git-wip-us.apache.org/repos/asf/cassandra/blob/ed024395/bin/cqlsh.py
--
diff --git a/bin/cqlsh.py b/bin/cqlsh.py
index 4e634ca..2e10490 100644
--- a/bin/cqlsh.py
+++ b/bin/cqlsh.py
@@ -1084,7 +1084,9 @@ class Shell(cmd.Cmd):
 num_rows += len(result.current_rows)
 self.print_static_result(result, table_meta)
 if result.has_more_pages:
-raw_input("---MORE---")
+if self.shunted_query_out is None:
+# Only pause when not capturing.
+raw_input("---MORE---")
 result.fetch_next_page()
 else:
 break


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13743) CAPTURE not easilly usable with PAGING

2017-08-10 Thread Jeff Jirsa (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Jirsa updated CASSANDRA-13743:
---
   Resolution: Fixed
 Reviewer: Chris Lohfink
Fix Version/s: (was: 4.x)
   4.0
   Status: Resolved  (was: Ready to Commit)

Thanks guys, committed as {{ed0243954f9ab9c5c68a4516a836ab3710891d5b}}



> CAPTURE not easilly usable with PAGING
> --
>
> Key: CASSANDRA-13743
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13743
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tools
>Reporter: Corentin Chary
>Assignee: Corentin Chary
> Fix For: 4.0
>
>
> See 
> https://github.com/iksaif/cassandra/commit/7ed56966a7150ced44c375af307685517d7e09a3
>  for a patch fixing that.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13743) CAPTURE not easilly usable with PAGING

2017-08-10 Thread Jeff Jirsa (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Jirsa updated CASSANDRA-13743:
---
Status: Ready to Commit  (was: Patch Available)

> CAPTURE not easilly usable with PAGING
> --
>
> Key: CASSANDRA-13743
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13743
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tools
>Reporter: Corentin Chary
>Assignee: Corentin Chary
> Fix For: 4.x
>
>
> See 
> https://github.com/iksaif/cassandra/commit/7ed56966a7150ced44c375af307685517d7e09a3
>  for a patch fixing that.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13664) RangeFetchMapCalculator should not try to optimise 'trivial' ranges

2017-08-10 Thread Jeff Jirsa (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16122640#comment-16122640
 ] 

Jeff Jirsa commented on CASSANDRA-13664:


This is marked ready to commit - is it good to go? That dtest run has already 
expired.


> RangeFetchMapCalculator should not try to optimise 'trivial' ranges
> ---
>
> Key: CASSANDRA-13664
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13664
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Marcus Eriksson
>Assignee: Marcus Eriksson
> Fix For: 4.x
>
>
> RangeFetchMapCalculator (CASSANDRA-4650) tries to make the number of streams 
> out of each node as even as possible.
> In a typical multi-dc ring the nodes in the dcs are setup using token + 1, 
> creating many tiny ranges. If we only try to optimise over the number of 
> streams, it is likely that the amount of data streamed out of each node is 
> unbalanced.
> We should ignore those trivial ranges and only optimise the big ones, then 
> share the tiny ones over the nodes.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[3/6] cassandra git commit: sstabledump reports incorrect usage for argument order

2017-08-10 Thread jjirsa
sstabledump reports incorrect usage for argument order

Patch by Varun Barala; Reviewed by ZhaoYang for CASSANDRA-13532


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/fab38456
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/fab38456
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/fab38456

Branch: refs/heads/trunk
Commit: fab384560311ec1f3043fbf6137093ea129afa68
Parents: eb6f03c
Author: Jeff Jirsa 
Authored: Thu Aug 10 18:11:56 2017 -0700
Committer: Jeff Jirsa 
Committed: Thu Aug 10 18:11:56 2017 -0700

--
 CHANGES.txt| 1 +
 src/java/org/apache/cassandra/tools/SSTableExport.java | 2 +-
 2 files changed, 2 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/fab38456/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 0b92a7e..2e9e8ad 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -17,6 +17,7 @@
  * Allow native function calls in CQLSSTableWriter (CASSANDRA-12606)
  * Fix secondary index queries on COMPACT tables (CASSANDRA-13627)
  * Nodetool listsnapshots output is missing a newline, if there are no 
snapshots (CASSANDRA-13568)
+ * sstabledump reports incorrect usage for argument order (CASSANDRA-13532)
  Merged from 2.2:
  * Prevent integer overflow on exabyte filesystems (CASSANDRA-13067)
  * Fix queries with LIMIT and filtering on clustering columns (CASSANDRA-11223)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/fab38456/src/java/org/apache/cassandra/tools/SSTableExport.java
--
diff --git a/src/java/org/apache/cassandra/tools/SSTableExport.java 
b/src/java/org/apache/cassandra/tools/SSTableExport.java
index cff1516..ac8ea61 100644
--- a/src/java/org/apache/cassandra/tools/SSTableExport.java
+++ b/src/java/org/apache/cassandra/tools/SSTableExport.java
@@ -254,7 +254,7 @@ public class SSTableExport
 
 private static void printUsage()
 {
-String usage = String.format("sstabledump  %n");
+String usage = String.format("sstabledump  
%n");
 String header = "Dump contents of given SSTable to standard output in 
JSON format.";
 new HelpFormatter().printHelp(usage, header, options, "");
 }


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[4/6] cassandra git commit: Merge branch 'cassandra-3.0' into cassandra-3.11

2017-08-10 Thread jjirsa
Merge branch 'cassandra-3.0' into cassandra-3.11


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/1884dbe2
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/1884dbe2
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/1884dbe2

Branch: refs/heads/trunk
Commit: 1884dbe288bf53af9359e4e1ad9e1cfc0d212f0c
Parents: e018bec fab3845
Author: Jeff Jirsa 
Authored: Thu Aug 10 18:12:49 2017 -0700
Committer: Jeff Jirsa 
Committed: Thu Aug 10 18:13:24 2017 -0700

--
 CHANGES.txt| 1 +
 src/java/org/apache/cassandra/tools/SSTableExport.java | 2 +-
 2 files changed, 2 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/1884dbe2/CHANGES.txt
--
diff --cc CHANGES.txt
index 3308287,2e9e8ad..c672675
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@@ -18,9 -14,11 +18,10 @@@ Merged from 3.0
   * Make concat work with iterators that have different subsets of columns 
(CASSANDRA-13482)
   * Set test.runners based on cores and memory size (CASSANDRA-13078)
   * Allow different NUMACTL_ARGS to be passed in (CASSANDRA-13557)
 - * Allow native function calls in CQLSSTableWriter (CASSANDRA-12606)
   * Fix secondary index queries on COMPACT tables (CASSANDRA-13627)
   * Nodetool listsnapshots output is missing a newline, if there are no 
snapshots (CASSANDRA-13568)
+  * sstabledump reports incorrect usage for argument order (CASSANDRA-13532)
 - Merged from 2.2:
 +Merged from 2.2:
   * Prevent integer overflow on exabyte filesystems (CASSANDRA-13067)
   * Fix queries with LIMIT and filtering on clustering columns 
(CASSANDRA-11223)
   * Fix potential NPE when resume bootstrap fails (CASSANDRA-13272)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/1884dbe2/src/java/org/apache/cassandra/tools/SSTableExport.java
--


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[2/6] cassandra git commit: sstabledump reports incorrect usage for argument order

2017-08-10 Thread jjirsa
sstabledump reports incorrect usage for argument order

Patch by Varun Barala; Reviewed by ZhaoYang for CASSANDRA-13532


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/fab38456
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/fab38456
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/fab38456

Branch: refs/heads/cassandra-3.11
Commit: fab384560311ec1f3043fbf6137093ea129afa68
Parents: eb6f03c
Author: Jeff Jirsa 
Authored: Thu Aug 10 18:11:56 2017 -0700
Committer: Jeff Jirsa 
Committed: Thu Aug 10 18:11:56 2017 -0700

--
 CHANGES.txt| 1 +
 src/java/org/apache/cassandra/tools/SSTableExport.java | 2 +-
 2 files changed, 2 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/fab38456/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 0b92a7e..2e9e8ad 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -17,6 +17,7 @@
  * Allow native function calls in CQLSSTableWriter (CASSANDRA-12606)
  * Fix secondary index queries on COMPACT tables (CASSANDRA-13627)
  * Nodetool listsnapshots output is missing a newline, if there are no 
snapshots (CASSANDRA-13568)
+ * sstabledump reports incorrect usage for argument order (CASSANDRA-13532)
  Merged from 2.2:
  * Prevent integer overflow on exabyte filesystems (CASSANDRA-13067)
  * Fix queries with LIMIT and filtering on clustering columns (CASSANDRA-11223)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/fab38456/src/java/org/apache/cassandra/tools/SSTableExport.java
--
diff --git a/src/java/org/apache/cassandra/tools/SSTableExport.java 
b/src/java/org/apache/cassandra/tools/SSTableExport.java
index cff1516..ac8ea61 100644
--- a/src/java/org/apache/cassandra/tools/SSTableExport.java
+++ b/src/java/org/apache/cassandra/tools/SSTableExport.java
@@ -254,7 +254,7 @@ public class SSTableExport
 
 private static void printUsage()
 {
-String usage = String.format("sstabledump  %n");
+String usage = String.format("sstabledump  
%n");
 String header = "Dump contents of given SSTable to standard output in 
JSON format.";
 new HelpFormatter().printHelp(usage, header, options, "");
 }


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[6/6] cassandra git commit: Merge branch 'cassandra-3.11' into trunk

2017-08-10 Thread jjirsa
Merge branch 'cassandra-3.11' into trunk


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/d68357a4
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/d68357a4
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/d68357a4

Branch: refs/heads/trunk
Commit: d68357a447a5296e8c5dfd097fbc092e586819fe
Parents: ed02439 1884dbe
Author: Jeff Jirsa 
Authored: Thu Aug 10 18:13:45 2017 -0700
Committer: Jeff Jirsa 
Committed: Thu Aug 10 18:14:14 2017 -0700

--
 CHANGES.txt| 1 +
 src/java/org/apache/cassandra/tools/SSTableExport.java | 2 +-
 2 files changed, 2 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/d68357a4/CHANGES.txt
--

http://git-wip-us.apache.org/repos/asf/cassandra/blob/d68357a4/src/java/org/apache/cassandra/tools/SSTableExport.java
--


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[1/6] cassandra git commit: sstabledump reports incorrect usage for argument order

2017-08-10 Thread jjirsa
Repository: cassandra
Updated Branches:
  refs/heads/cassandra-3.0 eb6f03c89 -> fab384560
  refs/heads/cassandra-3.11 e018bec8a -> 1884dbe28
  refs/heads/trunk ed0243954 -> d68357a44


sstabledump reports incorrect usage for argument order

Patch by Varun Barala; Reviewed by ZhaoYang for CASSANDRA-13532


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/fab38456
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/fab38456
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/fab38456

Branch: refs/heads/cassandra-3.0
Commit: fab384560311ec1f3043fbf6137093ea129afa68
Parents: eb6f03c
Author: Jeff Jirsa 
Authored: Thu Aug 10 18:11:56 2017 -0700
Committer: Jeff Jirsa 
Committed: Thu Aug 10 18:11:56 2017 -0700

--
 CHANGES.txt| 1 +
 src/java/org/apache/cassandra/tools/SSTableExport.java | 2 +-
 2 files changed, 2 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/fab38456/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 0b92a7e..2e9e8ad 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -17,6 +17,7 @@
  * Allow native function calls in CQLSSTableWriter (CASSANDRA-12606)
  * Fix secondary index queries on COMPACT tables (CASSANDRA-13627)
  * Nodetool listsnapshots output is missing a newline, if there are no 
snapshots (CASSANDRA-13568)
+ * sstabledump reports incorrect usage for argument order (CASSANDRA-13532)
  Merged from 2.2:
  * Prevent integer overflow on exabyte filesystems (CASSANDRA-13067)
  * Fix queries with LIMIT and filtering on clustering columns (CASSANDRA-11223)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/fab38456/src/java/org/apache/cassandra/tools/SSTableExport.java
--
diff --git a/src/java/org/apache/cassandra/tools/SSTableExport.java 
b/src/java/org/apache/cassandra/tools/SSTableExport.java
index cff1516..ac8ea61 100644
--- a/src/java/org/apache/cassandra/tools/SSTableExport.java
+++ b/src/java/org/apache/cassandra/tools/SSTableExport.java
@@ -254,7 +254,7 @@ public class SSTableExport
 
 private static void printUsage()
 {
-String usage = String.format("sstabledump  %n");
+String usage = String.format("sstabledump  
%n");
 String header = "Dump contents of given SSTable to standard output in 
JSON format.";
 new HelpFormatter().printHelp(usage, header, options, "");
 }


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[5/6] cassandra git commit: Merge branch 'cassandra-3.0' into cassandra-3.11

2017-08-10 Thread jjirsa
Merge branch 'cassandra-3.0' into cassandra-3.11


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/1884dbe2
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/1884dbe2
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/1884dbe2

Branch: refs/heads/cassandra-3.11
Commit: 1884dbe288bf53af9359e4e1ad9e1cfc0d212f0c
Parents: e018bec fab3845
Author: Jeff Jirsa 
Authored: Thu Aug 10 18:12:49 2017 -0700
Committer: Jeff Jirsa 
Committed: Thu Aug 10 18:13:24 2017 -0700

--
 CHANGES.txt| 1 +
 src/java/org/apache/cassandra/tools/SSTableExport.java | 2 +-
 2 files changed, 2 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/1884dbe2/CHANGES.txt
--
diff --cc CHANGES.txt
index 3308287,2e9e8ad..c672675
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@@ -18,9 -14,11 +18,10 @@@ Merged from 3.0
   * Make concat work with iterators that have different subsets of columns 
(CASSANDRA-13482)
   * Set test.runners based on cores and memory size (CASSANDRA-13078)
   * Allow different NUMACTL_ARGS to be passed in (CASSANDRA-13557)
 - * Allow native function calls in CQLSSTableWriter (CASSANDRA-12606)
   * Fix secondary index queries on COMPACT tables (CASSANDRA-13627)
   * Nodetool listsnapshots output is missing a newline, if there are no 
snapshots (CASSANDRA-13568)
+  * sstabledump reports incorrect usage for argument order (CASSANDRA-13532)
 - Merged from 2.2:
 +Merged from 2.2:
   * Prevent integer overflow on exabyte filesystems (CASSANDRA-13067)
   * Fix queries with LIMIT and filtering on clustering columns 
(CASSANDRA-11223)
   * Fix potential NPE when resume bootstrap fails (CASSANDRA-13272)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/1884dbe2/src/java/org/apache/cassandra/tools/SSTableExport.java
--


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13532) sstabledump reports incorrect usage for argument order

2017-08-10 Thread Jeff Jirsa (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Jirsa updated CASSANDRA-13532:
---
   Resolution: Fixed
Fix Version/s: 4.0
   3.11.1
   3.0.15
   Status: Resolved  (was: Ready to Commit)

Thanks all! Committed as {{fab384560311ec1f3043fbf6137093ea129afa68}}


> sstabledump reports incorrect usage for argument order
> --
>
> Key: CASSANDRA-13532
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13532
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tools
>Reporter: Ian Ilsley
>Assignee: Varun Barala
>Priority: Minor
>  Labels: lhf
> Fix For: 3.0.15, 3.11.1, 4.0
>
> Attachments: sstabledump#printUsage.patch
>
>
> sstabledump usage reports 
> {{usage: sstabledump  }}
> However the actual usage is 
> {{sstabledump   }}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-13756) StreamingHistogram is not thread safe

2017-08-10 Thread xiangzhou xia (JIRA)
xiangzhou xia created CASSANDRA-13756:
-

 Summary: StreamingHistogram is not thread safe
 Key: CASSANDRA-13756
 URL: https://issues.apache.org/jira/browse/CASSANDRA-13756
 Project: Cassandra
  Issue Type: Bug
Reporter: xiangzhou xia


optimization in CASSANDRA-13038 led to a spool flush every time when we call 
sum. Since TreeMap is not thread safe, threads will be stuck when multiple 
threads visit sum() at the same time, and finally 100% cpu is stuck in that 
function. 

I think this issue is not limit to sum(), update() and merge() both have the 
same issue since they all need to update TreeMap. 

Add lock to bin solved this issue but it also introduced extra overhead.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13752) Corrupted SSTables created in 3.11

2017-08-10 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hannu Kröger updated CASSANDRA-13752:
-
Description: 
We have discovered issues with corrupted SSTables. 

{code}
ERROR [SSTableBatchOpen:22] 2017-08-03 20:19:53,195 SSTableReader.java:577 - 
Cannot read sstable 
/cassandra/data/mykeyspace/mytable-7a4992800d5611e7b782cb90016f2d17/mc-35556-big=[Data.db,
 Statistics.db, Summary.db, Digest.crc32, CompressionInfo.db, TOC.txt, 
Index.db, Filter.db]; other IO error, skipping table
java.io.EOFException: EOF after 1898 bytes out of 21093
at 
org.apache.cassandra.io.util.RebufferingInputStream.readFully(RebufferingInputStream.java:68)
 ~[apache-cassandra-3.11.0.jar:3.11.0]
at 
org.apache.cassandra.io.util.RebufferingInputStream.readFully(RebufferingInputStream.java:60)
 ~[apache-cassandra-3.11.0.jar:3.11.0]
at 
org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:402) 
~[apache-cassandra-3.11.0.jar:3.11.0]
at 
org.apache.cassandra.utils.ByteBufferUtil.readWithShortLength(ByteBufferUtil.java:377)
 ~[apache-cassandra-3.11.0.jar:3.11.0]
at 
org.apache.cassandra.io.sstable.metadata.StatsMetadata$StatsMetadataSerializer.deserialize(StatsMetadata.java:325)
 ~[apache-cassandra-3.11.0.jar:3.11.0]
at 
org.apache.cassandra.io.sstable.metadata.StatsMetadata$StatsMetadataSerializer.deserialize(StatsMetadata.java:231)
 ~[apache-cassandra-3.11.0.jar:3.11.0]
at 
org.apache.cassandra.io.sstable.metadata.MetadataSerializer.deserialize(MetadataSerializer.java:122)
 ~[apache-cassandra-3.11.0.jar:3.11.0]
at 
org.apache.cassandra.io.sstable.metadata.MetadataSerializer.deserialize(MetadataSerializer.java:93)
 ~[apache-cassandra-3.11.0.jar:3.11.0]
at 
org.apache.cassandra.io.sstable.format.SSTableReader.open(SSTableReader.java:488)
 ~[apache-cassandra-3.11.0.jar:3.11.0]
at 
org.apache.cassandra.io.sstable.format.SSTableReader.open(SSTableReader.java:396)
 ~[apache-cassandra-3.11.0.jar:3.11.0]
at 
org.apache.cassandra.io.sstable.format.SSTableReader$5.run(SSTableReader.java:561)
 ~[apache-cassandra-3.11.0.jar:3.11.0]
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
[na:1.8.0_111]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
[na:1.8.0_111]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
[na:1.8.0_111]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
[na:1.8.0_111]
at 
org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:81)
 [apache-cassandra-3.11.0.jar:3.11.0]
{code}

Files look like this:
{code}
-rw-r--r--. 1 cassandra cassandra 3899251 Aug  7 08:37 
mc-6166-big-CompressionInfo.db
-rw-r--r--. 1 cassandra cassandra 16874421686 Aug  7 08:37 mc-6166-big-Data.db
-rw-r--r--. 1 cassandra cassandra  10 Aug  7 08:37 
mc-6166-big-Digest.crc32
-rw-r--r--. 1 cassandra cassandra 2930904 Aug  7 08:37 mc-6166-big-Filter.db
-rw-r--r--. 1 cassandra cassandra   75880 Aug  7 08:37 mc-6166-big-Index.db
-rw-r--r--. 1 cassandra cassandra   13762 Aug  7 08:37 
mc-6166-big-Statistics.db
-rw-r--r--. 1 cassandra cassandra  882008 Aug  7 08:37 
mc-6166-big-Summary.db
-rw-r--r--. 1 cassandra cassandra  92 Aug  7 08:37 mc-6166-big-TOC.txt
{code}

  was:
We have discovered issues with corrupted SSTables. 

{code}
ERROR [SSTableBatchOpen:22] 2017-08-03 20:19:53,195 SSTableReader.java:577 - 
Cannot read sstable 
/cassandra/data/mykeyspace/mytable-7a4992800d5611e7b782cb90016f2d17/mc-35556-big=[Data.db,
 Statistics.db, Summary.db, Digest.crc32, CompressionInfo.db, TOC.txt, 
Index.db, Filter.db]; other IO error, skipping table
java.io.EOFException: EOF after 1898 bytes out of 21093
at 
org.apache.cassandra.io.util.RebufferingInputStream.readFully(RebufferingInputStream.java:68)
 ~[apache-cassandra-3.11.0.jar:3.11.0]
at 
org.apache.cassandra.io.util.RebufferingInputStream.readFully(RebufferingInputStream.java:60)
 ~[apache-cassandra-3.11.0.jar:3.11.0]
at 
org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:402) 
~[apache-cassandra-3.11.0.jar:3.11.0]
at 
org.apache.cassandra.utils.ByteBufferUtil.readWithShortLength(ByteBufferUtil.java:377)
 ~[apache-cassandra-3.11.0.jar:3.11.0]
at 
org.apache.cassandra.io.sstable.metadata.StatsMetadata$StatsMetadataSerializer.deserialize(StatsMetadata.java:325)
 ~[apache-cassandra-3.11.0.jar:3.11.0]
at 
org.apache.cassandra.io.sstable.metadata.StatsMetadata$StatsMetadataSerializer.deserialize(StatsMetadata.java:231)
 ~[apache-cassandra-3.11.0.jar:3.11.0]
at 
org.apache.cassandra.io.sstable.metadata.MetadataSerializer.deserialize(MetadataSerializer.java:122)
 ~[apache-cassandra-3.11.0.jar:3.11.0]
at 

[jira] [Commented] (CASSANDRA-13723) fix exception logging that should be consumed by placeholder to 'getMessage()' for new slf4j version

2017-08-10 Thread ZhaoYang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16121385#comment-16121385
 ] 

ZhaoYang commented on CASSANDRA-13723:
--

Thank you.

> fix exception logging that should be consumed by placeholder to 
> 'getMessage()' for new slf4j version
> 
>
> Key: CASSANDRA-13723
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13723
> Project: Cassandra
>  Issue Type: Bug
>Reporter: ZhaoYang
>Assignee: ZhaoYang
>Priority: Trivial
> Fix For: 4.0
>
> Attachments: CASSANDRA-13723.patch
>
>
> The wrong tracing log will fail 
> {{materialized_views_test.py:TestMaterializedViews.view_tombstone_test}} and 
> impact clients.
> Current log: {{Digest mismatch: {} on 127.0.0.1}}
> Expected log: {{Digest mismatch: 
> org.apache.cassandra.service.DigestMismatchException: Mismatch for key 
> DecoratedKey... on 127.0.0.1}}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-13717) INSERT statement fails when Tuple type is used as clustering column with default DESC order

2017-08-10 Thread Stavros Kontopoulos (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16121472#comment-16121472
 ] 

Stavros Kontopoulos edited comment on CASSANDRA-13717 at 8/10/17 11:14 AM:
---

[~jjirsa] I fixed it for trunk (version 4). I could backport it to 3.11 
(version reported) as soon as it is verified that this fix is ok.
Good to know about the test procedure, thanx a lot! I will check the unit tests.


was (Author: skonto):
[~jjirsa] I fixed it for trunk (version 4). I could backport it to 3.11 
(version reported) as soon as it is verified that this fix is ok.
Good to know about the test procedure thanx a lot. I will check the unit tests.

> INSERT statement fails when Tuple type is used as clustering column with 
> default DESC order
> ---
>
> Key: CASSANDRA-13717
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13717
> Project: Cassandra
>  Issue Type: Bug
> Environment: Cassandra 3.11
>Reporter: Anastasios Kichidis
>Assignee: Stavros Kontopoulos
>Priority: Critical
> Attachments: example_queries.cql, fix_13717
>
>
> When a column family is created and a Tuple is used on clustering column with 
> default clustering order DESC, then the INSERT statement fails. 
> For example, the following table will make the INSERT statement fail with 
> error message "Invalid tuple type literal for tdemo of type 
> frozen>" , although the INSERT statement is correct 
> (works as expected when the default order is ASC)
> {noformat}
> create table test_table (
>   id int,
>   tdemo tuple,
>   primary key (id, tdemo)
> ) with clustering order by (tdemo desc);
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-13717) INSERT statement fails when Tuple type is used as clustering column with default DESC order

2017-08-10 Thread Stavros Kontopoulos (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16121472#comment-16121472
 ] 

Stavros Kontopoulos edited comment on CASSANDRA-13717 at 8/10/17 11:13 AM:
---

[~jjirsa] I fixed it for trunk (version 4). I could backport it to 3.11 
(version reported) as soon as it is verified that this fix is ok.
Good to know about the test procedure thanx a lot. I will check the unit tests.


was (Author: skonto):
[~jjirsa] I fixed for trunk (version 4). I could backport it to 3.11 (version 
reported) as soon as it is verified that this fix is ok.
Good to know about the test procedure thanx a lot. I will check the unit tests.

> INSERT statement fails when Tuple type is used as clustering column with 
> default DESC order
> ---
>
> Key: CASSANDRA-13717
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13717
> Project: Cassandra
>  Issue Type: Bug
> Environment: Cassandra 3.11
>Reporter: Anastasios Kichidis
>Assignee: Stavros Kontopoulos
>Priority: Critical
> Attachments: example_queries.cql, fix_13717
>
>
> When a column family is created and a Tuple is used on clustering column with 
> default clustering order DESC, then the INSERT statement fails. 
> For example, the following table will make the INSERT statement fail with 
> error message "Invalid tuple type literal for tdemo of type 
> frozen>" , although the INSERT statement is correct 
> (works as expected when the default order is ASC)
> {noformat}
> create table test_table (
>   id int,
>   tdemo tuple,
>   primary key (id, tdemo)
> ) with clustering order by (tdemo desc);
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13717) INSERT statement fails when Tuple type is used as clustering column with default DESC order

2017-08-10 Thread Stavros Kontopoulos (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16121472#comment-16121472
 ] 

Stavros Kontopoulos commented on CASSANDRA-13717:
-

[~jjirsa] I fixed for trunk (version 4). I could backport it to 3.11 (version 
reported) as soon as it is verified that this fix is ok.
Good to know about the test procedure thanx a lot. I will check the unit tests.

> INSERT statement fails when Tuple type is used as clustering column with 
> default DESC order
> ---
>
> Key: CASSANDRA-13717
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13717
> Project: Cassandra
>  Issue Type: Bug
> Environment: Cassandra 3.11
>Reporter: Anastasios Kichidis
>Assignee: Stavros Kontopoulos
>Priority: Critical
> Attachments: example_queries.cql, fix_13717
>
>
> When a column family is created and a Tuple is used on clustering column with 
> default clustering order DESC, then the INSERT statement fails. 
> For example, the following table will make the INSERT statement fail with 
> error message "Invalid tuple type literal for tdemo of type 
> frozen>" , although the INSERT statement is correct 
> (works as expected when the default order is ASC)
> {noformat}
> create table test_table (
>   id int,
>   tdemo tuple,
>   primary key (id, tdemo)
> ) with clustering order by (tdemo desc);
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-11748) Schema version mismatch may leads to Casandra OOM at bootstrap during a rolling upgrade process

2017-08-10 Thread Stefan Podkowinski (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16121506#comment-16121506
 ] 

Stefan Podkowinski commented on CASSANDRA-11748:



I'm not sure introducing a hard cap on pending outgoing pull requests and 
simply dopping anything from there is the way to go here. The good thing about 
the approach is that it's pretty much stateless, except from the atomic 
counter. But we should at least take the schema Ids and/or endpoints into 
account as well. It just doesn't make sense to queue 50 requests for the same 
schema Id and potentially drop requests for a different schema afterwards. Also 
as already noted, issuing pulls in parallel is probably not what we want, as 
this could lead to the described OOM issue, when too many responses get queued 
and applied at the same time. So I think we don't get around managing some more 
state, such as schema Ids, endpoints, last request time, delay, .., that we can 
use to schedule pulls in a more efficient way, by doing one request after 
another. 

But we should also not forget to look at the receiver side for incoming pull 
requests. Joining the cluster with a schema mismatch should not cause a node to 
answer each of those in parallel. If we keep track of pending incoming schema 
requests, we could introduce a delay before responding and create the schema 
mutations just once as payload to be used for all of them. We might have to 
bump up the MIGRATION_REQUEST timeout a in that case, but otherwise just 
delaying a few seconds should make a notable difference for nodes joining the 
cluster and having to answer to many migration requests in a short time frame.

> Schema version mismatch may leads to Casandra OOM at bootstrap during a 
> rolling upgrade process
> ---
>
> Key: CASSANDRA-11748
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11748
> Project: Cassandra
>  Issue Type: Bug
> Environment: Rolling upgrade process from 1.2.19 to 2.0.17. 
> CentOS 6.6
> Occurred in different C* node of different scale of deployment (2G ~ 5G)
>Reporter: Michael Fong
>Assignee: Matt Byrd
>Priority: Critical
> Fix For: 3.0.x, 3.11.x, 4.x
>
>
> We have observed multiple times when a multi-node C* (v2.0.17) cluster ran 
> into OOM in bootstrap during a rolling upgrade process from 1.2.19 to 2.0.17. 
> Here is the simple guideline of our rolling upgrade process
> 1. Update schema on a node, and wait until all nodes to be in schema version 
> agreemnt - via nodetool describeclulster
> 2. Restart a Cassandra node
> 3. After restart, there is a chance that the the restarted node has different 
> schema version.
> 4. All nodes in cluster start to rapidly exchange schema information, and any 
> of node could run into OOM. 
> The following is the system.log that occur in one of our 2-node cluster test 
> bed
> --
> Before rebooting node 2:
> Node 1: DEBUG [MigrationStage:1] 2016-04-19 11:09:42,326 
> MigrationManager.java (line 328) Gossiping my schema version 
> 4cb463f8-5376-3baf-8e88-a5cc6a94f58f
> Node 2: DEBUG [MigrationStage:1] 2016-04-19 11:09:42,122 
> MigrationManager.java (line 328) Gossiping my schema version 
> 4cb463f8-5376-3baf-8e88-a5cc6a94f58f
> After rebooting node 2, 
> Node 2: DEBUG [main] 2016-04-19 11:18:18,016 MigrationManager.java (line 328) 
> Gossiping my schema version f5270873-ba1f-39c7-ab2e-a86db868b09b
> The node2  keeps submitting the migration task over 100+ times to the other 
> node.
> INFO [GossipStage:1] 2016-04-19 11:18:18,261 Gossiper.java (line 1011) Node 
> /192.168.88.33 has restarted, now UP
> INFO [GossipStage:1] 2016-04-19 11:18:18,262 TokenMetadata.java (line 414) 
> Updating topology for /192.168.88.33
> ...
> DEBUG [GossipStage:1] 2016-04-19 11:18:18,265 MigrationManager.java (line 
> 102) Submitting migration task for /192.168.88.33
> ... ( over 100+ times)
> --
> On the otherhand, Node 1 keeps updating its gossip information, followed by 
> receiving and submitting migrationTask afterwards: 
> INFO [RequestResponseStage:3] 2016-04-19 11:18:18,333 Gossiper.java (line 
> 978) InetAddress /192.168.88.34 is now UP
> ...
> DEBUG [MigrationStage:1] 2016-04-19 11:18:18,496 
> MigrationRequestVerbHandler.java (line 41) Received migration request from 
> /192.168.88.34.
> …… ( over 100+ times)
> DEBUG [OptionalTasks:1] 2016-04-19 11:19:18,337 MigrationManager.java (line 
> 127) submitting migration task for /192.168.88.34
> .  (over 50+ times)
> On the side note, we have over 200+ column families defined in Cassandra 
> database, which may related to this amount of rpc traffic.
> P.S.2 The over requested schema migration task will eventually have 
> 

[jira] [Commented] (CASSANDRA-13717) INSERT statement fails when Tuple type is used as clustering column with default DESC order

2017-08-10 Thread Stavros Kontopoulos (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16121536#comment-16121536
 ] 

Stavros Kontopoulos commented on CASSANDRA-13717:
-

I added a test there in TupleTypeTest, updated the branch.

> INSERT statement fails when Tuple type is used as clustering column with 
> default DESC order
> ---
>
> Key: CASSANDRA-13717
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13717
> Project: Cassandra
>  Issue Type: Bug
> Environment: Cassandra 3.11
>Reporter: Anastasios Kichidis
>Assignee: Stavros Kontopoulos
>Priority: Critical
> Attachments: example_queries.cql, fix_13717
>
>
> When a column family is created and a Tuple is used on clustering column with 
> default clustering order DESC, then the INSERT statement fails. 
> For example, the following table will make the INSERT statement fail with 
> error message "Invalid tuple type literal for tdemo of type 
> frozen>" , although the INSERT statement is correct 
> (works as expected when the default order is ASC)
> {noformat}
> create table test_table (
>   id int,
>   tdemo tuple,
>   primary key (id, tdemo)
> ) with clustering order by (tdemo desc);
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-13717) INSERT statement fails when Tuple type is used as clustering column with default DESC order

2017-08-10 Thread Stavros Kontopoulos (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16121536#comment-16121536
 ] 

Stavros Kontopoulos edited comment on CASSANDRA-13717 at 8/10/17 12:29 PM:
---

[~jjirsa] I added a test there in TupleTypeTest, updated the branch. How can I 
update the patch? 
Should I cancel it and add a new one?


was (Author: skonto):
I added a test there in TupleTypeTest, updated the branch.

> INSERT statement fails when Tuple type is used as clustering column with 
> default DESC order
> ---
>
> Key: CASSANDRA-13717
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13717
> Project: Cassandra
>  Issue Type: Bug
> Environment: Cassandra 3.11
>Reporter: Anastasios Kichidis
>Assignee: Stavros Kontopoulos
>Priority: Critical
> Attachments: example_queries.cql, fix_13717
>
>
> When a column family is created and a Tuple is used on clustering column with 
> default clustering order DESC, then the INSERT statement fails. 
> For example, the following table will make the INSERT statement fail with 
> error message "Invalid tuple type literal for tdemo of type 
> frozen>" , although the INSERT statement is correct 
> (works as expected when the default order is ASC)
> {noformat}
> create table test_table (
>   id int,
>   tdemo tuple,
>   primary key (id, tdemo)
> ) with clustering order by (tdemo desc);
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Resolved] (CASSANDRA-13162) Batchlog replay is throttled during bootstrap, creating conditions for incorrect query results on materialized views

2017-08-10 Thread Paulo Motta (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paulo Motta resolved CASSANDRA-13162.
-
Resolution: Fixed

> Batchlog replay is throttled during bootstrap, creating conditions for 
> incorrect query results on materialized views
> 
>
> Key: CASSANDRA-13162
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13162
> Project: Cassandra
>  Issue Type: Bug
>  Components: Materialized Views
>Reporter: Wei Deng
>Assignee: Andrés de la Peña
>Priority: Critical
>  Labels: bootstrap, materializedviews
>
> I've tested this in a C* 3.0 cluster with a couple of Materialized Views 
> defined (one base table and two MVs on that base table). The data volume is 
> not very high per node (about 80GB of data per node total, and that 
> particular base table has about 25GB of data uncompressed with one MV taking 
> 18GB compressed and the other MV taking 3GB), and the cluster is using decent 
> hardware (EC2 C4.8XL with 18 cores + 60GB RAM + 18K IOPS RAID0 from two 3TB 
> gp2 EBS volumes). 
> This is originally a 9-node cluster. It appears that after adding 3 more 
> nodes to the DC, the system.batches table accumulated a lot of data on the 3 
> new nodes (each having around 20GB under system.batches directory), and in 
> the subsequent week the batchlog on the 3 new nodes got slowly replayed back 
> to the rest of the nodes in the cluster. The bottleneck seems to be the 
> throttling defined in this cassandra.yaml setting: 
> batchlog_replay_throttle_in_kb, which by default is set to 1MB/s.
> Given that it is taking almost a week (and still hasn't finished) for the 
> batchlog (from MV) to be replayed after the boostrap finishes, it seems only 
> reasonable to unthrottle (or at least give it a much higher throttle rate) 
> during the initial bootstrap, and hence I'd consider this a bug for our 
> current MV implementation.
> Also as far as I understand, the bootstrap logic won't wait for the 
> backlogged batchlog to be fully replayed before changing the new 
> bootstrapping node to "UN" state, and if batchlog for the MVs got stuck in 
> this state for a long time, we basically will get wrong answers on the MVs 
> during that whole duration (until batchlog is fully played to the cluster), 
> which adds even more criticality to this bug.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13162) Batchlog replay is throttled during bootstrap, creating conditions for incorrect query results on materialized views

2017-08-10 Thread Paulo Motta (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16122833#comment-16122833
 ] 

Paulo Motta commented on CASSANDRA-13162:
-

Closing as this was superseded by CASSANDRA-13614 and CASSANDRA-13065.

> Batchlog replay is throttled during bootstrap, creating conditions for 
> incorrect query results on materialized views
> 
>
> Key: CASSANDRA-13162
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13162
> Project: Cassandra
>  Issue Type: Bug
>  Components: Materialized Views
>Reporter: Wei Deng
>Assignee: Andrés de la Peña
>Priority: Critical
>  Labels: bootstrap, materializedviews
>
> I've tested this in a C* 3.0 cluster with a couple of Materialized Views 
> defined (one base table and two MVs on that base table). The data volume is 
> not very high per node (about 80GB of data per node total, and that 
> particular base table has about 25GB of data uncompressed with one MV taking 
> 18GB compressed and the other MV taking 3GB), and the cluster is using decent 
> hardware (EC2 C4.8XL with 18 cores + 60GB RAM + 18K IOPS RAID0 from two 3TB 
> gp2 EBS volumes). 
> This is originally a 9-node cluster. It appears that after adding 3 more 
> nodes to the DC, the system.batches table accumulated a lot of data on the 3 
> new nodes (each having around 20GB under system.batches directory), and in 
> the subsequent week the batchlog on the 3 new nodes got slowly replayed back 
> to the rest of the nodes in the cluster. The bottleneck seems to be the 
> throttling defined in this cassandra.yaml setting: 
> batchlog_replay_throttle_in_kb, which by default is set to 1MB/s.
> Given that it is taking almost a week (and still hasn't finished) for the 
> batchlog (from MV) to be replayed after the boostrap finishes, it seems only 
> reasonable to unthrottle (or at least give it a much higher throttle rate) 
> during the initial bootstrap, and hence I'd consider this a bug for our 
> current MV implementation.
> Also as far as I understand, the bootstrap logic won't wait for the 
> backlogged batchlog to be fully replayed before changing the new 
> bootstrapping node to "UN" state, and if batchlog for the MVs got stuck in 
> this state for a long time, we basically will get wrong answers on the MVs 
> during that whole duration (until batchlog is fully played to the cluster), 
> which adds even more criticality to this bug.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org