[jira] [Created] (CASSANDRA-13766) Invalid page state caused by IllegalArgumentException from ByteBuffer.limit()

2017-08-15 Thread Jay Zhuang (JIRA)
Jay Zhuang created CASSANDRA-13766:
--

 Summary: Invalid page state caused by IllegalArgumentException 
from ByteBuffer.limit()
 Key: CASSANDRA-13766
 URL: https://issues.apache.org/jira/browse/CASSANDRA-13766
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Jay Zhuang
Priority: Minor


Here is the exception:
{noformat}
ERROR [SharedPool-Worker-5] 2017-08-15 22:53:37,255 ErrorMessage.java:349 - 
Unexpected exception during request
java.lang.IllegalArgumentException: null
at java.nio.Buffer.limit(Buffer.java:275) ~[na:1.8.0_121]
at 
org.apache.cassandra.utils.ByteBufferUtil.readBytes(ByteBufferUtil.java:613) 
~[apache-cassandra-3.0.14.jar:3.0.14]
at 
org.apache.cassandra.utils.ByteBufferUtil.readBytesWithShortLength(ByteBufferUtil.java:622)
 ~[apache-cassandra-3.0.14.jar:3.0.14]
at 
org.apache.cassandra.db.marshal.CompositeType.splitName(CompositeType.java:201) 
~[apache-cassandra-3.0.14.jar:3.0.14]
at 
org.apache.cassandra.db.LegacyLayout.decodeClustering(LegacyLayout.java:326) 
~[apache-cassandra-3.0.14.jar:3.0.14]
at 
org.apache.cassandra.service.pager.PagingState$RowMark.clustering(PagingState.java:242)
 ~[apache-cassandra-3.0.14.jar:3.0.14]
at 
org.apache.cassandra.service.pager.SinglePartitionPager.nextPageReadCommand(SinglePartitionPager.java:73)
 ~[apache-cassandra-3.0.14.jar:3.0.14]
at 
org.apache.cassandra.service.pager.AbstractQueryPager.fetchPage(AbstractQueryPager.java:68)
 ~[apache-cassandra-3.0.14.jar:3.0.14]
at 
org.apache.cassandra.service.pager.SinglePartitionPager.fetchPage(SinglePartitionPager.java:34)
 ~[apache-cassandra-3.0.14.jar:3.0.14]
at 
org.apache.cassandra.cql3.statements.SelectStatement$Pager$NormalPager.fetchPage(SelectStatement.java:315)
 ~[apache-cassandra-3.0.14.jar:3.0.14]
at 
org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:351)
 ~[apache-cassandra-3.0.14.jar:3.0.14]
at 
org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:227)
 ~[apache-cassandra-3.0.14.jar:3.0.14]
at 
org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:76)
 ~[apache-cassandra-3.0.14.jar:3.0.14]
at 
org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:206)
 ~[apache-cassandra-3.0.14.jar:3.0.14]
at 
org.apache.cassandra.cql3.QueryProcessor.processPrepared(QueryProcessor.java:494)
 ~[apache-cassandra-3.0.14.jar:3.0.14]
at 
org.apache.cassandra.cql3.QueryProcessor.processPrepared(QueryProcessor.java:471)
 ~[apache-cassandra-3.0.14.jar:3.0.14]
at 
org.apache.cassandra.transport.messages.ExecuteMessage.execute(ExecuteMessage.java:131)
 ~[apache-cassandra-3.0.14.jar:3.0.14]
at 
org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:513)
 [apache-cassandra-3.0.14.jar:3.0.14]
at 
org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:407)
 [apache-cassandra-3.0.14.jar:3.0.14]
at 
io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
 [netty-all-4.1.0.CR6.jar:4.1.0.CR6]
at 
io.netty.channel.ChannelHandlerInvokerUtil.invokeChannelReadNow(ChannelHandlerInvokerUtil.java:83)
 [netty-all-4.1.0.CR6.jar:4.1.0.CR6]
at 
io.netty.channel.DefaultChannelHandlerInvoker$7.run(DefaultChannelHandlerInvoker.java:159)
 [netty-all-4.1.0.CR6.jar:4.1.0.CR6]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
[na:1.8.0_121]
at 
org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:164)
 [apache-cassandra-3.0.14.jar:3.0.14]
at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) 
[apache-cassandra-3.0.14.jar:3.0.14]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_121]
{noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13744) Better bootstrap failure message when blocked by (potential) range movement

2017-08-15 Thread mck (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16128239#comment-16128239
 ] 

mck commented on CASSANDRA-13744:
-

thanks [~jjirsa], committed.

> Better bootstrap failure message when blocked by (potential) range movement
> ---
>
> Key: CASSANDRA-13744
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13744
> Project: Cassandra
>  Issue Type: Bug
>Reporter: mck
>Assignee: mck
>Priority: Trivial
> Fix For: 3.11.1, 4.0
>
>
> The UnsupportedOperationException thrown from 
> {{StorageService.joinTokenRing(..)}} when it's detected that other nodes are 
> bootstrapping|leaving|moving offers no information as to which are those 
> other nodes.
> In a large cluster this might not be obvious nor easy to discover, gossipinfo 
> can hold information that takes a bit of effort to uncover. Even when it is 
> easily seen it's helpful to have it confirmed.
> Attached is the patch that provides a more thorough exception message to the 
> failed bootstrap attempt.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13744) Better bootstrap failure message when blocked by (potential) range movement

2017-08-15 Thread mck (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mck updated CASSANDRA-13744:

   Resolution: Fixed
Fix Version/s: (was: 3.11.x)
   (was: 4.x)
   4.0
   3.11.1
   Status: Resolved  (was: Patch Available)

> Better bootstrap failure message when blocked by (potential) range movement
> ---
>
> Key: CASSANDRA-13744
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13744
> Project: Cassandra
>  Issue Type: Bug
>Reporter: mck
>Assignee: mck
>Priority: Trivial
> Fix For: 3.11.1, 4.0
>
>
> The UnsupportedOperationException thrown from 
> {{StorageService.joinTokenRing(..)}} when it's detected that other nodes are 
> bootstrapping|leaving|moving offers no information as to which are those 
> other nodes.
> In a large cluster this might not be obvious nor easy to discover, gossipinfo 
> can hold information that takes a bit of effort to uncover. Even when it is 
> easily seen it's helpful to have it confirmed.
> Attached is the patch that provides a more thorough exception message to the 
> failed bootstrap attempt.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[2/3] cassandra git commit: Better bootstrap failure message when blocked by (potential) range movement

2017-08-15 Thread mck
Better bootstrap failure message when blocked by (potential) range movement

 patch by Mick Semb Wever; reviewed by Jeff Jirsa  for CASSANDRA-13744


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/2795d72b
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/2795d72b
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/2795d72b

Branch: refs/heads/trunk
Commit: 2795d72b46e493b87f74a4eb9c25520adff58f8c
Parents: db57cbd
Author: Mick Semb Wever 
Authored: Fri Aug 4 23:44:26 2017 +1000
Committer: mck 
Committed: Wed Aug 16 12:41:21 2017 +1000

--
 CHANGES.txt   | 1 +
 src/java/org/apache/cassandra/service/StorageService.java | 5 -
 2 files changed, 5 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/2795d72b/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 5403812..4ede932 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 3.11.1
+ * Better bootstrap failure message when blocked by (potential) range movement 
(CASSANDRA-13744)
  * "ignore" option is ignored in sstableloader (CASSANDRA-13721)
  * Deadlock in AbstractCommitLogSegmentManager (CASSANDRA-13652)
  * Duplicate the buffer before passing it to analyser in SASI operation 
(CASSANDRA-13512)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/2795d72b/src/java/org/apache/cassandra/service/StorageService.java
--
diff --git a/src/java/org/apache/cassandra/service/StorageService.java 
b/src/java/org/apache/cassandra/service/StorageService.java
index 29619c4..cbf69b4 100644
--- a/src/java/org/apache/cassandra/service/StorageService.java
+++ b/src/java/org/apache/cassandra/service/StorageService.java
@@ -899,7 +899,10 @@ public class StorageService extends 
NotificationBroadcasterSupport implements IE
 tokenMetadata.getMovingEndpoints().size() > 0
 ))
 {
-throw new UnsupportedOperationException("Other 
bootstrapping/leaving/moving nodes detected, cannot bootstrap while 
cassandra.consistent.rangemovement is true");
+String bootstrapTokens = 
StringUtils.join(tokenMetadata.getBootstrapTokens().valueSet(), ',');
+String leavingTokens = 
StringUtils.join(tokenMetadata.getLeavingEndpoints(), ',');
+String movingTokens = 
StringUtils.join(tokenMetadata.getMovingEndpoints().stream().map(e -> 
e.right).toArray(), ',');
+throw new UnsupportedOperationException(String.format("Other 
bootstrapping/leaving/moving nodes detected, cannot bootstrap while 
cassandra.consistent.rangemovement is true. Nodes detected, bootstrapping: %s; 
leaving: %s; moving: %s;", bootstrapTokens, leavingTokens, movingTokens));
 }
 
 // get bootstrap tokens


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[3/3] cassandra git commit: Merge branch 'cassandra-3.11' into trunk

2017-08-15 Thread mck
Merge branch 'cassandra-3.11' into trunk


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/22b2a82f
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/22b2a82f
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/22b2a82f

Branch: refs/heads/trunk
Commit: 22b2a82f76417f8cc2d7a16bfd41b05ff624e880
Parents: 256a74f 2795d72
Author: mck 
Authored: Wed Aug 16 13:03:28 2017 +1000
Committer: mck 
Committed: Wed Aug 16 13:05:50 2017 +1000

--
 CHANGES.txt   | 1 +
 src/java/org/apache/cassandra/service/StorageService.java | 5 -
 2 files changed, 5 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/22b2a82f/CHANGES.txt
--
diff --cc CHANGES.txt
index ee5b955,4ede932..2961a1d
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@@ -1,120 -1,5 +1,121 @@@
 +4.0
 + * Don't optimise trivial ranges in RangeFetchMapCalculator (CASSANDRA-13664)
 + * Use an ExecutorService for repair commands instead of new 
Thread(..).start() (CASSANDRA-13594)
 + * Fix race / ref leak in anticompaction (CASSANDRA-13688)
 + * Expose tasks queue length via JMX (CASSANDRA-12758)
 + * Fix race / ref leak in PendingRepairManager (CASSANDRA-13751)
 + * Enable ppc64le runtime as unsupported architecture (CASSANDRA-13615)
 + * Improve sstablemetadata output (CASSANDRA-11483)
 + * Support for migrating legacy users to roles has been dropped 
(CASSANDRA-13371)
 + * Introduce error metrics for repair (CASSANDRA-13387)
 + * Refactoring to primitive functional interfaces in AuthCache 
(CASSANDRA-13732)
 + * Update metrics to 3.1.5 (CASSANDRA-13648)
 + * batch_size_warn_threshold_in_kb can now be set at runtime (CASSANDRA-13699)
 + * Avoid always rebuilding secondary indexes at startup (CASSANDRA-13725)
 + * Upgrade JMH from 1.13 to 1.19 (CASSANDRA-13727)
 + * Upgrade SLF4J from 1.7.7 to 1.7.25 (CASSANDRA-12996)
 + * Default for start_native_transport now true if not set in config 
(CASSANDRA-13656)
 + * Don't add localhost to the graph when calculating where to stream from 
(CASSANDRA-13583)
 + * Allow skipping equality-restricted clustering columns in ORDER BY clause 
(CASSANDRA-10271)
 + * Use common nowInSec for validation compactions (CASSANDRA-13671)
 + * Improve handling of IR prepare failures (CASSANDRA-13672)
 + * Send IR coordinator messages synchronously (CASSANDRA-13673)
 + * Flush system.repair table before IR finalize promise (CASSANDRA-13660)
 + * Fix column filter creation for wildcard queries (CASSANDRA-13650)
 + * Add 'nodetool getbatchlogreplaythrottle' and 'nodetool 
setbatchlogreplaythrottle' (CASSANDRA-13614)
 + * fix race condition in PendingRepairManager (CASSANDRA-13659)
 + * Allow noop incremental repair state transitions (CASSANDRA-13658)
 + * Run repair with down replicas (CASSANDRA-10446)
 + * Added started & completed repair metrics (CASSANDRA-13598)
 + * Added started & completed repair metrics (CASSANDRA-13598)
 + * Improve secondary index (re)build failure and concurrency handling 
(CASSANDRA-10130)
 + * Improve calculation of available disk space for compaction 
(CASSANDRA-13068)
 + * Change the accessibility of RowCacheSerializer for third party row cache 
plugins (CASSANDRA-13579)
 + * Allow sub-range repairs for a preview of repaired data (CASSANDRA-13570)
 + * NPE in IR cleanup when columnfamily has no sstables (CASSANDRA-13585)
 + * Fix Randomness of stress values (CASSANDRA-12744)
 + * Allow selecting Map values and Set elements (CASSANDRA-7396)
 + * Fast and garbage-free Streaming Histogram (CASSANDRA-13444)
 + * Update repairTime for keyspaces on completion (CASSANDRA-13539)
 + * Add configurable upper bound for validation executor threads 
(CASSANDRA-13521)
 + * Bring back maxHintTTL propery (CASSANDRA-12982)
 + * Add testing guidelines (CASSANDRA-13497)
 + * Add more repair metrics (CASSANDRA-13531)
 + * RangeStreamer should be smarter when picking endpoints for streaming 
(CASSANDRA-4650)
 + * Avoid rewrapping an exception thrown for cache load functions 
(CASSANDRA-13367)
 + * Log time elapsed for each incremental repair phase (CASSANDRA-13498)
 + * Add multiple table operation support to cassandra-stress (CASSANDRA-8780)
 + * Fix incorrect cqlsh results when selecting same columns multiple times 
(CASSANDRA-13262)
 + * Fix WriteResponseHandlerTest is sensitive to test execution order 
(CASSANDRA-13421)
 + * Improve incremental repair logging (CASSANDRA-13468)
 + * Start compaction when incremental repair finishes (CASSANDRA-13454)
 + * Add repair streaming preview (CASSANDRA-13257)
 + * Cleanup isIncremental/repairedAt usage (CASSANDRA-13430)
 + * Change protocol to allow sending key space independent 

[1/3] cassandra git commit: Better bootstrap failure message when blocked by (potential) range movement

2017-08-15 Thread mck
Repository: cassandra
Updated Branches:
  refs/heads/cassandra-3.11 db57cbddc -> 2795d72b4
  refs/heads/trunk 256a74faa -> 22b2a82f7


Better bootstrap failure message when blocked by (potential) range movement

 patch by Mick Semb Wever; reviewed by Jeff Jirsa  for CASSANDRA-13744


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/2795d72b
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/2795d72b
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/2795d72b

Branch: refs/heads/cassandra-3.11
Commit: 2795d72b46e493b87f74a4eb9c25520adff58f8c
Parents: db57cbd
Author: Mick Semb Wever 
Authored: Fri Aug 4 23:44:26 2017 +1000
Committer: mck 
Committed: Wed Aug 16 12:41:21 2017 +1000

--
 CHANGES.txt   | 1 +
 src/java/org/apache/cassandra/service/StorageService.java | 5 -
 2 files changed, 5 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/2795d72b/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 5403812..4ede932 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 3.11.1
+ * Better bootstrap failure message when blocked by (potential) range movement 
(CASSANDRA-13744)
  * "ignore" option is ignored in sstableloader (CASSANDRA-13721)
  * Deadlock in AbstractCommitLogSegmentManager (CASSANDRA-13652)
  * Duplicate the buffer before passing it to analyser in SASI operation 
(CASSANDRA-13512)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/2795d72b/src/java/org/apache/cassandra/service/StorageService.java
--
diff --git a/src/java/org/apache/cassandra/service/StorageService.java 
b/src/java/org/apache/cassandra/service/StorageService.java
index 29619c4..cbf69b4 100644
--- a/src/java/org/apache/cassandra/service/StorageService.java
+++ b/src/java/org/apache/cassandra/service/StorageService.java
@@ -899,7 +899,10 @@ public class StorageService extends 
NotificationBroadcasterSupport implements IE
 tokenMetadata.getMovingEndpoints().size() > 0
 ))
 {
-throw new UnsupportedOperationException("Other 
bootstrapping/leaving/moving nodes detected, cannot bootstrap while 
cassandra.consistent.rangemovement is true");
+String bootstrapTokens = 
StringUtils.join(tokenMetadata.getBootstrapTokens().valueSet(), ',');
+String leavingTokens = 
StringUtils.join(tokenMetadata.getLeavingEndpoints(), ',');
+String movingTokens = 
StringUtils.join(tokenMetadata.getMovingEndpoints().stream().map(e -> 
e.right).toArray(), ',');
+throw new UnsupportedOperationException(String.format("Other 
bootstrapping/leaving/moving nodes detected, cannot bootstrap while 
cassandra.consistent.rangemovement is true. Nodes detected, bootstrapping: %s; 
leaving: %s; moving: %s;", bootstrapTokens, leavingTokens, movingTokens));
 }
 
 // get bootstrap tokens


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-13765) List of steps needed to upgrade Cassandra 2.2.5 to 3.11.0 on ubuntu 14.04

2017-08-15 Thread R1J1 (JIRA)
R1J1 created CASSANDRA-13765:


 Summary: List of steps needed to upgrade Cassandra 2.2.5 to 3.11.0 
on ubuntu 14.04
 Key: CASSANDRA-13765
 URL: https://issues.apache.org/jira/browse/CASSANDRA-13765
 Project: Cassandra
  Issue Type: Wish
  Components: Configuration
 Environment: ubuntu 14.04
Reporter: R1J1


Please send me  steps  to upgrade Cassandra 2.2.5 to 3.11.0 on ubuntu 14.04



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-10050) Secondary Index Performance Dependent on TokenRange Searched in Analytics

2017-08-15 Thread Igor Zubchenok (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16128163#comment-16128163
 ] 

Igor Zubchenok edited comment on CASSANDRA-10050 at 8/16/17 12:53 AM:
--

Is there any chance to have it resolved in coming releases?

>From my point of view it looks like -architecture design- implementation 
>problem, cause if indexes are sorted by token, why you cannot just use the 
>binary search to find start token?


was (Author: geagle):
Is there any chance to have it resolved in coming releases?

>From my point of view it looks like architecture design problem, cause if 
>indexes are sorted by token, why you cannot just use the binary search to find 
>start token?

> Secondary Index Performance Dependent on TokenRange Searched in Analytics
> -
>
> Key: CASSANDRA-10050
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10050
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Secondary Indexes
> Environment: Single node, macbook, 2.1.8
>Reporter: Russell Spitzer
> Fix For: 4.x
>
>
> In doing some test work on the Spark Cassandra Connector I saw some odd 
> performance when pushing down range queries with Secondary Index filters. 
> When running the queries we see huge amount of time when the C* server is not 
> doing any work and the query seem to be hanging. This investigation led to 
> the work in this document
> https://docs.google.com/spreadsheets/d/1aJg3KX7nPnY77RJ9ZT-IfaYADgJh0A--nAxItvC6hb4/edit#gid=0
> The Spark Cassandra Connector builds up token range specific queries and 
> allows the user to pushdown relevant fields to C*. Here we have two indexed 
> fields (size) and (color) being pushed down to C*. 
> {code}
> SELECT count(*) FROM ks.tab WHERE token("store") > $min AND token("store") <= 
> $max AND color = 'red' AND size = 'P' ALLOW FILTERING;{code}
> These queries will have different token ranges inserted and executed as 
> separate spark tasks. Spark tasks with token ranges near the Min(token) end 
> up executing much faster than those near Max(token) which also happen to 
> through errors.
> {code}
> Coordinator node timed out waiting for replica nodes' responses] 
> message="Operation timed out - received only 0 responses." 
> info={'received_responses': 0, 'required_responses': 1, 'consistency': 'ONE'}
> {code}
> I took the queries and ran them through CQLSH to see the difference in time. 
> A linear relationship is seen based on where the tokenRange being queried is 
> starting with only 2 second for queries near the beginning of the full token 
> spectrum and over 12 seconds at the end of the spectrum. 
> The question is, can this behavior be improved? or should we not recommend 
> using secondary indexes with Analytics workloads?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-10050) Secondary Index Performance Dependent on TokenRange Searched in Analytics

2017-08-15 Thread Igor Zubchenok (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16128163#comment-16128163
 ] 

Igor Zubchenok edited comment on CASSANDRA-10050 at 8/16/17 12:53 AM:
--

Is there any chance to have it resolved in coming releases?

>From my point of view it looks like -architecture design- implementation 
>problem, cause if indexes are sorted by token, why you cannot just use the 
>binary search to find start token?


was (Author: geagle):
Is there any chance to have it resolved in coming releases?

>From my point of view it looks like -architecture design- implementation 
>problem, cause if indexes are sorted by token, why you cannot just use the 
>binary search to find start token?

> Secondary Index Performance Dependent on TokenRange Searched in Analytics
> -
>
> Key: CASSANDRA-10050
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10050
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Secondary Indexes
> Environment: Single node, macbook, 2.1.8
>Reporter: Russell Spitzer
> Fix For: 4.x
>
>
> In doing some test work on the Spark Cassandra Connector I saw some odd 
> performance when pushing down range queries with Secondary Index filters. 
> When running the queries we see huge amount of time when the C* server is not 
> doing any work and the query seem to be hanging. This investigation led to 
> the work in this document
> https://docs.google.com/spreadsheets/d/1aJg3KX7nPnY77RJ9ZT-IfaYADgJh0A--nAxItvC6hb4/edit#gid=0
> The Spark Cassandra Connector builds up token range specific queries and 
> allows the user to pushdown relevant fields to C*. Here we have two indexed 
> fields (size) and (color) being pushed down to C*. 
> {code}
> SELECT count(*) FROM ks.tab WHERE token("store") > $min AND token("store") <= 
> $max AND color = 'red' AND size = 'P' ALLOW FILTERING;{code}
> These queries will have different token ranges inserted and executed as 
> separate spark tasks. Spark tasks with token ranges near the Min(token) end 
> up executing much faster than those near Max(token) which also happen to 
> through errors.
> {code}
> Coordinator node timed out waiting for replica nodes' responses] 
> message="Operation timed out - received only 0 responses." 
> info={'received_responses': 0, 'required_responses': 1, 'consistency': 'ONE'}
> {code}
> I took the queries and ran them through CQLSH to see the difference in time. 
> A linear relationship is seen based on where the tokenRange being queried is 
> starting with only 2 second for queries near the beginning of the full token 
> spectrum and over 12 seconds at the end of the spectrum. 
> The question is, can this behavior be improved? or should we not recommend 
> using secondary indexes with Analytics workloads?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-10050) Secondary Index Performance Dependent on TokenRange Searched in Analytics

2017-08-15 Thread Igor Zubchenok (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16128163#comment-16128163
 ] 

Igor Zubchenok commented on CASSANDRA-10050:


Is there any chance to have it resolved in coming releases?

>From my point of view it looks like architecture design problem, cause if 
>indexes are sorted by token, why you cannot just use the binary search to find 
>start token?

> Secondary Index Performance Dependent on TokenRange Searched in Analytics
> -
>
> Key: CASSANDRA-10050
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10050
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Secondary Indexes
> Environment: Single node, macbook, 2.1.8
>Reporter: Russell Spitzer
> Fix For: 4.x
>
>
> In doing some test work on the Spark Cassandra Connector I saw some odd 
> performance when pushing down range queries with Secondary Index filters. 
> When running the queries we see huge amount of time when the C* server is not 
> doing any work and the query seem to be hanging. This investigation led to 
> the work in this document
> https://docs.google.com/spreadsheets/d/1aJg3KX7nPnY77RJ9ZT-IfaYADgJh0A--nAxItvC6hb4/edit#gid=0
> The Spark Cassandra Connector builds up token range specific queries and 
> allows the user to pushdown relevant fields to C*. Here we have two indexed 
> fields (size) and (color) being pushed down to C*. 
> {code}
> SELECT count(*) FROM ks.tab WHERE token("store") > $min AND token("store") <= 
> $max AND color = 'red' AND size = 'P' ALLOW FILTERING;{code}
> These queries will have different token ranges inserted and executed as 
> separate spark tasks. Spark tasks with token ranges near the Min(token) end 
> up executing much faster than those near Max(token) which also happen to 
> through errors.
> {code}
> Coordinator node timed out waiting for replica nodes' responses] 
> message="Operation timed out - received only 0 responses." 
> info={'received_responses': 0, 'required_responses': 1, 'consistency': 'ONE'}
> {code}
> I took the queries and ran them through CQLSH to see the difference in time. 
> A linear relationship is seen based on where the tokenRange being queried is 
> starting with only 2 second for queries near the beginning of the full token 
> spectrum and over 12 seconds at the end of the spectrum. 
> The question is, can this behavior be improved? or should we not recommend 
> using secondary indexes with Analytics workloads?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-6246) EPaxos

2017-08-15 Thread Igor Zubchenok (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16128159#comment-16128159
 ] 

Igor Zubchenok edited comment on CASSANDRA-6246 at 8/16/17 12:45 AM:
-

I would like to try, but I'm not familiar with Cassandra source code. :( Isn't 
it easier to implement the patch again, but without rebase from 4 year old code?

BTW, I'm looking for a solution to implement a *reference counter based on 
Cassandra*. 

My first reference counter implementation has been made on counter columns, but 
unfortunately it had been ruined with tombstones issue - when a counter get 
back to zero, I cannot delete nor compact it.

My guess was that the lightweight Cassandra transactions can do a very good job 
for my task. I was so naive and now I have an issue with WriteTimeoutException 
and inconsistent state. 

The only workaround I came up with today is to do an exclusive lock that can be 
easily made with LWT with TTL, and subsequent change of a value, but it will 
have much more greater performance hit. I'm still looking for a good solution 
on that with Cassandra.

Currently I'm naive again and expecting that EPaxos will help me, but seems it 
will never-never be merged and released.

Dear community, do you have any idea?

P.S. Huge thanks and warm hugs to everyone who answers me!


was (Author: geagle):
I would like to try, but I'm not familiar with Cassandra source code. :( Isn't 
it easier to implement the patch again, but without rebase from 4 year old code?

BTW, I'm looking for a solution to implement a *reference counter based on 
Cassandra*. 

My first reference counter implementation has been made on counter columns, but 
unfortunately it had been ruined with tombstones issue - when a counter get 
back to zero, I cannot delete nor compact it.

My guess was that the lightweight Cassandra transactions can do a very good job 
for my task. I was so naive and now I have an issue with WriteTimeoutException 
and inconsistent state. 

The only workaround I came up with today is to do an exclusive lock that can be 
easily made with LWT with TTL, and subsequent change of a value, but it will 
have much more greater performance hit. I'm still looking for a good solution 
on that with Cassandra.

Currently I'm naive again and expecting that EPaxos will help me, but seems it 
will never-never be merged and released.

Dear community, do you have any idea?

Huge thanks to everyone who answers me.

> EPaxos
> --
>
> Key: CASSANDRA-6246
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6246
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Jonathan Ellis
>Assignee: Blake Eggleston
>  Labels: messaging-service-bump-required
> Fix For: 4.x
>
>
> One reason we haven't optimized our Paxos implementation with Multi-paxos is 
> that Multi-paxos requires leader election and hence, a period of 
> unavailability when the leader dies.
> EPaxos is a Paxos variant that requires (1) less messages than multi-paxos, 
> (2) is particularly useful across multiple datacenters, and (3) allows any 
> node to act as coordinator: 
> http://sigops.org/sosp/sosp13/papers/p358-moraru.pdf
> However, there is substantial additional complexity involved if we choose to 
> implement it.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-6246) EPaxos

2017-08-15 Thread Igor Zubchenok (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16128159#comment-16128159
 ] 

Igor Zubchenok edited comment on CASSANDRA-6246 at 8/16/17 12:45 AM:
-

I would like to try, but I'm not familiar with Cassandra source code. :( Isn't 
it easier to implement the patch again, but without rebase from 4 year old code?

BTW, I'm looking for a solution to implement a *reference counter based on 
Cassandra*. 

My first reference counter implementation has been made on counter columns, but 
unfortunately it had been ruined with tombstones issue - when a counter get 
back to zero, I cannot delete nor compact it.

My guess was that the lightweight Cassandra transactions can do a very good job 
for my task. I was so naive and now I have an issue with WriteTimeoutException 
and inconsistent state. 

The only workaround I came up with today is to do an exclusive lock that can be 
easily made with LWT with TTL, and subsequent change of a value, but it will 
have much more greater performance hit. I'm still looking for a good solution 
on that with Cassandra.

Currently I'm naive again and expecting that EPaxos will help me, but seems it 
will never-never be merged and released.

Dear community, do you have any idea?

P.S. Huge thanks and warm hugs to everyone who answers to me!


was (Author: geagle):
I would like to try, but I'm not familiar with Cassandra source code. :( Isn't 
it easier to implement the patch again, but without rebase from 4 year old code?

BTW, I'm looking for a solution to implement a *reference counter based on 
Cassandra*. 

My first reference counter implementation has been made on counter columns, but 
unfortunately it had been ruined with tombstones issue - when a counter get 
back to zero, I cannot delete nor compact it.

My guess was that the lightweight Cassandra transactions can do a very good job 
for my task. I was so naive and now I have an issue with WriteTimeoutException 
and inconsistent state. 

The only workaround I came up with today is to do an exclusive lock that can be 
easily made with LWT with TTL, and subsequent change of a value, but it will 
have much more greater performance hit. I'm still looking for a good solution 
on that with Cassandra.

Currently I'm naive again and expecting that EPaxos will help me, but seems it 
will never-never be merged and released.

Dear community, do you have any idea?

P.S. Huge thanks and warm hugs to everyone who answers me!

> EPaxos
> --
>
> Key: CASSANDRA-6246
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6246
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Jonathan Ellis
>Assignee: Blake Eggleston
>  Labels: messaging-service-bump-required
> Fix For: 4.x
>
>
> One reason we haven't optimized our Paxos implementation with Multi-paxos is 
> that Multi-paxos requires leader election and hence, a period of 
> unavailability when the leader dies.
> EPaxos is a Paxos variant that requires (1) less messages than multi-paxos, 
> (2) is particularly useful across multiple datacenters, and (3) allows any 
> node to act as coordinator: 
> http://sigops.org/sosp/sosp13/papers/p358-moraru.pdf
> However, there is substantial additional complexity involved if we choose to 
> implement it.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-6246) EPaxos

2017-08-15 Thread Igor Zubchenok (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16128159#comment-16128159
 ] 

Igor Zubchenok edited comment on CASSANDRA-6246 at 8/16/17 12:44 AM:
-

I would like to try, but I'm not familiar with Cassandra source code. :( Isn't 
it easier to implement the patch again, but without rebase from 4 year old code?

BTW, I'm looking for a solution to implement a *reference counter based on 
Cassandra*. 

My first reference counter implementation has been made on counter columns, but 
unfortunately it had been ruined with tombstones issue - when a counter get 
back to zero, I cannot delete nor compact it.

My guess was that the lightweight Cassandra transactions can do a very good job 
for my task. I was so naive and now I have an issue with WriteTimeoutException 
and inconsistent state. 

The only workaround I came up with today is to do an exclusive lock that can be 
easily made with LWT with TTL, and subsequent change of a value, but it will 
have much more greater performance hit. I'm still looking for a good solution 
on that with Cassandra.

Currently I'm naive again and expecting that EPaxos will help me, but seems it 
will never-never be merged and released.

Dear community, do you have any idea?

Huge thanks to everyone who answers me.


was (Author: geagle):
I would like to try, but I'm not familiar with Cassandra source code. :( Isn't 
it easier to implement the patch again, but without rebase from 4 year old code?

BTW, I'm looking for a solution to implement a *reference counter based on 
Cassandra*. 

My first reference counter implementation has been made on counter columns, but 
unfortunately it had been ruined with tombstones issue - when a counter get 
back to zero, I cannot delete nor compact it.

My guess was that the lightweight Cassandra transactions can do a very good job 
for my task. I was so naive and now I have an issue with WriteTimeoutException 
and inconsistent state. 

The only workaround I came up with today is to do an exclusive lock that can be 
easily made with LWT with TTL, and subsequent change of a value, but it will 
have much more greater performance hit. I'm still looking for a good solution 
on that with Cassandra.

Currently I'm naive again and expecting that EPaxos will help me, but seems it 
will never-never be merged and released.

Dear community, do you have any idea?

Huge thanks to everyone who answer me.

> EPaxos
> --
>
> Key: CASSANDRA-6246
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6246
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Jonathan Ellis
>Assignee: Blake Eggleston
>  Labels: messaging-service-bump-required
> Fix For: 4.x
>
>
> One reason we haven't optimized our Paxos implementation with Multi-paxos is 
> that Multi-paxos requires leader election and hence, a period of 
> unavailability when the leader dies.
> EPaxos is a Paxos variant that requires (1) less messages than multi-paxos, 
> (2) is particularly useful across multiple datacenters, and (3) allows any 
> node to act as coordinator: 
> http://sigops.org/sosp/sosp13/papers/p358-moraru.pdf
> However, there is substantial additional complexity involved if we choose to 
> implement it.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-6246) EPaxos

2017-08-15 Thread Igor Zubchenok (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16128159#comment-16128159
 ] 

Igor Zubchenok edited comment on CASSANDRA-6246 at 8/16/17 12:44 AM:
-

I would like to try, but I'm not familiar with Cassandra source code. :( Isn't 
it easier to implement the patch again, but without rebase from 4 year old code?

BTW, I'm looking for a solution to implement a *reference counter based on 
Cassandra*. 

My first reference counter implementation has been made on counter columns, but 
unfortunately it had been ruined with tombstones issue - when a counter get 
back to zero, I cannot delete nor compact it.

My guess was that the lightweight Cassandra transactions can do a very good job 
for my task. I was so naive and now I have an issue with WriteTimeoutException 
and inconsistent state. 

The only workaround I came up with today is to do an exclusive lock that can be 
easily made with LWT with TTL, and subsequent change of a value, but it will 
have much more greater performance hit. I'm still looking for a good solution 
on that with Cassandra.

Currently I'm naive again and expecting that EPaxos will help me, but seems it 
will never-never be merged and released.

Dear community, do you have any idea?

Huge thanks to everyone who answer me.


was (Author: geagle):
I would like to try, but I'm not familiar with Cassandra source code. :( Isn't 
it easier to implement the patch again, but without rebase from 4 year old code?

BTW, I'm looking for a solution to implement a *reference counter based on 
Cassandra*. 

My first reference counter implementation has been made on counter columns, but 
unfortunately it had been ruined with tombstones issue - when a counter get 
back to zero, I cannot delete nor compact it.

My guess was that the lightweight Cassandra transactions can do a very good job 
for my task. I was so naive and now I have an issue with WriteTimeoutException 
and inconsistent state. 

The only workaround I came up with today is to do an exclusive lock that can be 
easily made with LWT with TLL, and subsequent change of a value, but it will 
have much more greater performance hit. I'm still looking for a good solution 
on that with Cassandra.

Currently I'm naive again and expecting that EPaxos will help me, but seems it 
will never-never be merged and released.

Dear community, do you have any idea?

Huge thanks to everyone who answer me.

> EPaxos
> --
>
> Key: CASSANDRA-6246
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6246
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Jonathan Ellis
>Assignee: Blake Eggleston
>  Labels: messaging-service-bump-required
> Fix For: 4.x
>
>
> One reason we haven't optimized our Paxos implementation with Multi-paxos is 
> that Multi-paxos requires leader election and hence, a period of 
> unavailability when the leader dies.
> EPaxos is a Paxos variant that requires (1) less messages than multi-paxos, 
> (2) is particularly useful across multiple datacenters, and (3) allows any 
> node to act as coordinator: 
> http://sigops.org/sosp/sosp13/papers/p358-moraru.pdf
> However, there is substantial additional complexity involved if we choose to 
> implement it.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-6246) EPaxos

2017-08-15 Thread Igor Zubchenok (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16128159#comment-16128159
 ] 

Igor Zubchenok commented on CASSANDRA-6246:
---

I would like to try, but I'm not familiar with Cassandra source code. :( Isn't 
it easier to implement the patch again, but without rebase from 4 year old code?

BTW, I'm looking for a solution to implement a *reference counter based on 
Cassandra*. 

My first reference counter implementation has been made on counter columns, but 
unfortunately it had been ruined with tombstones issue - when a counter get 
back to zero, I cannot delete nor compact it.

My guess was that the lightweight Cassandra transactions can do a very good job 
for my task. I was so naive and now I have an issue with WriteTimeoutException 
and inconsistent state. 

The only workaround I came up with today is to do an exclusive lock that can be 
easily made with LWT with TLL, and subsequent change of a value, but it will 
have much more greater performance hit. I'm still looking for a good solution 
on that with Cassandra.

Currently I'm naive again and expecting that EPaxos will help me, but seems it 
will never-never be merged and released.

Dear community, do you have any idea?

Huge thanks to everyone who answer me.

> EPaxos
> --
>
> Key: CASSANDRA-6246
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6246
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Jonathan Ellis
>Assignee: Blake Eggleston
>  Labels: messaging-service-bump-required
> Fix For: 4.x
>
>
> One reason we haven't optimized our Paxos implementation with Multi-paxos is 
> that Multi-paxos requires leader election and hence, a period of 
> unavailability when the leader dies.
> EPaxos is a Paxos variant that requires (1) less messages than multi-paxos, 
> (2) is particularly useful across multiple datacenters, and (3) allows any 
> node to act as coordinator: 
> http://sigops.org/sosp/sosp13/papers/p358-moraru.pdf
> However, there is substantial additional complexity involved if we choose to 
> implement it.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13756) StreamingHistogram is not thread safe

2017-08-15 Thread xiangzhou xia (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xiangzhou xia updated CASSANDRA-13756:
--
Description: 
When we test C*3 in shadow cluster, we notice after a period of time, several 
data node suddenly run into 100% cpu and stop process query anymore.

After investigation, we found that threads are stuck on the sum() in 
streaminghistogram class. Those are jmx threads that working on expose 
getTombStoneRatio metrics (since jmx is kicked off every 3 seconds, there is a 
chance that multiple jmx thread is access streaminghistogram at the same time). 
 

After further investigation, we find that the optimization in CASSANDRA-13038 
led to a spool flush every time when we call sum(). Since TreeMap is not thread 
safe, threads will be stuck when multiple threads visit sum() at the same time.

There are two approaches to solve this issue. 

The first one is to add a lock to the flush in sum() which will introduce some 
extra overhead to streaminghistogram.

The second one is to avoid streaminghistogram to be access by multiple threads. 
For our specific case, is to remove the metrics we added.  

  was:
optimization in CASSANDRA-13038 led to a spool flush every time when we call 
sum. Since TreeMap is not thread safe, threads will be stuck when multiple 
threads visit sum() at the same time, and finally 100% cpu is stuck in that 
function. 

I think this issue is not limit to sum(), update() and merge() both have the 
same issue since they all need to update TreeMap. 

Add lock to bin solved this issue but it also introduced extra overhead.


> StreamingHistogram is not thread safe
> -
>
> Key: CASSANDRA-13756
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13756
> Project: Cassandra
>  Issue Type: Bug
>Reporter: xiangzhou xia
>
> When we test C*3 in shadow cluster, we notice after a period of time, several 
> data node suddenly run into 100% cpu and stop process query anymore.
> After investigation, we found that threads are stuck on the sum() in 
> streaminghistogram class. Those are jmx threads that working on expose 
> getTombStoneRatio metrics (since jmx is kicked off every 3 seconds, there is 
> a chance that multiple jmx thread is access streaminghistogram at the same 
> time).  
> After further investigation, we find that the optimization in CASSANDRA-13038 
> led to a spool flush every time when we call sum(). Since TreeMap is not 
> thread safe, threads will be stuck when multiple threads visit sum() at the 
> same time.
> There are two approaches to solve this issue. 
> The first one is to add a lock to the flush in sum() which will introduce 
> some extra overhead to streaminghistogram.
> The second one is to avoid streaminghistogram to be access by multiple 
> threads. For our specific case, is to remove the metrics we added.  



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13758) Incremental repair sessions shouldn't be deleted if they still have sstables

2017-08-15 Thread Jeremy Hanna (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeremy Hanna updated CASSANDRA-13758:
-
Labels: incremental_repair  (was: )

> Incremental repair sessions shouldn't be deleted if they still have sstables
> 
>
> Key: CASSANDRA-13758
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13758
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Blake Eggleston
>Assignee: Blake Eggleston
>  Labels: incremental_repair
> Fix For: 4.0
>
>
> The incremental session cleanup doesn't verify that there are no remaining 
> sstables marked as part of the repair before deleting it. Deleting a 
> successful repair session which still has outstanding sstables will cause 
> those sstables to be demoted to unrepaired, creating an inconsistency.
> This typically wouldn't be an issue, since we'd expect the sstables to long 
> since have been promoted / demoted. However, I've seen a few ref leak issues 
> which can cause sstables to get stuck. Those have been fixed, but we should 
> still protect against that edge case to prevent inconsistencies caused by 
> future (or currently unknown) bugs.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-6246) EPaxos

2017-08-15 Thread sankalp kohli (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16127782#comment-16127782
 ] 

sankalp kohli commented on CASSANDRA-6246:
--

What are you looking for with this patch? 
It would help if you could rebase this patch and see if someone can review it. 

> EPaxos
> --
>
> Key: CASSANDRA-6246
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6246
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Jonathan Ellis
>Assignee: Blake Eggleston
>  Labels: messaging-service-bump-required
> Fix For: 4.x
>
>
> One reason we haven't optimized our Paxos implementation with Multi-paxos is 
> that Multi-paxos requires leader election and hence, a period of 
> unavailability when the leader dies.
> EPaxos is a Paxos variant that requires (1) less messages than multi-paxos, 
> (2) is particularly useful across multiple datacenters, and (3) allows any 
> node to act as coordinator: 
> http://sigops.org/sosp/sosp13/papers/p358-moraru.pdf
> However, there is substantial additional complexity involved if we choose to 
> implement it.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-6246) EPaxos

2017-08-15 Thread Igor Zubchenok (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16127713#comment-16127713
 ] 

Igor Zubchenok commented on CASSANDRA-6246:
---

It is a pity that these lightweight transactions can not be used at full 
strength due to the delay in merging this improvement. I refer to 
CASSANDRA-9328. I would set the highest priority for the merging.

> EPaxos
> --
>
> Key: CASSANDRA-6246
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6246
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Jonathan Ellis
>Assignee: Blake Eggleston
>  Labels: messaging-service-bump-required
> Fix For: 4.x
>
>
> One reason we haven't optimized our Paxos implementation with Multi-paxos is 
> that Multi-paxos requires leader election and hence, a period of 
> unavailability when the leader dies.
> EPaxos is a Paxos variant that requires (1) less messages than multi-paxos, 
> (2) is particularly useful across multiple datacenters, and (3) allows any 
> node to act as coordinator: 
> http://sigops.org/sosp/sosp13/papers/p358-moraru.pdf
> However, there is substantial additional complexity involved if we choose to 
> implement it.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13741) Replace Cassandra's lz4-1.3.0.jar with lz4-java-1.4.0.jar

2017-08-15 Thread Michael Kjellman (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16127547#comment-16127547
 ] 

Michael Kjellman commented on CASSANDRA-13741:
--

[~amita...@rediffmail.com] nope! don't need anything more from you. We've 
actually had a lot of stuff going on with LZ4 behind the scenes for a few 
months up to now. We need comprehensive performance and correctness testing 
here though as LZ4 is hugely important to C*... I'm working on that and will 
get to it as soon as possible.

> Replace Cassandra's lz4-1.3.0.jar with lz4-java-1.4.0.jar
> -
>
> Key: CASSANDRA-13741
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13741
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Libraries
>Reporter: Amitkumar Ghatwal
>Assignee: Michael Kjellman
> Fix For: 4.x
>
>
> Hi All,
> The latest lz4-java library has been released 
> (https://github.com/lz4/lz4-java/releases) and uploaded to maven central . 
> Please replace in mainline the current version ( 1.3.0) with the latest one ( 
> 1.4.0) from here - http://repo1.maven.org/maven2/org/lz4/lz4-java/1.4.0/
> Adding : [~ReiOdaira].
> Regards,
> Amit



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13594) Use an ExecutorService for repair commands instead of new Thread(..).start()

2017-08-15 Thread Joel Knighton (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16127540#comment-16127540
 ] 

Joel Knighton commented on CASSANDRA-13594:
---

No problem - fix committed as {{256a74faa31fcf25bdae753c563fa2c69f7f355c}}. 
Thanks!

> Use an ExecutorService for repair commands instead of new Thread(..).start()
> 
>
> Key: CASSANDRA-13594
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13594
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Marcus Eriksson
>Assignee: Marcus Eriksson
> Fix For: 4.0
>
> Attachments: 13594.png
>
>
> Currently when starting a new repair, we create a new Thread and start it 
> immediately
> It would be nice to be able to 1) limit the number of threads and 2) reject 
> starting new repair commands if we are already running too many.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



cassandra git commit: Add Config$RepairCommandPoolFullStrategy awareness to DatabaseDescriptorRefTest

2017-08-15 Thread jkni
Repository: cassandra
Updated Branches:
  refs/heads/trunk 99e5f7efc -> 256a74faa


Add Config$RepairCommandPoolFullStrategy awareness to DatabaseDescriptorRefTest

Patch by Joel Knighton; reviewed by Marcus Eriksson for CASSANDRA-13594


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/256a74fa
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/256a74fa
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/256a74fa

Branch: refs/heads/trunk
Commit: 256a74faa31fcf25bdae753c563fa2c69f7f355c
Parents: 99e5f7e
Author: Joel Knighton 
Authored: Tue Aug 15 11:24:40 2017 -0500
Committer: Joel Knighton 
Committed: Tue Aug 15 11:58:58 2017 -0500

--
 .../unit/org/apache/cassandra/config/DatabaseDescriptorRefTest.java | 1 +
 1 file changed, 1 insertion(+)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/256a74fa/test/unit/org/apache/cassandra/config/DatabaseDescriptorRefTest.java
--
diff --git 
a/test/unit/org/apache/cassandra/config/DatabaseDescriptorRefTest.java 
b/test/unit/org/apache/cassandra/config/DatabaseDescriptorRefTest.java
index b915854..b50a050 100644
--- a/test/unit/org/apache/cassandra/config/DatabaseDescriptorRefTest.java
+++ b/test/unit/org/apache/cassandra/config/DatabaseDescriptorRefTest.java
@@ -70,6 +70,7 @@ public class DatabaseDescriptorRefTest
 "org.apache.cassandra.config.Config$DiskOptimizationStrategy",
 "org.apache.cassandra.config.Config$InternodeCompression",
 "org.apache.cassandra.config.Config$MemtableAllocationType",
+"org.apache.cassandra.config.Config$RepairCommandPoolFullStrategy",
 "org.apache.cassandra.config.Config$UserFunctionTimeoutPolicy",
 "org.apache.cassandra.config.ParameterizedClass",
 "org.apache.cassandra.config.EncryptionOptions",


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13741) Replace Cassandra's lz4-1.3.0.jar with lz4-java-1.4.0.jar

2017-08-15 Thread Jeff Jirsa (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16127531#comment-16127531
 ] 

Jeff Jirsa commented on CASSANDRA-13741:


Folks,

This is only going into 4.0, which is (at least) months away. Please be 
patient. It'll be reviewed when folks have bandwidth.



> Replace Cassandra's lz4-1.3.0.jar with lz4-java-1.4.0.jar
> -
>
> Key: CASSANDRA-13741
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13741
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Libraries
>Reporter: Amitkumar Ghatwal
>Assignee: Michael Kjellman
> Fix For: 4.x
>
>
> Hi All,
> The latest lz4-java library has been released 
> (https://github.com/lz4/lz4-java/releases) and uploaded to maven central . 
> Please replace in mainline the current version ( 1.3.0) with the latest one ( 
> 1.4.0) from here - http://repo1.maven.org/maven2/org/lz4/lz4-java/1.4.0/
> Adding : [~ReiOdaira].
> Regards,
> Amit



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13576) test failure in bootstrap_test.TestBootstrap.consistent_range_movement_false_with_rf1_should_succeed_test

2017-08-15 Thread Marcus Eriksson (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcus Eriksson updated CASSANDRA-13576:

Reviewer: Alex Petrov

I'm fine just not optimising rf = 1

As my patch has spent more than 15h in queue on builds.apache.org, lets commit 
that one once it is finished? Could you review [~ifesdjeen]?

> test failure in 
> bootstrap_test.TestBootstrap.consistent_range_movement_false_with_rf1_should_succeed_test
> -
>
> Key: CASSANDRA-13576
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13576
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Michael Hamm
>Assignee: Marcus Eriksson
>  Labels: dtest, test-failure
> Attachments: node1_debug.log, node1_gc.log, node1.log, 
> node2_debug.log, node2_gc.log, node2.log, node3_debug.log, node3_gc.log, 
> node3.log
>
>
> example failure:
> http://cassci.datastax.com/job/trunk_offheap_dtest/445/testReport/bootstrap_test/TestBootstrap/consistent_range_movement_false_with_rf1_should_succeed_test
> {noformat}
> Error Message
> 31 May 2017 04:28:09 [node3] Missing: ['Starting listening for CQL clients']:
> INFO  [main] 2017-05-31 04:18:01,615 YamlConfigura.
> See system.log for remainder
> {noformat}
> {noformat}
> Stacktrace
>   File "/usr/lib/python2.7/unittest/case.py", line 329, in run
> testMethod()
>   File "/home/automaton/cassandra-dtest/bootstrap_test.py", line 236, in 
> consistent_range_movement_false_with_rf1_should_succeed_test
> self._bootstrap_test_with_replica_down(False, rf=1)
>   File "/home/automaton/cassandra-dtest/bootstrap_test.py", line 278, in 
> _bootstrap_test_with_replica_down
> 
> jvm_args=["-Dcassandra.consistent.rangemovement={}".format(consistent_range_movement)])
>   File 
> "/home/automaton/venv/local/lib/python2.7/site-packages/ccmlib/node.py", line 
> 696, in start
> self.wait_for_binary_interface(from_mark=self.mark)
>   File 
> "/home/automaton/venv/local/lib/python2.7/site-packages/ccmlib/node.py", line 
> 514, in wait_for_binary_interface
> self.watch_log_for("Starting listening for CQL clients", **kwargs)
>   File 
> "/home/automaton/venv/local/lib/python2.7/site-packages/ccmlib/node.py", line 
> 471, in watch_log_for
> raise TimeoutError(time.strftime("%d %b %Y %H:%M:%S", time.gmtime()) + " 
> [" + self.name + "] Missing: " + str([e.pattern for e in tofind]) + ":\n" + 
> reads[:50] + ".\nSee {} for remainder".format(filename))
> "31 May 2017 04:28:09 [node3] Missing: ['Starting listening for CQL 
> clients']:\nINFO  [main] 2017-05-31 04:18:01,615 YamlConfigura.\n
> {noformat}
> {noformat}
>  >> begin captured logging << 
> \ndtest: DEBUG: cluster ccm directory: 
> /tmp/dtest-PKphwD\ndtest: DEBUG: Done setting configuration options:\n{   
> 'initial_token': None,\n'memtable_allocation_type': 'offheap_objects',\n  
>   'num_tokens': '32',\n'phi_convict_threshold': 5,\n
> 'range_request_timeout_in_ms': 1,\n'read_request_timeout_in_ms': 
> 1,\n'request_timeout_in_ms': 1,\n
> 'truncate_request_timeout_in_ms': 1,\n'write_request_timeout_in_ms': 
> 1}\ncassandra.policies: INFO: Using datacenter 'datacenter1' for 
> DCAwareRoundRobinPolicy (via host '127.0.0.1'); if incorrect, please specify 
> a local_dc to the constructor, or limit contact points to local cluster 
> nodes\ncassandra.cluster: INFO: New Cassandra host  datacenter1> discovered\ncassandra.protocol: WARNING: Server warning: When 
> increasing replication factor you need to run a full (-full) repair to 
> distribute the data.\ncassandra.connection: WARNING: Heartbeat failed for 
> connection (139927174110160) to 127.0.0.2\ncassandra.cluster: WARNING: Host 
> 127.0.0.2 has been marked down\ncassandra.pool: WARNING: Error attempting to 
> reconnect to 127.0.0.2, scheduling retry in 2.0 seconds: [Errno 111] Tried 
> connecting to [('127.0.0.2', 9042)]. Last error: Connection 
> refused\ncassandra.pool: WARNING: Error attempting to reconnect to 127.0.0.2, 
> scheduling retry in 4.0 seconds: [Errno 111] Tried connecting to 
> [('127.0.0.2', 9042)]. Last error: Connection refused\ncassandra.pool: 
> WARNING: Error attempting to reconnect to 127.0.0.2, scheduling retry in 8.0 
> seconds: [Errno 111] Tried connecting to [('127.0.0.2', 9042)]. Last error: 
> Connection refused\ncassandra.pool: WARNING: Error attempting to reconnect to 
> 127.0.0.2, scheduling retry in 16.0 seconds: [Errno 111] Tried connecting to 
> [('127.0.0.2', 9042)]. Last error: Connection refused\ncassandra.pool: 
> WARNING: Error attempting to reconnect to 127.0.0.2, scheduling retry in 32.0 
> seconds: [Errno 111] Tried connecting to [('127.0.0.2', 9042)]. Last error: 

[jira] [Commented] (CASSANDRA-13594) Use an ExecutorService for repair commands instead of new Thread(..).start()

2017-08-15 Thread Marcus Eriksson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16127490#comment-16127490
 ] 

Marcus Eriksson commented on CASSANDRA-13594:
-

+1, sorry about that, again...

> Use an ExecutorService for repair commands instead of new Thread(..).start()
> 
>
> Key: CASSANDRA-13594
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13594
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Marcus Eriksson
>Assignee: Marcus Eriksson
> Fix For: 4.0
>
> Attachments: 13594.png
>
>
> Currently when starting a new repair, we create a new Thread and start it 
> immediately
> It would be nice to be able to 1) limit the number of threads and 2) reject 
> starting new repair commands if we are already running too many.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13594) Use an ExecutorService for repair commands instead of new Thread(..).start()

2017-08-15 Thread Joel Knighton (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16127479#comment-16127479
 ] 

Joel Knighton commented on CASSANDRA-13594:
---

This causes a test failure in {{DatabaseDescriptorRefTest}} because of the new 
Config class - I've pushed a fix 
[here|https://github.com/jkni/cassandra/commit/ec3e7a84e5bae4b6968ee39a39f331fe0f5dd036].

> Use an ExecutorService for repair commands instead of new Thread(..).start()
> 
>
> Key: CASSANDRA-13594
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13594
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Marcus Eriksson
>Assignee: Marcus Eriksson
> Fix For: 4.0
>
> Attachments: 13594.png
>
>
> Currently when starting a new repair, we create a new Thread and start it 
> immediately
> It would be nice to be able to 1) limit the number of threads and 2) reject 
> starting new repair commands if we are already running too many.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13741) Replace Cassandra's lz4-1.3.0.jar with lz4-java-1.4.0.jar

2017-08-15 Thread Yangzheng Bai (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16127467#comment-16127467
 ] 

Yangzheng Bai commented on CASSANDRA-13741:
---

Jeff Jirsa and Michael Kjellman:

We are also expecting this new lz4-java release for over a year. Our testing 
shows new lz4-java improves more than 10% on aarch64.

Thanks,
Yangzheng

> Replace Cassandra's lz4-1.3.0.jar with lz4-java-1.4.0.jar
> -
>
> Key: CASSANDRA-13741
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13741
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Libraries
>Reporter: Amitkumar Ghatwal
>Assignee: Michael Kjellman
> Fix For: 4.x
>
>
> Hi All,
> The latest lz4-java library has been released 
> (https://github.com/lz4/lz4-java/releases) and uploaded to maven central . 
> Please replace in mainline the current version ( 1.3.0) with the latest one ( 
> 1.4.0) from here - http://repo1.maven.org/maven2/org/lz4/lz4-java/1.4.0/
> Adding : [~ReiOdaira].
> Regards,
> Amit



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13764) SelectTest.testMixedTTLOnColumnsWide is flaky

2017-08-15 Thread Joel Knighton (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16127438#comment-16127438
 ] 

Joel Knighton commented on CASSANDRA-13764:
---

This also affects {{SelectTest.testMixedTTLOnColumns}}.

> SelectTest.testMixedTTLOnColumnsWide is flaky
> -
>
> Key: CASSANDRA-13764
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13764
> Project: Cassandra
>  Issue Type: Bug
>  Components: Testing
>Reporter: Joel Knighton
>Priority: Trivial
>
> {{org.apache.cassandra.cql3.validation.operations.SelectTest.testMixedTTLOnColumnsWide}}
>  is flaky. This is because it inserts rows and then asserts their contents 
> using {{ttl()}} in the select, but if the test is sufficiently slow, the 
> remaining ttl may change by the time the select is run. Anecdotally, 
> {{testSelectWithAlias}} in the same class uses a fudge factor of 1 second 
> that would fix all the failures I've seen, but it might make more sense to 
> measure the elapsed time in the test and calculate the acceptable variation 
> from that time.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-6246) EPaxos

2017-08-15 Thread Joshua McKenzie (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16127426#comment-16127426
 ] 

Joshua McKenzie commented on CASSANDRA-6246:


[~geagle]: given that this a) needs a rebase, and b) is a [massive 
patch|https://github.com/apache/cassandra/compare/trunk...bdeggleston:CASSANDRA-6246-trunk]
 that has yet to be reviewed, I'd expect there's going to be a substantial 
delay for this to be ready for merge. Not to put words in Blake's mouth, but 
I'd assume a post 4.0 world.

> EPaxos
> --
>
> Key: CASSANDRA-6246
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6246
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Jonathan Ellis
>Assignee: Blake Eggleston
>  Labels: messaging-service-bump-required
> Fix For: 4.x
>
>
> One reason we haven't optimized our Paxos implementation with Multi-paxos is 
> that Multi-paxos requires leader election and hence, a period of 
> unavailability when the leader dies.
> EPaxos is a Paxos variant that requires (1) less messages than multi-paxos, 
> (2) is particularly useful across multiple datacenters, and (3) allows any 
> node to act as coordinator: 
> http://sigops.org/sosp/sosp13/papers/p358-moraru.pdf
> However, there is substantial additional complexity involved if we choose to 
> implement it.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13576) test failure in bootstrap_test.TestBootstrap.consistent_range_movement_false_with_rf1_should_succeed_test

2017-08-15 Thread Alex Petrov (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16127094#comment-16127094
 ] 

Alex Petrov commented on CASSANDRA-13576:
-

[~krummas] I can't recall the details anymore, iirc someone mentioned it 
should've been fixed in the scope of some bigger issue, so I retracted my 
changes, although it seems that it was never committed.

I've pushed the changes 
[here|https://github.com/apache/cassandra/compare/trunk...ifesdjeen:13576-trunk]
 again, and I can say they're more or less identical to yours, I've just 
extracted {{AbstractReplicationStrategy}} to the separate variable to avoid 
looking it up several times.

> test failure in 
> bootstrap_test.TestBootstrap.consistent_range_movement_false_with_rf1_should_succeed_test
> -
>
> Key: CASSANDRA-13576
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13576
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Michael Hamm
>Assignee: Marcus Eriksson
>  Labels: dtest, test-failure
> Attachments: node1_debug.log, node1_gc.log, node1.log, 
> node2_debug.log, node2_gc.log, node2.log, node3_debug.log, node3_gc.log, 
> node3.log
>
>
> example failure:
> http://cassci.datastax.com/job/trunk_offheap_dtest/445/testReport/bootstrap_test/TestBootstrap/consistent_range_movement_false_with_rf1_should_succeed_test
> {noformat}
> Error Message
> 31 May 2017 04:28:09 [node3] Missing: ['Starting listening for CQL clients']:
> INFO  [main] 2017-05-31 04:18:01,615 YamlConfigura.
> See system.log for remainder
> {noformat}
> {noformat}
> Stacktrace
>   File "/usr/lib/python2.7/unittest/case.py", line 329, in run
> testMethod()
>   File "/home/automaton/cassandra-dtest/bootstrap_test.py", line 236, in 
> consistent_range_movement_false_with_rf1_should_succeed_test
> self._bootstrap_test_with_replica_down(False, rf=1)
>   File "/home/automaton/cassandra-dtest/bootstrap_test.py", line 278, in 
> _bootstrap_test_with_replica_down
> 
> jvm_args=["-Dcassandra.consistent.rangemovement={}".format(consistent_range_movement)])
>   File 
> "/home/automaton/venv/local/lib/python2.7/site-packages/ccmlib/node.py", line 
> 696, in start
> self.wait_for_binary_interface(from_mark=self.mark)
>   File 
> "/home/automaton/venv/local/lib/python2.7/site-packages/ccmlib/node.py", line 
> 514, in wait_for_binary_interface
> self.watch_log_for("Starting listening for CQL clients", **kwargs)
>   File 
> "/home/automaton/venv/local/lib/python2.7/site-packages/ccmlib/node.py", line 
> 471, in watch_log_for
> raise TimeoutError(time.strftime("%d %b %Y %H:%M:%S", time.gmtime()) + " 
> [" + self.name + "] Missing: " + str([e.pattern for e in tofind]) + ":\n" + 
> reads[:50] + ".\nSee {} for remainder".format(filename))
> "31 May 2017 04:28:09 [node3] Missing: ['Starting listening for CQL 
> clients']:\nINFO  [main] 2017-05-31 04:18:01,615 YamlConfigura.\n
> {noformat}
> {noformat}
>  >> begin captured logging << 
> \ndtest: DEBUG: cluster ccm directory: 
> /tmp/dtest-PKphwD\ndtest: DEBUG: Done setting configuration options:\n{   
> 'initial_token': None,\n'memtable_allocation_type': 'offheap_objects',\n  
>   'num_tokens': '32',\n'phi_convict_threshold': 5,\n
> 'range_request_timeout_in_ms': 1,\n'read_request_timeout_in_ms': 
> 1,\n'request_timeout_in_ms': 1,\n
> 'truncate_request_timeout_in_ms': 1,\n'write_request_timeout_in_ms': 
> 1}\ncassandra.policies: INFO: Using datacenter 'datacenter1' for 
> DCAwareRoundRobinPolicy (via host '127.0.0.1'); if incorrect, please specify 
> a local_dc to the constructor, or limit contact points to local cluster 
> nodes\ncassandra.cluster: INFO: New Cassandra host  datacenter1> discovered\ncassandra.protocol: WARNING: Server warning: When 
> increasing replication factor you need to run a full (-full) repair to 
> distribute the data.\ncassandra.connection: WARNING: Heartbeat failed for 
> connection (139927174110160) to 127.0.0.2\ncassandra.cluster: WARNING: Host 
> 127.0.0.2 has been marked down\ncassandra.pool: WARNING: Error attempting to 
> reconnect to 127.0.0.2, scheduling retry in 2.0 seconds: [Errno 111] Tried 
> connecting to [('127.0.0.2', 9042)]. Last error: Connection 
> refused\ncassandra.pool: WARNING: Error attempting to reconnect to 127.0.0.2, 
> scheduling retry in 4.0 seconds: [Errno 111] Tried connecting to 
> [('127.0.0.2', 9042)]. Last error: Connection refused\ncassandra.pool: 
> WARNING: Error attempting to reconnect to 127.0.0.2, scheduling retry in 8.0 
> seconds: [Errno 111] Tried connecting to [('127.0.0.2', 9042)]. Last error: 
> Connection refused\ncassandra.pool: WARNING: Error attempting to 

[jira] [Commented] (CASSANDRA-13761) truncatehints cant't delete all hints

2017-08-15 Thread Aleksey Yeschenko (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16127052#comment-16127052
 ] 

Aleksey Yeschenko commented on CASSANDRA-13761:
---

It's not impossible. If there is an open writer at the time you issue truncate, 
it won't be deleted - because by then the descriptor hasn't been added to the 
store yet.

> truncatehints  cant't delete all hints
> --
>
> Key: CASSANDRA-13761
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13761
> Project: Cassandra
>  Issue Type: Bug
> Environment: 3.0.14
> java version "1.8.0_131"
>Reporter: huyx
>Priority: Minor
>
> step1
> Execute nodetool truncatehints on node A , no print any log. when restart the 
> down node B,
> A print:
> INFO  [HintsDispatcher:1] 2017-08-10 18:27:01,593 HintsStore.java:126 - 
> Deleted hint file 4da2fd65-a4fe-4c0a-bf95-f818431c31bb-1502360551290-1.hints
> INFO  [HintsDispatcher:1] 2017-08-10 18:27:01,595 
> HintsDispatchExecutor.java:272 - Finished hinted handoff of file 
> 4da2fd65-a4fe-4c0a-bf95-f818431c31bb-1502360551290-1.hints to endpoint 
> /10.71.0.14,
> and B data is repaired。
> step2:
> I change the cassandra.yaml max_hints_file_size_in_mb=1, and insert data to 
> cluster.
> Execute nodetool truncatehints on node A,A print:
> INFO  [RMI TCP Connection(20)-10.71.0.12] 2017-08-11 17:22:51,164 
> HintsStore.java:126 - Deleted hint file 
> 4da2fd65-a4fe-4c0a-bf95-f818431c31bb-1502443243250-1.hints
> INFO  [RMI TCP Connection(20)-10.71.0.12] 2017-08-11 17:22:51,165 
> HintsStore.java:126 - Deleted hint file 
> 4da2fd65-a4fe-4c0a-bf95-f818431c31bb-1502443273261-1.hints
> INFO  [RMI TCP Connection(20)-10.71.0.12] 2017-08-11 17:22:51,166 
> HintsStore.java:126 - Deleted hint file 
> 4da2fd65-a4fe-4c0a-bf95-f818431c31bb-1502443293262-1.hints
> INFO  [RMI TCP Connection(20)-10.71.0.12] 2017-08-11 17:22:51,167 
> HintsStore.java:126 - Deleted hint file 
> 4da2fd65-a4fe-4c0a-bf95-f818431c31bb-1502443313267-1.hints
> when restart the down node B, A print:
> Deleted hint file 4da2fd65-a4fe-4c0a-bf95-f818431c31bb-150244269-1.hints
> INFO  [HintsDispatcher:7] 2017-08-11 17:25:14,626 
> HintsDispatchExecutor.java:272 - Finished hinted handoff of file 
> 4da2fd65-a4fe-4c0a-bf95-f818431c31bb-150244269-1.hints to endpoint 
> /10.71.0.14: 4da2fd65-a4fe-4c0a-bf95-f818431c31bb
> truncatehints  can't delete all hits, it will Leave one don't delete。



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13622) Better config validation/documentation

2017-08-15 Thread ZhaoYang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16126960#comment-16126960
 ] 

ZhaoYang commented on CASSANDRA-13622:
--

| [trunk|https://github.com/jasonstack/cassandra/commits/CASSANDRA-13622-trunk] 
|
| [3.11|https://github.com/jasonstack/cassandra/commits/CASSANDRA-13622-3.11] |
| [3.0| https://github.com/jasonstack/cassandra/commits/CASSANDRA-13622-3.0] |

Thanks for reviewing

> Better config validation/documentation
> --
>
> Key: CASSANDRA-13622
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13622
> Project: Cassandra
>  Issue Type: Bug
>  Components: Configuration
>Reporter: Kurt Greaves
>Assignee: ZhaoYang
>Priority: Minor
>  Labels: lhf
> Fix For: 4.0
>
>
> There are a number of properties in the yaml that are "in_mb", however 
> resolve to bytes when calculated in {{DatabaseDescriptor.java}}, but are 
> stored in int's. This means that their maximum values are 2047, as any higher 
> when converted to bytes overflows the int.
> Where possible/reasonable we should convert these to be long's, and stored as 
> long's. If there is no reason for the value to ever be >2047 we should at 
> least document that as the max value, or better yet make it error if set 
> higher than that. Noting that although it's bad practice to increase a lot of 
> them to such high values, there may be cases where it is necessary and in 
> which case we should handle it appropriately rather than overflowing and 
> surprising the user. That is, causing it to break but not in the way the user 
> expected it to :)
> Following are functions that currently could be at risk of the above:
> {code:java|title=DatabaseDescriptor.java}
> getThriftFramedTransportSize()
> getMaxValueSize()
> getCompactionLargePartitionWarningThreshold()
> getCommitLogSegmentSize()
> getNativeTransportMaxFrameSize()
> # These are in KB so max value of 2096128
> getBatchSizeWarnThreshold()
> getColumnIndexSize()
> getColumnIndexCacheSize()
> getMaxMutationSize()
> {code}
> Note we may not actually need to fix all of these, and there may be more. 
> This was just from a rough scan over the code.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org