[jira] [Commented] (CASSANDRA-12245) initial view build can be parallel

2017-08-14 Thread Paulo Motta (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16126791#comment-16126791
 ] 

Paulo Motta commented on CASSANDRA-12245:
-

Thanks for the patch and sorry for the delay! Had an initial look at the patch, 
overall looks good, I have the following comments/questions/remarks:

bq. The newly created ViewBuilderController reads the local token ranges, 
splits them to satisfy a concurrency factor, and runs a ViewBuilder for each of 
them.

It would be nice to maybe try to reuse the {{Splitter}} methods if possible, so 
we can reuse tests, or if that's not straightforward maybe put the methods on 
splitter and add some tests to make sure it's working correctly.

bq.  When ViewBuilderController receives the finalization signal of the last 
ViewBuilder, it double-checks if there are new local ranges that weren't 
considered at the beginning of the build. If there are new ranges, new 
{{ViewBuilder}}s are created for them.

This will not work if the range movement which created the new local range 
finishes after the view has finished building. This problem exists currently 
and is unrelated to the view build process itself, but more related to the 
range movement completion which should ensure the views are properly built 
before the operation finishes, so I created CASSANDRA-13762 to handle this 
properly.

bq. Given that we have a ViewBuilder per local range, the key of the table 
system.views_builds_in_progress is modified to include the bounds of the token 
range. So, we will have an entry in the table per each ViewBuilder. The number 
of covered keys per range is also recorded in the table.

Can probably remove the generation field from the builds in progress table and 
[remove this 
comment|https://github.com/adelapena/cassandra/blob/94f3d0d02bb5f849e4d54857d0b33531a5650643/src/java/org/apache/cassandra/db/view/ViewBuilder.java#L124]

bq. I have updated the patch to use a new separate table, 
system.views_builds_in_progress_v2

{{views_builds_in_progress_v2}} sounds a bit hacky, so perhaps we should call 
it {{system.view_builds_in_progress}} (remove the s) and also add a NOTICE 
entry informing the previous table was replaced and data files can be removed.

bq. The downside is that pending view builds will be restarted during an 
upgrade to 4.x, which seems reasonable to me.

Sounds reasonable to me too.

bq. ViewBuilder and ViewBuilderController are probably not the best names. 
Maybe we could rename ViewBuilder to something like ViewBuilderTask or 
ViewBuilderForRange, and rename ViewBuilderController to ViewBuilder.

{{ViewBuilder}} and {{ViewBuilderTask}} LGTM

bq. The concurrency factor is based on conf.concurrent_compactors because the 
views are built on the CompactionManager, but we may be interested in a 
different value.

I'm a bit concerned about starving the compaction executor for a long period 
during view build of large base tables, so we should probably have another 
option like {{concurret_view_builders}} with a conservative default and perhaps 
control the concurrency at the {{ViewBuilderController}}. WDYT?

bq. The patch tries to evenly split the token ranges in the minimum number of 
parts to satisfy the concurrency factor, and it never merges ranges. So, with 
the default 256 virtual nodes (and a lesser concurrency factor) we create 256 
build tasks. We might be interested in a different planning. If we want the 
number of tasks to be lesser than the number of local ranges we should modify 
the ViewBuilder task to be responsible for several ranges, although it will 
complicate the status tracking.

I think this is good to start with, we can improve the planning later if 
necessary. I don't think there is much gain from merging ranges to have smaller 
tasks.

bq. Probably there is a better way of implementing 
ViewBuilder.getCompactionInfo. The patch uses 
keysBuilt/ColumnFamilyStore.estimatedKeysForRange to estimate the completion, 
which could deal to have task completion status over 100%, depending on the 
estimation.

How about using {{prevToken.size(range.right)}} (introduced by CASSANDRA-7032)? 
Even though this will not be available for BytesToken (used by 
ByteOrderedPartitioner, which is rarely used, so could maybe fallback to the 
current imprecise calculation in that case).

Other comments:

* Avoid submitting view builder when view is already built instead of checking 
on the ViewBuilder 
([here|https://github.com/adelapena/cassandra/blob/94f3d0d02bb5f849e4d54857d0b33531a5650643/src/java/org/apache/cassandra/db/view/ViewBuilder.java#L109])
* ViewBuilder seems to be reimplementing some of the logic of 
{{PartitionRangeReadCommand}}, so I wonder if we shoud take this chance to 
simplify and use that instead of manually constructing the commands via 
ReducingKeyIterator and multiple {{SinglePartitionReadCommands}}? We can 
totally do this in other ticket if 

[jira] [Commented] (CASSANDRA-13761) truncatehints cant't delete all hints

2017-08-14 Thread huyx (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16126775#comment-16126775
 ] 

huyx commented on CASSANDRA-13761:
--

 but now i test again,stopdaemon B,writing data, stop writing data,nodetool 
flush A,nodetool  truncating  A, restart B。

A print:
INFO  [RMI TCP Connection(46)-10.71.0.12] 2017-08-15 11:36:59,368 
HintsStore.java:126 - Deleted hint file 
4da2fd65-a4fe-4c0a-bf95-f818431c31bb-1502768072934-1.hints
INFO  [RMI TCP Connection(46)-10.71.0.12] 2017-08-15 11:36:59,369 
HintsStore.java:126 - Deleted hint file 
4da2fd65-a4fe-4c0a-bf95-f818431c31bb-1502768092954-1.hints

INFO  [HintsDispatcher:9] 2017-08-15 11:39:11,543 HintsStore.java:126 - Deleted 
hint file 4da2fd65-a4fe-4c0a-bf95-f818431c31bb-1502768122944-1.hints
INFO  [HintsDispatcher:9] 2017-08-15 11:39:11,543 
HintsDispatchExecutor.java:272 - Finished hinted handoff of file 
4da2fd65-a4fe-4c0a-bf95-f818431c31bb-1502768122944-1.hints to endpoint 
/10.71.0.14: 4da2fd65-a4fe-4c0a-bf95-f818431c31bb


> truncatehints  cant't delete all hints
> --
>
> Key: CASSANDRA-13761
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13761
> Project: Cassandra
>  Issue Type: Bug
> Environment: 3.0.14
> java version "1.8.0_131"
>Reporter: huyx
>Priority: Minor
>
> step1
> Execute nodetool truncatehints on node A , no print any log. when restart the 
> down node B,
> A print:
> INFO  [HintsDispatcher:1] 2017-08-10 18:27:01,593 HintsStore.java:126 - 
> Deleted hint file 4da2fd65-a4fe-4c0a-bf95-f818431c31bb-1502360551290-1.hints
> INFO  [HintsDispatcher:1] 2017-08-10 18:27:01,595 
> HintsDispatchExecutor.java:272 - Finished hinted handoff of file 
> 4da2fd65-a4fe-4c0a-bf95-f818431c31bb-1502360551290-1.hints to endpoint 
> /10.71.0.14,
> and B data is repaired。
> step2:
> I change the cassandra.yaml max_hints_file_size_in_mb=1, and insert data to 
> cluster.
> Execute nodetool truncatehints on node A,A print:
> INFO  [RMI TCP Connection(20)-10.71.0.12] 2017-08-11 17:22:51,164 
> HintsStore.java:126 - Deleted hint file 
> 4da2fd65-a4fe-4c0a-bf95-f818431c31bb-1502443243250-1.hints
> INFO  [RMI TCP Connection(20)-10.71.0.12] 2017-08-11 17:22:51,165 
> HintsStore.java:126 - Deleted hint file 
> 4da2fd65-a4fe-4c0a-bf95-f818431c31bb-1502443273261-1.hints
> INFO  [RMI TCP Connection(20)-10.71.0.12] 2017-08-11 17:22:51,166 
> HintsStore.java:126 - Deleted hint file 
> 4da2fd65-a4fe-4c0a-bf95-f818431c31bb-1502443293262-1.hints
> INFO  [RMI TCP Connection(20)-10.71.0.12] 2017-08-11 17:22:51,167 
> HintsStore.java:126 - Deleted hint file 
> 4da2fd65-a4fe-4c0a-bf95-f818431c31bb-1502443313267-1.hints
> when restart the down node B, A print:
> Deleted hint file 4da2fd65-a4fe-4c0a-bf95-f818431c31bb-150244269-1.hints
> INFO  [HintsDispatcher:7] 2017-08-11 17:25:14,626 
> HintsDispatchExecutor.java:272 - Finished hinted handoff of file 
> 4da2fd65-a4fe-4c0a-bf95-f818431c31bb-150244269-1.hints to endpoint 
> /10.71.0.14: 4da2fd65-a4fe-4c0a-bf95-f818431c31bb
> truncatehints  can't delete all hits, it will Leave one don't delete。



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-6246) EPaxos

2017-08-14 Thread Igor Zubchenok (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16126756#comment-16126756
 ] 

Igor Zubchenok commented on CASSANDRA-6246:
---

Any update when this can be released?

> EPaxos
> --
>
> Key: CASSANDRA-6246
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6246
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Jonathan Ellis
>Assignee: Blake Eggleston
>  Labels: messaging-service-bump-required
> Fix For: 4.x
>
>
> One reason we haven't optimized our Paxos implementation with Multi-paxos is 
> that Multi-paxos requires leader election and hence, a period of 
> unavailability when the leader dies.
> EPaxos is a Paxos variant that requires (1) less messages than multi-paxos, 
> (2) is particularly useful across multiple datacenters, and (3) allows any 
> node to act as coordinator: 
> http://sigops.org/sosp/sosp13/papers/p358-moraru.pdf
> However, there is substantial additional complexity involved if we choose to 
> implement it.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-9328) WriteTimeoutException thrown when LWT concurrency > 1, despite the query duration taking MUCH less than cas_contention_timeout_in_ms

2017-08-14 Thread Igor Zubchenok (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16126736#comment-16126736
 ] 

Igor Zubchenok edited comment on CASSANDRA-9328 at 8/15/17 2:27 AM:


I completely discouraged. Is there any workaround on this?
If LWT and CAS can not be actually used on non-idempotent operations, what is a 
real use case of the current implementation?



was (Author: geagle):
I completely discouraged. Is there any workaround on this?
If CAS can not be actually used on non-idempotent operations, what is a real 
use of current LWT implementation?


> WriteTimeoutException thrown when LWT concurrency > 1, despite the query 
> duration taking MUCH less than cas_contention_timeout_in_ms
> 
>
> Key: CASSANDRA-9328
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9328
> Project: Cassandra
>  Issue Type: Bug
>  Components: Coordination
>Reporter: Aaron Whiteside
> Attachments: CassandraLWTTest2.java, CassandraLWTTest.java
>
>
> WriteTimeoutException thrown when LWT concurrency > 1, despite the query 
> duration taking MUCH less than cas_contention_timeout_in_ms.
> Unit test attached, run against a 3 node cluster running 2.1.5.
> If you reduce the threadCount to 1, you never see a WriteTimeoutException. If 
> the WTE is due to not being able to communicate with other nodes, why does 
> the concurrency >1 cause inter-node communication to fail?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-9328) WriteTimeoutException thrown when LWT concurrency > 1, despite the query duration taking MUCH less than cas_contention_timeout_in_ms

2017-08-14 Thread Igor Zubchenok (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16126736#comment-16126736
 ] 

Igor Zubchenok commented on CASSANDRA-9328:
---

I completely discouraged. Is there any workaround on this?
If CAS can not be actually used on non-idempotent operations, what is a real 
use of current LWT implementation?


> WriteTimeoutException thrown when LWT concurrency > 1, despite the query 
> duration taking MUCH less than cas_contention_timeout_in_ms
> 
>
> Key: CASSANDRA-9328
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9328
> Project: Cassandra
>  Issue Type: Bug
>  Components: Coordination
>Reporter: Aaron Whiteside
> Attachments: CassandraLWTTest2.java, CassandraLWTTest.java
>
>
> WriteTimeoutException thrown when LWT concurrency > 1, despite the query 
> duration taking MUCH less than cas_contention_timeout_in_ms.
> Unit test attached, run against a 3 node cluster running 2.1.5.
> If you reduce the threadCount to 1, you never see a WriteTimeoutException. If 
> the WTE is due to not being able to communicate with other nodes, why does 
> the concurrency >1 cause inter-node communication to fail?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-13764) SelectTest.testMixedTTLOnColumnsWide is flaky

2017-08-14 Thread Joel Knighton (JIRA)
Joel Knighton created CASSANDRA-13764:
-

 Summary: SelectTest.testMixedTTLOnColumnsWide is flaky
 Key: CASSANDRA-13764
 URL: https://issues.apache.org/jira/browse/CASSANDRA-13764
 Project: Cassandra
  Issue Type: Bug
  Components: Testing
Reporter: Joel Knighton
Priority: Trivial


{{org.apache.cassandra.cql3.validation.operations.SelectTest.testMixedTTLOnColumnsWide}}
 is flaky. This is because it inserts rows and then asserts their contents 
using {{ttl()}} in the select, but if the test is sufficiently slow, the 
remaining ttl may change by the time the select is run. Anecdotally, 
{{testSelectWithAlias}} in the same class uses a fudge factor of 1 second that 
would fix all the failures I've seen, but it might make more sense to measure 
the elapsed time in the test and calculate the acceptable variation from that 
time.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-10726) Read repair inserts should not be blocking

2017-08-14 Thread Xiaolong Jiang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16126694#comment-16126694
 ] 

Xiaolong Jiang commented on CASSANDRA-10726:


[~krummas] this is the patch to fix CME   

https://github.com/krummas/cassandra/pull/4

Can u please take a look? If it sounds good, can you merge to your repo and 
retrigger the dtest? 

> Read repair inserts should not be blocking
> --
>
> Key: CASSANDRA-10726
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10726
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Coordination
>Reporter: Richard Low
>Assignee: Xiaolong Jiang
> Fix For: 4.x
>
>
> Today, if there’s a digest mismatch in a foreground read repair, the insert 
> to update out of date replicas is blocking. This means, if it fails, the read 
> fails with a timeout. If a node is dropping writes (maybe it is overloaded or 
> the mutation stage is backed up for some other reason), all reads to a 
> replica set could fail. Further, replicas dropping writes get more out of 
> sync so will require more read repair.
> The comment on the code for why the writes are blocking is:
> {code}
> // wait for the repair writes to be acknowledged, to minimize impact on any 
> replica that's
> // behind on writes in case the out-of-sync row is read multiple times in 
> quick succession
> {code}
> but the bad side effect is that reads timeout. Either the writes should not 
> be blocking or we should return success for the read even if the write times 
> out.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Assigned] (CASSANDRA-13745) Compaction History would be beneficial to include completion timestamp

2017-08-14 Thread Jeff Jirsa (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Jirsa reassigned CASSANDRA-13745:
--

Assignee: Ihar Kukharchuk

> Compaction History would be beneficial to include completion timestamp 
> ---
>
> Key: CASSANDRA-13745
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13745
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction, Core
>Reporter: Richard Andersen
>Assignee: Ihar Kukharchuk
>Priority: Minor
>  Labels: lhf
> Attachments: 13745-trunk.txt
>
>
> Compaction history does not currently contain the completion time stamp which 
> can be beneficial in determining performance and event tracing. I would like 
> to use this information also in our Health Check process to trace event 
> timelines. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-10726) Read repair inserts should not be blocking

2017-08-14 Thread Xiaolong Jiang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16126376#comment-16126376
 ] 

Xiaolong Jiang commented on CASSANDRA-10726:


[~krummas] After digging into this, I think the problem is there are two pathes 
to do the read repair. The first path is the one I fixed which is directly part 
of read call. The other path is back ground AsyncReadRepair runner which use 
same DataReslover to resolve the conflict. Since they are using same 
DataReslover to reslove, they are sharing same repairResponseRequestMap object. 
So the background AsyncRepairRunner is also changing this map causing the 
interation on the repairResponseRequestMap keySet get CME. Let me see how I can 
fix this. 

> Read repair inserts should not be blocking
> --
>
> Key: CASSANDRA-10726
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10726
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Coordination
>Reporter: Richard Low
>Assignee: Xiaolong Jiang
> Fix For: 4.x
>
>
> Today, if there’s a digest mismatch in a foreground read repair, the insert 
> to update out of date replicas is blocking. This means, if it fails, the read 
> fails with a timeout. If a node is dropping writes (maybe it is overloaded or 
> the mutation stage is backed up for some other reason), all reads to a 
> replica set could fail. Further, replicas dropping writes get more out of 
> sync so will require more read repair.
> The comment on the code for why the writes are blocking is:
> {code}
> // wait for the repair writes to be acknowledged, to minimize impact on any 
> replica that's
> // behind on writes in case the out-of-sync row is read multiple times in 
> quick succession
> {code}
> but the bad side effect is that reads timeout. Either the writes should not 
> be blocking or we should return success for the read even if the write times 
> out.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



cassandra-dtest git commit: Fix jolokia for mixed version clusters

2017-08-14 Thread jjirsa
Repository: cassandra-dtest
Updated Branches:
  refs/heads/master 013efa11f -> b8842b979


Fix jolokia for mixed version clusters

Patch by Jeff Jirsa; Reviewed by Aleksey Yeshchenko


Project: http://git-wip-us.apache.org/repos/asf/cassandra-dtest/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra-dtest/commit/b8842b97
Tree: http://git-wip-us.apache.org/repos/asf/cassandra-dtest/tree/b8842b97
Diff: http://git-wip-us.apache.org/repos/asf/cassandra-dtest/diff/b8842b97

Branch: refs/heads/master
Commit: b8842b979244547dd43d48bbaeadf1cea34a9fef
Parents: 013efa1
Author: Jeff Jirsa 
Authored: Mon Aug 14 12:55:17 2017 -0700
Committer: Jeff Jirsa 
Committed: Mon Aug 14 12:57:47 2017 -0700

--
 tools/jmxutils.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra-dtest/blob/b8842b97/tools/jmxutils.py
--
diff --git a/tools/jmxutils.py b/tools/jmxutils.py
index 1f41626..8c20eb8 100644
--- a/tools/jmxutils.py
+++ b/tools/jmxutils.py
@@ -158,7 +158,7 @@ def remove_perf_disable_shared_mem(node):
 option (see https://github.com/rhuss/jolokia/issues/198 for details).  This
 edits cassandra-env.sh (or the Windows equivalent), or jvm.options file on 
3.2+ to remove that option.
 """
-if node.cluster.version() >= LooseVersion('3.2'):
+if node.get_cassandra_version() >= LooseVersion('3.2'):
 conf_file = os.path.join(node.get_conf_dir(), JVM_OPTIONS)
 pattern = '\-XX:\+PerfDisableSharedMem'
 replacement = '#-XX:+PerfDisableSharedMem'


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13387) Metrics for repair

2017-08-14 Thread Simon Zhou (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16125942#comment-16125942
 ] 

Simon Zhou commented on CASSANDRA-13387:


Stefan, thank you so much for code review!

> Metrics for repair
> --
>
> Key: CASSANDRA-13387
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13387
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Simon Zhou
>Assignee: Simon Zhou
>Priority: Minor
> Fix For: 4.0
>
>
> We're missing metrics for repair, especially for errors. From what I observed 
> now, the exception will be caught by UncaughtExceptionHandler set in 
> CassandraDaemon and is categorized as StorageMetrics.exceptions. This is one 
> example:
> {code}
> ERROR [AntiEntropyStage:1] 2017-03-27 18:17:08,385 CassandraDaemon.java:207 - 
> Exception in thread Thread[AntiEntropyStage:1,5,main]
> java.lang.RuntimeException: Parent repair session with id = 
> 8c85d260-1319-11e7-82a2-25090a89015f has failed.
> at 
> org.apache.cassandra.service.ActiveRepairService.getParentRepairSession(ActiveRepairService.java:377)
>  ~[apache-cassandra-3.0.10.jar:3.0.10]
> at 
> org.apache.cassandra.service.ActiveRepairService.removeParentRepairSession(ActiveRepairService.java:392)
>  ~[apache-cassandra-3.0.10.jar:3.0.10]
> at 
> org.apache.cassandra.repair.RepairMessageVerbHandler.doVerb(RepairMessageVerbHandler.java:172)
>  ~[apache-cassandra-3.0.10.jar:3.0.10]
> at 
> org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:67) 
> ~[apache-cassandra-3.0.10.jar:3.0.10]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> ~[na:1.8.0_121]
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
> ~[na:1.8.0_121]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  ~[na:1.8.0_121]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  [na:1.8.0_121]
> at java.lang.Thread.run(Thread.java:745) [na:1.8.0_121]
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Issue Comment Deleted] (CASSANDRA-13745) Compaction History would be beneficial to include completion timestamp

2017-08-14 Thread Ihar Kukharchuk (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ihar Kukharchuk updated CASSANDRA-13745:

Comment: was deleted

(was: Hi,
sorry for bothering, but do I need extra permissions to assign the issue to me 
[I have no access to assign button for now]?
I've already submitted a kind of patch, but still have several questions 
because I'm new here and may miss the spirit of C* development:
1. Do C* really need such functionality? [the question exists because provided 
solution changes the schema of system.compaction_history table and there are 
several cases, when it is not appropriate to do such things]
2. I assume, that "compaction_started_at" may be not a really good name, so 
should I think about more conscious one? Or, maybe someone has the proper name 
for that column on the fly.
3. Current solution shows "unix start time" instead of compaction start time if 
it is not present [via nodetool compactionhistory] - are you ok with such 
approach? Or is it better to output a kind of special symbol, for example - "-"?
4. As far as I investigated, C* performs "force" update for system tables and I 
should not think about any special upgrade utility for now providing such 
patch, am I right?
Thank you!)

> Compaction History would be beneficial to include completion timestamp 
> ---
>
> Key: CASSANDRA-13745
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13745
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction, Core
>Reporter: Richard Andersen
>Priority: Minor
>  Labels: lhf
> Attachments: 13745-trunk.txt
>
>
> Compaction history does not currently contain the completion time stamp which 
> can be beneficial in determining performance and event tracing. I would like 
> to use this information also in our Health Check process to trace event 
> timelines. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13745) Compaction History would be beneficial to include completion timestamp

2017-08-14 Thread Ihar Kukharchuk (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ihar Kukharchuk updated CASSANDRA-13745:

Attachment: 13745-trunk.txt

Hi,
sorry for bothering, but do I need extra permissions to assign the issue to me 
[I have no access to assign button for now]?
I've already attached a kind of patch, but still have several questions because 
I'm new here and may miss the spirit of C* development:
1. Do C* really need such functionality? [the question exists because provided 
solution changes the schema of system.compaction_history table and there are 
several cases, when it is not appropriate to do such things]
2. I assume, that "compaction_started_at" may be not a really good name, so 
should I think about more conscious one? Or, maybe someone has the proper name 
for that column on the fly.
3. Current solution shows "unix start time" instead of compaction start time if 
it is not present [via nodetool compactionhistory] - are you ok with such 
approach? Or is it better to output a kind of special symbol, for example - "-"?
4. As far as I investigated, C* performs "force" update for system tables and I 
should not think about any special upgrade utility for now providing such 
patch, am I right?
Thank you!

> Compaction History would be beneficial to include completion timestamp 
> ---
>
> Key: CASSANDRA-13745
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13745
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction, Core
>Reporter: Richard Andersen
>Priority: Minor
>  Labels: lhf
> Attachments: 13745-trunk.txt
>
>
> Compaction history does not currently contain the completion time stamp which 
> can be beneficial in determining performance and event tracing. I would like 
> to use this information also in our Health Check process to trace event 
> timelines. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13745) Compaction History would be beneficial to include completion timestamp

2017-08-14 Thread Ihar Kukharchuk (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ihar Kukharchuk updated CASSANDRA-13745:

Status: Ready to Commit  (was: Patch Available)

> Compaction History would be beneficial to include completion timestamp 
> ---
>
> Key: CASSANDRA-13745
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13745
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction, Core
>Reporter: Richard Andersen
>Priority: Minor
>  Labels: lhf
>
> Compaction history does not currently contain the completion time stamp which 
> can be beneficial in determining performance and event tracing. I would like 
> to use this information also in our Health Check process to trace event 
> timelines. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13745) Compaction History would be beneficial to include completion timestamp

2017-08-14 Thread Ihar Kukharchuk (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ihar Kukharchuk updated CASSANDRA-13745:

Status: Open  (was: Ready to Commit)

> Compaction History would be beneficial to include completion timestamp 
> ---
>
> Key: CASSANDRA-13745
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13745
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction, Core
>Reporter: Richard Andersen
>Priority: Minor
>  Labels: lhf
>
> Compaction history does not currently contain the completion time stamp which 
> can be beneficial in determining performance and event tracing. I would like 
> to use this information also in our Health Check process to trace event 
> timelines. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13745) Compaction History would be beneficial to include completion timestamp

2017-08-14 Thread Ihar Kukharchuk (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ihar Kukharchuk updated CASSANDRA-13745:

Reproduced In: 3.11.0
   Status: Patch Available  (was: Open)

> Compaction History would be beneficial to include completion timestamp 
> ---
>
> Key: CASSANDRA-13745
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13745
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction, Core
>Reporter: Richard Andersen
>Priority: Minor
>  Labels: lhf
>
> Compaction history does not currently contain the completion time stamp which 
> can be beneficial in determining performance and event tracing. I would like 
> to use this information also in our Health Check process to trace event 
> timelines. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13745) Compaction History would be beneficial to include completion timestamp

2017-08-14 Thread Ihar Kukharchuk (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16125918#comment-16125918
 ] 

Ihar Kukharchuk commented on CASSANDRA-13745:
-

Hi,
sorry for bothering, but do I need extra permissions to assign the issue to me 
[I have no access to assign button for now]?
I've already submitted a kind of patch, but still have several questions 
because I'm new here and may miss the spirit of C* development:
1. Do C* really need such functionality? [the question exists because provided 
solution changes the schema of system.compaction_history table and there are 
several cases, when it is not appropriate to do such things]
2. I assume, that "compaction_started_at" may be not a really good name, so 
should I think about more conscious one? Or, maybe someone has the proper name 
for that column on the fly.
3. Current solution shows "unix start time" instead of compaction start time if 
it is not present [via nodetool compactionhistory] - are you ok with such 
approach? Or is it better to output a kind of special symbol, for example - "-"?
4. As far as I investigated, C* performs "force" update for system tables and I 
should not think about any special upgrade utility for now providing such 
patch, am I right?
Thank you!

> Compaction History would be beneficial to include completion timestamp 
> ---
>
> Key: CASSANDRA-13745
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13745
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction, Core
>Reporter: Richard Andersen
>Priority: Minor
>  Labels: lhf
>
> Compaction history does not currently contain the completion time stamp which 
> can be beneficial in determining performance and event tracing. I would like 
> to use this information also in our Health Check process to trace event 
> timelines. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13576) test failure in bootstrap_test.TestBootstrap.consistent_range_movement_false_with_rf1_should_succeed_test

2017-08-14 Thread Marcus Eriksson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16125902#comment-16125902
 ] 

Marcus Eriksson commented on CASSANDRA-13576:
-

not sure what happened here [~ifesdjeen] why did you remove your branch+comment?

anyway, [here|https://github.com/krummas/cassandra/tree/marcuse/13576] is a 
patch to not optimise if rf == 1

dtest run:
https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/182

> test failure in 
> bootstrap_test.TestBootstrap.consistent_range_movement_false_with_rf1_should_succeed_test
> -
>
> Key: CASSANDRA-13576
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13576
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Michael Hamm
>Assignee: Marcus Eriksson
>  Labels: dtest, test-failure
> Attachments: node1_debug.log, node1_gc.log, node1.log, 
> node2_debug.log, node2_gc.log, node2.log, node3_debug.log, node3_gc.log, 
> node3.log
>
>
> example failure:
> http://cassci.datastax.com/job/trunk_offheap_dtest/445/testReport/bootstrap_test/TestBootstrap/consistent_range_movement_false_with_rf1_should_succeed_test
> {noformat}
> Error Message
> 31 May 2017 04:28:09 [node3] Missing: ['Starting listening for CQL clients']:
> INFO  [main] 2017-05-31 04:18:01,615 YamlConfigura.
> See system.log for remainder
> {noformat}
> {noformat}
> Stacktrace
>   File "/usr/lib/python2.7/unittest/case.py", line 329, in run
> testMethod()
>   File "/home/automaton/cassandra-dtest/bootstrap_test.py", line 236, in 
> consistent_range_movement_false_with_rf1_should_succeed_test
> self._bootstrap_test_with_replica_down(False, rf=1)
>   File "/home/automaton/cassandra-dtest/bootstrap_test.py", line 278, in 
> _bootstrap_test_with_replica_down
> 
> jvm_args=["-Dcassandra.consistent.rangemovement={}".format(consistent_range_movement)])
>   File 
> "/home/automaton/venv/local/lib/python2.7/site-packages/ccmlib/node.py", line 
> 696, in start
> self.wait_for_binary_interface(from_mark=self.mark)
>   File 
> "/home/automaton/venv/local/lib/python2.7/site-packages/ccmlib/node.py", line 
> 514, in wait_for_binary_interface
> self.watch_log_for("Starting listening for CQL clients", **kwargs)
>   File 
> "/home/automaton/venv/local/lib/python2.7/site-packages/ccmlib/node.py", line 
> 471, in watch_log_for
> raise TimeoutError(time.strftime("%d %b %Y %H:%M:%S", time.gmtime()) + " 
> [" + self.name + "] Missing: " + str([e.pattern for e in tofind]) + ":\n" + 
> reads[:50] + ".\nSee {} for remainder".format(filename))
> "31 May 2017 04:28:09 [node3] Missing: ['Starting listening for CQL 
> clients']:\nINFO  [main] 2017-05-31 04:18:01,615 YamlConfigura.\n
> {noformat}
> {noformat}
>  >> begin captured logging << 
> \ndtest: DEBUG: cluster ccm directory: 
> /tmp/dtest-PKphwD\ndtest: DEBUG: Done setting configuration options:\n{   
> 'initial_token': None,\n'memtable_allocation_type': 'offheap_objects',\n  
>   'num_tokens': '32',\n'phi_convict_threshold': 5,\n
> 'range_request_timeout_in_ms': 1,\n'read_request_timeout_in_ms': 
> 1,\n'request_timeout_in_ms': 1,\n
> 'truncate_request_timeout_in_ms': 1,\n'write_request_timeout_in_ms': 
> 1}\ncassandra.policies: INFO: Using datacenter 'datacenter1' for 
> DCAwareRoundRobinPolicy (via host '127.0.0.1'); if incorrect, please specify 
> a local_dc to the constructor, or limit contact points to local cluster 
> nodes\ncassandra.cluster: INFO: New Cassandra host  datacenter1> discovered\ncassandra.protocol: WARNING: Server warning: When 
> increasing replication factor you need to run a full (-full) repair to 
> distribute the data.\ncassandra.connection: WARNING: Heartbeat failed for 
> connection (139927174110160) to 127.0.0.2\ncassandra.cluster: WARNING: Host 
> 127.0.0.2 has been marked down\ncassandra.pool: WARNING: Error attempting to 
> reconnect to 127.0.0.2, scheduling retry in 2.0 seconds: [Errno 111] Tried 
> connecting to [('127.0.0.2', 9042)]. Last error: Connection 
> refused\ncassandra.pool: WARNING: Error attempting to reconnect to 127.0.0.2, 
> scheduling retry in 4.0 seconds: [Errno 111] Tried connecting to 
> [('127.0.0.2', 9042)]. Last error: Connection refused\ncassandra.pool: 
> WARNING: Error attempting to reconnect to 127.0.0.2, scheduling retry in 8.0 
> seconds: [Errno 111] Tried connecting to [('127.0.0.2', 9042)]. Last error: 
> Connection refused\ncassandra.pool: WARNING: Error attempting to reconnect to 
> 127.0.0.2, scheduling retry in 16.0 seconds: [Errno 111] Tried connecting to 
> [('127.0.0.2', 9042)]. Last error: Connection refused\ncassandra.pool: 
> WARNING: Error attempting to 

[jira] [Updated] (CASSANDRA-13363) java.lang.ArrayIndexOutOfBoundsException: null

2017-08-14 Thread Aleksey Yeschenko (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko updated CASSANDRA-13363:
--
Status: Open  (was: Patch Available)

> java.lang.ArrayIndexOutOfBoundsException: null
> --
>
> Key: CASSANDRA-13363
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13363
> Project: Cassandra
>  Issue Type: Bug
> Environment: CentOS 6, Cassandra 3.10
>Reporter: Artem Rokhin
>Assignee: zhaoyan
>Priority: Critical
> Fix For: 3.0.x, 3.11.x, 4.x
>
>
> Constantly see this error in the log without any additional information or a 
> stack trace.
> {code}
> Exception in thread Thread[MessagingService-Incoming-/10.0.1.26,5,main]
> {code}
> {code}
> java.lang.ArrayIndexOutOfBoundsException: null
> {code}
> Logger: org.apache.cassandra.service.CassandraDaemon
> Thrdead: MessagingService-Incoming-/10.0.1.12
> Method: uncaughtException
> File: CassandraDaemon.java
> Line: 229



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13363) java.lang.ArrayIndexOutOfBoundsException: null

2017-08-14 Thread Aleksey Yeschenko (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16125838#comment-16125838
 ] 

Aleksey Yeschenko commented on CASSANDRA-13363:
---

The problem is real, and the patch does work, but I'm afraid it doesn't quite 
solve the issue completely.

There is also an issue involving {{serializedSize()}}. {{index}} field might be 
empty at the time when we calculate the size of the message, and switch to 
non-empty afterwards. It's not currently an issue since {{READ_COMMAND}} is not 
using a {{CallbackDeterminedSerializer}} but it's still a bug.

The proper fix would be to make sure the field never changes and is only set 
once at construction time. While at it, might also want to refactor it to not 
be {{Optional}}. {{Optional}} is reserved for return types, not object fields 
and method arguments.

Give me a couple hours to try work it out properly?

> java.lang.ArrayIndexOutOfBoundsException: null
> --
>
> Key: CASSANDRA-13363
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13363
> Project: Cassandra
>  Issue Type: Bug
> Environment: CentOS 6, Cassandra 3.10
>Reporter: Artem Rokhin
>Assignee: zhaoyan
>Priority: Critical
> Fix For: 3.0.x, 3.11.x, 4.x
>
>
> Constantly see this error in the log without any additional information or a 
> stack trace.
> {code}
> Exception in thread Thread[MessagingService-Incoming-/10.0.1.26,5,main]
> {code}
> {code}
> java.lang.ArrayIndexOutOfBoundsException: null
> {code}
> Logger: org.apache.cassandra.service.CassandraDaemon
> Thrdead: MessagingService-Incoming-/10.0.1.12
> Method: uncaughtException
> File: CassandraDaemon.java
> Line: 229



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Assigned] (CASSANDRA-13576) test failure in bootstrap_test.TestBootstrap.consistent_range_movement_false_with_rf1_should_succeed_test

2017-08-14 Thread Marcus Eriksson (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcus Eriksson reassigned CASSANDRA-13576:
---

Assignee: Marcus Eriksson

> test failure in 
> bootstrap_test.TestBootstrap.consistent_range_movement_false_with_rf1_should_succeed_test
> -
>
> Key: CASSANDRA-13576
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13576
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Michael Hamm
>Assignee: Marcus Eriksson
>  Labels: dtest, test-failure
> Attachments: node1_debug.log, node1_gc.log, node1.log, 
> node2_debug.log, node2_gc.log, node2.log, node3_debug.log, node3_gc.log, 
> node3.log
>
>
> example failure:
> http://cassci.datastax.com/job/trunk_offheap_dtest/445/testReport/bootstrap_test/TestBootstrap/consistent_range_movement_false_with_rf1_should_succeed_test
> {noformat}
> Error Message
> 31 May 2017 04:28:09 [node3] Missing: ['Starting listening for CQL clients']:
> INFO  [main] 2017-05-31 04:18:01,615 YamlConfigura.
> See system.log for remainder
> {noformat}
> {noformat}
> Stacktrace
>   File "/usr/lib/python2.7/unittest/case.py", line 329, in run
> testMethod()
>   File "/home/automaton/cassandra-dtest/bootstrap_test.py", line 236, in 
> consistent_range_movement_false_with_rf1_should_succeed_test
> self._bootstrap_test_with_replica_down(False, rf=1)
>   File "/home/automaton/cassandra-dtest/bootstrap_test.py", line 278, in 
> _bootstrap_test_with_replica_down
> 
> jvm_args=["-Dcassandra.consistent.rangemovement={}".format(consistent_range_movement)])
>   File 
> "/home/automaton/venv/local/lib/python2.7/site-packages/ccmlib/node.py", line 
> 696, in start
> self.wait_for_binary_interface(from_mark=self.mark)
>   File 
> "/home/automaton/venv/local/lib/python2.7/site-packages/ccmlib/node.py", line 
> 514, in wait_for_binary_interface
> self.watch_log_for("Starting listening for CQL clients", **kwargs)
>   File 
> "/home/automaton/venv/local/lib/python2.7/site-packages/ccmlib/node.py", line 
> 471, in watch_log_for
> raise TimeoutError(time.strftime("%d %b %Y %H:%M:%S", time.gmtime()) + " 
> [" + self.name + "] Missing: " + str([e.pattern for e in tofind]) + ":\n" + 
> reads[:50] + ".\nSee {} for remainder".format(filename))
> "31 May 2017 04:28:09 [node3] Missing: ['Starting listening for CQL 
> clients']:\nINFO  [main] 2017-05-31 04:18:01,615 YamlConfigura.\n
> {noformat}
> {noformat}
>  >> begin captured logging << 
> \ndtest: DEBUG: cluster ccm directory: 
> /tmp/dtest-PKphwD\ndtest: DEBUG: Done setting configuration options:\n{   
> 'initial_token': None,\n'memtable_allocation_type': 'offheap_objects',\n  
>   'num_tokens': '32',\n'phi_convict_threshold': 5,\n
> 'range_request_timeout_in_ms': 1,\n'read_request_timeout_in_ms': 
> 1,\n'request_timeout_in_ms': 1,\n
> 'truncate_request_timeout_in_ms': 1,\n'write_request_timeout_in_ms': 
> 1}\ncassandra.policies: INFO: Using datacenter 'datacenter1' for 
> DCAwareRoundRobinPolicy (via host '127.0.0.1'); if incorrect, please specify 
> a local_dc to the constructor, or limit contact points to local cluster 
> nodes\ncassandra.cluster: INFO: New Cassandra host  datacenter1> discovered\ncassandra.protocol: WARNING: Server warning: When 
> increasing replication factor you need to run a full (-full) repair to 
> distribute the data.\ncassandra.connection: WARNING: Heartbeat failed for 
> connection (139927174110160) to 127.0.0.2\ncassandra.cluster: WARNING: Host 
> 127.0.0.2 has been marked down\ncassandra.pool: WARNING: Error attempting to 
> reconnect to 127.0.0.2, scheduling retry in 2.0 seconds: [Errno 111] Tried 
> connecting to [('127.0.0.2', 9042)]. Last error: Connection 
> refused\ncassandra.pool: WARNING: Error attempting to reconnect to 127.0.0.2, 
> scheduling retry in 4.0 seconds: [Errno 111] Tried connecting to 
> [('127.0.0.2', 9042)]. Last error: Connection refused\ncassandra.pool: 
> WARNING: Error attempting to reconnect to 127.0.0.2, scheduling retry in 8.0 
> seconds: [Errno 111] Tried connecting to [('127.0.0.2', 9042)]. Last error: 
> Connection refused\ncassandra.pool: WARNING: Error attempting to reconnect to 
> 127.0.0.2, scheduling retry in 16.0 seconds: [Errno 111] Tried connecting to 
> [('127.0.0.2', 9042)]. Last error: Connection refused\ncassandra.pool: 
> WARNING: Error attempting to reconnect to 127.0.0.2, scheduling retry in 32.0 
> seconds: [Errno 111] Tried connecting to [('127.0.0.2', 9042)]. Last error: 
> Connection refused\ncassandra.pool: WARNING: Error attempting to reconnect to 
> 127.0.0.2, scheduling retry in 64.0 seconds: [Errno 111] Tried connecting to 
> 

[jira] [Commented] (CASSANDRA-12884) Batch logic can lead to unbalanced use of system.batches

2017-08-14 Thread Aleksey Yeschenko (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16125750#comment-16125750
 ] 

Aleksey Yeschenko commented on CASSANDRA-12884:
---

bq. Oh, and not to nitpick, but any reason to prefer {{otherRack.sublist(2, 
otherRack.size()).clear(); return otherRack();}} to {{return 
otherRack.sublist(0,2);}}?

It's very slightly cheaper as it won't in practice create an extra object. But 
here it doesn't really matter, and the latter, I agree, reads better. Swapped.

Committed to 3.0 as 
[c2b635ac240ae8d9375fd96791a5aba903a94498|https://github.com/apache/cassandra/commit/c2b635ac240ae8d9375fd96791a5aba903a94498]
 and merged into 3.11 and trunk.

> Batch logic can lead to unbalanced use of system.batches
> 
>
> Key: CASSANDRA-12884
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12884
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Adam Hattrell
>Assignee: Daniel Cranford
> Fix For: 3.0.15, 3.11.1
>
> Attachments: 0001-CASSANDRA-12884.patch
>
>
> It looks as though there are some odd edge cases in how we distribute the 
> copies in system.batches.
> The main issue is in the filter method for 
> org.apache.cassandra.batchlog.BatchlogManager
> {code:java}
>  if (validated.size() - validated.get(localRack).size() >= 2)
>  {
> // we have enough endpoints in other racks
> validated.removeAll(localRack);
>   }
>  if (validated.keySet().size() == 1)
>  {
>// we have only 1 `other` rack
>Collection otherRack = 
> Iterables.getOnlyElement(validated.asMap().values());
>
> return Lists.newArrayList(Iterables.limit(otherRack, 2));
>  }
> {code}
> So with one or two racks we just return the first 2 entries in the list.  
> There's no shuffle or randomisation here.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-12884) Batch logic can lead to unbalanced use of system.batches

2017-08-14 Thread Aleksey Yeschenko (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko updated CASSANDRA-12884:
--
   Resolution: Fixed
Fix Version/s: (was: 3.11.x)
   (was: 3.0.x)
   3.11.1
   3.0.15
   Status: Resolved  (was: Patch Available)

> Batch logic can lead to unbalanced use of system.batches
> 
>
> Key: CASSANDRA-12884
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12884
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Adam Hattrell
>Assignee: Daniel Cranford
> Fix For: 3.0.15, 3.11.1
>
> Attachments: 0001-CASSANDRA-12884.patch
>
>
> It looks as though there are some odd edge cases in how we distribute the 
> copies in system.batches.
> The main issue is in the filter method for 
> org.apache.cassandra.batchlog.BatchlogManager
> {code:java}
>  if (validated.size() - validated.get(localRack).size() >= 2)
>  {
> // we have enough endpoints in other racks
> validated.removeAll(localRack);
>   }
>  if (validated.keySet().size() == 1)
>  {
>// we have only 1 `other` rack
>Collection otherRack = 
> Iterables.getOnlyElement(validated.asMap().values());
>
> return Lists.newArrayList(Iterables.limit(otherRack, 2));
>  }
> {code}
> So with one or two racks we just return the first 2 entries in the list.  
> There's no shuffle or randomisation here.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[3/6] cassandra git commit: Randomize batchlog endpoint selection with only 1 or 2 racks

2017-08-14 Thread aleksey
Randomize batchlog endpoint selection with only 1 or 2 racks

patch by Daniel Cranford; reviewed by Aleksey Yeschenko for
CASSANDRA-12884


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/c2b635ac
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/c2b635ac
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/c2b635ac

Branch: refs/heads/trunk
Commit: c2b635ac240ae8d9375fd96791a5aba903a94498
Parents: fab3845
Author: dcranford 
Authored: Wed Aug 9 10:20:03 2017 -0400
Committer: Aleksey Yeschenko 
Committed: Mon Aug 14 15:23:09 2017 +0100

--
 CHANGES.txt |  1 +
 .../cassandra/batchlog/BatchlogManager.java | 19 ---
 .../batchlog/BatchlogEndpointFilterTest.java| 33 ++--
 3 files changed, 47 insertions(+), 6 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/c2b635ac/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 2e9e8ad..358dd04 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 3.0.15
+ * Randomize batchlog endpoint selection with only 1 or 2 racks 
(CASSANDRA-12884)
  * Fix digest calculation for counter cells (CASSANDRA-13750)
  * Fix ColumnDefinition.cellValueType() for non-frozen collection and change 
SSTabledump to use type.toJSONString() (CASSANDRA-13573)
  * Skip materialized view addition if the base table doesn't exist 
(CASSANDRA-13737)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/c2b635ac/src/java/org/apache/cassandra/batchlog/BatchlogManager.java
--
diff --git a/src/java/org/apache/cassandra/batchlog/BatchlogManager.java 
b/src/java/org/apache/cassandra/batchlog/BatchlogManager.java
index f5133bb..b614fc5 100644
--- a/src/java/org/apache/cassandra/batchlog/BatchlogManager.java
+++ b/src/java/org/apache/cassandra/batchlog/BatchlogManager.java
@@ -523,9 +523,14 @@ public class BatchlogManager implements 
BatchlogManagerMBean
 
 if (validated.keySet().size() == 1)
 {
-// we have only 1 `other` rack
-Collection otherRack = 
Iterables.getOnlyElement(validated.asMap().values());
-return Lists.newArrayList(Iterables.limit(otherRack, 2));
+/*
+ * we have only 1 `other` rack to select replicas from 
(whether it be the local rack or a single non-local rack)
+ * pick two random nodes from there; we are guaranteed to have 
at least two nodes in the single remaining rack
+ * because of the preceding if block.
+ */
+List otherRack = 
Lists.newArrayList(validated.values());
+shuffle(otherRack);
+return otherRack.subList(0, 2);
 }
 
 // randomize which racks we pick from if more than 2 remaining
@@ -537,7 +542,7 @@ public class BatchlogManager implements BatchlogManagerMBean
 else
 {
 racks = Lists.newArrayList(validated.keySet());
-Collections.shuffle((List) racks);
+shuffle((List) racks);
 }
 
 // grab a random member of up to two racks
@@ -562,5 +567,11 @@ public class BatchlogManager implements 
BatchlogManagerMBean
 {
 return ThreadLocalRandom.current().nextInt(bound);
 }
+
+@VisibleForTesting
+protected void shuffle(List list)
+{
+Collections.shuffle(list);
+}
 }
 }

http://git-wip-us.apache.org/repos/asf/cassandra/blob/c2b635ac/test/unit/org/apache/cassandra/batchlog/BatchlogEndpointFilterTest.java
--
diff --git 
a/test/unit/org/apache/cassandra/batchlog/BatchlogEndpointFilterTest.java 
b/test/unit/org/apache/cassandra/batchlog/BatchlogEndpointFilterTest.java
index 23aeaaa..7db1cfa 100644
--- a/test/unit/org/apache/cassandra/batchlog/BatchlogEndpointFilterTest.java
+++ b/test/unit/org/apache/cassandra/batchlog/BatchlogEndpointFilterTest.java
@@ -20,7 +20,9 @@ package org.apache.cassandra.batchlog;
 import java.net.InetAddress;
 import java.net.UnknownHostException;
 import java.util.Collection;
+import java.util.Collections;
 import java.util.HashSet;
+import java.util.List;
 
 import com.google.common.collect.ImmutableMultimap;
 import com.google.common.collect.Multimap;
@@ -87,8 +89,28 @@ public class BatchlogEndpointFilterTest
 .put("1", InetAddress.getByName("111"))
 .build();
 Collection result = new TestEndpointFilter(LOCAL, 
endpoints).filter();
-// result should 

[4/6] cassandra git commit: Merge branch 'cassandra-3.0' into cassandra-3.11

2017-08-14 Thread aleksey
Merge branch 'cassandra-3.0' into cassandra-3.11


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/db57cbdd
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/db57cbdd
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/db57cbdd

Branch: refs/heads/trunk
Commit: db57cbddc390c7af4d824962686ab6f6a0b3d079
Parents: 1884dbe c2b635a
Author: Aleksey Yeschenko 
Authored: Mon Aug 14 15:25:59 2017 +0100
Committer: Aleksey Yeschenko 
Committed: Mon Aug 14 15:25:59 2017 +0100

--
 CHANGES.txt |  1 +
 .../cassandra/batchlog/BatchlogManager.java | 19 ---
 .../batchlog/BatchlogEndpointFilterTest.java| 33 ++--
 3 files changed, 47 insertions(+), 6 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/db57cbdd/CHANGES.txt
--
diff --cc CHANGES.txt
index c672675,358dd04..5403812
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@@ -1,9 -1,5 +1,10 @@@
 -3.0.15
 +3.11.1
 + * "ignore" option is ignored in sstableloader (CASSANDRA-13721)
 + * Deadlock in AbstractCommitLogSegmentManager (CASSANDRA-13652)
 + * Duplicate the buffer before passing it to analyser in SASI operation 
(CASSANDRA-13512)
 + * Properly evict pstmts from prepared statements cache (CASSANDRA-13641)
 +Merged from 3.0:
+  * Randomize batchlog endpoint selection with only 1 or 2 racks 
(CASSANDRA-12884)
   * Fix digest calculation for counter cells (CASSANDRA-13750)
   * Fix ColumnDefinition.cellValueType() for non-frozen collection and change 
SSTabledump to use type.toJSONString() (CASSANDRA-13573)
   * Skip materialized view addition if the base table doesn't exist 
(CASSANDRA-13737)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/db57cbdd/src/java/org/apache/cassandra/batchlog/BatchlogManager.java
--


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[1/6] cassandra git commit: Randomize batchlog endpoint selection with only 1 or 2 racks

2017-08-14 Thread aleksey
Repository: cassandra
Updated Branches:
  refs/heads/cassandra-3.0 fab384560 -> c2b635ac2
  refs/heads/cassandra-3.11 1884dbe28 -> db57cbddc
  refs/heads/trunk ff06424fa -> 99e5f7efc


Randomize batchlog endpoint selection with only 1 or 2 racks

patch by Daniel Cranford; reviewed by Aleksey Yeschenko for
CASSANDRA-12884


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/c2b635ac
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/c2b635ac
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/c2b635ac

Branch: refs/heads/cassandra-3.0
Commit: c2b635ac240ae8d9375fd96791a5aba903a94498
Parents: fab3845
Author: dcranford 
Authored: Wed Aug 9 10:20:03 2017 -0400
Committer: Aleksey Yeschenko 
Committed: Mon Aug 14 15:23:09 2017 +0100

--
 CHANGES.txt |  1 +
 .../cassandra/batchlog/BatchlogManager.java | 19 ---
 .../batchlog/BatchlogEndpointFilterTest.java| 33 ++--
 3 files changed, 47 insertions(+), 6 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/c2b635ac/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 2e9e8ad..358dd04 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 3.0.15
+ * Randomize batchlog endpoint selection with only 1 or 2 racks 
(CASSANDRA-12884)
  * Fix digest calculation for counter cells (CASSANDRA-13750)
  * Fix ColumnDefinition.cellValueType() for non-frozen collection and change 
SSTabledump to use type.toJSONString() (CASSANDRA-13573)
  * Skip materialized view addition if the base table doesn't exist 
(CASSANDRA-13737)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/c2b635ac/src/java/org/apache/cassandra/batchlog/BatchlogManager.java
--
diff --git a/src/java/org/apache/cassandra/batchlog/BatchlogManager.java 
b/src/java/org/apache/cassandra/batchlog/BatchlogManager.java
index f5133bb..b614fc5 100644
--- a/src/java/org/apache/cassandra/batchlog/BatchlogManager.java
+++ b/src/java/org/apache/cassandra/batchlog/BatchlogManager.java
@@ -523,9 +523,14 @@ public class BatchlogManager implements 
BatchlogManagerMBean
 
 if (validated.keySet().size() == 1)
 {
-// we have only 1 `other` rack
-Collection otherRack = 
Iterables.getOnlyElement(validated.asMap().values());
-return Lists.newArrayList(Iterables.limit(otherRack, 2));
+/*
+ * we have only 1 `other` rack to select replicas from 
(whether it be the local rack or a single non-local rack)
+ * pick two random nodes from there; we are guaranteed to have 
at least two nodes in the single remaining rack
+ * because of the preceding if block.
+ */
+List otherRack = 
Lists.newArrayList(validated.values());
+shuffle(otherRack);
+return otherRack.subList(0, 2);
 }
 
 // randomize which racks we pick from if more than 2 remaining
@@ -537,7 +542,7 @@ public class BatchlogManager implements BatchlogManagerMBean
 else
 {
 racks = Lists.newArrayList(validated.keySet());
-Collections.shuffle((List) racks);
+shuffle((List) racks);
 }
 
 // grab a random member of up to two racks
@@ -562,5 +567,11 @@ public class BatchlogManager implements 
BatchlogManagerMBean
 {
 return ThreadLocalRandom.current().nextInt(bound);
 }
+
+@VisibleForTesting
+protected void shuffle(List list)
+{
+Collections.shuffle(list);
+}
 }
 }

http://git-wip-us.apache.org/repos/asf/cassandra/blob/c2b635ac/test/unit/org/apache/cassandra/batchlog/BatchlogEndpointFilterTest.java
--
diff --git 
a/test/unit/org/apache/cassandra/batchlog/BatchlogEndpointFilterTest.java 
b/test/unit/org/apache/cassandra/batchlog/BatchlogEndpointFilterTest.java
index 23aeaaa..7db1cfa 100644
--- a/test/unit/org/apache/cassandra/batchlog/BatchlogEndpointFilterTest.java
+++ b/test/unit/org/apache/cassandra/batchlog/BatchlogEndpointFilterTest.java
@@ -20,7 +20,9 @@ package org.apache.cassandra.batchlog;
 import java.net.InetAddress;
 import java.net.UnknownHostException;
 import java.util.Collection;
+import java.util.Collections;
 import java.util.HashSet;
+import java.util.List;
 
 import com.google.common.collect.ImmutableMultimap;
 import com.google.common.collect.Multimap;
@@ -87,8 +89,28 @@ public class 

[5/6] cassandra git commit: Merge branch 'cassandra-3.0' into cassandra-3.11

2017-08-14 Thread aleksey
Merge branch 'cassandra-3.0' into cassandra-3.11


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/db57cbdd
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/db57cbdd
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/db57cbdd

Branch: refs/heads/cassandra-3.11
Commit: db57cbddc390c7af4d824962686ab6f6a0b3d079
Parents: 1884dbe c2b635a
Author: Aleksey Yeschenko 
Authored: Mon Aug 14 15:25:59 2017 +0100
Committer: Aleksey Yeschenko 
Committed: Mon Aug 14 15:25:59 2017 +0100

--
 CHANGES.txt |  1 +
 .../cassandra/batchlog/BatchlogManager.java | 19 ---
 .../batchlog/BatchlogEndpointFilterTest.java| 33 ++--
 3 files changed, 47 insertions(+), 6 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/db57cbdd/CHANGES.txt
--
diff --cc CHANGES.txt
index c672675,358dd04..5403812
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@@ -1,9 -1,5 +1,10 @@@
 -3.0.15
 +3.11.1
 + * "ignore" option is ignored in sstableloader (CASSANDRA-13721)
 + * Deadlock in AbstractCommitLogSegmentManager (CASSANDRA-13652)
 + * Duplicate the buffer before passing it to analyser in SASI operation 
(CASSANDRA-13512)
 + * Properly evict pstmts from prepared statements cache (CASSANDRA-13641)
 +Merged from 3.0:
+  * Randomize batchlog endpoint selection with only 1 or 2 racks 
(CASSANDRA-12884)
   * Fix digest calculation for counter cells (CASSANDRA-13750)
   * Fix ColumnDefinition.cellValueType() for non-frozen collection and change 
SSTabledump to use type.toJSONString() (CASSANDRA-13573)
   * Skip materialized view addition if the base table doesn't exist 
(CASSANDRA-13737)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/db57cbdd/src/java/org/apache/cassandra/batchlog/BatchlogManager.java
--


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[2/6] cassandra git commit: Randomize batchlog endpoint selection with only 1 or 2 racks

2017-08-14 Thread aleksey
Randomize batchlog endpoint selection with only 1 or 2 racks

patch by Daniel Cranford; reviewed by Aleksey Yeschenko for
CASSANDRA-12884


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/c2b635ac
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/c2b635ac
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/c2b635ac

Branch: refs/heads/cassandra-3.11
Commit: c2b635ac240ae8d9375fd96791a5aba903a94498
Parents: fab3845
Author: dcranford 
Authored: Wed Aug 9 10:20:03 2017 -0400
Committer: Aleksey Yeschenko 
Committed: Mon Aug 14 15:23:09 2017 +0100

--
 CHANGES.txt |  1 +
 .../cassandra/batchlog/BatchlogManager.java | 19 ---
 .../batchlog/BatchlogEndpointFilterTest.java| 33 ++--
 3 files changed, 47 insertions(+), 6 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/c2b635ac/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 2e9e8ad..358dd04 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 3.0.15
+ * Randomize batchlog endpoint selection with only 1 or 2 racks 
(CASSANDRA-12884)
  * Fix digest calculation for counter cells (CASSANDRA-13750)
  * Fix ColumnDefinition.cellValueType() for non-frozen collection and change 
SSTabledump to use type.toJSONString() (CASSANDRA-13573)
  * Skip materialized view addition if the base table doesn't exist 
(CASSANDRA-13737)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/c2b635ac/src/java/org/apache/cassandra/batchlog/BatchlogManager.java
--
diff --git a/src/java/org/apache/cassandra/batchlog/BatchlogManager.java 
b/src/java/org/apache/cassandra/batchlog/BatchlogManager.java
index f5133bb..b614fc5 100644
--- a/src/java/org/apache/cassandra/batchlog/BatchlogManager.java
+++ b/src/java/org/apache/cassandra/batchlog/BatchlogManager.java
@@ -523,9 +523,14 @@ public class BatchlogManager implements 
BatchlogManagerMBean
 
 if (validated.keySet().size() == 1)
 {
-// we have only 1 `other` rack
-Collection otherRack = 
Iterables.getOnlyElement(validated.asMap().values());
-return Lists.newArrayList(Iterables.limit(otherRack, 2));
+/*
+ * we have only 1 `other` rack to select replicas from 
(whether it be the local rack or a single non-local rack)
+ * pick two random nodes from there; we are guaranteed to have 
at least two nodes in the single remaining rack
+ * because of the preceding if block.
+ */
+List otherRack = 
Lists.newArrayList(validated.values());
+shuffle(otherRack);
+return otherRack.subList(0, 2);
 }
 
 // randomize which racks we pick from if more than 2 remaining
@@ -537,7 +542,7 @@ public class BatchlogManager implements BatchlogManagerMBean
 else
 {
 racks = Lists.newArrayList(validated.keySet());
-Collections.shuffle((List) racks);
+shuffle((List) racks);
 }
 
 // grab a random member of up to two racks
@@ -562,5 +567,11 @@ public class BatchlogManager implements 
BatchlogManagerMBean
 {
 return ThreadLocalRandom.current().nextInt(bound);
 }
+
+@VisibleForTesting
+protected void shuffle(List list)
+{
+Collections.shuffle(list);
+}
 }
 }

http://git-wip-us.apache.org/repos/asf/cassandra/blob/c2b635ac/test/unit/org/apache/cassandra/batchlog/BatchlogEndpointFilterTest.java
--
diff --git 
a/test/unit/org/apache/cassandra/batchlog/BatchlogEndpointFilterTest.java 
b/test/unit/org/apache/cassandra/batchlog/BatchlogEndpointFilterTest.java
index 23aeaaa..7db1cfa 100644
--- a/test/unit/org/apache/cassandra/batchlog/BatchlogEndpointFilterTest.java
+++ b/test/unit/org/apache/cassandra/batchlog/BatchlogEndpointFilterTest.java
@@ -20,7 +20,9 @@ package org.apache.cassandra.batchlog;
 import java.net.InetAddress;
 import java.net.UnknownHostException;
 import java.util.Collection;
+import java.util.Collections;
 import java.util.HashSet;
+import java.util.List;
 
 import com.google.common.collect.ImmutableMultimap;
 import com.google.common.collect.Multimap;
@@ -87,8 +89,28 @@ public class BatchlogEndpointFilterTest
 .put("1", InetAddress.getByName("111"))
 .build();
 Collection result = new TestEndpointFilter(LOCAL, 
endpoints).filter();
-// 

[6/6] cassandra git commit: Merge branch 'cassandra-3.11' into trunk

2017-08-14 Thread aleksey
Merge branch 'cassandra-3.11' into trunk


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/99e5f7ef
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/99e5f7ef
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/99e5f7ef

Branch: refs/heads/trunk
Commit: 99e5f7efc33fb3672e11dfba9f2521d09473dddf
Parents: ff06424 db57cbd
Author: Aleksey Yeschenko 
Authored: Mon Aug 14 15:28:22 2017 +0100
Committer: Aleksey Yeschenko 
Committed: Mon Aug 14 15:28:22 2017 +0100

--
 CHANGES.txt |  1 +
 .../cassandra/batchlog/BatchlogManager.java | 19 ---
 .../batchlog/BatchlogEndpointFilterTest.java| 33 ++--
 3 files changed, 47 insertions(+), 6 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/99e5f7ef/CHANGES.txt
--

http://git-wip-us.apache.org/repos/asf/cassandra/blob/99e5f7ef/src/java/org/apache/cassandra/batchlog/BatchlogManager.java
--


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-8735) Batch log replication is not randomized when there are only 2 racks

2017-08-14 Thread Daniel Cranford (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16119979#comment-16119979
 ] 

Daniel Cranford edited comment on CASSANDRA-8735 at 8/14/17 2:24 PM:
-

[~iamaleksey] Great, I didn't see any activity yet on CASSANDRA-12884, so I 
attached a patch there.


was (Author: daniel.cranford):
[~iamaleksey] Great, I didn't see any activity yet on CASSANDRA-12844, so I 
attached a patch there.

> Batch log replication is not randomized when there are only 2 racks
> ---
>
> Key: CASSANDRA-8735
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8735
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Yuki Morishita
>Assignee: Mihai Suteu
>Priority: Minor
> Fix For: 2.1.9, 2.2.1, 3.0 alpha 1
>
> Attachments: 8735-v2.patch, CASSANDRA-8735.patch
>
>
> Batch log replication is not randomized and the same 2 nodes can be picked up 
> when there are only 2 racks in the cluster.
> https://github.com/apache/cassandra/blob/cassandra-2.0.11/src/java/org/apache/cassandra/service/BatchlogEndpointSelector.java#L72-73



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-12884) Batch logic can lead to unbalanced use of system.batches

2017-08-14 Thread Daniel Cranford (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16125719#comment-16125719
 ] 

Daniel Cranford commented on CASSANDRA-12884:
-

Oh, and not to nitpick, but any reason to prefer
{{otherRack.sublist(2, otherRack.size()).clear(); return otherRack();}} to 
{{return otherRack.sublist(0,2);}} ?

> Batch logic can lead to unbalanced use of system.batches
> 
>
> Key: CASSANDRA-12884
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12884
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Adam Hattrell
>Assignee: Daniel Cranford
> Fix For: 3.0.x, 3.11.x
>
> Attachments: 0001-CASSANDRA-12884.patch
>
>
> It looks as though there are some odd edge cases in how we distribute the 
> copies in system.batches.
> The main issue is in the filter method for 
> org.apache.cassandra.batchlog.BatchlogManager
> {code:java}
>  if (validated.size() - validated.get(localRack).size() >= 2)
>  {
> // we have enough endpoints in other racks
> validated.removeAll(localRack);
>   }
>  if (validated.keySet().size() == 1)
>  {
>// we have only 1 `other` rack
>Collection otherRack = 
> Iterables.getOnlyElement(validated.asMap().values());
>
> return Lists.newArrayList(Iterables.limit(otherRack, 2));
>  }
> {code}
> So with one or two racks we just return the first 2 entries in the list.  
> There's no shuffle or randomisation here.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-12884) Batch logic can lead to unbalanced use of system.batches

2017-08-14 Thread Daniel Cranford (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16125710#comment-16125710
 ] 

Daniel Cranford commented on CASSANDRA-12884:
-

I had originally considered using sublist to avoid creating a second ArrayList, 
but decided against it because the sublist version throws an exception in the 
degenerate case where there is only 1 element in otherRack.

But now that I trace through the code, I think that otherRack is guaranteed to 
have at least 2 elements. If otherRack is the local rack and only has 1 
element, {{if(validated.size() <= 2)}} would have been true, and the filter() 
function would have already returned. If otherRack was the single non-local 
rack, and had size 1, then {{if(validated.size() - 
validated.get(localRack).size() >= 2)}} would be false and the whole 
single-other-rack block wouldn't run. It's probably worth a comment stating 
that otherRack is guaranteed to have at least 2 elements.

Looks good!

> Batch logic can lead to unbalanced use of system.batches
> 
>
> Key: CASSANDRA-12884
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12884
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Adam Hattrell
>Assignee: Daniel Cranford
> Fix For: 3.0.x, 3.11.x
>
> Attachments: 0001-CASSANDRA-12884.patch
>
>
> It looks as though there are some odd edge cases in how we distribute the 
> copies in system.batches.
> The main issue is in the filter method for 
> org.apache.cassandra.batchlog.BatchlogManager
> {code:java}
>  if (validated.size() - validated.get(localRack).size() >= 2)
>  {
> // we have enough endpoints in other racks
> validated.removeAll(localRack);
>   }
>  if (validated.keySet().size() == 1)
>  {
>// we have only 1 `other` rack
>Collection otherRack = 
> Iterables.getOnlyElement(validated.asMap().values());
>
> return Lists.newArrayList(Iterables.limit(otherRack, 2));
>  }
> {code}
> So with one or two racks we just return the first 2 entries in the list.  
> There's no shuffle or randomisation here.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-12655) Incremental repair & compaction hang on random nodes

2017-08-14 Thread Michael Guissine (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16125703#comment-16125703
 ] 

Michael Guissine commented on CASSANDRA-12655:
--

FWIW, we are seeing very similar if not identical behavior after upgrading to 
DSE 5.1.1 (Cassandra 3.10). So far haven't found any solution to the problem 
rather then bouncing the nodes. 

> Incremental repair & compaction hang on random nodes
> 
>
> Key: CASSANDRA-12655
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12655
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
> Environment: CentOS Linux release 7.1.1503 (Core)
> RAM - 64GB
> HEAP - 16GB
> Load on each node - ~5GB
> Cassandra Version - 2.2.5
>Reporter: Navjyot Nishant
>
> Hi We are setting up incremental repair on our 18 node cluster. Avg load on 
> each node is ~5GB. The repair run fine on couple of nodes and sudently get 
> stuck on random nodes. Upon checking the system.log of impacted node we dont 
> see much information.
> Following are the lines we see in system.log and its there from the point 
> repair is not making progress -
> {code}
> INFO  [CompactionExecutor:3490] 2016-09-16 11:14:44,236 
> CompactionManager.java:1221 - Anticompacting 
> [BigTableReader(path='/cassandra/data/gccatlgsvcks/message_backup-cab0485008ed11e5bfed452cdd54652d/la-30832-big-Data.db'),
>  
> BigTableReader(path='/cassandra/data/gccatlgsvcks/message_backup-cab0485008ed11e5bfed452cdd54652d/la-30811-big-Data.db')]
> INFO  [IndexSummaryManager:1] 2016-09-16 11:14:49,954 
> IndexSummaryRedistribution.java:74 - Redistributing index summaries
> INFO  [IndexSummaryManager:1] 2016-09-16 12:14:49,961 
> IndexSummaryRedistribution.java:74 - Redistributing index summaries
> {code}
> When we try to see pending compaction by executing {code}nodetool 
> compactionstats{code} it hangs as well and doesn't return anything. However 
> {code}nodetool tpstats{code} show active and pending compaction which never 
> come down and keep increasing. 
> {code}
> Pool NameActive   Pending  Completed   Blocked  All 
> time blocked
> MutationStage 0 0 221208 0
>  0
> ReadStage 0 01288839 0
>  0
> RequestResponseStage  0 0 104356 0
>  0
> ReadRepairStage   0 0 72 0
>  0
> CounterMutationStage  0 0  0 0
>  0
> HintedHandoff 0 0 46 0
>  0
> MiscStage 0 0  0 0
>  0
> CompactionExecutor866  68124 0
>  0
> MemtableReclaimMemory 0 0166 0
>  0
> PendingRangeCalculator0 0 38 0
>  0
> GossipStage   0 0 242455 0
>  0
> MigrationStage0 0  0 0
>  0
> MemtablePostFlush 0 0   3682 0
>  0
> ValidationExecutor0 0   2246 0
>  0
> Sampler   0 0  0 0
>  0
> MemtableFlushWriter   0 0166 0
>  0
> InternalResponseStage 0 0   8866 0
>  0
> AntiEntropyStage  0 0  15417 0
>  0
> Repair#7  0 0160 0
>  0
> CacheCleanupExecutor  0 0  0 0
>  0
> Native-Transport-Requests 0 0 327334 0
>  0
> Message type   Dropped
> READ 0
> RANGE_SLICE  0
> _TRACE   0
> MUTATION 0
> COUNTER_MUTATION 0
> REQUEST_RESPONSE 0
> PAGED_RANGE  0
> READ_REPAIR  0
> {code}
> {code} nodetool netstats{code} shows some pending messages which never get 
> processed and noting in progress -
> {code}
> Mode: NORMAL
> Not sending any streams.
> Read Repair Statistics:
> Attempted: 15585
> Mismatch (Blocking): 0
> Mismatch (Background): 0
> Pool NameActive   Pending  Completed
> Large messages  n/a12562
> Small 

[jira] [Commented] (CASSANDRA-12148) Improve determinism of CDC data availability

2017-08-14 Thread Branimir Lambov (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16125701#comment-16125701
 ] 

Branimir Lambov commented on CASSANDRA-12148:
-

LGTM

> Improve determinism of CDC data availability
> 
>
> Key: CASSANDRA-12148
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12148
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Joshua McKenzie
>Assignee: Joshua McKenzie
> Fix For: 4.x
>
>
> The latency with which CDC data becomes available has a known limitation due 
> to our reliance on CommitLogSegments being discarded to have the data 
> available in cdc_raw: if a slowly written table co-habitates a 
> CommitLogSegment with CDC data, the CommitLogSegment won't be flushed until 
> we hit either memory pressure on memtables or CommitLog limit pressure. 
> Ultimately, this leaves a non-deterministic element to when data becomes 
> available for CDC consumption unless a consumer parses live CommitLogSegments.
> To work around this limitation and make semi-realtime CDC consumption more 
> friendly to end-users, I propose we extend CDC as follows:
> h6. High level:
> * Consumers parse hard links of active CommitLogSegments in cdc_raw instead 
> of waiting for flush/discard and file move
> * C* stores an offset of the highest seen CDC mutation in a separate idx file 
> per commit log segment in cdc_raw. Clients tail this index file, delta their 
> local last parsed offset on change, and parse the corresponding commit log 
> segment using their last parsed offset as min
> * C* flags that index file with an offset and DONE when the file is flushed 
> so clients know when they can clean up
> h6. Details:
> * On creation of a CommitLogSegment, also hard-link the file in cdc_raw
> * On first write of a CDC-enabled mutation to a segment, we:
> ** Flag it as {{CDCState.CONTAINS}}
> ** Set a long tracking the {{CommitLogPosition}} of the 1st CDC-enabled 
> mutation in the log
> ** Set a long in the CommitLogSegment tracking the offset of the end of the 
> last written CDC mutation in the segment if higher than the previously known 
> highest CDC offset
> * On subsequent writes to the segment, we update the offset of the highest 
> known CDC data
> * On CommitLogSegment fsync, we write a file in cdc_raw as 
> _cdc.idx containing the min offset and end offset fsynced to 
> disk per file
> * On segment discard, if CDCState == {{CDCState.PERMITTED}}, delete both the 
> segment in commitlog and in cdc_raw
> * On segment discard, if CDCState == {{CDCState.CONTAINS}}, delete the 
> segment in commitlog and update the _cdc.idx file w/end offset 
> and a DONE marker
> * On segment replay, store the highest end offset of seen CDC-enabled 
> mutations from a segment and write that to _cdc.idx on 
> completion of segment replay. This should bridge the potential correctness 
> gap of a node writing to a segment and then dying before it can write the 
> _cdc.idx file.
> This should allow clients to skip the beginning of a file to the 1st CDC 
> mutation, track an offset of how far they've parsed, delta against the 
> _cdc.idx file end offset, and use that as a determinant on when to parse new 
> CDC data. Any existing clients written to the initial implementation of CDC 
> need only add the _cdc.idx logic and checking for DONE marker 
> to their code, so the burden on users to update to support this should be 
> quite small for the benefit of having data available as soon as it's fsynced 
> instead of at a non-deterministic time when potentially unrelated tables are 
> flushed.
> Finally, we should look into extending the interface on CommitLogReader to be 
> more friendly for realtime parsing, perhaps supporting taking a 
> CommitLogDescriptor and RandomAccessReader and resuming readSection calls, 
> assuming the reader is at the start of a SyncSegment. Would probably also 
> need to rewind to the start of the segment before returning so subsequent 
> calls would respect this contract. This would skip needing to deserialize the 
> descriptor and all completed SyncSegments to get to the root of the desired 
> segment for parsing.
> One alternative we discussed offline - instead of just storing the highest 
> seen CDC offset, we could instead store an offset per CDC mutation 
> (potentially delta encoded) in the idx file to allow clients to seek and only 
> parse the mutations with CDC enabled. My hunch is that the performance delta 
> from doing so wouldn't justify the complexity given the SyncSegment 
> deserialization and seeking restrictions in the compressed and encrypted 
> cases as mentioned above.
> The only complication I can think of with the above design is uncompressed 
> mmapped CommitLogSegments on Windows being undeletable, but it'd be pretty 
> simple to 

[jira] [Updated] (CASSANDRA-12148) Improve determinism of CDC data availability

2017-08-14 Thread Branimir Lambov (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Branimir Lambov updated CASSANDRA-12148:

Status: Ready to Commit  (was: Patch Available)

> Improve determinism of CDC data availability
> 
>
> Key: CASSANDRA-12148
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12148
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Joshua McKenzie
>Assignee: Joshua McKenzie
> Fix For: 4.x
>
>
> The latency with which CDC data becomes available has a known limitation due 
> to our reliance on CommitLogSegments being discarded to have the data 
> available in cdc_raw: if a slowly written table co-habitates a 
> CommitLogSegment with CDC data, the CommitLogSegment won't be flushed until 
> we hit either memory pressure on memtables or CommitLog limit pressure. 
> Ultimately, this leaves a non-deterministic element to when data becomes 
> available for CDC consumption unless a consumer parses live CommitLogSegments.
> To work around this limitation and make semi-realtime CDC consumption more 
> friendly to end-users, I propose we extend CDC as follows:
> h6. High level:
> * Consumers parse hard links of active CommitLogSegments in cdc_raw instead 
> of waiting for flush/discard and file move
> * C* stores an offset of the highest seen CDC mutation in a separate idx file 
> per commit log segment in cdc_raw. Clients tail this index file, delta their 
> local last parsed offset on change, and parse the corresponding commit log 
> segment using their last parsed offset as min
> * C* flags that index file with an offset and DONE when the file is flushed 
> so clients know when they can clean up
> h6. Details:
> * On creation of a CommitLogSegment, also hard-link the file in cdc_raw
> * On first write of a CDC-enabled mutation to a segment, we:
> ** Flag it as {{CDCState.CONTAINS}}
> ** Set a long tracking the {{CommitLogPosition}} of the 1st CDC-enabled 
> mutation in the log
> ** Set a long in the CommitLogSegment tracking the offset of the end of the 
> last written CDC mutation in the segment if higher than the previously known 
> highest CDC offset
> * On subsequent writes to the segment, we update the offset of the highest 
> known CDC data
> * On CommitLogSegment fsync, we write a file in cdc_raw as 
> _cdc.idx containing the min offset and end offset fsynced to 
> disk per file
> * On segment discard, if CDCState == {{CDCState.PERMITTED}}, delete both the 
> segment in commitlog and in cdc_raw
> * On segment discard, if CDCState == {{CDCState.CONTAINS}}, delete the 
> segment in commitlog and update the _cdc.idx file w/end offset 
> and a DONE marker
> * On segment replay, store the highest end offset of seen CDC-enabled 
> mutations from a segment and write that to _cdc.idx on 
> completion of segment replay. This should bridge the potential correctness 
> gap of a node writing to a segment and then dying before it can write the 
> _cdc.idx file.
> This should allow clients to skip the beginning of a file to the 1st CDC 
> mutation, track an offset of how far they've parsed, delta against the 
> _cdc.idx file end offset, and use that as a determinant on when to parse new 
> CDC data. Any existing clients written to the initial implementation of CDC 
> need only add the _cdc.idx logic and checking for DONE marker 
> to their code, so the burden on users to update to support this should be 
> quite small for the benefit of having data available as soon as it's fsynced 
> instead of at a non-deterministic time when potentially unrelated tables are 
> flushed.
> Finally, we should look into extending the interface on CommitLogReader to be 
> more friendly for realtime parsing, perhaps supporting taking a 
> CommitLogDescriptor and RandomAccessReader and resuming readSection calls, 
> assuming the reader is at the start of a SyncSegment. Would probably also 
> need to rewind to the start of the segment before returning so subsequent 
> calls would respect this contract. This would skip needing to deserialize the 
> descriptor and all completed SyncSegments to get to the root of the desired 
> segment for parsing.
> One alternative we discussed offline - instead of just storing the highest 
> seen CDC offset, we could instead store an offset per CDC mutation 
> (potentially delta encoded) in the idx file to allow clients to seek and only 
> parse the mutations with CDC enabled. My hunch is that the performance delta 
> from doing so wouldn't justify the complexity given the SyncSegment 
> deserialization and seeking restrictions in the compressed and encrypted 
> cases as mentioned above.
> The only complication I can think of with the above design is uncompressed 
> mmapped CommitLogSegments on Windows being undeletable, but it'd be pretty 
> simple to 

[jira] [Created] (CASSANDRA-13763) Trivial but potential security issue?

2017-08-14 Thread JC (JIRA)
JC created CASSANDRA-13763:
--

 Summary: Trivial but potential security issue? 
 Key: CASSANDRA-13763
 URL: https://issues.apache.org/jira/browse/CASSANDRA-13763
 Project: Cassandra
  Issue Type: Bug
  Components: Tools
Reporter: JC
Priority: Trivial


Hi

In a recent github mirror, I've found the following line.
Path: tools/stress/src/org/apache/cassandra/stress/settings/SettingsMode.java

{code:java}
177 out.printf("  Password: %s%n", 
(password==null?password:"*suppressed*"));
{code}

As the original password is intended to be masked as "*suppressed   *", I was 
wondering if showing "null" when the password is null is safe. This might not 
be an issue but I wanted to report just in case. Thanks!



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-12884) Batch logic can lead to unbalanced use of system.batches

2017-08-14 Thread Aleksey Yeschenko (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16125688#comment-16125688
 ] 

Aleksey Yeschenko commented on CASSANDRA-12884:
---

The code is correct, but you could avoid creating an extra {{ArrayList}} there 
by clearing extra elements of the collection in-place. While we are here, can 
also simplify that slightly weird use of 
{{Iterables.getOnlyElement(validated.asMap().values())}} - in our case it's 
essentially equivalent to {{validated.values()}}. And a formatting nit: we put 
braces on new lines, always.

Pushed a tiny commit on top with these suggestions addressed 
[here|https://github.com/iamaleksey/cassandra/commits/12884-3.0].

Does it look alright to you?

> Batch logic can lead to unbalanced use of system.batches
> 
>
> Key: CASSANDRA-12884
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12884
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Adam Hattrell
>Assignee: Daniel Cranford
> Fix For: 3.0.x, 3.11.x
>
> Attachments: 0001-CASSANDRA-12884.patch
>
>
> It looks as though there are some odd edge cases in how we distribute the 
> copies in system.batches.
> The main issue is in the filter method for 
> org.apache.cassandra.batchlog.BatchlogManager
> {code:java}
>  if (validated.size() - validated.get(localRack).size() >= 2)
>  {
> // we have enough endpoints in other racks
> validated.removeAll(localRack);
>   }
>  if (validated.keySet().size() == 1)
>  {
>// we have only 1 `other` rack
>Collection otherRack = 
> Iterables.getOnlyElement(validated.asMap().values());
>
> return Lists.newArrayList(Iterables.limit(otherRack, 2));
>  }
> {code}
> So with one or two racks we just return the first 2 entries in the list.  
> There's no shuffle or randomisation here.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13043) UnavailabeException caused by counter writes forwarded to leaders without complete cluster view

2017-08-14 Thread Aleksey Yeschenko (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16125679#comment-16125679
 ] 

Aleksey Yeschenko commented on CASSANDRA-13043:
---

bq. Can I bother you in IRC if I get stuck?

Sure.

> UnavailabeException caused by counter writes forwarded to leaders without 
> complete cluster view
> ---
>
> Key: CASSANDRA-13043
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13043
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: Debian
>Reporter: Catalin Alexandru Zamfir
>
> In version 3.9 of Cassandra, we get the following exceptions on the 
> system.log whenever booting an agent. They seem to grow in number with each 
> reboot. Any idea where they come from or what can we do about them? Note that 
> the cluster is healthy (has sufficient live nodes).
> {noformat}
> 2/14/2016 12:39:47 PMINFO  10:39:47 Updating topology for /10.136.64.120
> 12/14/2016 12:39:47 PMINFO  10:39:47 Updating topology for /10.136.64.120
> 12/14/2016 12:39:47 PMWARN  10:39:47 Uncaught exception on thread 
> Thread[CounterMutationStage-111,5,main]: {}
> 12/14/2016 12:39:47 PMorg.apache.cassandra.exceptions.UnavailableException: 
> Cannot achieve consistency level LOCAL_QUORUM
> 12/14/2016 12:39:47 PMat 
> org.apache.cassandra.db.ConsistencyLevel.assureSufficientLiveNodes(ConsistencyLevel.java:313)
>  ~[apache-cassandra-3.9.jar:3.9]
> 12/14/2016 12:39:47 PMat 
> org.apache.cassandra.service.AbstractWriteResponseHandler.assureSufficientLiveNodes(AbstractWriteResponseHandler.java:146)
>  ~[apache-cassandra-3.9.jar:3.9]
> 12/14/2016 12:39:47 PMat 
> org.apache.cassandra.service.StorageProxy.performWrite(StorageProxy.java:1054)
>  ~[apache-cassandra-3.9.jar:3.9]
> 12/14/2016 12:39:47 PMat 
> org.apache.cassandra.service.StorageProxy.applyCounterMutationOnLeader(StorageProxy.java:1450)
>  ~[apache-cassandra-3.9.jar:3.9]
> 12/14/2016 12:39:47 PMat 
> org.apache.cassandra.db.CounterMutationVerbHandler.doVerb(CounterMutationVerbHandler.java:48)
>  ~[apache-cassandra-3.9.jar:3.9]
> 12/14/2016 12:39:47 PMat 
> org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:64) 
> ~[apache-cassandra-3.9.jar:3.9]
> 12/14/2016 12:39:47 PMat 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> ~[na:1.8.0_111]
> 12/14/2016 12:39:47 PMat 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:164)
>  ~[apache-cassandra-3.9.jar:3.9]
> 12/14/2016 12:39:47 PMat 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:136)
>  [apache-cassandra-3.9.jar:3.9]
> 12/14/2016 12:39:47 PMat 
> org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:109) 
> [apache-cassandra-3.9.jar:3.9]
> 12/14/2016 12:39:47 PMat java.lang.Thread.run(Thread.java:745) 
> [na:1.8.0_111]
> 12/14/2016 12:39:47 PMWARN  10:39:47 Uncaught exception on thread 
> Thread[CounterMutationStage-118,5,main]: {}
> 12/14/2016 12:39:47 PMorg.apache.cassandra.exceptions.UnavailableException: 
> Cannot achieve consistency level LOCAL_QUORUM
> 12/14/2016 12:39:47 PMat 
> org.apache.cassandra.db.ConsistencyLevel.assureSufficientLiveNodes(ConsistencyLevel.java:313)
>  ~[apache-cassandra-3.9.jar:3.9]
> 12/14/2016 12:39:47 PMat 
> org.apache.cassandra.service.AbstractWriteResponseHandler.assureSufficientLiveNodes(AbstractWriteResponseHandler.java:146)
>  ~[apache-cassandra-3.9.jar:3.9]
> 12/14/2016 12:39:47 PMat 
> org.apache.cassandra.service.StorageProxy.performWrite(StorageProxy.java:1054)
>  ~[apache-cassandra-3.9.jar:3.9]
> 12/14/2016 12:39:47 PMat 
> org.apache.cassandra.service.StorageProxy.applyCounterMutationOnLeader(StorageProxy.java:1450)
>  ~[apache-cassandra-3.9.jar:3.9]
> 12/14/2016 12:39:47 PMat 
> org.apache.cassandra.db.CounterMutationVerbHandler.doVerb(CounterMutationVerbHandler.java:48)
>  ~[apache-cassandra-3.9.jar:3.9]
> 12/14/2016 12:39:47 PMat 
> org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:64) 
> ~[apache-cassandra-3.9.jar:3.9]
> 12/14/2016 12:39:47 PMat 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> ~[na:1.8.0_111]
> 12/14/2016 12:39:47 PMat 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:164)
>  ~[apache-cassandra-3.9.jar:3.9]
> 12/14/2016 12:39:47 PMat 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:136)

[jira] [Created] (CASSANDRA-13762) Ensure views created during (or just before) range movements are properly built

2017-08-14 Thread Paulo Motta (JIRA)
Paulo Motta created CASSANDRA-13762:
---

 Summary: Ensure views created during (or just before) range 
movements are properly built
 Key: CASSANDRA-13762
 URL: https://issues.apache.org/jira/browse/CASSANDRA-13762
 Project: Cassandra
  Issue Type: Bug
  Components: Materialized Views
Reporter: Paulo Motta
Assignee: Paulo Motta
Priority: Minor


CASSANDRA-13065 assumes the source node has its views built to skip running 
base mutations through the write path during range movements.

It is possible that the source node has not finished building the view, or that 
a new view is created during a range movement, in which case the view may be 
wrongly marked as built on the destination node.

The former problem was introduced by #13065, but even before that a view 
created during a range movement may not be correctly built on the destination 
node because the view builder will be triggered before it has finished 
streaming the source data, wrongly marking the view as built on that node.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13649) Uncaught exceptions in Netty pipeline

2017-08-14 Thread Norman Maurer (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16125612#comment-16125612
 ] 

Norman Maurer commented on CASSANDRA-13649:
---

And this only happens with the native epoll transport but not with the nio 
transport ?

> Uncaught exceptions in Netty pipeline
> -
>
> Key: CASSANDRA-13649
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13649
> Project: Cassandra
>  Issue Type: Bug
>  Components: Streaming and Messaging, Testing
>Reporter: Stefan Podkowinski
> Attachments: test_stdout.txt
>
>
> I've noticed some netty related errors in trunk in [some of the dtest 
> results|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/106/#showFailuresLink].
>  Just want to make sure that we don't have to change anything related to the 
> exception handling in our pipeline and that this isn't a netty issue. 
> Actually if this causes flakiness but is otherwise harmless, we should do 
> something about it, even if it's just on the dtest side.
> {noformat}
> WARN  [epollEventLoopGroup-2-9] 2017-06-28 17:23:49,699 Slf4JLogger.java:151 
> - An exceptionCaught() event was fired, and it reached at the tail of the 
> pipeline. It usually means the last handler in the pipeline did not handle 
> the exception.
> io.netty.channel.unix.Errors$NativeIoException: syscall:read(...)() failed: 
> Connection reset by peer
>   at io.netty.channel.unix.FileDescriptor.readAddress(...)(Unknown 
> Source) ~[netty-all-4.0.44.Final.jar:4.0.44.Final]
> {noformat}
> And again in another test:
> {noformat}
> WARN  [epollEventLoopGroup-2-8] 2017-06-29 02:27:31,300 Slf4JLogger.java:151 
> - An exceptionCaught() event was fired, and it reached at the tail of the 
> pipeline. It usually means the last handler in the pipeline did not handle 
> the exception.
> io.netty.channel.unix.Errors$NativeIoException: syscall:read(...)() failed: 
> Connection reset by peer
>   at io.netty.channel.unix.FileDescriptor.readAddress(...)(Unknown 
> Source) ~[netty-all-4.0.44.Final.jar:4.0.44.Final]
> {noformat}
> Edit:
> The {{io.netty.channel.unix.Errors$NativeIoException: syscall:read(...)() 
> failed}} error also causes tests to fail for 3.0 and 3.11. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13664) RangeFetchMapCalculator should not try to optimise 'trivial' ranges

2017-08-14 Thread Marcus Eriksson (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcus Eriksson updated CASSANDRA-13664:

   Resolution: Fixed
Fix Version/s: (was: 4.x)
   4.0
   Status: Resolved  (was: Ready to Commit)

Committed

> RangeFetchMapCalculator should not try to optimise 'trivial' ranges
> ---
>
> Key: CASSANDRA-13664
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13664
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Marcus Eriksson
>Assignee: Marcus Eriksson
> Fix For: 4.0
>
> Attachments: Screen Shot 2017-08-14 at 14.22.23.png
>
>
> RangeFetchMapCalculator (CASSANDRA-4650) tries to make the number of streams 
> out of each node as even as possible.
> In a typical multi-dc ring the nodes in the dcs are setup using token + 1, 
> creating many tiny ranges. If we only try to optimise over the number of 
> streams, it is likely that the amount of data streamed out of each node is 
> unbalanced.
> We should ignore those trivial ranges and only optimise the big ones, then 
> share the tiny ones over the nodes.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



cassandra git commit: Only optimize large ranges when figuring out where to stream from

2017-08-14 Thread marcuse
Repository: cassandra
Updated Branches:
  refs/heads/trunk 62d39f654 -> ff06424fa


Only optimize large ranges when figuring out where to stream from

Patch by marcuse; reviewed by Ariel Weisberg for CASSANDRA-13664


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/ff06424f
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/ff06424f
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/ff06424f

Branch: refs/heads/trunk
Commit: ff06424faccc8acedd027c71e955a38fd8ddee6c
Parents: 62d39f6
Author: Marcus Eriksson 
Authored: Mon Jul 3 15:16:56 2017 +0200
Committer: Marcus Eriksson 
Committed: Mon Aug 14 14:32:19 2017 +0200

--
 CHANGES.txt |   1 +
 .../cassandra/dht/RangeFetchMapCalculator.java  |  79 +++-
 .../dht/RangeFetchMapCalculatorTest.java| 186 +--
 3 files changed, 208 insertions(+), 58 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/ff06424f/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index a6428d3..a59c00b 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 4.0
+ * Don't optimise trivial ranges in RangeFetchMapCalculator (CASSANDRA-13664)
  * Use an ExecutorService for repair commands instead of new 
Thread(..).start() (CASSANDRA-13594)
  * Fix race / ref leak in anticompaction (CASSANDRA-13688)
  * Expose tasks queue length via JMX (CASSANDRA-12758)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/ff06424f/src/java/org/apache/cassandra/dht/RangeFetchMapCalculator.java
--
diff --git a/src/java/org/apache/cassandra/dht/RangeFetchMapCalculator.java 
b/src/java/org/apache/cassandra/dht/RangeFetchMapCalculator.java
index 1186eab..d407212 100644
--- a/src/java/org/apache/cassandra/dht/RangeFetchMapCalculator.java
+++ b/src/java/org/apache/cassandra/dht/RangeFetchMapCalculator.java
@@ -18,9 +18,17 @@
 
 package org.apache.cassandra.dht;
 
+import java.math.BigInteger;
 import java.net.InetAddress;
+import java.util.ArrayList;
 import java.util.Collection;
+import java.util.Collections;
+import java.util.Comparator;
+import java.util.List;
+import java.util.Set;
+import java.util.stream.Collectors;
 
+import com.google.common.annotations.VisibleForTesting;
 import com.google.common.collect.HashMultimap;
 import com.google.common.collect.Multimap;
 
@@ -63,22 +71,54 @@ import org.psjava.ds.math.Function;
 public class RangeFetchMapCalculator
 {
 private static final Logger logger = 
LoggerFactory.getLogger(RangeFetchMapCalculator.class);
+private static final long TRIVIAL_RANGE_LIMIT = 1000;
 private final Multimap rangesWithSources;
 private final Collection sourceFilters;
 private final String keyspace;
 //We need two Vertices to act as source and destination in the algorithm
 private final Vertex sourceVertex = OuterVertex.getSourceVertex();
 private final Vertex destinationVertex = 
OuterVertex.getDestinationVertex();
+private final Set trivialRanges;
 
-public RangeFetchMapCalculator(Multimap 
rangesWithSources, Collection sourceFilters, 
String keyspace)
+public RangeFetchMapCalculator(Multimap 
rangesWithSources,
+   Collection 
sourceFilters,
+   String keyspace)
 {
 this.rangesWithSources = rangesWithSources;
 this.sourceFilters = sourceFilters;
 this.keyspace = keyspace;
+this.trivialRanges = rangesWithSources.keySet()
+  .stream()
+  
.filter(RangeFetchMapCalculator::isTrivial)
+  .collect(Collectors.toSet());
+}
+
+static boolean isTrivial(Range range)
+{
+IPartitioner partitioner = DatabaseDescriptor.getPartitioner();
+if (partitioner.splitter().isPresent())
+{
+BigInteger l = 
partitioner.splitter().get().valueForToken(range.left);
+BigInteger r = 
partitioner.splitter().get().valueForToken(range.right);
+if (r.compareTo(l) <= 0)
+return false;
+if 
(r.subtract(l).compareTo(BigInteger.valueOf(TRIVIAL_RANGE_LIMIT)) < 0)
+return true;
+}
+return false;
 }
 
 public Multimap getRangeFetchMap()
 {
+Multimap fetchMap = HashMultimap.create();
+fetchMap.putAll(getRangeFetchMapForNonTrivialRanges());
+

[jira] [Comment Edited] (CASSANDRA-11500) Obsolete MV entry may not be properly deleted

2017-08-14 Thread ZhaoYang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16124534#comment-16124534
 ] 

ZhaoYang edited comment on CASSANDRA-11500 at 8/14/17 12:30 PM:


Thanks for reviewing and feedback.

Changing the semantic of MV and revising non-key column filtering 
feature(CASSANDRA-10368) will indeed make it easier. It's a good idea to make a 
simple non-disruptive change to stabilize basic features and wait for more 
commiters involved.

Using an extended flag for {{Strict-Liveness}} will allow us to change to 
future structure easily, either multiple livenessInfos or virtualcells. 

About the {{Strict Liveness}} semantic:
* A strict row is only live iff it's row level liveness info is live, 
regardless of the liveness of its columns.

My understanding is: view row is strict iff the view has non-key base row as 
view pk. When it's {{Strict}}, the view's row liveness/deletion should use this 
non-key base column's timestamp as well as ttl, unless there is a greater row 
deletion.(It's like a simplified version of "VirtualCells" which only store 
metadata for non-key base column in view pk)

For now, the semantic of MV: 
* if it's strict(non-key base row as view pk), the existence of view row is 
only with its row livenessInfo
* if it's not-strict, view row is alive if there is any live selected view 
columns or live livenessInfo.

{code}
For 13127: 
   Unselected columns has no effect on liveness of view row, for now, till we 
are ready for new design.
   It cannot be properly supported without disruptive changes, like 
VirtualCells or multiple livenessInfos
{code}

{code}
For 13547:
It's necessary to forbid dropping filtered columns from base columns.
The filtered column part needs to be reconsidered with 10368.
It cannot be properly supported without disruptive changes, like 
VirtualCells or multiple livenessInfos
{code}

{code}
for 13409:
As paulo suggested, generating column tombstones when receiving a partial 
update for a previously deleted row might be a non-disruptive solution if cell 
tombstone can co-exist with row deletion which has greater timestamp.
I will reopen this ticket.
{code}

PATCH for 11500: 
| [trunk|https://github.com/jasonstack/cassandra/commits/11500-poc]|
| [dtest|https://github.com/riptano/cassandra-dtest/commits/11500-poc]| 

Changes:
1. deletion is shadowable if the non-key base column in view-pk is updated or 
deleted by partial update or partial delete. if this non-key column is removed 
by row deletion, it's not shadowable.
2. it's strict-liveness iff there is non-key base column in view-pk.
3. if it's not strict-liveness, the view's livenes/deletion is using max of all 
base columns. (this wouldn't support complex unselected columns. eg. c/d 
unselected, update c@10, delete c@11, update d@5. view row should be alive but 
dead)
4. in TableViews.java, the DeletionTracker should be applied even if one of the 
iterator has no data, eg. partition-deletion
5. sstabledump will include shadowable info



was (Author: jasonstack):
Thanks for reviewing and feedback.

Changing the semantic of MV and revising non-key column filtering 
feature(CASSANDRA-10368) will indeed make it easier. It's a good idea to make a 
simple non-disruptive change to stabilize basic features and wait for more 
commiters involved.

Using an extended flag for {{Strict-Liveness}} will allow us to change to 
future structure easily, either multiple livenessInfos or virtualcells. 

About the {{Strict Liveness}} semantic:
* A strict row is only live iff it's row level liveness info is live, 
regardless of the liveness of its columns.

My understanding is: view row is strict iff the view has non-key base row as 
view pk. When it's {{Strict}}, the view's row liveness/deletion should use this 
non-key base column's timestamp as well as ttl, unless there is a greater row 
deletion.(It's like a simplified version of "VirtualCells" which only store 
metadata for non-key base column in view pk)

For now, the semantic of MV: 
* if it's strict(non-key base row as view pk), the existence of view row is 
only with its row livenessInfo
* if it's not-strict, view row is alive if there is any live selected view 
columns or live livenessInfo.

{code}
For 13127: 
   Unselected columns has no effect on liveness of view row, for now, till we 
are ready for new design.
   It cannot be properly supported without disruptive changes, like 
VirtualCells or multiple livenessInfos
{code}

{code}
For 13547:
It's necessary to forbid dropping filtered columns from base columns.
The filtered column part needs to be reconsidered with 10368.
It cannot be properly supported without disruptive changes, like 
VirtualCells or multiple livenessInfos
{code}

{code}
for 13409:
As paulo suggested, generating column tombstones when receiving a partial 
update for a previously deleted 

[jira] [Commented] (CASSANDRA-13664) RangeFetchMapCalculator should not try to optimise 'trivial' ranges

2017-08-14 Thread Marcus Eriksson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16125602#comment-16125602
 ] 

Marcus Eriksson commented on CASSANDRA-13664:
-

[^Screen Shot 2017-08-14 at 14.22.23.png] - looks like only flaky failures

> RangeFetchMapCalculator should not try to optimise 'trivial' ranges
> ---
>
> Key: CASSANDRA-13664
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13664
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Marcus Eriksson
>Assignee: Marcus Eriksson
> Fix For: 4.x
>
> Attachments: Screen Shot 2017-08-14 at 14.22.23.png
>
>
> RangeFetchMapCalculator (CASSANDRA-4650) tries to make the number of streams 
> out of each node as even as possible.
> In a typical multi-dc ring the nodes in the dcs are setup using token + 1, 
> creating many tiny ranges. If we only try to optimise over the number of 
> streams, it is likely that the amount of data streamed out of each node is 
> unbalanced.
> We should ignore those trivial ranges and only optimise the big ones, then 
> share the tiny ones over the nodes.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13664) RangeFetchMapCalculator should not try to optimise 'trivial' ranges

2017-08-14 Thread Marcus Eriksson (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcus Eriksson updated CASSANDRA-13664:

Attachment: Screen Shot 2017-08-14 at 14.22.23.png

> RangeFetchMapCalculator should not try to optimise 'trivial' ranges
> ---
>
> Key: CASSANDRA-13664
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13664
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Marcus Eriksson
>Assignee: Marcus Eriksson
> Fix For: 4.x
>
> Attachments: Screen Shot 2017-08-14 at 14.22.23.png
>
>
> RangeFetchMapCalculator (CASSANDRA-4650) tries to make the number of streams 
> out of each node as even as possible.
> In a typical multi-dc ring the nodes in the dcs are setup using token + 1, 
> creating many tiny ranges. If we only try to optimise over the number of 
> streams, it is likely that the amount of data streamed out of each node is 
> unbalanced.
> We should ignore those trivial ranges and only optimise the big ones, then 
> share the tiny ones over the nodes.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13425) nodetool refresh should try to insert new sstables in existing leveling

2017-08-14 Thread Marcus Eriksson (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcus Eriksson updated CASSANDRA-13425:

Status: Open  (was: Patch Available)

> nodetool refresh should try to insert new sstables in existing leveling
> ---
>
> Key: CASSANDRA-13425
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13425
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Marcus Eriksson
>Assignee: Marcus Eriksson
> Fix For: 4.x
>
>
> Currently {{nodetool refresh}} sets level to 0 on all new sstables, instead 
> we could try to find gaps in the existing leveling and insert them there.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13594) Use an ExecutorService for repair commands instead of new Thread(..).start()

2017-08-14 Thread Marcus Eriksson (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcus Eriksson updated CASSANDRA-13594:

   Resolution: Fixed
Fix Version/s: (was: 4.x)
   4.0
   Status: Resolved  (was: Ready to Commit)

Committed, thanks


> Use an ExecutorService for repair commands instead of new Thread(..).start()
> 
>
> Key: CASSANDRA-13594
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13594
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Marcus Eriksson
>Assignee: Marcus Eriksson
> Fix For: 4.0
>
> Attachments: 13594.png
>
>
> Currently when starting a new repair, we create a new Thread and start it 
> immediately
> It would be nice to be able to 1) limit the number of threads and 2) reject 
> starting new repair commands if we are already running too many.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



cassandra git commit: Use an ExecutorService for repair commands instead of new Thread(..).start()

2017-08-14 Thread marcuse
Repository: cassandra
Updated Branches:
  refs/heads/trunk e9cc805db -> 62d39f654


Use an ExecutorService for repair commands instead of new Thread(..).start()

Patch by marcuse; reviewed by Ariel Weisberg for CASSANDRA-13594


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/62d39f65
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/62d39f65
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/62d39f65

Branch: refs/heads/trunk
Commit: 62d39f6544e3fbcbc268aecbb3a46950dcba2bf0
Parents: e9cc805
Author: Marcus Eriksson 
Authored: Thu Jun 8 13:34:18 2017 +0200
Committer: Marcus Eriksson 
Committed: Mon Aug 14 14:12:34 2017 +0200

--
 CHANGES.txt |  1 +
 .../JMXEnabledThreadPoolExecutor.java   | 14 
 .../org/apache/cassandra/config/Config.java |  9 
 .../cassandra/config/DatabaseDescriptor.java| 10 
 .../cassandra/service/ActiveRepairService.java  | 24 
 .../cassandra/service/StorageService.java   | 21 ++---
 6 files changed, 76 insertions(+), 3 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/62d39f65/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 7c9d79a..a6428d3 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 4.0
+ * Use an ExecutorService for repair commands instead of new 
Thread(..).start() (CASSANDRA-13594)
  * Fix race / ref leak in anticompaction (CASSANDRA-13688)
  * Expose tasks queue length via JMX (CASSANDRA-12758)
  * Fix race / ref leak in PendingRepairManager (CASSANDRA-13751)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/62d39f65/src/java/org/apache/cassandra/concurrent/JMXEnabledThreadPoolExecutor.java
--
diff --git 
a/src/java/org/apache/cassandra/concurrent/JMXEnabledThreadPoolExecutor.java 
b/src/java/org/apache/cassandra/concurrent/JMXEnabledThreadPoolExecutor.java
index a7a54f2..2dafb4f 100644
--- a/src/java/org/apache/cassandra/concurrent/JMXEnabledThreadPoolExecutor.java
+++ b/src/java/org/apache/cassandra/concurrent/JMXEnabledThreadPoolExecutor.java
@@ -21,6 +21,7 @@ import java.lang.management.ManagementFactory;
 import java.util.List;
 import java.util.concurrent.BlockingQueue;
 import java.util.concurrent.LinkedBlockingQueue;
+import java.util.concurrent.RejectedExecutionHandler;
 import java.util.concurrent.TimeUnit;
 import javax.management.MBeanServer;
 import javax.management.ObjectName;
@@ -93,6 +94,19 @@ public class JMXEnabledThreadPoolExecutor extends 
DebuggableThreadPoolExecutor i
 }
 }
 
+public JMXEnabledThreadPoolExecutor(int corePoolSize,
+int maxPoolSize,
+long keepAliveTime,
+TimeUnit unit,
+BlockingQueue workQueue,
+NamedThreadFactory threadFactory,
+String jmxPath,
+RejectedExecutionHandler 
rejectedExecutionHandler)
+{
+this(corePoolSize, maxPoolSize, keepAliveTime, unit, workQueue, 
threadFactory, jmxPath);
+setRejectedExecutionHandler(rejectedExecutionHandler);
+}
+
 public JMXEnabledThreadPoolExecutor(Stage stage)
 {
 this(stage.getJmxName(), stage.getJmxType());

http://git-wip-us.apache.org/repos/asf/cassandra/blob/62d39f65/src/java/org/apache/cassandra/config/Config.java
--
diff --git a/src/java/org/apache/cassandra/config/Config.java 
b/src/java/org/apache/cassandra/config/Config.java
index 22f3551..5a45282 100644
--- a/src/java/org/apache/cassandra/config/Config.java
+++ b/src/java/org/apache/cassandra/config/Config.java
@@ -348,6 +348,9 @@ public class Config
 public volatile boolean back_pressure_enabled = false;
 public volatile ParameterizedClass back_pressure_strategy;
 
+public RepairCommandPoolFullStrategy repair_command_pool_full_strategy = 
RepairCommandPoolFullStrategy.queue;
+public int repair_command_pool_size = concurrent_validations;
+
 /**
  * @deprecated migrate to {@link DatabaseDescriptor#isClientInitialized()}
  */
@@ -425,6 +428,12 @@ public class Config
 spinning
 }
 
+public enum RepairCommandPoolFullStrategy
+{
+queue,
+reject
+}
+
 private static final List SENSITIVE_KEYS = new ArrayList() 
{{
 add("client_encryption_options");
 add("server_encryption_options");


[jira] [Commented] (CASSANDRA-10726) Read repair inserts should not be blocking

2017-08-14 Thread Marcus Eriksson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16125586#comment-16125586
 ] 

Marcus Eriksson commented on CASSANDRA-10726:
-

[~xiaolong...@gmail.com] seems we are getting a CME:
{code}
java.util.ConcurrentModificationException: null
at java.util.HashMap$HashIterator.nextNode(HashMap.java:1437) 
~[na:1.8.0_149-apple]
at java.util.HashMap$KeyIterator.next(HashMap.java:1461) 
~[na:1.8.0_149-apple]
at 
org.apache.cassandra.service.DataResolver$RepairMergeListener.awaitRepairResponses(DataResolver.java:298)
 ~[main/:na]
at 
org.apache.cassandra.service.DataResolver$RepairMergeListener.waitRepairToFinishWithPossibleRetry(DataResolver.java:223)
 ~[main/:na]
at 
org.apache.cassandra.service.DataResolver$RepairMergeListener.close(DataResolver.java:175)
 ~[main/:na]
at 
org.apache.cassandra.db.partitions.UnfilteredPartitionIterators$2.close(UnfilteredPartitionIterators.java:175)
 ~[main/:na]
at 
org.apache.cassandra.db.transform.BaseIterator.close(BaseIterator.java:92) 
~[main/:na]
at 
org.apache.cassandra.service.DataResolver.compareResponses(DataResolver.java:103)
 ~[main/:na]
at 
org.apache.cassandra.service.ReadCallback$AsyncRepairRunner.run(ReadCallback.java:232)
 ~[main/:na]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
~[na:1.8.0_149-apple]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
~[na:1.8.0_149-apple]
at 
org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:81)
 ~[main/:na]
at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_149-apple]
{code}

in the 
{{materialized_views_test.py:TestMaterializedViewsConsistency.multi_partition_consistent_reads_after_write_test}}
 dtest

> Read repair inserts should not be blocking
> --
>
> Key: CASSANDRA-10726
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10726
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Coordination
>Reporter: Richard Low
>Assignee: Xiaolong Jiang
> Fix For: 4.x
>
>
> Today, if there’s a digest mismatch in a foreground read repair, the insert 
> to update out of date replicas is blocking. This means, if it fails, the read 
> fails with a timeout. If a node is dropping writes (maybe it is overloaded or 
> the mutation stage is backed up for some other reason), all reads to a 
> replica set could fail. Further, replicas dropping writes get more out of 
> sync so will require more read repair.
> The comment on the code for why the writes are blocking is:
> {code}
> // wait for the repair writes to be acknowledged, to minimize impact on any 
> replica that's
> // behind on writes in case the out-of-sync row is read multiple times in 
> quick succession
> {code}
> but the bad side effect is that reads timeout. Either the writes should not 
> be blocking or we should return success for the read even if the write times 
> out.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13043) UnavailabeException caused by counter writes forwarded to leaders without complete cluster view

2017-08-14 Thread Stefano Ortolani (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16125561#comment-16125561
 ] 

Stefano Ortolani commented on CASSANDRA-13043:
--

Hi [~iamaleksey], thx for the explanation! Sure I will give it a go. I might 
need some assistance/have questions though. Can I bother you in IRC if I get 
stuck?

> UnavailabeException caused by counter writes forwarded to leaders without 
> complete cluster view
> ---
>
> Key: CASSANDRA-13043
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13043
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: Debian
>Reporter: Catalin Alexandru Zamfir
>
> In version 3.9 of Cassandra, we get the following exceptions on the 
> system.log whenever booting an agent. They seem to grow in number with each 
> reboot. Any idea where they come from or what can we do about them? Note that 
> the cluster is healthy (has sufficient live nodes).
> {noformat}
> 2/14/2016 12:39:47 PMINFO  10:39:47 Updating topology for /10.136.64.120
> 12/14/2016 12:39:47 PMINFO  10:39:47 Updating topology for /10.136.64.120
> 12/14/2016 12:39:47 PMWARN  10:39:47 Uncaught exception on thread 
> Thread[CounterMutationStage-111,5,main]: {}
> 12/14/2016 12:39:47 PMorg.apache.cassandra.exceptions.UnavailableException: 
> Cannot achieve consistency level LOCAL_QUORUM
> 12/14/2016 12:39:47 PMat 
> org.apache.cassandra.db.ConsistencyLevel.assureSufficientLiveNodes(ConsistencyLevel.java:313)
>  ~[apache-cassandra-3.9.jar:3.9]
> 12/14/2016 12:39:47 PMat 
> org.apache.cassandra.service.AbstractWriteResponseHandler.assureSufficientLiveNodes(AbstractWriteResponseHandler.java:146)
>  ~[apache-cassandra-3.9.jar:3.9]
> 12/14/2016 12:39:47 PMat 
> org.apache.cassandra.service.StorageProxy.performWrite(StorageProxy.java:1054)
>  ~[apache-cassandra-3.9.jar:3.9]
> 12/14/2016 12:39:47 PMat 
> org.apache.cassandra.service.StorageProxy.applyCounterMutationOnLeader(StorageProxy.java:1450)
>  ~[apache-cassandra-3.9.jar:3.9]
> 12/14/2016 12:39:47 PMat 
> org.apache.cassandra.db.CounterMutationVerbHandler.doVerb(CounterMutationVerbHandler.java:48)
>  ~[apache-cassandra-3.9.jar:3.9]
> 12/14/2016 12:39:47 PMat 
> org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:64) 
> ~[apache-cassandra-3.9.jar:3.9]
> 12/14/2016 12:39:47 PMat 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> ~[na:1.8.0_111]
> 12/14/2016 12:39:47 PMat 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:164)
>  ~[apache-cassandra-3.9.jar:3.9]
> 12/14/2016 12:39:47 PMat 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:136)
>  [apache-cassandra-3.9.jar:3.9]
> 12/14/2016 12:39:47 PMat 
> org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:109) 
> [apache-cassandra-3.9.jar:3.9]
> 12/14/2016 12:39:47 PMat java.lang.Thread.run(Thread.java:745) 
> [na:1.8.0_111]
> 12/14/2016 12:39:47 PMWARN  10:39:47 Uncaught exception on thread 
> Thread[CounterMutationStage-118,5,main]: {}
> 12/14/2016 12:39:47 PMorg.apache.cassandra.exceptions.UnavailableException: 
> Cannot achieve consistency level LOCAL_QUORUM
> 12/14/2016 12:39:47 PMat 
> org.apache.cassandra.db.ConsistencyLevel.assureSufficientLiveNodes(ConsistencyLevel.java:313)
>  ~[apache-cassandra-3.9.jar:3.9]
> 12/14/2016 12:39:47 PMat 
> org.apache.cassandra.service.AbstractWriteResponseHandler.assureSufficientLiveNodes(AbstractWriteResponseHandler.java:146)
>  ~[apache-cassandra-3.9.jar:3.9]
> 12/14/2016 12:39:47 PMat 
> org.apache.cassandra.service.StorageProxy.performWrite(StorageProxy.java:1054)
>  ~[apache-cassandra-3.9.jar:3.9]
> 12/14/2016 12:39:47 PMat 
> org.apache.cassandra.service.StorageProxy.applyCounterMutationOnLeader(StorageProxy.java:1450)
>  ~[apache-cassandra-3.9.jar:3.9]
> 12/14/2016 12:39:47 PMat 
> org.apache.cassandra.db.CounterMutationVerbHandler.doVerb(CounterMutationVerbHandler.java:48)
>  ~[apache-cassandra-3.9.jar:3.9]
> 12/14/2016 12:39:47 PMat 
> org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:64) 
> ~[apache-cassandra-3.9.jar:3.9]
> 12/14/2016 12:39:47 PMat 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> ~[na:1.8.0_111]
> 12/14/2016 12:39:47 PMat 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:164)
>  ~[apache-cassandra-3.9.jar:3.9]
> 12/14/2016 12:39:47 PMat 
> 

[jira] [Commented] (CASSANDRA-13761) truncatehints cant't delete all hints

2017-08-14 Thread Aleksey Yeschenko (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16125530#comment-16125530
 ] 

Aleksey Yeschenko commented on CASSANDRA-13761:
---

It's likely that you had some pending hints in the buffer for B when you issued 
{{nodetool truncatehints}}. So a new hint file was written after you truncated 
everything (that was written at the time).

> truncatehints  cant't delete all hints
> --
>
> Key: CASSANDRA-13761
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13761
> Project: Cassandra
>  Issue Type: Bug
> Environment: 3.0.14
> java version "1.8.0_131"
>Reporter: huyx
>Priority: Minor
>
> step1
> Execute nodetool truncatehints on node A , no print any log. when restart the 
> down node B,
> A print:
> INFO  [HintsDispatcher:1] 2017-08-10 18:27:01,593 HintsStore.java:126 - 
> Deleted hint file 4da2fd65-a4fe-4c0a-bf95-f818431c31bb-1502360551290-1.hints
> INFO  [HintsDispatcher:1] 2017-08-10 18:27:01,595 
> HintsDispatchExecutor.java:272 - Finished hinted handoff of file 
> 4da2fd65-a4fe-4c0a-bf95-f818431c31bb-1502360551290-1.hints to endpoint 
> /10.71.0.14,
> and B data is repaired。
> step2:
> I change the cassandra.yaml max_hints_file_size_in_mb=1, and insert data to 
> cluster.
> Execute nodetool truncatehints on node A,A print:
> INFO  [RMI TCP Connection(20)-10.71.0.12] 2017-08-11 17:22:51,164 
> HintsStore.java:126 - Deleted hint file 
> 4da2fd65-a4fe-4c0a-bf95-f818431c31bb-1502443243250-1.hints
> INFO  [RMI TCP Connection(20)-10.71.0.12] 2017-08-11 17:22:51,165 
> HintsStore.java:126 - Deleted hint file 
> 4da2fd65-a4fe-4c0a-bf95-f818431c31bb-1502443273261-1.hints
> INFO  [RMI TCP Connection(20)-10.71.0.12] 2017-08-11 17:22:51,166 
> HintsStore.java:126 - Deleted hint file 
> 4da2fd65-a4fe-4c0a-bf95-f818431c31bb-1502443293262-1.hints
> INFO  [RMI TCP Connection(20)-10.71.0.12] 2017-08-11 17:22:51,167 
> HintsStore.java:126 - Deleted hint file 
> 4da2fd65-a4fe-4c0a-bf95-f818431c31bb-1502443313267-1.hints
> when restart the down node B, A print:
> Deleted hint file 4da2fd65-a4fe-4c0a-bf95-f818431c31bb-150244269-1.hints
> INFO  [HintsDispatcher:7] 2017-08-11 17:25:14,626 
> HintsDispatchExecutor.java:272 - Finished hinted handoff of file 
> 4da2fd65-a4fe-4c0a-bf95-f818431c31bb-150244269-1.hints to endpoint 
> /10.71.0.14: 4da2fd65-a4fe-4c0a-bf95-f818431c31bb
> truncatehints  can't delete all hits, it will Leave one don't delete。



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13761) truncatehints cant't delete all hints

2017-08-14 Thread Aleksey Yeschenko (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13761?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko updated CASSANDRA-13761:
--
Priority: Minor  (was: Blocker)

> truncatehints  cant't delete all hints
> --
>
> Key: CASSANDRA-13761
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13761
> Project: Cassandra
>  Issue Type: Bug
> Environment: 3.0.14
> java version "1.8.0_131"
>Reporter: huyx
>Priority: Minor
>
> step1
> Execute nodetool truncatehints on node A , no print any log. when restart the 
> down node B,
> A print:
> INFO  [HintsDispatcher:1] 2017-08-10 18:27:01,593 HintsStore.java:126 - 
> Deleted hint file 4da2fd65-a4fe-4c0a-bf95-f818431c31bb-1502360551290-1.hints
> INFO  [HintsDispatcher:1] 2017-08-10 18:27:01,595 
> HintsDispatchExecutor.java:272 - Finished hinted handoff of file 
> 4da2fd65-a4fe-4c0a-bf95-f818431c31bb-1502360551290-1.hints to endpoint 
> /10.71.0.14,
> and B data is repaired。
> step2:
> I change the cassandra.yaml max_hints_file_size_in_mb=1, and insert data to 
> cluster.
> Execute nodetool truncatehints on node A,A print:
> INFO  [RMI TCP Connection(20)-10.71.0.12] 2017-08-11 17:22:51,164 
> HintsStore.java:126 - Deleted hint file 
> 4da2fd65-a4fe-4c0a-bf95-f818431c31bb-1502443243250-1.hints
> INFO  [RMI TCP Connection(20)-10.71.0.12] 2017-08-11 17:22:51,165 
> HintsStore.java:126 - Deleted hint file 
> 4da2fd65-a4fe-4c0a-bf95-f818431c31bb-1502443273261-1.hints
> INFO  [RMI TCP Connection(20)-10.71.0.12] 2017-08-11 17:22:51,166 
> HintsStore.java:126 - Deleted hint file 
> 4da2fd65-a4fe-4c0a-bf95-f818431c31bb-1502443293262-1.hints
> INFO  [RMI TCP Connection(20)-10.71.0.12] 2017-08-11 17:22:51,167 
> HintsStore.java:126 - Deleted hint file 
> 4da2fd65-a4fe-4c0a-bf95-f818431c31bb-1502443313267-1.hints
> when restart the down node B, A print:
> Deleted hint file 4da2fd65-a4fe-4c0a-bf95-f818431c31bb-150244269-1.hints
> INFO  [HintsDispatcher:7] 2017-08-11 17:25:14,626 
> HintsDispatchExecutor.java:272 - Finished hinted handoff of file 
> 4da2fd65-a4fe-4c0a-bf95-f818431c31bb-150244269-1.hints to endpoint 
> /10.71.0.14: 4da2fd65-a4fe-4c0a-bf95-f818431c31bb
> truncatehints  can't delete all hits, it will Leave one don't delete。



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13761) truncatehints cant't delete all hints

2017-08-14 Thread huyx (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16125513#comment-16125513
 ] 

huyx commented on CASSANDRA-13761:
--

yes,stopdaemon B,writing data, stop writing data, truncating the hints A, 
restart B。

> truncatehints  cant't delete all hints
> --
>
> Key: CASSANDRA-13761
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13761
> Project: Cassandra
>  Issue Type: Bug
> Environment: 3.0.14
> java version "1.8.0_131"
>Reporter: huyx
>Priority: Blocker
>
> step1
> Execute nodetool truncatehints on node A , no print any log. when restart the 
> down node B,
> A print:
> INFO  [HintsDispatcher:1] 2017-08-10 18:27:01,593 HintsStore.java:126 - 
> Deleted hint file 4da2fd65-a4fe-4c0a-bf95-f818431c31bb-1502360551290-1.hints
> INFO  [HintsDispatcher:1] 2017-08-10 18:27:01,595 
> HintsDispatchExecutor.java:272 - Finished hinted handoff of file 
> 4da2fd65-a4fe-4c0a-bf95-f818431c31bb-1502360551290-1.hints to endpoint 
> /10.71.0.14,
> and B data is repaired。
> step2:
> I change the cassandra.yaml max_hints_file_size_in_mb=1, and insert data to 
> cluster.
> Execute nodetool truncatehints on node A,A print:
> INFO  [RMI TCP Connection(20)-10.71.0.12] 2017-08-11 17:22:51,164 
> HintsStore.java:126 - Deleted hint file 
> 4da2fd65-a4fe-4c0a-bf95-f818431c31bb-1502443243250-1.hints
> INFO  [RMI TCP Connection(20)-10.71.0.12] 2017-08-11 17:22:51,165 
> HintsStore.java:126 - Deleted hint file 
> 4da2fd65-a4fe-4c0a-bf95-f818431c31bb-1502443273261-1.hints
> INFO  [RMI TCP Connection(20)-10.71.0.12] 2017-08-11 17:22:51,166 
> HintsStore.java:126 - Deleted hint file 
> 4da2fd65-a4fe-4c0a-bf95-f818431c31bb-1502443293262-1.hints
> INFO  [RMI TCP Connection(20)-10.71.0.12] 2017-08-11 17:22:51,167 
> HintsStore.java:126 - Deleted hint file 
> 4da2fd65-a4fe-4c0a-bf95-f818431c31bb-1502443313267-1.hints
> when restart the down node B, A print:
> Deleted hint file 4da2fd65-a4fe-4c0a-bf95-f818431c31bb-150244269-1.hints
> INFO  [HintsDispatcher:7] 2017-08-11 17:25:14,626 
> HintsDispatchExecutor.java:272 - Finished hinted handoff of file 
> 4da2fd65-a4fe-4c0a-bf95-f818431c31bb-150244269-1.hints to endpoint 
> /10.71.0.14: 4da2fd65-a4fe-4c0a-bf95-f818431c31bb
> truncatehints  can't delete all hits, it will Leave one don't delete。



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Assigned] (CASSANDRA-13260) Add UDT support to Cassandra stress

2017-08-14 Thread Aleksandr Sorokoumov (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksandr Sorokoumov reassigned CASSANDRA-13260:


Assignee: (was: Aleksandr Sorokoumov)

> Add UDT support to Cassandra stress
> ---
>
> Key: CASSANDRA-13260
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13260
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Tools
>Reporter: Jeremy Hanna
>  Labels: lhf, stress
>
> Splitting out UDT support in cassandra stress from CASSANDRA-9556.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13761) truncatehints cant't delete all hints

2017-08-14 Thread Kurt Greaves (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16125310#comment-16125310
 ] 

Kurt Greaves commented on CASSANDRA-13761:
--

Do you stop writing data prior to truncating the hints? 

> truncatehints  cant't delete all hints
> --
>
> Key: CASSANDRA-13761
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13761
> Project: Cassandra
>  Issue Type: Bug
> Environment: 3.0.14
> java version "1.8.0_131"
>Reporter: huyx
>Priority: Blocker
>
> step1
> Execute nodetool truncatehints on node A , no print any log. when restart the 
> down node B,
> A print:
> INFO  [HintsDispatcher:1] 2017-08-10 18:27:01,593 HintsStore.java:126 - 
> Deleted hint file 4da2fd65-a4fe-4c0a-bf95-f818431c31bb-1502360551290-1.hints
> INFO  [HintsDispatcher:1] 2017-08-10 18:27:01,595 
> HintsDispatchExecutor.java:272 - Finished hinted handoff of file 
> 4da2fd65-a4fe-4c0a-bf95-f818431c31bb-1502360551290-1.hints to endpoint 
> /10.71.0.14,
> and B data is repaired。
> step2:
> I change the cassandra.yaml max_hints_file_size_in_mb=1, and insert data to 
> cluster.
> Execute nodetool truncatehints on node A,A print:
> INFO  [RMI TCP Connection(20)-10.71.0.12] 2017-08-11 17:22:51,164 
> HintsStore.java:126 - Deleted hint file 
> 4da2fd65-a4fe-4c0a-bf95-f818431c31bb-1502443243250-1.hints
> INFO  [RMI TCP Connection(20)-10.71.0.12] 2017-08-11 17:22:51,165 
> HintsStore.java:126 - Deleted hint file 
> 4da2fd65-a4fe-4c0a-bf95-f818431c31bb-1502443273261-1.hints
> INFO  [RMI TCP Connection(20)-10.71.0.12] 2017-08-11 17:22:51,166 
> HintsStore.java:126 - Deleted hint file 
> 4da2fd65-a4fe-4c0a-bf95-f818431c31bb-1502443293262-1.hints
> INFO  [RMI TCP Connection(20)-10.71.0.12] 2017-08-11 17:22:51,167 
> HintsStore.java:126 - Deleted hint file 
> 4da2fd65-a4fe-4c0a-bf95-f818431c31bb-1502443313267-1.hints
> when restart the down node B, A print:
> Deleted hint file 4da2fd65-a4fe-4c0a-bf95-f818431c31bb-150244269-1.hints
> INFO  [HintsDispatcher:7] 2017-08-11 17:25:14,626 
> HintsDispatchExecutor.java:272 - Finished hinted handoff of file 
> 4da2fd65-a4fe-4c0a-bf95-f818431c31bb-150244269-1.hints to endpoint 
> /10.71.0.14: 4da2fd65-a4fe-4c0a-bf95-f818431c31bb
> truncatehints  can't delete all hits, it will Leave one don't delete。



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org