[jira] [Commented] (CASSANDRA-15318) sendMessagesToNonlocalDC() should shuffle targets

2021-10-08 Thread Jon Meredith (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17426393#comment-17426393
 ] 

Jon Meredith commented on CASSANDRA-15318:
--

The original motivation for the patch was to mitigate the impact of an outage 
for write queries with CL.EACH_QUORUM where the first node in the forwarding 
list was considered UP but was in a state where it did not forward traffic to 
the others, so by shuffling with equal probability means 2/3 traffic gets 
through instead of failing.  Sorting or weighting by proximity might help 
performance but probably negate some of the resilience this was intended to 
help with.

> sendMessagesToNonlocalDC() should shuffle targets
> -
>
> Key: CASSANDRA-15318
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15318
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Messaging/Internode
>Reporter: Jon Meredith
>Assignee: Jon Meredith
>Priority: Normal
> Fix For: 4.0-alpha3, 4.0
>
>
> To better spread load and reduce the impact of a node failure before 
> detection (or other issues like issues host replacement), when forwarding 
> messages to other data centers the forwarding non-local dc nodes should be 
> selected at random rather than always selecting the first node in the list of 
> endpoints for a token.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15318) sendMessagesToNonlocalDC() should shuffle targets

2021-10-08 Thread Paulo Motta (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17426367#comment-17426367
 ] 

Paulo Motta commented on CASSANDRA-15318:
-

[~jmeredithco] Hi Jon, just came across this issue and was wondering if you 
think it would be worth sorting nodes by proximity when dynamic snitch is 
enabled to avoid stragglers when picking non-local forwarders?

> sendMessagesToNonlocalDC() should shuffle targets
> -
>
> Key: CASSANDRA-15318
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15318
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Messaging/Internode
>Reporter: Jon Meredith
>Assignee: Jon Meredith
>Priority: Normal
> Fix For: 4.0-alpha3, 4.0
>
>
> To better spread load and reduce the impact of a node failure before 
> detection (or other issues like issues host replacement), when forwarding 
> messages to other data centers the forwarding non-local dc nodes should be 
> selected at random rather than always selecting the first node in the list of 
> endpoints for a token.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15318) sendMessagesToNonlocalDC() should shuffle targets

2019-11-18 Thread Dinesh Joshi (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16976809#comment-16976809
 ] 

Dinesh Joshi commented on CASSANDRA-15318:
--

+1 but before merging lets insure that the test failures are unrelated.

> sendMessagesToNonlocalDC() should shuffle targets
> -
>
> Key: CASSANDRA-15318
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15318
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Messaging/Internode
>Reporter: Jon Meredith
>Assignee: Jon Meredith
>Priority: Normal
>
> To better spread load and reduce the impact of a node failure before 
> detection (or other issues like issues host replacement), when forwarding 
> messages to other data centers the forwarding non-local dc nodes should be 
> selected at random rather than always selecting the first node in the list of 
> endpoints for a token.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15318) sendMessagesToNonlocalDC() should shuffle targets

2019-11-18 Thread Jon Meredith (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16976802#comment-16976802
 ] 

Jon Meredith commented on CASSANDRA-15318:
--

Rebased and rerunning to double-check some (thought to be) unrelated unit test 
failures.

[CircleCI|https://circleci.com/workflow-run/c6247670-e965-4260-9632-5bd3deb9ad06]



> sendMessagesToNonlocalDC() should shuffle targets
> -
>
> Key: CASSANDRA-15318
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15318
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Messaging/Internode
>Reporter: Jon Meredith
>Assignee: Jon Meredith
>Priority: Normal
>
> To better spread load and reduce the impact of a node failure before 
> detection (or other issues like issues host replacement), when forwarding 
> messages to other data centers the forwarding non-local dc nodes should be 
> selected at random rather than always selecting the first node in the list of 
> endpoints for a token.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15318) sendMessagesToNonlocalDC() should shuffle targets

2019-10-08 Thread Jon Meredith (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16947206#comment-16947206
 ] 

Jon Meredith commented on CASSANDRA-15318:
--

Created a new branch against trunk now CASSANDRA-15319 is merged.

[Branch|https://github.com/jonmeredith/cassandra/tree/CASSANDRA-15318] 
[GItHub PR|https://github.com/apache/cassandra/pull/363]
[CircleCI|https://circleci.com/workflow-run/cbd7e18d-e452-4a8c-aa84-29f1285d38ba]

> sendMessagesToNonlocalDC() should shuffle targets
> -
>
> Key: CASSANDRA-15318
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15318
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Messaging/Internode
>Reporter: Jon Meredith
>Assignee: Jon Meredith
>Priority: Normal
>
> To better spread load and reduce the impact of a node failure before 
> detection (or other issues like issues host replacement), when forwarding 
> messages to other data centers the forwarding non-local dc nodes should be 
> selected at random rather than always selecting the first node in the list of 
> endpoints for a token.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15318) sendMessagesToNonlocalDC() should shuffle targets

2019-09-09 Thread Jon Meredith (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16925876#comment-16925876
 ] 

Jon Meredith commented on CASSANDRA-15318:
--

Linked is a patch to shuffle the node used for forwarding messages to remote 
DCs. The MessageForwardingTest
has been updated to verify that the remote node is picked as the forwarder at 
least once
(it doesn't check fairness, I didn't want to risk any flappy tests due to 
randomness).

trunk [changes | https://github.com/jonmeredith/cassandra/pull/1 ] [trunk 
CircleCI | 
https://circleci.com/workflow-run/c418-f747-4665-be7a-7a97ffcc]

(The PR is against my local CASSANDRA-15319 branch to make the diff clear, can 
retarget/reopen as needed)

> sendMessagesToNonlocalDC() should shuffle targets
> -
>
> Key: CASSANDRA-15318
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15318
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Messaging/Internode
>Reporter: Jon Meredith
>Assignee: Jon Meredith
>Priority: Normal
>
> To better spread load and reduce the impact of a node failure before 
> detection (or other issues like issues host replacement), when forwarding 
> messages to other data centers the forwarding non-local dc nodes should be 
> selected at random rather than always selecting the first node in the list of 
> endpoints for a token.
>  



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org