[
https://issues.apache.org/jira/browse/CASSANDRA-18025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17640146#comment-17640146
]
Israel Fruchter commented on CASSANDRA-18025:
---------------------------------------------
[~smiklosovic] the idea is that we are using cassandra-stress during complex
test cases, in which at any given time one of the nodes might be affect by a
ChaosMonkey style experiment/fault. in those situation we might start
cassandra-stress with a list of nodes, which one of them is down. and it's part
of what we are testing.
other stress tools we are using, doesn't suffer from this issue, since they are
passing all contact points down to their driver code.
The whole idea is to let user run the stress tool also on cluster which are not
fully functioning, as long as you doing CQL command in QUORUM for example, you
have guarantee it should work, isn't that what Cassandra advertise ? :)
Anyhow randomly picking one of the nodes doesn't guarantees that all of them
are working, just that this one that was selected is up.
[~brandon.williams] the driver is aware that nodes goes down, but I don't think
cassandra-stress is listen to those. (but it's visible you turn on some of the
driver debug levels)
And again, all production code will be using multiple contact points (for
exactly the reason I've described), I think tooling and tests should act as
close to production behavior as possible.
> cassandra-stress: not all contact point are passed down to driver
> -----------------------------------------------------------------
>
> Key: CASSANDRA-18025
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18025
> Project: Cassandra
> Issue Type: Bug
> Components: Tool/stress
> Reporter: Israel Fruchter
> Priority: Normal
> Fix For: 3.0.x, 3.11.x, 4.0.x, 4.1.x, 4.2
>
> Time Spent: 10m
> Remaining Estimate: 0h
>
> Seem like c-s is randomly selecting a node from the nodes passed down to it
> in the command line, and use that node as contact point to the driver.
>
> When using c-s together with other management operations (for example
> expending/shrinking the cluster), we can get into situation some of the nodes
> mentioned in the command line aren't reachable/available, and c-s instead of
> applying the best practice of having multiple contact points, pass down only
> one that can be unavailable and fail completely without trying any of the
> other nodes mentioned in the command line
> we just fixed that in our fork of cassandra-stress:
> [https://github.com/scylladb/scylla-tools-java/pull/314]
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]