I see time outs and I immediately blame firewalls. Have you triple checked
then?
Is this only occurring to a subset of clients?

Also, 3.0.6 is pretty dated and has many bugs, you should definitely
upgrade to the latest 3.0 (don't forget to read news.txt)
On 14 Dec. 2017 19:18, "Max Campos" <mc_cassan...@core43.com> wrote:

Hi -

We’re finally putting our new application under load, and we’re starting to
get this error message from the Python driver when under heavy load:

('Unable to connect to any servers', {‘x.y.z.205':
OperationTimedOut('errors=None, last_host=None',), ‘x.y.z.204':
OperationTimedOut('errors=None, last_host=None',), ‘x.y.z.206':
OperationTimedOut('errors=None, last_host=None',)})' (22.7s)

Our cluster is running 3.0.6, has 3 nodes and we use RF=3, CL=QUORUM
reads/writes.  We have a few thousand machines which are each making 1-10
connections to C* at once, but each of these connections only reads/writes
a few records, waits several minutes, and then writes a few records — so
while netstat reports ~5K connections per node, they’re generally idle.
Peak read/sec today was ~1500 per node, peak writes/sec was ~300 per node.
Read/write latencies peaked at 2.5ms.

Some questions:
1) Is anyone else out there making this many simultaneous connections?  Any
idea what a reasonable number of connections is, what is too many, etc?

2) Any thoughts on which JMX metrics I should look at to better understand
what exactly is exploding?  Is there a “number of active connections”
metric?  We currently look at:
- client reads/writes per sec
- read/write latency
- compaction tasks
- repair tasks
- disk used by node
- disk used by table
- avg partition size per table

3) Any other advice?

I think I’ll try doing an explicit disconnect during the waiting period of
our application’s execution; so as to get the C* connection count down.
Hopefully that will solve the timeout problem.

Thanks for your help.

- Max
---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org

Reply via email to