Re: Debugging write timeouts on Cassandra 2.2.5

Alain RODRIGUEZ Thu, 18 Feb 2016 00:34:35 -0800

Hi Mike,

What about the output of tpstats ? I imagine you have dropped messages
there. Any blocked threads ? Could you paste this output here ?


May this be due to some network hiccup to access the disks as they are EBS
? Can you think of anyway of checking this ? Do you have a lot of GC logs,
how long are the pauses (use something like: grep -i 'GCInspector'
/var/log/cassandra/system.log) ?

Something else you could check are local_writes stats to see if only one
table if affected or this is keyspace / cluster wide. You can use metrics
exposed by cassandra or if you have no dashboards I believe a: 'nodetool
cfstats <myks> | grep -e 'Table:' -e 'Local'' should give you a rough idea
of local latencies.

Those are just things I would check, I have not a clue on what is happening
here, hope this will help.

C*heers,
-----------------
Alain Rodriguez
France

The Last Pickle
http://www.thelastpickle.com

2016-02-18 5:13 GMT+01:00 Mike Heffner <m...@librato.com>:

> Jaydeep,
>
> No, we don't use any light weight transactions.
>
> Mike
>
> On Wed, Feb 17, 2016 at 6:44 PM, Jaydeep Chovatia <
> chovatia.jayd...@gmail.com> wrote:
>
>> Are you guys using light weight transactions in your write path?
>>
>> On Thu, Feb 11, 2016 at 12:36 AM, Fabrice Facorat <
>> fabrice.faco...@gmail.com> wrote:
>>
>>> Are your commitlog and data on the same disk ? If yes, you should put
>>> commitlogs on a separate disk which don't have a lot of IO.
>>>
>>> Others IO may have great impact impact on your commitlog writing and
>>> it may even block.
>>>
>>> An example of impact IO may have, even for Async writes:
>>>
>>> https://engineering.linkedin.com/blog/2016/02/eliminating-large-jvm-gc-pauses-caused-by-background-io-traffic
>>>
>>> 2016-02-11 0:31 GMT+01:00 Mike Heffner <m...@librato.com>:
>>> > Jeff,
>>> >
>>> > We have both commitlog and data on a 4TB EBS with 10k IOPS.
>>> >
>>> > Mike
>>> >
>>> > On Wed, Feb 10, 2016 at 5:28 PM, Jeff Jirsa <
>>> jeff.ji...@crowdstrike.com>
>>> > wrote:
>>> >>
>>> >> What disk size are you using?
>>> >>
>>> >>
>>> >>
>>> >> From: Mike Heffner
>>> >> Reply-To: "user@cassandra.apache.org"
>>> >> Date: Wednesday, February 10, 2016 at 2:24 PM
>>> >> To: "user@cassandra.apache.org"
>>> >> Cc: Peter Norton
>>> >> Subject: Re: Debugging write timeouts on Cassandra 2.2.5
>>> >>
>>> >> Paulo,
>>> >>
>>> >> Thanks for the suggestion, we ran some tests against CMS and saw the
>>> same
>>> >> timeouts. On that note though, we are going to try doubling the
>>> instance
>>> >> sizes and testing with double the heap (even though current usage is
>>> low).
>>> >>
>>> >> Mike
>>> >>
>>> >> On Wed, Feb 10, 2016 at 3:40 PM, Paulo Motta <
>>> pauloricard...@gmail.com>
>>> >> wrote:
>>> >>>
>>> >>> Are you using the same GC settings as the staging 2.0 cluster? If
>>> not,
>>> >>> could you try using the default GC settings (CMS) and see if that
>>> changes
>>> >>> anything? This is just a wild guess, but there were reports before of
>>> >>> G1-caused instabilities with small heap sizes (< 16GB - see
>>> CASSANDRA-10403
>>> >>> for more context). Please ignore if you already tried reverting back
>>> to CMS.
>>> >>>
>>> >>> 2016-02-10 16:51 GMT-03:00 Mike Heffner <m...@librato.com>:
>>> >>>>
>>> >>>> Hi all,
>>> >>>>
>>> >>>> We've recently embarked on a project to update our Cassandra
>>> >>>> infrastructure running on EC2. We are long time users of 2.0.x and
>>> are
>>> >>>> testing out a move to version 2.2.5 running on VPC with EBS. Our
>>> test setup
>>> >>>> is a 3 node, RF=3 cluster supporting a small write load (mirror of
>>> our
>>> >>>> staging load).
>>> >>>>
>>> >>>> We are writing at QUORUM and while p95's look good compared to our
>>> >>>> staging 2.0.x cluster, we are seeing frequent write operations that
>>> time out
>>> >>>> at the max write_request_timeout_in_ms (10 seconds). CPU across the
>>> cluster
>>> >>>> is < 10% and EBS write load is < 100 IOPS. Cassandra is running
>>> with the
>>> >>>> Oracle JDK 8u60 and we're using G1GC and any GC pauses are less
>>> than 500ms.
>>> >>>>
>>> >>>> We run on c4.2xl instances with GP2 EBS attached storage for data
>>> and
>>> >>>> commitlog directories. The nodes are using EC2 enhanced networking
>>> and have
>>> >>>> the latest Intel network driver module. We are running on HVM
>>> instances
>>> >>>> using Ubuntu 14.04.2.
>>> >>>>
>>> >>>> Our schema is 5 tables, all with COMPACT STORAGE. Each table is
>>> similar
>>> >>>> to the definition here:
>>> >>>> https://gist.github.com/mheffner/4d80f6b53ccaa24cc20a
>>> >>>>
>>> >>>> This is our cassandra.yaml:
>>> >>>>
>>> https://gist.github.com/mheffner/fea80e6e939dd483f94f#file-cassandra-yaml
>>> >>>>
>>> >>>> Like I mentioned we use 8u60 with G1GC and have used many of the GC
>>> >>>> settings in Al Tobey's tuning guide. This is our upstart config
>>> with JVM and
>>> >>>> other CPU settings:
>>> https://gist.github.com/mheffner/dc44613620b25c4fa46d
>>> >>>>
>>> >>>> We've used several of the sysctl settings from Al's guide as well:
>>> >>>> https://gist.github.com/mheffner/ea40d58f58a517028152
>>> >>>>
>>> >>>> Our client application is able to write using either Thrift batches
>>> >>>> using Asytanax driver or CQL async INSERT's using the Datastax Java
>>> driver.
>>> >>>>
>>> >>>> For testing against Thrift (our legacy infra uses this) we write
>>> batches
>>> >>>> of anywhere from 6 to 1500 rows at a time. Our p99 for batch
>>> execution is
>>> >>>> around 45ms but our maximum (p100) sits less than 150ms except when
>>> it
>>> >>>> periodically spikes to the full 10seconds.
>>> >>>>
>>> >>>> Testing the same write path using CQL writes instead demonstrates
>>> >>>> similar behavior. Low p99s except for periodic full timeouts. We
>>> enabled
>>> >>>> tracing for several operations but were unable to get a trace that
>>> completed
>>> >>>> successfully -- Cassandra started logging many messages as:
>>> >>>>
>>> >>>> INFO  [ScheduledTasks:1] - MessagingService.java:946 - _TRACE
>>> messages
>>> >>>> were dropped in last 5000 ms: 52499 for internal timeout and 0 for
>>> cross
>>> >>>> node timeout
>>> >>>>
>>> >>>> And all the traces contained rows with a "null" source_elapsed row:
>>> >>>>
>>> https://gist.githubusercontent.com/mheffner/1d68a70449bd6688a010/raw/0327d7d3d94c3a93af02b64212e3b7e7d8f2911b/trace.out
>>> >>>>
>>> >>>>
>>> >>>> We've exhausted as many configuration option permutations that we
>>> can
>>> >>>> think of. This cluster does not appear to be under any significant
>>> load and
>>> >>>> latencies seem to largely fall in two bands: low normal or max
>>> timeout. This
>>> >>>> seems to imply that something is getting stuck and timing out at
>>> the max
>>> >>>> write timeout.
>>> >>>>
>>> >>>> Any suggestions on what to look for? We had debug enabled for
>>> awhile but
>>> >>>> we didn't see any msg that pointed to something obvious. Happy to
>>> provide
>>> >>>> any more information that may help.
>>> >>>>
>>> >>>> We are pretty much at the point of sprinkling debug around the code
>>> to
>>> >>>> track down what could be blocking.
>>> >>>>
>>> >>>>
>>> >>>> Thanks,
>>> >>>>
>>> >>>> Mike
>>> >>>>
>>> >>>> --
>>> >>>>
>>> >>>>   Mike Heffner <m...@librato.com>
>>> >>>>   Librato, Inc.
>>> >>>>
>>> >>>
>>> >>
>>> >>
>>> >>
>>> >> --
>>> >>
>>> >>   Mike Heffner <m...@librato.com>
>>> >>   Librato, Inc.
>>> >>
>>> >
>>> >
>>> >
>>> > --
>>> >
>>> >   Mike Heffner <m...@librato.com>
>>> >   Librato, Inc.
>>> >
>>>
>>>
>>>
>>> --
>>> Close the World, Open the Net
>>> http://www.linux-wizard.net
>>>
>>
>>
>
>
> --
>
>   Mike Heffner <m...@librato.com>
>   Librato, Inc.
>
>

Re: Debugging write timeouts on Cassandra 2.2.5

Reply via email to