Maybe you should try to lower your read repair probability?
— Sent from Mailbox On Sat, Nov 15, 2014 at 9:40 AM, Jimmy Lin <y2klyf+w...@gmail.com> wrote: > Well we are able to do the tracing under normal load, but not yet able to > turn on tracing on demand during heavy load from client side(due to hard to > predict traffic pattern). > under normal load we saw most of the time query spent (in one particular > row we focus on) between > merging data from memtables and (2-3) sstables > Read 10xx live cell and 2x tomstones cell. > Our cql basically pull out one row that has about 1000 columns(approx. 800k > size of data). This table already in level compaction. > But once we get a series of exact same cql(against same row), the response > time start to dramatically degraded from normal 300-500ms to like 1 sec or > 4 sec. > Other part of the system seems remain fine, no obvious latency spike In > read/write within the same keyspace or different keyspace. > So I wonder what is causing the sudden increase in latency of exact same > cql? what do we saturated ? if we saturated the disk IO, other part of the > tables will see similar effect but we didn't see it. > is there any table specific factor may contribute to the slowness? > thanks > On Mon, Nov 10, 2014 at 7:21 AM, DuyHai Doan <doanduy...@gmail.com> wrote: >> As Jonathan said, it's better to activate query tracing client side. It'll >> give you better flexibility of when to turn on & off tracing and on which >> table. Server-side tracing is global (all tables) and probabilistic, thus >> may not give satisfactory level of debugging. >> >> Programmatically it's pretty simple to achieve and coupled with a good >> logging framework (LogBack for Java), you'll even have dynamic logging on >> production without having to redeploy client code. I have implemented it in >> Achilles very easily by wrapping over the Regular/Bound/Simple statements >> of Java driver and display the bound values at runtime : >> https://github.com/doanduyhai/Achilles/wiki/Statements-Logging-and-Tracing#dynamic-statements-logging >> >> On Mon, Nov 10, 2014 at 3:52 PM, Johnny Miller <johnny.p.mil...@gmail.com> >> wrote: >> >>> Be cautious enabling query tracing. Great tool for dev/testing/diagnosing >>> etc.. - but it does persist data to the system_traces keyspace with a TTL >>> of 24 hours and will, as a consequence, consume resources. >>> >>> http://www.datastax.com/dev/blog/advanced-request-tracing-in-cassandra-1-2 >>> >>> >>> On 7 Nov 2014, at 20:20, Jonathan Haddad <j...@jonhaddad.com> wrote: >>> >>> Personally I've found that using query timing + log aggregation on the >>> client side is more effective than trying to mess with tracing probability >>> in order to find a single query which has recently become a problem. I >>> recommend wrapping your session with something that can automatically log >>> the statement on a slow query, then use tracing to identify exactly what >>> happened. This way finding your problem is not a matter of chance. >>> >>> >>> >>> On Fri Nov 07 2014 at 9:41:38 AM Chris Lohfink <clohfin...@gmail.com> >>> wrote: >>> >>>> It saves a lot of information for each request thats traced so there is >>>> significant overhead. If you start at a low probability and move it up >>>> based on the load impact it will provide a lot of insight and you can >>>> control the cost. >>>> >>>> --- >>>> Chris Lohfink >>>> >>>> On Fri, Nov 7, 2014 at 11:35 AM, Jimmy Lin <y2klyf+w...@gmail.com> >>>> wrote: >>>> >>>>> is there any significant performance penalty if one turn on Cassandra >>>>> query tracing, through DataStax java driver (say, per every query request >>>>> of some trouble query)? >>>>> >>>>> More sampling seems better but then doing so may also slow down the >>>>> system in some other ways? >>>>> >>>>> thanks >>>>> >>>>> >>>>> >>>> >>> >>