hi Jen, interesting idea, but I thought read repair happen in background, and so won't affect the actual read request calling from real client. ?
On Sat, Nov 15, 2014 at 1:04 AM, Jens Rantil <jens.ran...@tink.se> wrote: > Maybe you should try to lower your read repair probability? > > — > Sent from Mailbox <https://www.dropbox.com/mailbox> > > > On Sat, Nov 15, 2014 at 9:40 AM, Jimmy Lin <y2klyf+w...@gmail.com> wrote: > >> Well we are able to do the tracing under normal load, but not yet able >> to turn on tracing on demand during heavy load from client side(due to hard >> to predict traffic pattern). >> >> under normal load we saw most of the time query spent (in one particular >> row we focus on) between >> merging data from memtables and (2-3) sstables >> Read 10xx live cell and 2x tomstones cell. >> >> Our cql basically pull out one row that has about 1000 columns(approx. >> 800k size of data). This table already in level compaction. >> >> But once we get a series of exact same cql(against same row), the >> response time start to dramatically degraded from normal 300-500ms to like >> 1 sec or 4 sec. >> Other part of the system seems remain fine, no obvious latency spike In >> read/write within the same keyspace or different keyspace. >> >> So I wonder what is causing the sudden increase in latency of exact same >> cql? what do we saturated ? if we saturated the disk IO, other part of the >> tables will see similar effect but we didn't see it. >> is there any table specific factor may contribute to the slowness? >> >> thanks >> >> >> >> >> >> >> >> >> On Mon, Nov 10, 2014 at 7:21 AM, DuyHai Doan <doanduy...@gmail.com> >> wrote: >> >>> As Jonathan said, it's better to activate query tracing client side. >>> It'll give you better flexibility of when to turn on & off tracing and on >>> which table. Server-side tracing is global (all tables) and probabilistic, >>> thus may not give satisfactory level of debugging. >>> >>> Programmatically it's pretty simple to achieve and coupled with a good >>> logging framework (LogBack for Java), you'll even have dynamic logging on >>> production without having to redeploy client code. I have implemented it in >>> Achilles very easily by wrapping over the Regular/Bound/Simple statements >>> of Java driver and display the bound values at runtime : >>> https://github.com/doanduyhai/Achilles/wiki/Statements-Logging-and-Tracing#dynamic-statements-logging >>> >>> On Mon, Nov 10, 2014 at 3:52 PM, Johnny Miller < >>> johnny.p.mil...@gmail.com> wrote: >>> >>>> Be cautious enabling query tracing. Great tool for >>>> dev/testing/diagnosing etc.. - but it does persist data to the >>>> system_traces keyspace with a TTL of 24 hours and will, as a consequence, >>>> consume resources. >>>> >>>> >>>> http://www.datastax.com/dev/blog/advanced-request-tracing-in-cassandra-1-2 >>>> >>>> >>>> On 7 Nov 2014, at 20:20, Jonathan Haddad <j...@jonhaddad.com> wrote: >>>> >>>> Personally I've found that using query timing + log aggregation on the >>>> client side is more effective than trying to mess with tracing probability >>>> in order to find a single query which has recently become a problem. I >>>> recommend wrapping your session with something that can automatically log >>>> the statement on a slow query, then use tracing to identify exactly what >>>> happened. This way finding your problem is not a matter of chance. >>>> >>>> >>>> >>>> On Fri Nov 07 2014 at 9:41:38 AM Chris Lohfink <clohfin...@gmail.com> >>>> wrote: >>>> >>>>> It saves a lot of information for each request thats traced so there >>>>> is significant overhead. If you start at a low probability and move it up >>>>> based on the load impact it will provide a lot of insight and you can >>>>> control the cost. >>>>> >>>>> --- >>>>> Chris Lohfink >>>>> >>>>> On Fri, Nov 7, 2014 at 11:35 AM, Jimmy Lin <y2klyf+w...@gmail.com> >>>>> wrote: >>>>> >>>>>> is there any significant performance penalty if one turn on >>>>>> Cassandra query tracing, through DataStax java driver (say, per every >>>>>> query >>>>>> request of some trouble query)? >>>>>> >>>>>> More sampling seems better but then doing so may also slow down the >>>>>> system in some other ways? >>>>>> >>>>>> thanks >>>>>> >>>>>> >>>>>> >>>>> >>>> >>> >> >