Re: query tracing
Well we are able to do the tracing under normal load, but not yet able to turn on tracing on demand during heavy load from client side(due to hard to predict traffic pattern). under normal load we saw most of the time query spent (in one particular row we focus on) between merging data from memtables and (2-3) sstables Read 10xx live cell and 2x tomstones cell. Our cql basically pull out one row that has about 1000 columns(approx. 800k size of data). This table already in level compaction. But once we get a series of exact same cql(against same row), the response time start to dramatically degraded from normal 300-500ms to like 1 sec or 4 sec. Other part of the system seems remain fine, no obvious latency spike In read/write within the same keyspace or different keyspace. So I wonder what is causing the sudden increase in latency of exact same cql? what do we saturated ? if we saturated the disk IO, other part of the tables will see similar effect but we didn't see it. is there any table specific factor may contribute to the slowness? thanks On Mon, Nov 10, 2014 at 7:21 AM, DuyHai Doan doanduy...@gmail.com wrote: As Jonathan said, it's better to activate query tracing client side. It'll give you better flexibility of when to turn on off tracing and on which table. Server-side tracing is global (all tables) and probabilistic, thus may not give satisfactory level of debugging. Programmatically it's pretty simple to achieve and coupled with a good logging framework (LogBack for Java), you'll even have dynamic logging on production without having to redeploy client code. I have implemented it in Achilles very easily by wrapping over the Regular/Bound/Simple statements of Java driver and display the bound values at runtime : https://github.com/doanduyhai/Achilles/wiki/Statements-Logging-and-Tracing#dynamic-statements-logging On Mon, Nov 10, 2014 at 3:52 PM, Johnny Miller johnny.p.mil...@gmail.com wrote: Be cautious enabling query tracing. Great tool for dev/testing/diagnosing etc.. - but it does persist data to the system_traces keyspace with a TTL of 24 hours and will, as a consequence, consume resources. http://www.datastax.com/dev/blog/advanced-request-tracing-in-cassandra-1-2 On 7 Nov 2014, at 20:20, Jonathan Haddad j...@jonhaddad.com wrote: Personally I've found that using query timing + log aggregation on the client side is more effective than trying to mess with tracing probability in order to find a single query which has recently become a problem. I recommend wrapping your session with something that can automatically log the statement on a slow query, then use tracing to identify exactly what happened. This way finding your problem is not a matter of chance. On Fri Nov 07 2014 at 9:41:38 AM Chris Lohfink clohfin...@gmail.com wrote: It saves a lot of information for each request thats traced so there is significant overhead. If you start at a low probability and move it up based on the load impact it will provide a lot of insight and you can control the cost. --- Chris Lohfink On Fri, Nov 7, 2014 at 11:35 AM, Jimmy Lin y2klyf+w...@gmail.com wrote: is there any significant performance penalty if one turn on Cassandra query tracing, through DataStax java driver (say, per every query request of some trouble query)? More sampling seems better but then doing so may also slow down the system in some other ways? thanks
Re: query tracing
Maybe you should try to lower your read repair probability? — Sent from Mailbox On Sat, Nov 15, 2014 at 9:40 AM, Jimmy Lin y2klyf+w...@gmail.com wrote: Well we are able to do the tracing under normal load, but not yet able to turn on tracing on demand during heavy load from client side(due to hard to predict traffic pattern). under normal load we saw most of the time query spent (in one particular row we focus on) between merging data from memtables and (2-3) sstables Read 10xx live cell and 2x tomstones cell. Our cql basically pull out one row that has about 1000 columns(approx. 800k size of data). This table already in level compaction. But once we get a series of exact same cql(against same row), the response time start to dramatically degraded from normal 300-500ms to like 1 sec or 4 sec. Other part of the system seems remain fine, no obvious latency spike In read/write within the same keyspace or different keyspace. So I wonder what is causing the sudden increase in latency of exact same cql? what do we saturated ? if we saturated the disk IO, other part of the tables will see similar effect but we didn't see it. is there any table specific factor may contribute to the slowness? thanks On Mon, Nov 10, 2014 at 7:21 AM, DuyHai Doan doanduy...@gmail.com wrote: As Jonathan said, it's better to activate query tracing client side. It'll give you better flexibility of when to turn on off tracing and on which table. Server-side tracing is global (all tables) and probabilistic, thus may not give satisfactory level of debugging. Programmatically it's pretty simple to achieve and coupled with a good logging framework (LogBack for Java), you'll even have dynamic logging on production without having to redeploy client code. I have implemented it in Achilles very easily by wrapping over the Regular/Bound/Simple statements of Java driver and display the bound values at runtime : https://github.com/doanduyhai/Achilles/wiki/Statements-Logging-and-Tracing#dynamic-statements-logging On Mon, Nov 10, 2014 at 3:52 PM, Johnny Miller johnny.p.mil...@gmail.com wrote: Be cautious enabling query tracing. Great tool for dev/testing/diagnosing etc.. - but it does persist data to the system_traces keyspace with a TTL of 24 hours and will, as a consequence, consume resources. http://www.datastax.com/dev/blog/advanced-request-tracing-in-cassandra-1-2 On 7 Nov 2014, at 20:20, Jonathan Haddad j...@jonhaddad.com wrote: Personally I've found that using query timing + log aggregation on the client side is more effective than trying to mess with tracing probability in order to find a single query which has recently become a problem. I recommend wrapping your session with something that can automatically log the statement on a slow query, then use tracing to identify exactly what happened. This way finding your problem is not a matter of chance. On Fri Nov 07 2014 at 9:41:38 AM Chris Lohfink clohfin...@gmail.com wrote: It saves a lot of information for each request thats traced so there is significant overhead. If you start at a low probability and move it up based on the load impact it will provide a lot of insight and you can control the cost. --- Chris Lohfink On Fri, Nov 7, 2014 at 11:35 AM, Jimmy Lin y2klyf+w...@gmail.com wrote: is there any significant performance penalty if one turn on Cassandra query tracing, through DataStax java driver (say, per every query request of some trouble query)? More sampling seems better but then doing so may also slow down the system in some other ways? thanks
Re: query tracing
hi Jen, interesting idea, but I thought read repair happen in background, and so won't affect the actual read request calling from real client. ? On Sat, Nov 15, 2014 at 1:04 AM, Jens Rantil jens.ran...@tink.se wrote: Maybe you should try to lower your read repair probability? — Sent from Mailbox https://www.dropbox.com/mailbox On Sat, Nov 15, 2014 at 9:40 AM, Jimmy Lin y2klyf+w...@gmail.com wrote: Well we are able to do the tracing under normal load, but not yet able to turn on tracing on demand during heavy load from client side(due to hard to predict traffic pattern). under normal load we saw most of the time query spent (in one particular row we focus on) between merging data from memtables and (2-3) sstables Read 10xx live cell and 2x tomstones cell. Our cql basically pull out one row that has about 1000 columns(approx. 800k size of data). This table already in level compaction. But once we get a series of exact same cql(against same row), the response time start to dramatically degraded from normal 300-500ms to like 1 sec or 4 sec. Other part of the system seems remain fine, no obvious latency spike In read/write within the same keyspace or different keyspace. So I wonder what is causing the sudden increase in latency of exact same cql? what do we saturated ? if we saturated the disk IO, other part of the tables will see similar effect but we didn't see it. is there any table specific factor may contribute to the slowness? thanks On Mon, Nov 10, 2014 at 7:21 AM, DuyHai Doan doanduy...@gmail.com wrote: As Jonathan said, it's better to activate query tracing client side. It'll give you better flexibility of when to turn on off tracing and on which table. Server-side tracing is global (all tables) and probabilistic, thus may not give satisfactory level of debugging. Programmatically it's pretty simple to achieve and coupled with a good logging framework (LogBack for Java), you'll even have dynamic logging on production without having to redeploy client code. I have implemented it in Achilles very easily by wrapping over the Regular/Bound/Simple statements of Java driver and display the bound values at runtime : https://github.com/doanduyhai/Achilles/wiki/Statements-Logging-and-Tracing#dynamic-statements-logging On Mon, Nov 10, 2014 at 3:52 PM, Johnny Miller johnny.p.mil...@gmail.com wrote: Be cautious enabling query tracing. Great tool for dev/testing/diagnosing etc.. - but it does persist data to the system_traces keyspace with a TTL of 24 hours and will, as a consequence, consume resources. http://www.datastax.com/dev/blog/advanced-request-tracing-in-cassandra-1-2 On 7 Nov 2014, at 20:20, Jonathan Haddad j...@jonhaddad.com wrote: Personally I've found that using query timing + log aggregation on the client side is more effective than trying to mess with tracing probability in order to find a single query which has recently become a problem. I recommend wrapping your session with something that can automatically log the statement on a slow query, then use tracing to identify exactly what happened. This way finding your problem is not a matter of chance. On Fri Nov 07 2014 at 9:41:38 AM Chris Lohfink clohfin...@gmail.com wrote: It saves a lot of information for each request thats traced so there is significant overhead. If you start at a low probability and move it up based on the load impact it will provide a lot of insight and you can control the cost. --- Chris Lohfink On Fri, Nov 7, 2014 at 11:35 AM, Jimmy Lin y2klyf+w...@gmail.com wrote: is there any significant performance penalty if one turn on Cassandra query tracing, through DataStax java driver (say, per every query request of some trouble query)? More sampling seems better but then doing so may also slow down the system in some other ways? thanks
Re: Cassandra default consistency level on multi datacenter
yes, already found...via the QueryOptions 2014-11-15 1:28 GMT+01:00 Tyler Hobbs ty...@datastax.com: Cassandra itself does not have default consistency levels. These are only configured in the driver. On Fri, Nov 14, 2014 at 8:54 AM, Adil adil.cha...@gmail.com wrote: Hi, We are using two datacenter and we want to set the default consistency level to LOCAL_ONE instead of ONE but we don't know how to configure it. We set LOCAL_QUORUM via cql driver for the desired queries but we won't do the same for the default one. Thanks in advance Adil -- Tyler Hobbs DataStax http://datastax.com/
writetime of individual set members, and what happens when you add a set member a second time.
So I think there are some operations in CQL WRT sets/maps that aren’t supported yet or at least not very well documented. For example, you can set the TTL on individual set members, but how do you read the writetime() ? normally on a column I can just SELECT writetime(foo) from my_table; but … I can’t do that for an individual set member. and what happens to an individual set member’s writetime (and eventual gc, expiration) if I write it again with the same member? Does the write time get changed because it’s a new add or does the write time stay the same because its’ already there? Kevin -- Founder/CEO Spinn3r.com Location: *San Francisco, CA* blog: http://burtonator.wordpress.com … or check out my Google+ profile https://plus.google.com/102718274791889610666/posts http://spinn3r.com
Two writers appending to a set to see which one wins?
I have two tasks trying to each insert into a table. The only problem is that I only want one to win, and then never perform that operation again. So my idea was to use the set append support in Cassandra to attempt to append to the set and if we win, then I can perform my operation. The problem is, how.. I don’t think there’s a way to find out that your INSERT successfully added or failed a set append. Is there something I’m missing? Kevin -- Founder/CEO Spinn3r.com Location: *San Francisco, CA* blog: http://burtonator.wordpress.com … or check out my Google+ profile https://plus.google.com/102718274791889610666/posts http://spinn3r.com