Re: Optimizing queries for partition keys

2018-05-08 Thread Sam Klock
Can someone please take a look at CASSANDRA-14415 when you have chance? Getting a fix into a Cassandra release is not especially urgent for us, but in lieu of that we would like to know whether it's safe to include in our local build of Cassandra before attempting to deploy it. Thanks, SK On

Re: Optimizing queries for partition keys

2018-04-24 Thread Sam Klock
Thanks. For those interested: opened CASSANDRA-14415. SK On 2018-04-19 06:04, Benjamin Lerer wrote: > Hi Sam, > > Your finding is interesting. Effectively, if the number of bytes to skip is > larger than the remaining bytes in the buffer + the buffer size it could be > faster to use seek. >

Re: Optimizing queries for partition keys

2018-04-19 Thread Benjamin Lerer
Hi Sam, Your finding is interesting. Effectively, if the number of bytes to skip is larger than the remaining bytes in the buffer + the buffer size it could be faster to use seek. Feel free to open a JIRA ticket and attach your patch. It will be great if you could add to the ticket your table

Re: Optimizing queries for partition keys

2018-04-17 Thread Sam Klock
Thanks (and apologies for the delayed response); that was the kind of feedback we were looking for. We backported the fix for CASSANDRA-10657 to 3.0.16, and it partially addresses our problem in the sense that it does limit the data sent on the wire. The performance is still extremely poor,

Re: Optimizing queries for partition keys

2018-03-22 Thread Benjamin Lerer
You should check the 3.x release. CASSANDRA-10657 could have fixed your problem. On Thu, Mar 22, 2018 at 9:15 PM, Benjamin Lerer wrote: > Syvlain explained the problem in CASSANDRA-4536: > " Let me note that in CQL3 a row that have no live column don't exist, so >

Re: Optimizing queries for partition keys

2018-03-22 Thread Benjamin Lerer
Syvlain explained the problem in CASSANDRA-4536: " Let me note that in CQL3 a row that have no live column don't exist, so we can't really implement this with a range slice having an empty columns list. Instead we should do a range slice with a full-row slice predicate with a count of 1, to make

Optimizing queries for partition keys

2018-03-22 Thread Sam Klock
Cassandra devs, We use workflows in some of our clusters (running 3.0.15) that involve "SELECT DISTINCT key FROM..."-style queries. For some tables, we observed extremely poor performance under light load (i.e., a small number of rows per second and frequent timeouts), which we eventually traced