Re: DELETE/SELECT with multi-column PK and IN

2017-02-09 Thread Benjamin Roth
Ok now I REALLY got it :) Thanks Sylvain! 2017-02-09 11:42 GMT+01:00 Sylvain Lebresne : > On Thu, Feb 9, 2017 at 10:52 AM, Benjamin Roth > wrote: > >> Ok got it. >> >> But it's interesting that this is supported: >> DELETE/SELECT FROM ks.cf WHERE

Re: DELETE/SELECT with multi-column PK and IN

2017-02-09 Thread Benjamin Roth
This doesn't really belong to this topic but I also experienced what Ben says. I was migrating (and still am) tons of data from MySQL to CS. I measured several approached (async parallel, prepared stmt, sync with unlogged batches) and it turned out that batches where really fast and produced less

Re: DELETE/SELECT with multi-column PK and IN

2017-02-09 Thread Sylvain Lebresne
On Thu, Feb 9, 2017 at 10:52 AM, Benjamin Roth wrote: > Ok got it. > > But it's interesting that this is supported: > DELETE/SELECT FROM ks.cf WHERE (pk1) IN ((1), (2), (3)); > > This is technically mostly the same (Token awareness, > coordination/routing, read

Re: DELETE/SELECT with multi-column PK and IN

2017-02-09 Thread Ben Slater
That’s a very good point from Sylvain that I forgot/missed. That said, we’ve seen plenty of scenarios where overall system throughput is improved through unlogged batches. One of my colleagues did quite a bit of benchmarking on this topic for his talk at last year’s C* summit:

Re: DELETE/SELECT with multi-column PK and IN

2017-02-09 Thread Benjamin Roth
Ok got it. But it's interesting that this is supported: DELETE/SELECT FROM ks.cf WHERE (pk1) IN ((1), (2), (3)); This is technically mostly the same (Token awareness, coordination/routing, read performance, ...), right? 2017-02-09 10:43 GMT+01:00 Sylvain Lebresne : > This

Re: DELETE/SELECT with multi-column PK and IN

2017-02-09 Thread Sylvain Lebresne
This is a statement on multiple partitions and there is really no optimization the code internally does on that. In fact, I strongly advise you to not use a batch but rather simply do a for loop client side and send statement individually. That way, your driver will be able to use proper

Re: DELETE/SELECT with multi-column PK and IN

2017-02-09 Thread Benjamin Roth
Yes, thats the workaround - I'll try that. Would you agree it would be better for internal optimizations to process this within a single statement? 2017-02-09 10:32 GMT+01:00 Ben Slater : > Yep, that makes it clear. I think an unlogged batch of prepared statements >

Re: DELETE/SELECT with multi-column PK and IN

2017-02-09 Thread Ben Slater
Yep, that makes it clear. I think an unlogged batch of prepared statements with one statement per PK tuple would be roughly equivalent? And probably no more complex to generate in the client? On Thu, 9 Feb 2017 at 20:22 Benjamin Roth wrote: > Maybe that makes it clear:

Re: DELETE/SELECT with multi-column PK and IN

2017-02-09 Thread Benjamin Roth
Maybe that makes it clear: DELETE FROM ks.cf WHERE (partitionkey1, partitionkey2) IN ((1, 2), (1, 3), (2, 3), (3, 4)); If want to delete or select a bunch of records identified by their multi-partitionkey tuples. 2017-02-09 10:18 GMT+01:00 Ben Slater : > Are you

Re: DELETE/SELECT with multi-column PK and IN

2017-02-09 Thread Ben Slater
Are you looking this to be equivalent to (PK1=1 AND PK2=2) or are you looking for (PK1 IN (1,2) AND PK2 IN (1,2)) or something else? Cheers Ben On Thu, 9 Feb 2017 at 20:09 Benjamin Roth wrote: > Hi Guys, > > CQL says this is not allowed: > > DELETE FROM ks.cf WHERE

DELETE/SELECT with multi-column PK and IN

2017-02-09 Thread Benjamin Roth
Hi Guys, CQL says this is not allowed: DELETE FROM ks.cf WHERE (pk1, pk2) IN ((1, 2)); 1. Is there a reason for it? There shouldn't be a performance penalty, it is a PK lookup, the same thing works with a single pk column 2. Is there a known workaround for it? It would be much of a help to