That’s a very good point from Sylvain that I forgot/missed. That said, we’ve seen plenty of scenarios where overall system throughput is improved through unlogged batches. One of my colleagues did quite a bit of benchmarking on this topic for his talk at last year’s C* summit: http://www.slideshare.net/DataStax/microbatching-highperformance-writes-adam-zegelin-instaclustr-cassandra-summit-2016
On Thu, 9 Feb 2017 at 20:52 Benjamin Roth <benjamin.r...@jaumo.com> wrote: > Ok got it. > > But it's interesting that this is supported: > DELETE/SELECT FROM ks.cf WHERE (pk1) IN ((1), (2), (3)); > > This is technically mostly the same (Token awareness, > coordination/routing, read performance, ...), right? > > 2017-02-09 10:43 GMT+01:00 Sylvain Lebresne <sylv...@datastax.com>: > > This is a statement on multiple partitions and there is really no > optimization the code internally does on that. In fact, I strongly advise > you to not use a batch but rather simply do a for loop client side and send > statement individually. That way, your driver will be able to use proper > token-awareness for each request (while if you send a batch, one > coordinator will be picked up and will have to forward most statement, > doing more network hops at the end of the day). The only case where using a > batch is indeed legit is if you care about all the statement being atomic, > but in that case it's a logged batch you want. > > That's btw more or less why we never bothered implementing that: it's > totally doable technically, but it's not really such a good idea > performance wise in practice most of the time, and you can easily work it > around with a batch if you need atomicity. > > Which is not saying it will never be and shouldn't be supported btw, there > is something to be said for the consistency of the CQL language in general. > But it's why no-one took time to do it so far. > > On Thu, Feb 9, 2017 at 10:36 AM, Benjamin Roth <benjamin.r...@jaumo.com> > wrote: > > Yes, thats the workaround - I'll try that. > > Would you agree it would be better for internal optimizations to process > this within a single statement? > > 2017-02-09 10:32 GMT+01:00 Ben Slater <ben.sla...@instaclustr.com>: > > Yep, that makes it clear. I think an unlogged batch of prepared statements > with one statement per PK tuple would be roughly equivalent? And probably > no more complex to generate in the client? > > On Thu, 9 Feb 2017 at 20:22 Benjamin Roth <benjamin.r...@jaumo.com> wrote: > > Maybe that makes it clear: > > DELETE FROM ks.cf WHERE (partitionkey1, partitionkey2) IN ((1, 2), (1, > 3), (2, 3), (3, 4)); > > If want to delete or select a bunch of records identified by their > multi-partitionkey tuples. > > 2017-02-09 10:18 GMT+01:00 Ben Slater <ben.sla...@instaclustr.com>: > > Are you looking this to be equivalent to (PK1=1 AND PK2=2) or are you > looking for (PK1 IN (1,2) AND PK2 IN (1,2)) or something else? > > Cheers > Ben > > On Thu, 9 Feb 2017 at 20:09 Benjamin Roth <benjamin.r...@jaumo.com> wrote: > > Hi Guys, > > CQL says this is not allowed: > > DELETE FROM ks.cf WHERE (pk1, pk2) IN ((1, 2)); > > 1. Is there a reason for it? There shouldn't be a performance penalty, it > is a PK lookup, the same thing works with a single pk column > 2. Is there a known workaround for it? > > It would be much of a help to have it for daily business, IMHO it's a > waste of resources to run multiple queries just to fetch a bunch of records > by a PK. > > Thanks in advance for any reply > > -- > Benjamin Roth > Prokurist > > Jaumo GmbH · www.jaumo.com > Wehrstraße 46 · 73035 Göppingen · Germany > Phone +49 7161 304880-6 <+49%207161%203048806> · Fax +49 7161 304880-1 > <+49%207161%203048801> > AG Ulm · HRB 731058 · Managing Director: Jens Kammerer > > -- > ———————— > Ben Slater > Chief Product Officer > Instaclustr: Cassandra + Spark - Managed | Consulting | Support > +61 437 929 798 <+61%20437%20929%20798> > > > > > -- > Benjamin Roth > Prokurist > > Jaumo GmbH · www.jaumo.com > Wehrstraße 46 · 73035 Göppingen · Germany > Phone +49 7161 304880-6 <+49%207161%203048806> · Fax +49 7161 304880-1 > <+49%207161%203048801> > AG Ulm · HRB 731058 · Managing Director: Jens Kammerer > > -- > ———————— > Ben Slater > Chief Product Officer > Instaclustr: Cassandra + Spark - Managed | Consulting | Support > +61 437 929 798 <+61%20437%20929%20798> > > > > > -- > Benjamin Roth > Prokurist > > Jaumo GmbH · www.jaumo.com > Wehrstraße 46 · 73035 Göppingen · Germany > Phone +49 7161 304880-6 <+49%207161%203048806> · Fax +49 7161 304880-1 > <+49%207161%203048801> > AG Ulm · HRB 731058 · Managing Director: Jens Kammerer > > > > > > -- > Benjamin Roth > Prokurist > > Jaumo GmbH · www.jaumo.com > Wehrstraße 46 · 73035 Göppingen · Germany > Phone +49 7161 304880-6 <+49%207161%203048806> · Fax +49 7161 304880-1 > <+49%207161%203048801> > AG Ulm · HRB 731058 · Managing Director: Jens Kammerer > -- ———————— Ben Slater Chief Product Officer Instaclustr: Cassandra + Spark - Managed | Consulting | Support +61 437 929 798