Someone can correct me if I'm wrong, but I believe if you do a large IN() on a single partition's cluster keys, all the reads are going to be served from a single replica. Compared to many concurrent individual equal statements you can get the performance gain of leaning on several replicas for parallelism.
On Tue, Feb 20, 2018 at 11:43 AM Gareth Collins <gareth.o.coll...@gmail.com> wrote: > Hello, > > When querying large wide rows for multiple specific values is it > better to do separate queries for each value...or do it with one query > and an "IN"? I am using Cassandra 2.1.14 > > I am asking because I had changed my app to use 'IN' queries and it > **appears** to be slower rather than faster. I had assumed that the > "IN" query should be faster...as I assumed it only needs to go down > the read path once (i.e. row cache -> memtable -> key cache -> bloom > filter -> index summary -> index -> compaction -> sstable) rather than > once for each entry? Or are there some additional caveats that I > should be aware of for 'IN' query performance (e.g. ordering of 'IN' > query entries, closeness of 'IN' query values in the SSTable etc.)? > > thanks in advance, > Gareth Collins > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org > For additional commands, e-mail: user-h...@cassandra.apache.org > >