Re: Performance Of IN Queries On Wide Rows

Eric Stevens Tue, 20 Feb 2018 15:48:55 -0800

Someone can correct me if I'm wrong, but I believe if you do a large IN()
on a single partition's cluster keys, all the reads are going to be served
from a single replica.  Compared to many concurrent individual equal
statements you can get the performance gain of leaning on several replicas
for parallelism.


On Tue, Feb 20, 2018 at 11:43 AM Gareth Collins <gareth.o.coll...@gmail.com>
wrote:

> Hello,
>
> When querying large wide rows for multiple specific values is it
> better to do separate queries for each value...or do it with one query
> and an "IN"? I am using Cassandra 2.1.14
>
> I am asking because I had changed my app to use 'IN' queries and it
> **appears** to be slower rather than faster. I had assumed that the
> "IN" query should be faster...as I assumed it only needs to go down
> the read path once (i.e. row cache -> memtable -> key cache -> bloom
> filter -> index summary -> index -> compaction -> sstable) rather than
> once for each entry? Or are there some additional caveats that I
> should be aware of for 'IN' query performance (e.g. ordering of 'IN'
> query entries, closeness of 'IN' query values in the SSTable etc.)?
>
> thanks in advance,
> Gareth Collins
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: user-h...@cassandra.apache.org
>
>

Re: Performance Of IN Queries On Wide Rows

Reply via email to