Re: [GENERAL] Perfomance of IN-clause with many elements and possible solutions

Jeff Janes Mon, 24 Jul 2017 19:50:56 -0700

On Jul 24, 2017 14:19, "PT" <wmo...@potentialtech.com> wrote:

On Mon, 24 Jul 2017 13:17:56 +0300
Dmitry Lazurkin <dila...@gmail.com> wrote:

> On 07/24/2017 01:40 AM, PT wrote:
> > In this example you count approximately 40,000,000 values, which is
> > about 40% of the table.
>
> 4 000 000 (:
>
> > If you really need these queries to be faster, I would suggest
> > materializing the data, i.e. create a table like:
> >
> > CREATE TABLE id_counts (
> >  id BIGINT PRIMARY KEY,
> >  num BIGINT
> > )
> >
> > Then use a trigger or similar technique to keep id_counts in sync
> > with the id table. You can then run queries of the form:
> >
> > SELECT sum(num) FROM id_counts WHERE id IN :values:
> >
> > which I would wager houseboats will be significantly faster.
> I use count only for example because it uses seqscan. I want optimize
> IN-clause ;-).

The IN clause is not what's taking all the time. It's the processing of
millions of rows that's taking all the time.

It isn't either-or.  It is the processing of millions of rows over the
large in-list which is taking the time. Processing an in-list as a hash
table would be great, but no one has gotten around to it implementing it
yet.  Maybe Dmitry will be the one to do that.

Cheers,

Jeff

Re: [GENERAL] Perfomance of IN-clause with many elements and possible solutions

Reply via email to