On Jul 24, 2017 14:19, "PT" <wmo...@potentialtech.com> wrote:
On Mon, 24 Jul 2017 13:17:56 +0300 Dmitry Lazurkin <dila...@gmail.com> wrote: > On 07/24/2017 01:40 AM, PT wrote: > > In this example you count approximately 40,000,000 values, which is > > about 40% of the table. > > 4 000 000 (: > > > If you really need these queries to be faster, I would suggest > > materializing the data, i.e. create a table like: > > > > CREATE TABLE id_counts ( > > id BIGINT PRIMARY KEY, > > num BIGINT > > ) > > > > Then use a trigger or similar technique to keep id_counts in sync > > with the id table. You can then run queries of the form: > > > > SELECT sum(num) FROM id_counts WHERE id IN :values: > > > > which I would wager houseboats will be significantly faster. > I use count only for example because it uses seqscan. I want optimize > IN-clause ;-). The IN clause is not what's taking all the time. It's the processing of millions of rows that's taking all the time. It isn't either-or. It is the processing of millions of rows over the large in-list which is taking the time. Processing an in-list as a hash table would be great, but no one has gotten around to it implementing it yet. Maybe Dmitry will be the one to do that. Cheers, Jeff