2017-06-28 9:28 GMT+02:00 Yevhenii Kurtov <yevhenii.kur...@gmail.com>:
> Hello Pavel, > > Can you please give a tip how to rewrite the query with UNION clause? I > didn't use it at all before actually and afraid that will not get it > properly from the first time :) > SELECT c0."id" FROM "campaign_jobs" AS c0 WHERE (((c0."status" = $1) AND NOT (c0."id" = ANY($2)))) UNION SELECT c0."id" FROM "campaign_jobs" AS c0 WHERE ((c0."status" = $3) AND (c0."failed_at" > $4)) UNION SELECT c0."id" FROM "campaign_jobs" AS c0 WHERE ((c0."status" = $5) AND (c0."started_at" < $6)) ORDER BY c0."priority" DESC, c0."times_failed" LIMIT $7 FOR UPDATE SKIP LOCKED Something like this Pavel > > On Wed, Jun 28, 2017 at 2:12 PM, Pavel Stehule <pavel.steh...@gmail.com> > wrote: > >> >> >> 2017-06-28 8:47 GMT+02:00 Yevhenii Kurtov <yevhenii.kur...@gmail.com>: >> >>> Hello, >>> >>> We have a query that is run almost each second and it's very important >>> to squeeze every other ms out of it. The query is: >>> >>> SELECT c0."id" FROM "campaign_jobs" AS c0 >>> WHERE (((c0."status" = $1) AND NOT (c0."id" = ANY($2)))) >>> OR ((c0."status" = $3) AND (c0."failed_at" > $4)) >>> OR ((c0."status" = $5) AND (c0."started_at" < $6)) >>> ORDER BY c0."priority" DESC, c0."times_failed" >>> LIMIT $7 >>> FOR UPDATE SKIP LOCKED >>> >>> I added following index: >>> >>> CREATE INDEX ON campaign_jobs(id, status, failed_at, started_at, >>> priority DESC, times_failed); >>> >>> And it didn't help at all, even opposite - the planning phase time grew >>> up from ~2ms up to ~40 ms leaving execution time intact: >>> >>> Limit (cost=29780.02..29781.27 rows=100 width=18) (actual >>> time=827.753..828.113 rows=100 loops=1) >>> -> LockRows (cost=29780.02..32279.42 rows=199952 width=18) (actual >>> time=827.752..828.096 rows=100 loops=1) >>> -> Sort (cost=29780.02..30279.90 rows=199952 width=18) >>> (actual time=827.623..827.653 rows=100 loops=1) >>> Sort Key: priority DESC, times_failed >>> Sort Method: external sort Disk: 5472kB >>> -> Seq Scan on campaign_jobs c0 (cost=0.00..22138.00 >>> rows=199952 width=18) (actual time=1.072..321.410 rows=200000 loops=1) >>> Filter: (((status = 0) AND (id <> ALL >>> ('{1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22, >>> 23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42, >>> 43,44,45,46,47,48}'::integer[]))) OR ((status = 2) AND (failed_at > >>> '2017-06-22 03:18:09'::timestamp without time zone)) OR ((status = 1) AND >>> (started_at < '2017-06-23 03:11:09'::timestamp without time zone))) >>> Planning time: 40.734 ms >>> Execution time: 913.638 ms >>> (9 rows) >>> >>> >>> I see that query still went through the Seq Scan instead of Index Scan. >>> Is it due to poorly crafted index or because of query structure? Is it >>> possible to make this query faster? >>> >> >> There are few issues >> >> a) parametrized LIMIT >> b) complex predicate with lot of OR >> c) slow external sort >> >> b) signalize maybe some strange in design .. try to replace "OR" by >> "UNION" query >> c) if you can and you have good enough memory .. try to increase work_mem >> .. maybe 20MB >> >> if you change query to union queries, then you can use conditional indexes >> >> create index(id) where status = 0; >> create index(failed_at) where status = 2; >> create index(started_at) where status = 1; >> >> Regards >> >> Pavel >> >> >>> >>> Thanks >>> >> >> >