2017-06-28 9:28 GMT+02:00 Yevhenii Kurtov <yevhenii.kur...@gmail.com>:

> Hello Pavel,
>
> Can you please give a tip how to rewrite the query with UNION clause? I
> didn't use it at all before actually and afraid that will not get it
> properly from the first time :)
>

SELECT c0."id" FROM "campaign_jobs" AS c0
WHERE (((c0."status" = $1) AND NOT (c0."id" = ANY($2))))
UNION SELECT c0."id" FROM "campaign_jobs" AS c0
WHERE ((c0."status" = $3) AND (c0."failed_at" > $4))
UNION SELECT c0."id" FROM "campaign_jobs" AS c0
WHERE ((c0."status" = $5) AND (c0."started_at" < $6))
ORDER BY c0."priority" DESC, c0."times_failed"
LIMIT $7
FOR UPDATE SKIP LOCKED

Something like this

Pavel


>
> On Wed, Jun 28, 2017 at 2:12 PM, Pavel Stehule <pavel.steh...@gmail.com>
> wrote:
>
>>
>>
>> 2017-06-28 8:47 GMT+02:00 Yevhenii Kurtov <yevhenii.kur...@gmail.com>:
>>
>>> Hello,
>>>
>>> We have a query that is run almost each second and it's very important
>>> to squeeze every other ms out of it. The query is:
>>>
>>> SELECT c0."id" FROM "campaign_jobs" AS c0
>>> WHERE (((c0."status" = $1) AND NOT (c0."id" = ANY($2))))
>>> OR ((c0."status" = $3) AND (c0."failed_at" > $4))
>>> OR ((c0."status" = $5) AND (c0."started_at" < $6))
>>> ORDER BY c0."priority" DESC, c0."times_failed"
>>> LIMIT $7
>>> FOR UPDATE SKIP LOCKED
>>>
>>> I added following index:
>>>
>>> CREATE INDEX ON campaign_jobs(id, status, failed_at, started_at,
>>> priority DESC, times_failed);
>>>
>>> And it didn't help at all, even opposite - the planning phase time grew
>>> up from ~2ms  up to ~40 ms leaving execution time intact:
>>>
>>>  Limit  (cost=29780.02..29781.27 rows=100 width=18) (actual
>>> time=827.753..828.113 rows=100 loops=1)
>>>    ->  LockRows  (cost=29780.02..32279.42 rows=199952 width=18) (actual
>>> time=827.752..828.096 rows=100 loops=1)
>>>          ->  Sort  (cost=29780.02..30279.90 rows=199952 width=18)
>>> (actual time=827.623..827.653 rows=100 loops=1)
>>>                Sort Key: priority DESC, times_failed
>>>                Sort Method: external sort  Disk: 5472kB
>>>                ->  Seq Scan on campaign_jobs c0  (cost=0.00..22138.00
>>> rows=199952 width=18) (actual time=1.072..321.410 rows=200000 loops=1)
>>>                      Filter: (((status = 0) AND (id <> ALL
>>> ('{1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,
>>> 23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,
>>> 43,44,45,46,47,48}'::integer[]))) OR ((status = 2) AND (failed_at >
>>> '2017-06-22 03:18:09'::timestamp without time zone)) OR ((status = 1) AND
>>> (started_at < '2017-06-23 03:11:09'::timestamp without time zone)))
>>>  Planning time: 40.734 ms
>>>  Execution time: 913.638 ms
>>> (9 rows)
>>>
>>>
>>> I see that query still went through the Seq Scan instead of Index Scan.
>>> Is it due to poorly crafted index or because of query structure? Is it
>>> possible to make this query faster?
>>>
>>
>> There are few issues
>>
>> a) parametrized LIMIT
>> b) complex predicate with lot of OR
>> c)  slow external sort
>>
>> b) signalize maybe some strange in design .. try to replace "OR" by
>> "UNION" query
>> c) if you can and you have good enough memory .. try to increase work_mem
>> .. maybe 20MB
>>
>> if you change query to union queries, then you can use conditional indexes
>>
>> create index(id) where status = 0;
>> create index(failed_at) where status = 2;
>> create index(started_at) where status = 1;
>>
>> Regards
>>
>> Pavel
>>
>>
>>>
>>> Thanks
>>>
>>
>>
>

Reply via email to