2017-06-29 7:17 GMT+02:00 Yevhenii Kurtov :
> Hello folks,
>
> Thank you very much for analysis and suggested - there is a lot to learn
> here. I just tried UNION queries and got following error:
>
> ERROR: FOR UPDATE is not allowed with UNION/INTERSECT/EXCEPT
>
it is sad :(
maybe bitmap index scan can work
postgres=# create table test(id int, started date, failed date, status int);
CREATE TABLE
postgres=# create index on test(id) where status = 0;
CREATE INDEX
postgres=# create index on test(started) where status = 1;
CREATE INDEX
postgres=# create index on test(failed ) where status = 2;
CREATE INDEX
postgres=# explain select id from test where (status = 0 and id in
(1,2,3,4,5)) or (status = 1 and started < current_date) or (status = 2 and
failed > current_date);
┌
│
QUERY PLAN
╞
│ Bitmap Heap Scan on test (cost=12.93..22.50 rows=6 width=4)
│ Recheck Cond: (((id = ANY ('{1,2,3,4,5}'::integer[])) AND (status = 0))
OR ((started < CURRENT_DATE) AND (status = 1)) OR ((faile
│ Filter: (((status = 0) AND (id = ANY ('{1,2,3,4,5}'::integer[]))) OR
((status = 1) AND (started < CURRENT_DATE)) OR ((status = 2)
│ -> BitmapOr (cost=12.93..12.93 rows=6 width=0)
│ -> Bitmap Index Scan on test_id_idx (cost=0.00..4.66 rows=1
width=0)
│ Index Cond: (id = ANY ('{1,2,3,4,5}'::integer[]))
│ -> Bitmap Index Scan on test_started_idx (cost=0.00..4.13
rows=3 width=0)
│ Index Cond: (started < CURRENT_DATE)
│ -> Bitmap Index Scan on test_failed_idx (cost=0.00..4.13 rows=3
width=0)
│ Index Cond: (failed > CURRENT_DATE)
└
(10 rows)
>
> I made a table dump for anyone who wants to give it a spin
> https://app.box.com/s/464b12glmlk5o4gvzz7krc4c8s2fxlwr
> and here is the gist for the original commands https://gist.github.
> com/lessless/33215d0c147645db721e74e07498ac53
>
> On Wed, Jun 28, 2017 at 8:10 PM, Brad DeJong
> wrote:
>
>>
>>
>> On 2017-06-28, Pavel Stehule wrote ...
>> > On 2017-06-28, Yevhenii Kurtov wrote ...
>> >> On 2017-06-28, Pavel Stehule wrote ...
>> >>> On 2017-06-28, Yevhenii Kurtov wrote ...
>> We have a query that is run almost each second and it's very
>> important to squeeze every other ms out of it. The query is:
>> ...
>> I added following index: CREATE INDEX ON campaign_jobs(id, status,
>> failed_at, started_at, priority DESC, times_failed);
>> ...
>> >>> There are few issues
>> >>> a) parametrized LIMIT
>> >>> b) complex predicate with lot of OR
>> >>> c) slow external sort
>> >>>
>> >>> b) signalize maybe some strange in design .. try to replace "OR" by
>> "UNION" query
>> >>> c) if you can and you have good enough memory .. try to increase
>> work_mem .. maybe 20MB
>> >>>
>> >>> if you change query to union queries, then you can use conditional
>> indexes
>> >>>
>> >>> create index(id) where status = 0;
>> >>> create index(failed_at) where status = 2;
>> >>> create index(started_at) where status = 1;
>> >>
>> >> Can you please give a tip how to rewrite the query with UNION clause?
>> >
>> > SELECT c0."id" FROM "campaign_jobs" AS c0
>> > WHERE (((c0."status" = $1) AND NOT (c0."id" = ANY($2
>> > UNION SELECT c0."id" FROM "campaign_jobs" AS c0
>> > WHERE ((c0."status" = $3) AND (c0."failed_at" > $4))
>> > UNION SELECT c0."id" FROM "campaign_jobs" AS c0
>> > WHERE ((c0."status" = $5) AND (c0."started_at" < $6))
>> > ORDER BY c0."priority" DESC, c0."times_failed"
>> > LIMIT $7
>> > FOR UPDATE SKIP LOCKED
>>
>>
>> Normally (at least for developers I've worked with), that kind of query
>> structure is used when the "status" values don't overlap and don't change
>> from query to query. Judging from Pavel's suggested conditional indexes
>> (i.e. "where status = "), he also thinks that is likely.
>>
>> Give the optimizer that information so that it can use it. Assuming $1 =
>> 0 and $3 = 2 and $5 = 1, substitute literals. Substitute literal for $7 in
>> limit. Push order by and limit to each branch of the union all (or does
>> Postgres figure that out automatically?) Replace union with union all (not
>> sure about Postgres, but allows other dbms to avoid sorting and merging
>> result sets to eliminate duplicates). (Use of UNION ALL assumes that "id"
>> is unique across rows as implied by only "id" being selected with FOR
>> UPDATE. If multiple rows can have the same "id", then use UNION to
>> eliminate the duplicates.)
>>
>> SELECT "id" FROM "campaign_jobs" WHERE "status" = 0 AND NOT "id" = ANY($1)
>> UNION ALL
>> SELECT "id" FROM "campaign_jobs" WHERE