Re: Slow "not in array" operation

Michael Lewis Tue, 12 Nov 2019 12:32:28 -0800

Odd index choice by the optimizer given what is available. The bitmap being
lossy means more work_mem is needed if I remember properly.


It is interesting that skipping the where condition on the array is only
half a second. Is the array being toasted or is it small and being stored
in the same file as primary table?

What is the result for this count query? Is it roughly 4 million?


On Tue, Nov 12, 2019, 1:06 PM Marco Colli <[email protected]> wrote:

> 1) It is running on a DigitalOcean CPU-optimized droplet with dedicated
> hyperthreads (16 cores) and SSD.
> SHOW random_page_cost; => 2
>
> 2) What config names should I check exactly? I used some suggestions from
> the online PGTune, when I first configured the db some months ago:
> max_worker_processes = 16
> max_parallel_workers_per_gather = 8
> max_parallel_workers = 16
>
> 3) Here's the query plan that I get after disabling the seq scan:
>
>
>               QUERY PLAN
>
>
>
> -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>  Finalize Aggregate  (cost=2183938.89..2183938.90 rows=1 width=8) (actual
> time=94972.253..94972.254 rows=1 loops=1)
>
>    ->  Gather  (cost=2183938.16..2183938.87 rows=7 width=8) (actual
> time=94952.895..95132.626 rows=8 loops=1)
>
>          Workers Planned: 7
>
>          Workers Launched: 7
>
>          ->  Partial Aggregate  (cost=2182938.16..2182938.17 rows=1
> width=8) (actual time=94950.958..94950.958 rows=1 loops=8)
>
>                ->  Parallel Bitmap Heap Scan on subscriptions
> (cost=50294.50..2180801.47 rows=854677 width=0) (actual
> time=1831.342..94895.208 rows=611828 loops=8)
>
>                      Recheck Cond: ((project_id = 123) AND (trashed_at IS
> NULL))
>
>                      Rows Removed by Index Recheck: 2217924
>
>                      Filter: (NOT (tags @> '{en}'::character varying[]))
>
>                      Rows Removed by Filter: 288545
>
>                      Heap Blocks: exact=120301 lossy=134269
>
>                      ->  Bitmap Index Scan on
> index_subscriptions_on_project_id_and_tags  (cost=0.00..48798.81
> rows=6518094 width=0) (actual time=1493.823..1493.823 rows=7203173 loops=1)
>
>                            Index Cond: (project_id = 123)
>
>  Planning Time: 1.273 ms
>
>  Execution Time: 95132.766 ms
>
> (15 rows)
>
>
> On Tue, Nov 12, 2019 at 8:20 PM Michael Lewis <[email protected]> wrote:
>
>> It is very interesting to me that the optimizer chose a parallel
>> sequential scan rather than an index scan on either of your indexes that
>> start with project_id that also reference trashed_at.
>>
>> 1) Are you running on SSD type storage? Has random_page_cost been lowered
>> to 1-1.5 or so (close to 1 assumes good cache hits)?
>> 2) It seems you have increased parallel workers. Have you also changed
>> the startup or other cost configs related to how inclined the system is to
>> use sequential scans?
>> 3) If you disable sequential scan, what does the plan look like for this
>> query? (SET ENABLE_SEQSCAN TO OFF;)
>>
>>>

Re: Slow "not in array" operation

Reply via email to