Re: Erratically behaving query needs optimization

2019-09-03 Thread Barbu Paul - Gheorghe
On Tue, Sep 3, 2019 at 12:57 AM Jeff Janes  wrote:
>
> On Mon, Aug 26, 2019 at 4:26 AM Barbu Paul - Gheorghe 
>  wrote:
>>
>> On Sun, Aug 25, 2019 at 5:51 PM Jeff Janes  wrote:
>> >
>> > Yes, it certainly looks like it is due to cold caches.  But you say it is 
>> > slow at first, and then say it varies greatly during a run.  Is being slow 
>> > at first the only way it varies greatly, or is there large variation even 
>> > beyond that?
>>
>> There is a great variation in run times (hundreds of ms to several
>> seconds) even beyond the start of the server.
>> The query runs several times with a different device_id, object_id and
>> another list of attribute_ids and it varies from one another.
>
>
> If you run the exact same query (with the same parameters) once the cache is 
> hot, is the performance than pretty consistent within a given 
> parameterization?  Or is still variable even within one parameterization.
>
> If they are consistent, could you capture a fast parameterizaton and a slow 
> parameterization and show then and the plans or them?

Cannot test right now, but I think I had both cases.
In the same parametrization I had both fast and slow runs and of
course it varied when changed parametrization.

>>
>> EXPLAIN (ANALYZE,BUFFERS)
>>  SELECT DISTINCT ON (results.attribute_id) results.timestamp,
>> results.data FROM results
>>  WHERE
>>  results.data <> ''
>>  AND results.data IS NOT NULL
>>  AND results.object_id = 1955
>>  AND results.attribute_id IN (4, 5) -- possibly a longer list here
>>  AND results.data_access_result = 'SUCCESS'
>>  ORDER BY results.attribute_id, results.timestamp DESC
>>  LIMIT 2 -- limit by the length of the attributes list
>>
>> Limit  (cost=166793.28..167335.52 rows=2 width=54) (actual
>> time=134783.510..134816.941 rows=2 loops=1)
>>   Buffers: shared hit=19086 read=46836, temp read=1522 written=3311
>>   ->  Unique  (cost=166793.28..168420.01 rows=6 width=54) (actual
>> time=134783.507..134816.850 rows=2 loops=1)
>> Buffers: shared hit=19086 read=46836, temp read=1522 written=3311
>> ->  Sort  (cost=166793.28..167606.64 rows=325346 width=54)
>> (actual time=134783.505..134802.602 rows=205380 loops=1)
>>   Sort Key: attribute_id, "timestamp" DESC
>
>
> Do you have an index on (attribute_id, "timestamp" DESC)?  That might really 
> help if it can step through the rows already sorted, filter out the ones that 
> need filtering out (building the partial index might help here), hit the 
> other two tables for each of those rows using a nested loop, and stop after 2 
> rows which meet those conditions.  The problem is if you have to step through 
> an enormous number for rows before finding 2 of them with device_id=97.

I tried that index and it wasn't used, it still chose to do an
in-memory quicksort of ~600 kB. I wonder why?

>>
>> So maybe I should de-normalize and place the device_id column into the
>> "results" table and add it to the index in your suggestion above?
>
>
> Yes, if nothing else works, that should.  How hard would it be to maintain 
> that column in the correct state?

In the end I used this solution. It works ... fine, still I see slow
response times when the caches are cold, but afterwards things seem to
be fine (for now at least).
I had this in mind for a while, but wasn't convinced it was "good
design" since I had to denormalize the DB, but seeing the erratic
behaviour of the query, I finally gave up on using smart indices
trying to satisfy the planner.

It's also the first time I do this outside of a controlled learning
environment so there could be things that I missed.

Thanks for the help, all of you!

> Cheers,
>
> Jeff



-- 

Barbu Paul - Gheorghe




Re: Erratically behaving query needs optimization

2019-09-02 Thread Jeff Janes
On Mon, Aug 26, 2019 at 4:26 AM Barbu Paul - Gheorghe <
barbu.paul.gheor...@gmail.com> wrote:

> On Sun, Aug 25, 2019 at 5:51 PM Jeff Janes  wrote:
> >
> > Yes, it certainly looks like it is due to cold caches.  But you say it
> is slow at first, and then say it varies greatly during a run.  Is being
> slow at first the only way it varies greatly, or is there large variation
> even beyond that?
>
> There is a great variation in run times (hundreds of ms to several
> seconds) even beyond the start of the server.
> The query runs several times with a different device_id, object_id and
> another list of attribute_ids and it varies from one another.
>

If you run the exact same query (with the same parameters) once the cache
is hot, is the performance than pretty consistent within a given
parameterization?  Or is still variable even within one parameterization.

If they are consistent, could you capture a fast parameterizaton and a slow
parameterization and show then and the plans or them?


> > You can use pg_rewarm to overcome the cold cache issue when you first
> start up the server.
>
> I cannot find anything related to pg_rewarm other than some dead ends
> from 2013 from which I gather it only works on Linux.
> Anyway, I have problems even beyond the start of the database, it's
> just easier to reproduce the problem at the start, otherwise I have to
> leave the application running for a while (to invalidate part of the
> cache I think).
>

Sorry, should have been pg_prewarm, not pg_rewarm. Unfortunately, you
probably have two different problems.  Reproducing it one way is unlikely
to help you solve the other one.


> > If you query only on "results" with only the conditions that apply to
> "results", what is the expected number of rows, and what is the actual
> number of rows?
>
> Explain for the query on results only: https://explain.depesz.com/s/Csau


>
> EXPLAIN (ANALYZE,BUFFERS)
>  SELECT DISTINCT ON (results.attribute_id) results.timestamp,
> results.data FROM results
>  WHERE
>  results.data <> ''
>  AND results.data IS NOT NULL
>  AND results.object_id = 1955
>  AND results.attribute_id IN (4, 5) -- possibly a longer list here
>  AND results.data_access_result = 'SUCCESS'
>  ORDER BY results.attribute_id, results.timestamp DESC
>  LIMIT 2 -- limit by the length of the attributes list
>
> Limit  (cost=166793.28..167335.52 rows=2 width=54) (actual
> time=134783.510..134816.941 rows=2 loops=1)
>   Buffers: shared hit=19086 read=46836, temp read=1522 written=3311
>   ->  Unique  (cost=166793.28..168420.01 rows=6 width=54) (actual
> time=134783.507..134816.850 rows=2 loops=1)
> Buffers: shared hit=19086 read=46836, temp read=1522 written=3311
> ->  Sort  (cost=166793.28..167606.64 rows=325346 width=54)
> (actual time=134783.505..134802.602 rows=205380 loops=1)
>   Sort Key: attribute_id, "timestamp" DESC
>

Do you have an index on (attribute_id, "timestamp" DESC)?  That might
really help if it can step through the rows already sorted, filter out the
ones that need filtering out (building the partial index might help here),
hit the other two tables for each of those rows using a nested loop, and
stop after 2 rows which meet those conditions.  The problem is if you have
to step through an enormous number for rows before finding 2 of them with
device_id=97.


> So maybe I should de-normalize and place the device_id column into the
> "results" table and add it to the index in your suggestion above?
>

Yes, if nothing else works, that should.  How hard would it be to maintain
that column in the correct state?

Cheers,

Jeff


Re: Erratically behaving query needs optimization

2019-08-26 Thread Barbu Paul - Gheorghe
On Sun, Aug 25, 2019 at 5:51 PM Jeff Janes  wrote:
>
> Yes, it certainly looks like it is due to cold caches.  But you say it is 
> slow at first, and then say it varies greatly during a run.  Is being slow at 
> first the only way it varies greatly, or is there large variation even beyond 
> that?

There is a great variation in run times (hundreds of ms to several
seconds) even beyond the start of the server.
The query runs several times with a different device_id, object_id and
another list of attribute_ids and it varies from one another.

> You can use pg_rewarm to overcome the cold cache issue when you first start 
> up the server.

I cannot find anything related to pg_rewarm other than some dead ends
from 2013 from which I gather it only works on Linux.
Anyway, I have problems even beyond the start of the database, it's
just easier to reproduce the problem at the start, otherwise I have to
leave the application running for a while (to invalidate part of the
cache I think).

> If you query only on "results" with only the conditions that apply to 
> "results", what is the expected number of rows, and what is the actual number 
> of rows?

Explain for the query on results only: https://explain.depesz.com/s/Csau

EXPLAIN (ANALYZE,BUFFERS)
 SELECT DISTINCT ON (results.attribute_id) results.timestamp,
results.data FROM results
 WHERE
 results.data <> ''
 AND results.data IS NOT NULL
 AND results.object_id = 1955
 AND results.attribute_id IN (4, 5) -- possibly a longer list here
 AND results.data_access_result = 'SUCCESS'
 ORDER BY results.attribute_id, results.timestamp DESC
 LIMIT 2 -- limit by the length of the attributes list

Limit  (cost=166793.28..167335.52 rows=2 width=54) (actual
time=134783.510..134816.941 rows=2 loops=1)
  Buffers: shared hit=19086 read=46836, temp read=1522 written=3311
  ->  Unique  (cost=166793.28..168420.01 rows=6 width=54) (actual
time=134783.507..134816.850 rows=2 loops=1)
Buffers: shared hit=19086 read=46836, temp read=1522 written=3311
->  Sort  (cost=166793.28..167606.64 rows=325346 width=54)
(actual time=134783.505..134802.602 rows=205380 loops=1)
  Sort Key: attribute_id, "timestamp" DESC
  Sort Method: external merge  Disk: 26456kB
  Buffers: shared hit=19086 read=46836, temp read=1522 written=3311
  ->  Gather  (cost=1000.00..125882.23 rows=325346
width=54) (actual time=32.325..133815.793 rows=410749 loops=1)
Workers Planned: 2
Workers Launched: 2
Buffers: shared hit=19086 read=46836
->  Parallel Seq Scan on results
(cost=0.00..92347.63 rows=135561 width=54) (actual
time=18.496..133871.888 rows=136916 loops=3)
  Filter: ((data IS NOT NULL) AND (data <>
''::text) AND (attribute_id = ANY ('{4,5}'::integer[])) AND
(object_id = 1955) AND (data_access_result = 'SUCCESS'::text))
  Rows Removed by Filter: 920123
  Buffers: shared hit=19086 read=46836
Planning Time: 5.071 ms
Execution Time: 134874.687 ms

As far as I can see the estimates are close to the real returned rows
in the "Parallel Seq Scan on results".
The numbers are similar (of course) if I turn off query parallelism.
Or should I VACUUM ANALYZE again?
I'm sure I ran it enough.

>> How can I improve it to be consistently fast (is it possible to get to
>> several milliseconds?)?
>
>
> Maybe.  Depends on the answer to my previous question.
>
>>
>> What I don't really understand is why the nested loop has 3 loops
>> (three joined tables)?
>
>
> Each parallel execution counts as a loop.  There are 2 parallel workers, plus 
> the leader also participates, making three.
>
>>
>> And why does the first index scan indicate ~60k loops? And does it
>> really work? It doesn't seem to filter out any rows.
>
>
> The parallel hash join returns about 20,000 rows, but I think that that is 
> just for one of the three parallel executions, making about 60,000 in total.  
> I don't know why one of the nodes report combined execution and the other 
> just a single worker.  Parallel queries are hard to understand.  When I want 
> to optimize a query that does parallel execution, I just turn off parallelism 
> ("set max_parallel_workers_per_gather TO 0;") at first to make is simpler to 
> understand.
>
> Apparently everything with device_id = 97 just happens to pass all the rest 
> of your filters.  If you need those filters to make sure you get the right 
> answer in all cases, then you need them.  A lifeboat isn't useless just 
> because your ship didn't happen to sink today.

The part with the workers makes sense, thanks.
For the condition, I thought there is something more contrived going
on in the planner, but I failed to see it was that simple: there is no
need to remove anything since everything matches the condition.

>> Should I add an index only on (attribute_id, object_id)? And maybe
>> 

Re: Erratically behaving query needs optimization

2019-08-22 Thread legrand legrand
Hello,

1/ access scheduler_task_executions 
by index with device_id = 97
seems ok

2/ 
I don't understand why 
joining
scheduler_task_executions.id=scheduler_operation_executions.task_execution_id
is done using a parallel hash join 
when a nested loop would be better (regarding the number of rows involved)

maybe because index on scheduler_operation_executions.task_execution_id

"index_task_execution_id_desc" btree (task_execution_id DESC NULLS LAST)

is not usable or bloated or because of DESC NULLS LAST ?


3/ join with results.operation_execution_id
by index
seems OK

Regards
PAscal



--
Sent from: 
https://www.postgresql-archive.org/PostgreSQL-performance-f2050081.html




Re: Erratically behaving query needs optimization

2019-08-22 Thread Barbu Paul - Gheorghe
If I restart the PostgreSQL server, then the performance is bad,
several seconds to one or two hundred seconds.
This is reflected in the "buffers read" indicator, which is >0 when
performance is bad for the first "Index Scan using
index_operation_execution_id_asc on
results".

Probably this explains the oscillations in running time as well?
Cache gets filled after the first run, hence the performance improves,
then as the system runs, the cache gets dirty and performance for this
particular query degrades again.

On Thu, Aug 22, 2019 at 3:22 PM Barbu Paul - Gheorghe
 wrote:
>
> with ORDER BY so I get the correct results (163 seconds):
> https://explain.depesz.com/s/j3o1
>
> Unique  (cost=164620.19..164650.19 rows=4 width=54) (actual
> time=163953.091..163954.621 rows=2 loops=1)
>   Buffers: shared hit=183080 read=103411
>   ->  Sort  (cost=164620.19..164635.19 rows=5999 width=54) (actual
> time=163953.081..163953.570 rows=4102 loops=1)
> Sort Key: results.attribute_id, results."timestamp" DESC
> Sort Method: quicksort  Memory: 641kB
> Buffers: shared hit=183080 read=103411
> ->  Nested Loop  (cost=132172.41..164243.74 rows=5999
> width=54) (actual time=3054.965..163928.686 rows=4102 loops=1)
>   Buffers: shared hit=183074 read=103411
>   ->  HashAggregate  (cost=132171.98..132779.88 rows=60790
> width=4) (actual time=2484.449..2581.582 rows=59269 loops=1)
> Group Key: scheduler_operation_executions.id
> Buffers: shared hit=87 read=76529
> ->  Gather  (cost=44474.33..132020.01 rows=60790
> width=4) (actual time=312.503..2463.254 rows=59269 loops=1)
>   Workers Planned: 2
>   Workers Launched: 2
>   Buffers: shared hit=87 read=76529
>   ->  Parallel Hash Join
> (cost=43474.33..124941.01 rows=25329 width=4) (actual
> time=124.733..2279.986 rows=19756 loops=3)
> Hash Cond:
> (scheduler_operation_executions.task_execution_id =
> scheduler_task_executions.id)
> Buffers: shared hit=87 read=76529
> ->  Parallel Seq Scan on
> scheduler_operation_executions  (cost=0.00..74948.21 rows=2483221
> width=8) (actual time=0.126..1828.461 rows=1986887 loops=3)
>   Buffers: shared hit=2 read=50114
> ->  Parallel Hash
> (cost=43263.67..43263.67 rows=16853 width=4) (actual
> time=123.631..123.631 rows=13558 loops=3)
>   Buckets: 65536  Batches: 1
> Memory Usage: 2144kB
>   Buffers: shared hit=1 read=26415
>   ->  Parallel Bitmap Heap Scan on
> scheduler_task_executions  (cost=757.90..43263.67 rows=16853 width=4)
> (actual time=6.944..120.405 rows=13558 loops=3)
> Recheck Cond: (device_id = 97)
> Heap Blocks: exact=24124
> Buffers: shared hit=1 read=26415
> ->  Bitmap Index Scan on
> scheduler_task_executions_device_id_idx  (cost=0.00..747.79 rows=40448
> width=0) (actual time=13.378..13.378 rows=40675 loops=1)
>   Index Cond: (device_id = 97)
>   Buffers: shared read=114
>   ->  Index Scan using index_operation_execution_id_asc on
> results  (cost=0.43..0.51 rows=1 width=58) (actual time=2.720..2.720
> rows=0 loops=59269)
> Index Cond: (operation_execution_id =
> scheduler_operation_executions.id)
> Filter: ((data IS NOT NULL) AND (data <>
> ''::text) AND (attribute_id = ANY ('{4,5}'::integer[])) AND
> (object_id = 1955) AND (data_access_result = 'SUCCESS'::text))
> Rows Removed by Filter: 0
> Buffers: shared hit=182987 read=26882
> Planning Time: 349.908 ms
> Execution Time: 163962.314 ms
>
>
> With ORDER BY (on the second run, 0.6 seconds):
> https://explain.depesz.com/s/QZ1Z
> Unique  (cost=164620.19..164650.19 rows=4 width=54) (actual
> time=621.057..621.527 rows=2 loops=1)
>   Buffers: shared hit=236659 read=49826
>   ->  Sort  (cost=164620.19..164635.19 rows=5999 width=54) (actual
> time=621.056..621.188 rows=4102 loops=1)
> Sort Key: results.attribute_id, results."timestamp" DESC
> Sort Method: quicksort  Memory: 641kB
> Buffers: shared hit=236659 read=49826
> ->  Nested Loop  (cost=132172.41..164243.74 rows=5999
> width=54) (actual time=503.577..619.250 rows=4102 loops=1)
>   Buffers: shared hit=236659 read=49826
>   ->  HashAggregate  (cost=132171.98..132779.88 rows=60790
> width=4) (actual time=503.498..513.551 rows=59269 

Re: Erratically behaving query needs optimization

2019-08-22 Thread Barbu Paul - Gheorghe
with ORDER BY so I get the correct results (163 seconds):
https://explain.depesz.com/s/j3o1

Unique  (cost=164620.19..164650.19 rows=4 width=54) (actual
time=163953.091..163954.621 rows=2 loops=1)
  Buffers: shared hit=183080 read=103411
  ->  Sort  (cost=164620.19..164635.19 rows=5999 width=54) (actual
time=163953.081..163953.570 rows=4102 loops=1)
Sort Key: results.attribute_id, results."timestamp" DESC
Sort Method: quicksort  Memory: 641kB
Buffers: shared hit=183080 read=103411
->  Nested Loop  (cost=132172.41..164243.74 rows=5999
width=54) (actual time=3054.965..163928.686 rows=4102 loops=1)
  Buffers: shared hit=183074 read=103411
  ->  HashAggregate  (cost=132171.98..132779.88 rows=60790
width=4) (actual time=2484.449..2581.582 rows=59269 loops=1)
Group Key: scheduler_operation_executions.id
Buffers: shared hit=87 read=76529
->  Gather  (cost=44474.33..132020.01 rows=60790
width=4) (actual time=312.503..2463.254 rows=59269 loops=1)
  Workers Planned: 2
  Workers Launched: 2
  Buffers: shared hit=87 read=76529
  ->  Parallel Hash Join
(cost=43474.33..124941.01 rows=25329 width=4) (actual
time=124.733..2279.986 rows=19756 loops=3)
Hash Cond:
(scheduler_operation_executions.task_execution_id =
scheduler_task_executions.id)
Buffers: shared hit=87 read=76529
->  Parallel Seq Scan on
scheduler_operation_executions  (cost=0.00..74948.21 rows=2483221
width=8) (actual time=0.126..1828.461 rows=1986887 loops=3)
  Buffers: shared hit=2 read=50114
->  Parallel Hash
(cost=43263.67..43263.67 rows=16853 width=4) (actual
time=123.631..123.631 rows=13558 loops=3)
  Buckets: 65536  Batches: 1
Memory Usage: 2144kB
  Buffers: shared hit=1 read=26415
  ->  Parallel Bitmap Heap Scan on
scheduler_task_executions  (cost=757.90..43263.67 rows=16853 width=4)
(actual time=6.944..120.405 rows=13558 loops=3)
Recheck Cond: (device_id = 97)
Heap Blocks: exact=24124
Buffers: shared hit=1 read=26415
->  Bitmap Index Scan on
scheduler_task_executions_device_id_idx  (cost=0.00..747.79 rows=40448
width=0) (actual time=13.378..13.378 rows=40675 loops=1)
  Index Cond: (device_id = 97)
  Buffers: shared read=114
  ->  Index Scan using index_operation_execution_id_asc on
results  (cost=0.43..0.51 rows=1 width=58) (actual time=2.720..2.720
rows=0 loops=59269)
Index Cond: (operation_execution_id =
scheduler_operation_executions.id)
Filter: ((data IS NOT NULL) AND (data <>
''::text) AND (attribute_id = ANY ('{4,5}'::integer[])) AND
(object_id = 1955) AND (data_access_result = 'SUCCESS'::text))
Rows Removed by Filter: 0
Buffers: shared hit=182987 read=26882
Planning Time: 349.908 ms
Execution Time: 163962.314 ms


With ORDER BY (on the second run, 0.6 seconds):
https://explain.depesz.com/s/QZ1Z
Unique  (cost=164620.19..164650.19 rows=4 width=54) (actual
time=621.057..621.527 rows=2 loops=1)
  Buffers: shared hit=236659 read=49826
  ->  Sort  (cost=164620.19..164635.19 rows=5999 width=54) (actual
time=621.056..621.188 rows=4102 loops=1)
Sort Key: results.attribute_id, results."timestamp" DESC
Sort Method: quicksort  Memory: 641kB
Buffers: shared hit=236659 read=49826
->  Nested Loop  (cost=132172.41..164243.74 rows=5999
width=54) (actual time=503.577..619.250 rows=4102 loops=1)
  Buffers: shared hit=236659 read=49826
  ->  HashAggregate  (cost=132171.98..132779.88 rows=60790
width=4) (actual time=503.498..513.551 rows=59269 loops=1)
Group Key: scheduler_operation_executions.id
Buffers: shared hit=26790 read=49826
->  Gather  (cost=44474.33..132020.01 rows=60790
width=4) (actual time=65.499..489.396 rows=59269 loops=1)
  Workers Planned: 2
  Workers Launched: 2
  Buffers: shared hit=26790 read=49826
  ->  Parallel Hash Join
(cost=43474.33..124941.01 rows=25329 width=4) (actual
time=22.059..441.847 rows=19756 loops=3)
Hash Cond:
(scheduler_operation_executions.task_execution_id =
scheduler_task_executions.id)
Buffers: shared hit=26790 

Re: Erratically behaving query needs optimization

2019-08-22 Thread Barbu Paul - Gheorghe
That query, if I add the ORDER BY and LIMIT, returns the same results.

The problem is the fact that it behaves the same way regarding its
speed as the original query with the index you suggested.
Sometimes it takes 800ms, sometimes it takes 6s to run, how the hell
can I get it to behave the same every time?
After I added the index you suggested, it was fine for a while, next
morning the run time exploded back to several seconds per query... and
it oscillates.

On Wed, Aug 21, 2019 at 2:25 PM Luís Roberto Weck
 wrote:
>
> Em 21/08/2019 04:30, Barbu Paul - Gheorghe escreveu:
> > I wonder how I missed that... probabily because of the "WHERE" clause
> > in what I already had.
> >
> > I indexed by scheduler_task_executions.device_id and the new plan is
> > as follows: https://explain.depesz.com/s/cQRq
> >
> > Can it be further improved?
> >
> > Limit  (cost=138511.45..138519.36 rows=2 width=54) (actual
> > time=598.703..618.524 rows=2 loops=1)
> >Buffers: shared hit=242389 read=44098
> >->  Unique  (cost=138511.45..138527.26 rows=4 width=54) (actual
> > time=598.701..598.878 rows=2 loops=1)
> >  Buffers: shared hit=242389 read=44098
> >  ->  Sort  (cost=138511.45..138519.36 rows=3162 width=54)
> > (actual time=598.699..598.767 rows=2052 loops=1)
> >Sort Key: results.attribute_id, results."timestamp" DESC
> >Sort Method: quicksort  Memory: 641kB
> >Buffers: shared hit=242389 read=44098
> >->  Gather  (cost=44082.11..138327.64 rows=3162
> > width=54) (actual time=117.548..616.456 rows=4102 loops=1)
> >  Workers Planned: 2
> >  Workers Launched: 2
> >  Buffers: shared hit=242389 read=44098
> >  ->  Nested Loop  (cost=43082.11..137011.44
> > rows=1318 width=54) (actual time=47.436..525.664 rows=1367 loops=3)
> >Buffers: shared hit=242389 read=44098
> >->  Parallel Hash Join
> > (cost=43081.68..124545.34 rows=24085 width=4) (actual
> > time=33.099..469.958 rows=19756 loops=3)
> >  Hash Cond:
> > (scheduler_operation_executions.task_execution_id =
> > scheduler_task_executions.id)
> >  Buffers: shared hit=32518 read=44098
> >  ->  Parallel Seq Scan on
> > scheduler_operation_executions  (cost=0.00..74945.82 rows=2482982
> > width=8) (actual time=8.493..245.190 rows=1986887 loops=3)
> >Buffers: shared hit=6018 read=44098
> >  ->  Parallel Hash
> > (cost=42881.33..42881.33 rows=16028 width=4) (actual
> > time=23.272..23.272 rows=13558 loops=3)
> >Buckets: 65536  Batches: 1
> > Memory Usage: 2112kB
> >Buffers: shared hit=26416
> >->  Parallel Bitmap Heap Scan on
> > scheduler_task_executions  (cost=722.55..42881.33 rows=16028 width=4)
> > (actual time=27.290..61.563 rows=40675 loops=1)
> >  Recheck Cond: (device_id = 97)
> >  Heap Blocks: exact=26302
> >  Buffers: shared hit=26416
> >  ->  Bitmap Index Scan on
> > scheduler_task_executions_device_id_idx  (cost=0.00..712.93 rows=38467
> > width=0) (actual time=17.087..17.087 rows=40675 loops=1)
> >Index Cond: (device_id = 
> > 97)
> >Buffers: shared hit=114
> >->  Index Scan using
> > index_operation_execution_id_asc on results  (cost=0.43..0.51 rows=1
> > width=58) (actual time=0.003..0.003 rows=0 loops=59269)
> >  Index Cond: (operation_execution_id =
> > scheduler_operation_executions.id)
> >  Filter: ((data IS NOT NULL) AND (data
> > <> ''::text) AND (attribute_id = ANY ('{4,5}'::integer[]))
> > AND (object_id = 1955) AND (data_access_result = 'SUCCESS'::text))
> >  Rows Removed by Filter: 0
> >  Buffers: shared hit=209871
> > Planning Time: 2.327 ms
> > Execution Time: 618.935 ms
> >
> > On Tue, Aug 20, 2019 at 5:54 PM Luís Roberto Weck
> >  wrote:
> >> Em 20/08/2019 10:54, Barbu Paul - Gheorghe escreveu:
> >>> Hello,
> >>> I'm running "PostgreSQL 11.2, compiled by Visual C++ build 1914,
> >>> 64-bit" and I have a query that runs several times per user action
> >>> (9-10 times).
> >>> The query takes a long time to execute, specially at first, due to
> >>> cold caches I think, but the performance varies greatly during a run
> >>> of the application (while applying the said action by the user several
> >>> times).
> >>>
> >>> My tables 

Re: Erratically behaving query needs optimization

2019-08-22 Thread Luís Roberto Weck

Em 22/08/2019 08:51, Barbu Paul - Gheorghe escreveu:

That query, if I add the ORDER BY and LIMIT, returns the same results.

The problem is the fact that it behaves the same way regarding its
speed as the original query with the index you suggested.
Sometimes it takes 800ms, sometimes it takes 6s to run, how the hell
can I get it to behave the same every time?
After I added the index you suggested, it was fine for a while, next
morning the run time exploded back to several seconds per query... and
it oscillates.

On Wed, Aug 21, 2019 at 2:25 PM Luís Roberto Weck
 wrote:

Em 21/08/2019 04:30, Barbu Paul - Gheorghe escreveu:

I wonder how I missed that... probabily because of the "WHERE" clause
in what I already had.

I indexed by scheduler_task_executions.device_id and the new plan is
as follows: https://explain.depesz.com/s/cQRq

Can it be further improved?

Limit  (cost=138511.45..138519.36 rows=2 width=54) (actual
time=598.703..618.524 rows=2 loops=1)
Buffers: shared hit=242389 read=44098
->  Unique  (cost=138511.45..138527.26 rows=4 width=54) (actual
time=598.701..598.878 rows=2 loops=1)
  Buffers: shared hit=242389 read=44098
  ->  Sort  (cost=138511.45..138519.36 rows=3162 width=54)
(actual time=598.699..598.767 rows=2052 loops=1)
Sort Key: results.attribute_id, results."timestamp" DESC
Sort Method: quicksort  Memory: 641kB
Buffers: shared hit=242389 read=44098
->  Gather  (cost=44082.11..138327.64 rows=3162
width=54) (actual time=117.548..616.456 rows=4102 loops=1)
  Workers Planned: 2
  Workers Launched: 2
  Buffers: shared hit=242389 read=44098
  ->  Nested Loop  (cost=43082.11..137011.44
rows=1318 width=54) (actual time=47.436..525.664 rows=1367 loops=3)
Buffers: shared hit=242389 read=44098
->  Parallel Hash Join
(cost=43081.68..124545.34 rows=24085 width=4) (actual
time=33.099..469.958 rows=19756 loops=3)
  Hash Cond:
(scheduler_operation_executions.task_execution_id =
scheduler_task_executions.id)
  Buffers: shared hit=32518 read=44098
  ->  Parallel Seq Scan on
scheduler_operation_executions  (cost=0.00..74945.82 rows=2482982
width=8) (actual time=8.493..245.190 rows=1986887 loops=3)
Buffers: shared hit=6018 read=44098
  ->  Parallel Hash
(cost=42881.33..42881.33 rows=16028 width=4) (actual
time=23.272..23.272 rows=13558 loops=3)
Buckets: 65536  Batches: 1
Memory Usage: 2112kB
Buffers: shared hit=26416
->  Parallel Bitmap Heap Scan on
scheduler_task_executions  (cost=722.55..42881.33 rows=16028 width=4)
(actual time=27.290..61.563 rows=40675 loops=1)
  Recheck Cond: (device_id = 97)
  Heap Blocks: exact=26302
  Buffers: shared hit=26416
  ->  Bitmap Index Scan on
scheduler_task_executions_device_id_idx  (cost=0.00..712.93 rows=38467
width=0) (actual time=17.087..17.087 rows=40675 loops=1)
Index Cond: (device_id = 97)
Buffers: shared hit=114
->  Index Scan using
index_operation_execution_id_asc on results  (cost=0.43..0.51 rows=1
width=58) (actual time=0.003..0.003 rows=0 loops=59269)
  Index Cond: (operation_execution_id =
scheduler_operation_executions.id)
  Filter: ((data IS NOT NULL) AND (data
<> ''::text) AND (attribute_id = ANY ('{4,5}'::integer[]))
AND (object_id = 1955) AND (data_access_result = 'SUCCESS'::text))
  Rows Removed by Filter: 0
  Buffers: shared hit=209871
Planning Time: 2.327 ms
Execution Time: 618.935 ms

On Tue, Aug 20, 2019 at 5:54 PM Luís Roberto Weck
 wrote:

Em 20/08/2019 10:54, Barbu Paul - Gheorghe escreveu:

Hello,
I'm running "PostgreSQL 11.2, compiled by Visual C++ build 1914,
64-bit" and I have a query that runs several times per user action
(9-10 times).
The query takes a long time to execute, specially at first, due to
cold caches I think, but the performance varies greatly during a run
of the application (while applying the said action by the user several
times).

My tables are only getting bigger with time, not much DELETEs and even
less UPDATEs as far as I can tell.

Problematic query:

EXPLAIN (ANALYZE,BUFFERS)
SELECT DISTINCT ON (results.attribute_id) results.timestamp,
results.data FROM results
   JOIN 

Re: Erratically behaving query needs optimization

2019-08-21 Thread Luís Roberto Weck

Em 21/08/2019 04:30, Barbu Paul - Gheorghe escreveu:

I wonder how I missed that... probabily because of the "WHERE" clause
in what I already had.

I indexed by scheduler_task_executions.device_id and the new plan is
as follows: https://explain.depesz.com/s/cQRq

Can it be further improved?

Limit  (cost=138511.45..138519.36 rows=2 width=54) (actual
time=598.703..618.524 rows=2 loops=1)
   Buffers: shared hit=242389 read=44098
   ->  Unique  (cost=138511.45..138527.26 rows=4 width=54) (actual
time=598.701..598.878 rows=2 loops=1)
 Buffers: shared hit=242389 read=44098
 ->  Sort  (cost=138511.45..138519.36 rows=3162 width=54)
(actual time=598.699..598.767 rows=2052 loops=1)
   Sort Key: results.attribute_id, results."timestamp" DESC
   Sort Method: quicksort  Memory: 641kB
   Buffers: shared hit=242389 read=44098
   ->  Gather  (cost=44082.11..138327.64 rows=3162
width=54) (actual time=117.548..616.456 rows=4102 loops=1)
 Workers Planned: 2
 Workers Launched: 2
 Buffers: shared hit=242389 read=44098
 ->  Nested Loop  (cost=43082.11..137011.44
rows=1318 width=54) (actual time=47.436..525.664 rows=1367 loops=3)
   Buffers: shared hit=242389 read=44098
   ->  Parallel Hash Join
(cost=43081.68..124545.34 rows=24085 width=4) (actual
time=33.099..469.958 rows=19756 loops=3)
 Hash Cond:
(scheduler_operation_executions.task_execution_id =
scheduler_task_executions.id)
 Buffers: shared hit=32518 read=44098
 ->  Parallel Seq Scan on
scheduler_operation_executions  (cost=0.00..74945.82 rows=2482982
width=8) (actual time=8.493..245.190 rows=1986887 loops=3)
   Buffers: shared hit=6018 read=44098
 ->  Parallel Hash
(cost=42881.33..42881.33 rows=16028 width=4) (actual
time=23.272..23.272 rows=13558 loops=3)
   Buckets: 65536  Batches: 1
Memory Usage: 2112kB
   Buffers: shared hit=26416
   ->  Parallel Bitmap Heap Scan on
scheduler_task_executions  (cost=722.55..42881.33 rows=16028 width=4)
(actual time=27.290..61.563 rows=40675 loops=1)
 Recheck Cond: (device_id = 97)
 Heap Blocks: exact=26302
 Buffers: shared hit=26416
 ->  Bitmap Index Scan on
scheduler_task_executions_device_id_idx  (cost=0.00..712.93 rows=38467
width=0) (actual time=17.087..17.087 rows=40675 loops=1)
   Index Cond: (device_id = 97)
   Buffers: shared hit=114
   ->  Index Scan using
index_operation_execution_id_asc on results  (cost=0.43..0.51 rows=1
width=58) (actual time=0.003..0.003 rows=0 loops=59269)
 Index Cond: (operation_execution_id =
scheduler_operation_executions.id)
 Filter: ((data IS NOT NULL) AND (data
<> ''::text) AND (attribute_id = ANY ('{4,5}'::integer[]))
AND (object_id = 1955) AND (data_access_result = 'SUCCESS'::text))
 Rows Removed by Filter: 0
 Buffers: shared hit=209871
Planning Time: 2.327 ms
Execution Time: 618.935 ms

On Tue, Aug 20, 2019 at 5:54 PM Luís Roberto Weck
 wrote:

Em 20/08/2019 10:54, Barbu Paul - Gheorghe escreveu:

Hello,
I'm running "PostgreSQL 11.2, compiled by Visual C++ build 1914,
64-bit" and I have a query that runs several times per user action
(9-10 times).
The query takes a long time to execute, specially at first, due to
cold caches I think, but the performance varies greatly during a run
of the application (while applying the said action by the user several
times).

My tables are only getting bigger with time, not much DELETEs and even
less UPDATEs as far as I can tell.

Problematic query:

EXPLAIN (ANALYZE,BUFFERS)
SELECT DISTINCT ON (results.attribute_id) results.timestamp,
results.data FROM results
  JOIN scheduler_operation_executions ON
scheduler_operation_executions.id = results.operation_execution_id
  JOIN scheduler_task_executions ON scheduler_task_executions.id =
scheduler_operation_executions.task_execution_id
WHERE scheduler_task_executions.device_id = 97
  AND results.data <> ''
  AND results.data IS NOT NULL
  AND results.object_id = 1955
  AND results.attribute_id IN (4, 5) -- possibly a longer list here
  AND results.data_access_result = 'SUCCESS'
ORDER BY results.attribute_id, results.timestamp DESC
LIMIT 2 -- limit by the length of the attributes list

In words: I want the latest 

Re: Erratically behaving query needs optimization

2019-08-21 Thread Barbu Paul - Gheorghe
I wonder how I missed that... probabily because of the "WHERE" clause
in what I already had.

I indexed by scheduler_task_executions.device_id and the new plan is
as follows: https://explain.depesz.com/s/cQRq

Can it be further improved?

Limit  (cost=138511.45..138519.36 rows=2 width=54) (actual
time=598.703..618.524 rows=2 loops=1)
  Buffers: shared hit=242389 read=44098
  ->  Unique  (cost=138511.45..138527.26 rows=4 width=54) (actual
time=598.701..598.878 rows=2 loops=1)
Buffers: shared hit=242389 read=44098
->  Sort  (cost=138511.45..138519.36 rows=3162 width=54)
(actual time=598.699..598.767 rows=2052 loops=1)
  Sort Key: results.attribute_id, results."timestamp" DESC
  Sort Method: quicksort  Memory: 641kB
  Buffers: shared hit=242389 read=44098
  ->  Gather  (cost=44082.11..138327.64 rows=3162
width=54) (actual time=117.548..616.456 rows=4102 loops=1)
Workers Planned: 2
Workers Launched: 2
Buffers: shared hit=242389 read=44098
->  Nested Loop  (cost=43082.11..137011.44
rows=1318 width=54) (actual time=47.436..525.664 rows=1367 loops=3)
  Buffers: shared hit=242389 read=44098
  ->  Parallel Hash Join
(cost=43081.68..124545.34 rows=24085 width=4) (actual
time=33.099..469.958 rows=19756 loops=3)
Hash Cond:
(scheduler_operation_executions.task_execution_id =
scheduler_task_executions.id)
Buffers: shared hit=32518 read=44098
->  Parallel Seq Scan on
scheduler_operation_executions  (cost=0.00..74945.82 rows=2482982
width=8) (actual time=8.493..245.190 rows=1986887 loops=3)
  Buffers: shared hit=6018 read=44098
->  Parallel Hash
(cost=42881.33..42881.33 rows=16028 width=4) (actual
time=23.272..23.272 rows=13558 loops=3)
  Buckets: 65536  Batches: 1
Memory Usage: 2112kB
  Buffers: shared hit=26416
  ->  Parallel Bitmap Heap Scan on
scheduler_task_executions  (cost=722.55..42881.33 rows=16028 width=4)
(actual time=27.290..61.563 rows=40675 loops=1)
Recheck Cond: (device_id = 97)
Heap Blocks: exact=26302
Buffers: shared hit=26416
->  Bitmap Index Scan on
scheduler_task_executions_device_id_idx  (cost=0.00..712.93 rows=38467
width=0) (actual time=17.087..17.087 rows=40675 loops=1)
  Index Cond: (device_id = 97)
  Buffers: shared hit=114
  ->  Index Scan using
index_operation_execution_id_asc on results  (cost=0.43..0.51 rows=1
width=58) (actual time=0.003..0.003 rows=0 loops=59269)
Index Cond: (operation_execution_id =
scheduler_operation_executions.id)
Filter: ((data IS NOT NULL) AND (data
<> ''::text) AND (attribute_id = ANY ('{4,5}'::integer[]))
AND (object_id = 1955) AND (data_access_result = 'SUCCESS'::text))
Rows Removed by Filter: 0
Buffers: shared hit=209871
Planning Time: 2.327 ms
Execution Time: 618.935 ms

On Tue, Aug 20, 2019 at 5:54 PM Luís Roberto Weck
 wrote:
>
> Em 20/08/2019 10:54, Barbu Paul - Gheorghe escreveu:
> > Hello,
> > I'm running "PostgreSQL 11.2, compiled by Visual C++ build 1914,
> > 64-bit" and I have a query that runs several times per user action
> > (9-10 times).
> > The query takes a long time to execute, specially at first, due to
> > cold caches I think, but the performance varies greatly during a run
> > of the application (while applying the said action by the user several
> > times).
> >
> > My tables are only getting bigger with time, not much DELETEs and even
> > less UPDATEs as far as I can tell.
> >
> > Problematic query:
> >
> > EXPLAIN (ANALYZE,BUFFERS)
> > SELECT DISTINCT ON (results.attribute_id) results.timestamp,
> > results.data FROM results
> >  JOIN scheduler_operation_executions ON
> > scheduler_operation_executions.id = results.operation_execution_id
> >  JOIN scheduler_task_executions ON scheduler_task_executions.id =
> > scheduler_operation_executions.task_execution_id
> > WHERE scheduler_task_executions.device_id = 97
> >  AND results.data <> ''
> >  AND results.data IS NOT NULL
> >  AND results.object_id = 1955
> >  AND results.attribute_id IN (4, 5) -- possibly a longer list here
> >  AND results.data_access_result = 'SUCCESS'
> > ORDER BY results.attribute_id, results.timestamp DESC
> > LIMIT 2 -- limit by the length of the attributes list
> >
> > In 

Re: Erratically behaving query needs optimization

2019-08-20 Thread MichaelDBA
Yes, adding another index might help reduce the number of rows filtered 
--> Rows Removed by Filter: 1308337


Also, make sure you run vacuum analyze on this query.

Regards,
Michael Vitale

Luís Roberto Weck wrote on 8/20/2019 10:58 AM:

Em 20/08/2019 10:54, Barbu Paul - Gheorghe escreveu:

Hello,
I'm running "PostgreSQL 11.2, compiled by Visual C++ build 1914,
64-bit" and I have a query that runs several times per user action
(9-10 times).
The query takes a long time to execute, specially at first, due to
cold caches I think, but the performance varies greatly during a run
of the application (while applying the said action by the user several
times).

My tables are only getting bigger with time, not much DELETEs and even
less UPDATEs as far as I can tell.

Problematic query:

EXPLAIN (ANALYZE,BUFFERS)
SELECT DISTINCT ON (results.attribute_id) results.timestamp,
results.data FROM results
 JOIN scheduler_operation_executions ON
scheduler_operation_executions.id = results.operation_execution_id
 JOIN scheduler_task_executions ON scheduler_task_executions.id =
scheduler_operation_executions.task_execution_id
WHERE scheduler_task_executions.device_id = 97
 AND results.data <> ''
 AND results.data IS NOT NULL
 AND results.object_id = 1955
 AND results.attribute_id IN (4, 5) -- possibly a longer list here
 AND results.data_access_result = 'SUCCESS'
ORDER BY results.attribute_id, results.timestamp DESC
LIMIT 2 -- limit by the length of the attributes list

In words: I want the latest (ORDER BY results.timestamp DESC) results
of a device (scheduler_task_executions.device_id = 97 - hence the
joins results -> scheduler_operation_executions ->
scheduler_task_executions)
for a given object and attributes with some additional constraints on
the data column. But I only want the latest attributes for which we
have results, hence the DISTINCT ON (results.attribute_id) and LIMIT.

First run: https://explain.depesz.com/s/qh4C
Limit  (cost=157282.39..157290.29 rows=2 width=54) (actual
time=44068.166..44086.970 rows=2 loops=1)
   Buffers: shared hit=215928 read=85139
   ->  Unique  (cost=157282.39..157298.20 rows=4 width=54) (actual
time=44068.164..44069.301 rows=2 loops=1)
 Buffers: shared hit=215928 read=85139
 ->  Sort  (cost=157282.39..157290.29 rows=3162 width=54)
(actual time=44068.161..44068.464 rows=2052 loops=1)
   Sort Key: results.attribute_id, results."timestamp" DESC
   Sort Method: quicksort  Memory: 641kB
   Buffers: shared hit=215928 read=85139
   ->  Gather  (cost=62853.04..157098.57 rows=3162
width=54) (actual time=23518.745..44076.385 rows=4102 loops=1)
 Workers Planned: 2
 Workers Launched: 2
 Buffers: shared hit=215928 read=85139
 ->  Nested Loop  (cost=61853.04..155782.37
rows=1318 width=54) (actual time=23290.514..43832.223 rows=1367
loops=3)
   Buffers: shared hit=215928 read=85139
   ->  Parallel Hash Join
(cost=61852.61..143316.27 rows=24085 width=4) (actual
time=23271.275..40018.451 rows=19756 loops=3)
 Hash Cond:
(scheduler_operation_executions.task_execution_id =
scheduler_task_executions.id)
 Buffers: shared hit=6057 read=85139
 ->  Parallel Seq Scan on
scheduler_operation_executions  (cost=0.00..74945.82 rows=2482982
width=8) (actual time=7.575..15694.435 rows=1986887 loops=3)
   Buffers: shared hit=2996 
read=47120

 ->  Parallel Hash
(cost=61652.25..61652.25 rows=16029 width=4) (actual
time=23253.337..23253.337 rows=13558 loops=3)
   Buckets: 65536  Batches: 1
Memory Usage: 2144kB
   Buffers: shared hit=2977 
read=38019

   ->  Parallel Seq Scan on
scheduler_task_executions  (cost=0.00..61652.25 rows=16029 width=4)
(actual time=25.939..23222.174 rows=13558 loops=3)
 Filter: (device_id = 97)
 Rows Removed by Filter: 
1308337
 Buffers: shared hit=2977 
read=38019

   ->  Index Scan using
index_operation_execution_id_asc on results  (cost=0.43..0.51 rows=1
width=58) (actual time=0.191..0.191 rows=0 loops=59269)
 Index Cond: (operation_execution_id =
scheduler_operation_executions.id)
 Filter: ((data IS NOT NULL) AND (data
<> ''::text) AND (attribute_id = ANY ('{4,5}'::integer[]))
AND (object_id = 1955) AND (data_access_result = 'SUCCESS'::text))
 Rows Removed by Filter: 0
 Buffers: shared hit=209871
Planning Time: 29.295 ms
Execution 

Re: Erratically behaving query needs optimization

2019-08-20 Thread Luís Roberto Weck

Em 20/08/2019 10:54, Barbu Paul - Gheorghe escreveu:

Hello,
I'm running "PostgreSQL 11.2, compiled by Visual C++ build 1914,
64-bit" and I have a query that runs several times per user action
(9-10 times).
The query takes a long time to execute, specially at first, due to
cold caches I think, but the performance varies greatly during a run
of the application (while applying the said action by the user several
times).

My tables are only getting bigger with time, not much DELETEs and even
less UPDATEs as far as I can tell.

Problematic query:

EXPLAIN (ANALYZE,BUFFERS)
SELECT DISTINCT ON (results.attribute_id) results.timestamp,
results.data FROM results
 JOIN scheduler_operation_executions ON
scheduler_operation_executions.id = results.operation_execution_id
 JOIN scheduler_task_executions ON scheduler_task_executions.id =
scheduler_operation_executions.task_execution_id
WHERE scheduler_task_executions.device_id = 97
 AND results.data <> ''
 AND results.data IS NOT NULL
 AND results.object_id = 1955
 AND results.attribute_id IN (4, 5) -- possibly a longer list here
 AND results.data_access_result = 'SUCCESS'
ORDER BY results.attribute_id, results.timestamp DESC
LIMIT 2 -- limit by the length of the attributes list

In words: I want the latest (ORDER BY results.timestamp DESC) results
of a device (scheduler_task_executions.device_id = 97 - hence the
joins results -> scheduler_operation_executions ->
scheduler_task_executions)
for a given object and attributes with some additional constraints on
the data column. But I only want the latest attributes for which we
have results, hence the DISTINCT ON (results.attribute_id) and LIMIT.

First run: https://explain.depesz.com/s/qh4C
Limit  (cost=157282.39..157290.29 rows=2 width=54) (actual
time=44068.166..44086.970 rows=2 loops=1)
   Buffers: shared hit=215928 read=85139
   ->  Unique  (cost=157282.39..157298.20 rows=4 width=54) (actual
time=44068.164..44069.301 rows=2 loops=1)
 Buffers: shared hit=215928 read=85139
 ->  Sort  (cost=157282.39..157290.29 rows=3162 width=54)
(actual time=44068.161..44068.464 rows=2052 loops=1)
   Sort Key: results.attribute_id, results."timestamp" DESC
   Sort Method: quicksort  Memory: 641kB
   Buffers: shared hit=215928 read=85139
   ->  Gather  (cost=62853.04..157098.57 rows=3162
width=54) (actual time=23518.745..44076.385 rows=4102 loops=1)
 Workers Planned: 2
 Workers Launched: 2
 Buffers: shared hit=215928 read=85139
 ->  Nested Loop  (cost=61853.04..155782.37
rows=1318 width=54) (actual time=23290.514..43832.223 rows=1367
loops=3)
   Buffers: shared hit=215928 read=85139
   ->  Parallel Hash Join
(cost=61852.61..143316.27 rows=24085 width=4) (actual
time=23271.275..40018.451 rows=19756 loops=3)
 Hash Cond:
(scheduler_operation_executions.task_execution_id =
scheduler_task_executions.id)
 Buffers: shared hit=6057 read=85139
 ->  Parallel Seq Scan on
scheduler_operation_executions  (cost=0.00..74945.82 rows=2482982
width=8) (actual time=7.575..15694.435 rows=1986887 loops=3)
   Buffers: shared hit=2996 read=47120
 ->  Parallel Hash
(cost=61652.25..61652.25 rows=16029 width=4) (actual
time=23253.337..23253.337 rows=13558 loops=3)
   Buckets: 65536  Batches: 1
Memory Usage: 2144kB
   Buffers: shared hit=2977 read=38019
   ->  Parallel Seq Scan on
scheduler_task_executions  (cost=0.00..61652.25 rows=16029 width=4)
(actual time=25.939..23222.174 rows=13558 loops=3)
 Filter: (device_id = 97)
 Rows Removed by Filter: 1308337
 Buffers: shared hit=2977 read=38019
   ->  Index Scan using
index_operation_execution_id_asc on results  (cost=0.43..0.51 rows=1
width=58) (actual time=0.191..0.191 rows=0 loops=59269)
 Index Cond: (operation_execution_id =
scheduler_operation_executions.id)
 Filter: ((data IS NOT NULL) AND (data
<> ''::text) AND (attribute_id = ANY ('{4,5}'::integer[]))
AND (object_id = 1955) AND (data_access_result = 'SUCCESS'::text))
 Rows Removed by Filter: 0
 Buffers: shared hit=209871
Planning Time: 29.295 ms
Execution Time: 44087.365 ms


Second run: https://explain.depesz.com/s/uy9f
Limit  (cost=157282.39..157290.29 rows=2 width=54) (actual
time=789.363..810.440 rows=2 loops=1)
   Buffers: shared hit=216312 read=84755
   ->  Unique