Re: [HACKERS] Actuall row count of Parallel Seq Scan in EXPLAIN ANALYZE .

Masahiko Sawada Sun, 19 Jun 2016 23:55:15 -0700

On Mon, Jun 20, 2016 at 3:42 PM, Amit Kapila <[email protected]> wrote:
> On Mon, Jun 20, 2016 at 11:48 AM, Masahiko Sawada <[email protected]>
> wrote:
>>
>> Hi all,
>>
>> My colleague noticed that the output of EXPLAIN ANALYZE doesn't work
>> fine for parallel seq scan.
>>
>> postgres(1)=# explain analyze verbose select count(*) from
>> pgbench_accounts ;
>>                                                     QUERY PLAN
>>
>> -----------------------------------------------------------------------------------------------------------------------------------------------------
>>  Finalize Aggregate  (cost=217018.55..217018.56 rows=1 width=8)
>> (actual time=2640.015..2640.015 rows=1 loops=1)
>>    Output: count(*)
>>    ->  Gather  (cost=217018.33..217018.54 rows=2 width=8) (actual
>> time=2639.064..2640.002 rows=3 loops=1)
>>          Output: (PARTIAL count(*))
>>          Workers Planned: 2
>>          Workers Launched: 2
>>          ->  Partial Aggregate  (cost=216018.33..216018.34 rows=1
>> width=8) (actual time=2632.714..2632.715 rows=1 loops=3)
>>                Output: PARTIAL count(*)
>>                Worker 0: actual time=2632.583..2632.584 rows=1 loops=1
>>                Worker 1: actual time=2627.517..2627.517 rows=1 loops=1
>>                ->  Parallel Seq Scan on public.pgbench_accounts
>> (cost=0.00..205601.67 rows=4166667 width=0) (actual
>> time=0.042..1685.542 rows=3333333 loops=3)
>>                      Worker 0: actual time=0.033..1657.486 rows=3457968
>> loops=1
>>                      Worker 1: actual time=0.039..1702.979 rows=3741069
>> loops=1
>>  Planning time: 1.026 ms
>>  Execution time: 2640.225 ms
>> (15 rows)
>>
>> For example, the above result shows,
>> Parallel Seq Scan : actual rows = 3333333
>> worker 0               : actual rows = 3457968
>> worker 1               : actual rows = 3741069
>> Summation of these is 10532370, but actual total rows is 10000000.
>> I think that Parallel Seq Scan should show actual rows =
>> 10000000(total rows) or actual rows = 2800963(rows collected by
>> itself). (10000000 maybe better)
>>
>
> You have to read the rows at Parallel Seq Scan nodes as total count of rows,
> but you have to consider the loops parameter as well.
>


In following case, it look to me that no one collect the tuple.
But it's obviously incorrect, this query collects a tuple(aid = 10) actually.

postgres(1)=# explain analyze verbose select * from pgbench_accounts
where aid = 10;
                                                                 QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------
 Gather  (cost=1000.00..217018.43 rows=1 width=97) (actual
time=0.541..2094.773 rows=1 loops=1)
   Output: aid, bid, abalance, filler
   Workers Planned: 2
   Workers Launched: 2
   ->  Parallel Seq Scan on public.pgbench_accounts
(cost=0.00..216018.34 rows=0 width=97) (actual time=1390.109..2088.103
rows=0 loops=3)
         Output: aid, bid, abalance, filler
         Filter: (pgbench_accounts.aid = 10)
         Rows Removed by Filter: 3333333
         Worker 0: actual time=2082.681..2082.681 rows=0 loops=1
         Worker 1: actual time=2087.532..2087.532 rows=0 loops=1
 Planning time: 0.126 ms
 Execution time: 2095.564 ms
(12 rows)

How can we consider actual rows and nloops?

Regards,

--
Masahiko Sawada


-- 
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Actuall row count of Parallel Seq Scan in EXPLAIN ANALYZE .

Reply via email to