On Mon, Jun 20, 2016 at 11:48 AM, Masahiko Sawada <sawada.m...@gmail.com> wrote: > > Hi all, > > My colleague noticed that the output of EXPLAIN ANALYZE doesn't work > fine for parallel seq scan. > > postgres(1)=# explain analyze verbose select count(*) from pgbench_accounts ; > QUERY PLAN > ----------------------------------------------------------------------------------------------------------------------------------------------------- > Finalize Aggregate (cost=217018.55..217018.56 rows=1 width=8) > (actual time=2640.015..2640.015 rows=1 loops=1) > Output: count(*) > -> Gather (cost=217018.33..217018.54 rows=2 width=8) (actual > time=2639.064..2640.002 rows=3 loops=1) > Output: (PARTIAL count(*)) > Workers Planned: 2 > Workers Launched: 2 > -> Partial Aggregate (cost=216018.33..216018.34 rows=1 > width=8) (actual time=2632.714..2632.715 rows=1 loops=3) > Output: PARTIAL count(*) > Worker 0: actual time=2632.583..2632.584 rows=1 loops=1 > Worker 1: actual time=2627.517..2627.517 rows=1 loops=1 > -> Parallel Seq Scan on public.pgbench_accounts > (cost=0.00..205601.67 rows=4166667 width=0) (actual > time=0.042..1685.542 rows=3333333 loops=3) > Worker 0: actual time=0.033..1657.486 rows=3457968 loops=1 > Worker 1: actual time=0.039..1702.979 rows=3741069 loops=1 > Planning time: 1.026 ms > Execution time: 2640.225 ms > (15 rows) > > For example, the above result shows, > Parallel Seq Scan : actual rows = 3333333 > worker 0 : actual rows = 3457968 > worker 1 : actual rows = 3741069 > Summation of these is 10532370, but actual total rows is 10000000. > I think that Parallel Seq Scan should show actual rows = > 10000000(total rows) or actual rows = 2800963(rows collected by > itself). (10000000 maybe better) >
You have to read the rows at Parallel Seq Scan nodes as total count of rows, but you have to consider the loops parameter as well. > > After spent time to investigate this behaviour, ISTM that the problem > is nloops of Parallel Seq Scan. > Parallel Seq Scan is done only once, but nloops is incremented to 3. > nloops here indicates, that it is done for 2 workers and a master backend. > So its "actual rows" is calculated 3333333(10000000 / 3) at explain.c:L1223. > Thats how it should be considered. You might want to compare the behaviour with other cases where value of nloops is used. > Is it a bug? > I don't think so. With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com