Re: [HACKERS] Parallel Seq Scan

Thom Brown Wed, 11 Nov 2015 11:53:02 -0800

On 11 November 2015 at 19:26, Robert Haas <robertmh...@gmail.com> wrote:
> On Wed, Nov 11, 2015 at 12:59 PM, Pavel Stehule <pavel.steh...@gmail.com> 
> wrote:
>> I have a first query
>>
>> I looked on EXPLAIN ANALYZE output and the numbers of filtered rows are
>> differen
>
> Hmm, I see I was right about people finding more bugs once this was
> committed.  That didn't take long.
>
> There's supposed to be code to handle this - see the
> SharedPlanStateInstrumentation stuff in execParallel.c - but it's
> evidently a few bricks shy of a load.
> ExecParallelReportInstrumentation is supposed to transfer the counts
> from each worker to the DSM:
>
>         ps_instrument = &instrumentation->ps_instrument[i];
>         SpinLockAcquire(&ps_instrument->mutex);
>         InstrAggNode(&ps_instrument->instr, planstate->instrument);
>         SpinLockRelease(&ps_instrument->mutex);
>
> And ExecParallelRetrieveInstrumentation is supposed to slurp those
> counts back into the leader's PlanState objects:
>
>         /* No need to acquire the spinlock here; workers have exited already. 
> */
>         ps_instrument = &instrumentation->ps_instrument[i];
>         InstrAggNode(planstate->instrument, &ps_instrument->instr);
>
> This might be a race condition, or it might be just wrong logic.
> Could you test what happens if you insert something like a 1-second
> sleep in ExecParallelFinish just after the call to
> WaitForParallelWorkersToFinish()?  If that makes the results
> consistent, this is a race.  If it doesn't, something else is wrong:
> then it would be useful to know whether the workers are actually
> calling ExecParallelReportInstrumentation, and whether the leader is
> actually calling ExecParallelRetrieveInstrumentation, and if so
> whether they are doing it for the correct set of nodes.


Hmm.. I made the change, but clearly it's not sleeping properly with
my change (I'm expecting a total runtime in excess of 1 second):

max_parallel_degree = 4:

# explain (analyse, buffers, timing, verbose, costs) select count(*)
from js where content->'tags'->0->>'term' like 'design%' or
content->'tags'->0->>'term' like 'web%';

            QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=49578.18..49578.19 rows=1 width=0) (actual
time=797.518..797.518 rows=1 loops=1)
   Output: count(*)
   Buffers: shared hit=174883 read=540
   ->  Gather  (cost=1000.00..49546.93 rows=12500 width=0) (actual
time=0.245..784.959 rows=55151 loops=1)
         Output: content
         Number of Workers: 4
         Buffers: shared hit=174883 read=540
         ->  Parallel Seq Scan on public.js  (cost=0.00..47296.93
rows=12500 width=0) (actual time=0.019..6153.679 rows=94503 loops=1)
               Output: content
               Filter: (((((js.content -> 'tags'::text) -> 0) ->>
'term'::text) ~~ 'design%'::text) OR ((((js.content -> 'tags'::text)
-> 0) ->> 'term'::text) ~~ 'web%'::text))
               Rows Removed by Filter: 2051330
               Buffers: shared hit=299224 read=907
 Planning time: 0.086 ms
 Execution time: 803.026 ms


max_parallel_degree = 0:

# explain (analyse, buffers, timing, verbose, costs) select count(*)
from js where content->'tags'->0->>'term' like 'design%' or
content->'tags'->0->>'term' like 'web%';

         QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=212867.43..212867.44 rows=1 width=0) (actual
time=1278.717..1278.717 rows=1 loops=1)
   Output: count(*)
   Buffers: shared hit=174671 read=572
   ->  Seq Scan on public.js  (cost=0.00..212836.18 rows=12500
width=0) (actual time=0.018..1272.030 rows=55151 loops=1)
         Output: content
         Filter: (((((js.content -> 'tags'::text) -> 0) ->>
'term'::text) ~~ 'design%'::text) OR ((((js.content -> 'tags'::text)
-> 0) ->> 'term'::text) ~~ 'web%'::text))
         Rows Removed by Filter: 1197822
         Buffers: shared hit=174671 read=572
 Planning time: 0.064 ms
 Execution time: 1278.741 ms
(10 rows)

Time: 1279.145 ms


I did, however, notice that repeated runs of the query with
max_parallel_degree = 4 yields different counts of rows removed by
filter:

Run 1: 2051330
Run 2: 2081252
Run 3: 2065112
Run 4: 2022045
Run 5: 2025384
Run 6: 2059360
Run 7: 2079620
Run 8: 2058541

-- 
Thom


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parallel Seq Scan

Reply via email to