On 2017-05-05 14:20:48 +1200, David Rowley wrote:
> Yeah, I did get some time to look over the contention in Parallel Seq
> Scan a while back and I discovered that on the machine that I was
> testing on. the lock obtained in heap_parallelscan_nextpage() was
> causing workers to have to wait for other workers to fetch their next
> task to work on.

Oh, if it's "just" that, it should be easy enough to address.  Two
1) use atomic ops for increment, modulo afterwards to deal with
   wraparound in the synchronous scan
2) batching

> I ended up writing the attached (which I'd not intended to post until
> some time closer to when the doors open for PG11). At the moment it's
> basically just a test patch to see how it affects things when we give
> workers a bit more to do before they come back to look for more work.
> In this case, I've just given them 10 pages to work on, instead of the
> 1 that's allocated in 9.6 and v10.


> A quick test on a pretty large table on a large machine shows:
> Unpatched:
> postgres=# select count(*) from a;
>    count
> ------------
>  1874000000
> (1 row)
> Time: 5211.485 ms (00:05.211)
> Patched:
> postgres=# select count(*) from a;
>    count
> ------------
>  1874000000
> (1 row)
> Time: 2523.983 ms (00:02.524)


> I'd had thoughts that the 10 pages wouldn't be constant, but the
> batching size would depend on the size of the relation to be scanned.
> I'd rough ideas to just try to make about 1 million batches. Something
> like batch_pages = Max(parallel_scan->phs_nblocks / 1000000, 1); so
> that we only take more than 1 page if there's some decent amount to
> process. We don't want to make the batches too big as we might end up
> having to wait on slow workers at the end of a scan.

I wonder how much doing the atomic ops approach alone can help, that
doesn't have the issue that the work might be unevenly distributed
between pages.

> Anyway. I don't want to hi-jack this thread with discussions on this.
> I just wanted to mark that I plan to work on this in order to avoid
> any repeat developments or analysis. I'll probably start a new thread
> for this sometime nearer PG11's dev cycle.

Cool.  I think it might sense to post about this soon, just to give it
some more visibility to reduce the potential for duplication.

- andres

Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:

Reply via email to