Re: [HACKERS] modeling parallel contention (was: Parallel Append implementation)

Andres Freund Fri, 05 May 2017 14:25:00 -0700

Hi,

On 2017-05-05 15:29:40 -0400, Robert Haas wrote:
> On Thu, May 4, 2017 at 9:37 PM, Andres Freund <and...@anarazel.de> wrote:
> It's pretty easy (but IMHO not very interesting) to measure internal
> contention in the Parallel Seq Scan node.  As David points out
> downthread, that problem isn't trivial to fix, but it's not that hard,
> either.

Well, I think it's important that we do some basic optimization before
we start building a more elaborate costing model, because we'll
otherwise codify assumptions that won't be true for long.  Changing an
established costing model once it's out there is really hard, because
you always will hurt someone.  I think some of the contention is easy
enough to remove, and some of the IO concurrency issues

> I do believe that there is a problem with too much concurrent
> I/O on things like:
> 
> Gather
> -> Parallel Seq Scan on lineitem
> -> Hash Join
>    -> Seq Scan on lineitem
> 
> If that goes to multiple batches, you're probably wrenching the disk
> head all over the place - multiple processes are now reading and
> writing batch files at exactly the same time.

At least on rotating media. On decent SSDs you usually can have so many
concurrent IOs out there, that it's not actually easy to overwhelm a
disk that way.  While we obviously still need to pay some attention to
spinning disks, I also think that being good on decent (not great) SSDs
is more important.  We shouldn't optimize for things to run most
performant on hydra with it's slow storage system ;)

We probably should take something like effective_io_concurrency into
account, but that'd require a smarter default than we currently have.
It's not that hard to estimate an upper bound of parallelism with a
meaningful effective_io_concurrency - we'd probably have to move the
effective_io_concurrency handling in bitmapscans to be a plan parameter
though, so it's divided by the number of workers.

> I also strongly suspect
> that contention on individual buffers can turn into a problem on
> queries like this:
> 
> Gather (Merge)
> -> Merge Join
>   -> Parallel Index Scan
>   -> Index Scan
> 
> The parallel index scan surely has some upper limit on concurrency,
> but if that is exceeded, what will tend to happen is that processes
> will sleep.  On the inner side, though, every process is doing a full
> index scan and chances are good that they are doing it more or less in
> lock step, hitting the same buffers at the same time.

Hm. Not sure how big that problem is practically.  But I wonder if it's
worthwhile to sleep more if you hit contention or something liek that...

> I think there are two separate questions here:
> 
> 1. How do we reduce or eliminate contention during parallel query execution?
> 2. How do we make the query planner smarter about picking the optimal
> number of workers?
> 
> I think the second problem is both more difficult and more
> interesting.  I think that no matter how much work we do on #1, there
> are always going to be cases where the amount of effective parallelism
> has some finite limit - and I think the limit will vary substantially
> from query to query.  So, without #2, we'll either leave a lot of
> money on the table for queries that can benefit from using a large
> number of workers, or we'll throw extra workers uselessly at queries
> where they don't help (or even make things worse).

Yea, I agree that 2) is more important in the long run, but I do think
that my point that we shouldn't put too much effort into modelling
concurrency before doing basic optimization still stands.

Greetings,

Andres Freund

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] modeling parallel contention (was: Parallel Append implementation)

Reply via email to