On Thu, Mar 9, 2017 at 6:28 PM, Robert Haas <robertmh...@gmail.com> wrote:
> On Thu, Mar 9, 2017 at 7:42 AM, Ashutosh Bapat
> <ashutosh.ba...@enterprisedb.com> wrote:
>>> +        if (rel->partial_pathlist != NIL &&
>>> +            (Path *) linitial(rel->partial_pathlist) == subpath)
>>> +            partial_subplans_set = bms_add_member(partial_subplans_set, i);
>>> This seems like a scary way to figure this out.  What if we wanted to
>>> build a parallel append subpath with some path other than the
>>> cheapest, for some reason?  I think you ought to record the decision
>>> that set_append_rel_pathlist makes about whether to use a partial path
>>> or a parallel-safe path, and then just copy it over here.
>> I agree that assuming that a subpath is non-partial path if it's not
>> cheapest of the partial paths is risky. In fact, we can not assume
>> that even when it's not one of the partial_paths since it could have
>> been kicked out or was never added to the partial path list like
>> reparameterized path. But if we have to save the information about
>> which of the subpaths are partial paths and which are not in
>> AppendPath, it would take some memory, noticeable for thousands of
>> partitions, which will leak if the path doesn't make into the
>> rel->pathlist.
> True, but that's no different from the situation for any other Path
> node that has substructure.  For example, an IndexPath has no fewer
> than 5 list pointers in it.  Generally we assume that the number of
> paths won't be large enough for the memory used to really matter, and
> I think that will also be true here.  And an AppendPath has a list of
> subpaths, and if I'm not mistaken, those list nodes consume more
> memory than the tracking information we're thinking about here will.

What I have observed is that we try to keep the memory usage to a
minimum, trying to avoid memory consumption as much as possible. Most
of that substructure gets absorbed by the planner or is shared across
paths. Append path lists are an exception to that, but we need
something to hold all subpaths together and list is PostgreSQL's way
of doing it. So, that's kind of unavoidable. And may be we will find
some reason for almost every substructure in paths.

> I think you're thinking about this issue because you've been working
> on partitionwise join where memory consumption is a big issue, but
> there are a lot of cases where that isn't really a big deal.


>> The purpose of that information is to make sure that we
>> allocate only one worker to that plan. I suggested that we use
>> path->parallel_workers for the same, but it seems that's not
>> guaranteed to be reliable. The reasons were discussed upthread. Is
>> there any way to infer whether we can allocate more than one workers
>> to a plan by looking at the corresponding path?
> I think it would be smarter to track it some other way.  Either keep
> two lists of paths, one of which is the partial paths and the other of
> which is the parallel-safe paths, or keep a bitmapset indicating which
> paths fall into which category.

I like two lists: it consumes almost no memory (two list headers
instead of one) compared to non-parallel-append when there are
non-partial paths and what more, it consumes no extra memory when all
paths are partial.

Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:

Reply via email to