Re: [HACKERS] FDW and parallel execution

Kyotaro HORIGUCHI Thu, 13 Apr 2017 00:51:19 -0700

Sorry for the too-brief reply.

At Tue, 11 Apr 2017 20:08:46 +0300, Konstantin Knizhnik 
<[email protected]> wrote in 
<[email protected]>
> 
> On 04.04.2017 13:29, Kyotaro HORIGUCHI wrote:
> > Hi,
> >
> > At Sun, 02 Apr 2017 16:30:24 +0300, Konstantin Knizhnik
> > <[email protected]> wrote in <[email protected]>
> >> My FDW provides implementation for IsForeignScanParallelSafe which
> >> returns true.
> >> I wonder what can prevent optimizer from using parallel plan in this
> >> case?
> > Parallel execution requires partial paths. It's the work for
> > GetForeignPaths of your FDW.
> 
> Thank you very much for explanation.
> But unfortunately I still do not completely understand what kind of
> queries allow parallel execution with FDW.


At Tue, 11 Apr 2017 19:20:04 +0200, PostgreSQL - Hans-Jürgen Schönig 
<[email protected]> wrote in 
<[email protected]>
> did you check out antonin houska's patches?
> we basically got code, which can do that.

Parallel aggregation is already available. Antonin's patch is
partition-wise aggregation, which boosts the case where partition
key is aggregation key, I suppose. parallel aggregation seems to
be considered when any appropriate partial path is available. (I
haven't tried anything, though.)

set_plain_rel_pathlist() does the work for plain relations so
what we should do in GetForeignPaths would be follows.

- check rel->consider_parallel (won't be requried since the fDW
  knows that) and rel->lateral_relids.

- If parallel is OK, create a path with create_foreignscan_path
  in ordinary way then change some parallel related members as
  necessary.

- Like create_plain_partial_paths(), check certain conditions and
  finally add_partial_path() the created partial foreign scan path.

I haven't really done this, so I might be wrong.

> Section "FDW Routines for Parallel Execution" of FDW specification
> says:
> > A ForeignScan node can, optionally, support parallel execution. A
> > parallel ForeignScan will be executed in multiple processes and should
> > return each row only once across all cooperating processes. To do
> > this, processes can coordinate through fixed size chunks of dynamic
> > shared memory. This shared memory is not guaranteed to be mapped at
> > the same address in every process, so pointers may not be used. The
> > following callbacks are all optional in general, but required if
> > parallel execution is to be supported.
> 
> I provided IsForeignScanParallelSafe, EstimateDSMForeignScan,
> InitializeDSMForeignSca and InitializeWorkerForeignScan in my FDW.
> IsForeignScanParallelSafe returns true.
> Also in GetForeignPaths function I created path with
> baserel->consider_parallel == true.
> Is it enough or I should do something else?

Creating partial paths, I think. create_grouping_paths() requires
partial_pathlist in input_rel.

The section is explaning FDW routines specially provided for
parallel execution. But it doesn't seem mentioning "how to run a
parallel execution" as a whole.

> But unfortunately I failed to find any query: sequential scan, grand
> aggregation, aggregation with group by, joins... when parallel
> execution plan is used for this FDW.
> Also there are no examples of using this functions in Postgres
> distributive and I failed to find any such examples in Internet.

Maybe you're the pioneer in this area.

> Can somebody please clarify my situation with parallel execution and
> FDW and may be point at some examples?
> Thank in advance.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



-- 
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] FDW and parallel execution

Reply via email to