Re: add_partial_path() may remove dominated path but still in use

2019-02-09 Thread Tom Lane
Amit Kapila  writes:
> It seems you would also like to see this back-patched.  I am not sure
> if that is a good idea as there is some risk of breaking existing
> usage.  Tom, do you have any opinion on this patch?  It seems to me
> you were thinking to have a separate hook for partial paths, but the
> patch has solved the problem by moving the hook location.

I was expecting Haas to take point on this, but since he doesn't seem
to be doing so, I'll push it.  I don't think there's any material
risk of breaking things --- the only functionality lost is the ability to
remove or modify baserel Gather paths, which I doubt anybody is interested
in doing.  Certainly that's way less useful than the ability to add
partial paths and have them be included in Gather-building.

In a green field I'd rather have had a separate hook for adding partial
paths, but it's not clear that that really buys much of anything except
logical cleanliness ... against which it adds cost since the using
extension(s) have to figure out what's going on twice.

Also this way does have the advantage that it retroactively fixes things
for extensions that may be trying to make partial paths today.

regards, tom lane



Re: add_partial_path() may remove dominated path but still in use

2019-02-09 Thread Amit Kapila
On Wed, Feb 6, 2019 at 10:35 AM Kohei KaiGai  wrote:
>
> Hello,
> Let me remind the thread again.
> I'm waiting for the fix getting committed for a month...
>

It seems you would also like to see this back-patched.  I am not sure
if that is a good idea as there is some risk of breaking existing
usage.  Tom, do you have any opinion on this patch?  It seems to me
you were thinking to have a separate hook for partial paths, but the
patch has solved the problem by moving the hook location.  I think
whatever is the case we should try to reach some consensus and move
forward with this patch as KaiGai-San is waiting from quite some time.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



Re: add_partial_path() may remove dominated path but still in use

2019-02-05 Thread Kohei KaiGai
Hello,

Let me remind the thread again.
I'm waiting for the fix getting committed for a month...

2019年1月22日(火) 20:50 Kohei KaiGai :
>
> Let me remind the thread.
> If no more comments, objections, or better ideas, please commit this fix.
>
> Thanks,
>
> 2019年1月17日(木) 18:29 Kyotaro HORIGUCHI :
> >
> > Hello, sorry for the absence.
> >
> > At Fri, 11 Jan 2019 11:36:43 -0500, Robert Haas  
> > wrote in 
> > 
> > > On Thu, Jan 10, 2019 at 9:10 PM Kohei KaiGai  wrote:
> > > > 2019年1月11日(金) 5:52 Robert Haas :
> > > > > On Wed, Jan 9, 2019 at 12:44 AM Kohei KaiGai  
> > > > > wrote:
> > > > > > So, is it sufficient if set_rel_pathlist_hook is just relocated in
> > > > > > front of the generate_gather_paths?
> > > > > > If we have no use case for the second hook, here is little necessity
> > > > > > to have the post_rel_pathlist_hook() here.
> > > > > > (At least, PG-Strom will use the first hook only.)
> > > > >
> > > > > +1.  That seems like the best way to be consistent with the principle
> > > > > that we need to have all the partial paths before generating any
> > > > > Gather paths.
> > > > >
> > > > Patch was updated, just for relocation of the set_rel_pathlist_hook.
> > > > Please check it.
> > >
> > > Seems reasonable to me.
> >
> > Also seems reasonable to me.  The extension can call
> > generate_gather_paths redundantly as is but it almost doesn't
> > harm, so it is acceptable even in a minor release.
> >
> > regards.
> >
> > --
> > Kyotaro Horiguchi
> > NTT Open Source Software Center
> >
>
>
> --
> HeteroDB, Inc / The PG-Strom Project
> KaiGai Kohei 

-- 
HeteroDB, Inc / The PG-Strom Project
KaiGai Kohei 


pgsql-bugfix-relocation-path-set_rel_pathlist_hook.v11.patch
Description: Binary data


pgsql-bugfix-relocation-path-set_rel_pathlist_hook.v10.patch
Description: Binary data


Re: add_partial_path() may remove dominated path but still in use

2019-01-22 Thread Kohei KaiGai
Let me remind the thread.
If no more comments, objections, or better ideas, please commit this fix.

Thanks,

2019年1月17日(木) 18:29 Kyotaro HORIGUCHI :
>
> Hello, sorry for the absence.
>
> At Fri, 11 Jan 2019 11:36:43 -0500, Robert Haas  wrote 
> in 
> > On Thu, Jan 10, 2019 at 9:10 PM Kohei KaiGai  wrote:
> > > 2019年1月11日(金) 5:52 Robert Haas :
> > > > On Wed, Jan 9, 2019 at 12:44 AM Kohei KaiGai  
> > > > wrote:
> > > > > So, is it sufficient if set_rel_pathlist_hook is just relocated in
> > > > > front of the generate_gather_paths?
> > > > > If we have no use case for the second hook, here is little necessity
> > > > > to have the post_rel_pathlist_hook() here.
> > > > > (At least, PG-Strom will use the first hook only.)
> > > >
> > > > +1.  That seems like the best way to be consistent with the principle
> > > > that we need to have all the partial paths before generating any
> > > > Gather paths.
> > > >
> > > Patch was updated, just for relocation of the set_rel_pathlist_hook.
> > > Please check it.
> >
> > Seems reasonable to me.
>
> Also seems reasonable to me.  The extension can call
> generate_gather_paths redundantly as is but it almost doesn't
> harm, so it is acceptable even in a minor release.
>
> regards.
>
> --
> Kyotaro Horiguchi
> NTT Open Source Software Center
>


-- 
HeteroDB, Inc / The PG-Strom Project
KaiGai Kohei 



Re: add_partial_path() may remove dominated path but still in use

2019-01-17 Thread Kyotaro HORIGUCHI
Hello, sorry for the absence.

At Fri, 11 Jan 2019 11:36:43 -0500, Robert Haas  wrote 
in 
> On Thu, Jan 10, 2019 at 9:10 PM Kohei KaiGai  wrote:
> > 2019年1月11日(金) 5:52 Robert Haas :
> > > On Wed, Jan 9, 2019 at 12:44 AM Kohei KaiGai  wrote:
> > > > So, is it sufficient if set_rel_pathlist_hook is just relocated in
> > > > front of the generate_gather_paths?
> > > > If we have no use case for the second hook, here is little necessity
> > > > to have the post_rel_pathlist_hook() here.
> > > > (At least, PG-Strom will use the first hook only.)
> > >
> > > +1.  That seems like the best way to be consistent with the principle
> > > that we need to have all the partial paths before generating any
> > > Gather paths.
> > >
> > Patch was updated, just for relocation of the set_rel_pathlist_hook.
> > Please check it.
> 
> Seems reasonable to me.

Also seems reasonable to me.  The extension can call
generate_gather_paths redundantly as is but it almost doesn't
harm, so it is acceptable even in a minor release.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: add_partial_path() may remove dominated path but still in use

2019-01-11 Thread Robert Haas
On Thu, Jan 10, 2019 at 9:10 PM Kohei KaiGai  wrote:
> 2019年1月11日(金) 5:52 Robert Haas :
> > On Wed, Jan 9, 2019 at 12:44 AM Kohei KaiGai  wrote:
> > > So, is it sufficient if set_rel_pathlist_hook is just relocated in
> > > front of the generate_gather_paths?
> > > If we have no use case for the second hook, here is little necessity
> > > to have the post_rel_pathlist_hook() here.
> > > (At least, PG-Strom will use the first hook only.)
> >
> > +1.  That seems like the best way to be consistent with the principle
> > that we need to have all the partial paths before generating any
> > Gather paths.
> >
> Patch was updated, just for relocation of the set_rel_pathlist_hook.
> Please check it.

Seems reasonable to me.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: add_partial_path() may remove dominated path but still in use

2019-01-10 Thread Kohei KaiGai
2019年1月11日(金) 5:52 Robert Haas :
>
> On Wed, Jan 9, 2019 at 12:44 AM Kohei KaiGai  wrote:
> > So, is it sufficient if set_rel_pathlist_hook is just relocated in
> > front of the generate_gather_paths?
> > If we have no use case for the second hook, here is little necessity
> > to have the post_rel_pathlist_hook() here.
> > (At least, PG-Strom will use the first hook only.)
>
> +1.  That seems like the best way to be consistent with the principle
> that we need to have all the partial paths before generating any
> Gather paths.
>
Patch was updated, just for relocation of the set_rel_pathlist_hook.
Please check it.

Thanks,
-- 
HeteroDB, Inc / The PG-Strom Project
KaiGai Kohei 


pgsql-bugfix-relocation-path-set_rel_pathlist_hook.v10.patch
Description: Binary data


pgsql-bugfix-relocation-path-set_rel_pathlist_hook.v11.patch
Description: Binary data


Re: add_partial_path() may remove dominated path but still in use

2019-01-10 Thread Robert Haas
On Wed, Jan 9, 2019 at 12:44 AM Kohei KaiGai  wrote:
> So, is it sufficient if set_rel_pathlist_hook is just relocated in
> front of the generate_gather_paths?
> If we have no use case for the second hook, here is little necessity
> to have the post_rel_pathlist_hook() here.
> (At least, PG-Strom will use the first hook only.)

+1.  That seems like the best way to be consistent with the principle
that we need to have all the partial paths before generating any
Gather paths.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: add_partial_path() may remove dominated path but still in use

2019-01-08 Thread Kohei KaiGai
2019年1月9日(水) 13:18 Kyotaro HORIGUCHI :
>
> At Sun, 30 Dec 2018 12:31:22 +0900, Kohei KaiGai  wrote 
> in 
> > 2018年12月30日(日) 4:12 Tom Lane :
> > On the other hands, the later hook must be dedicated to add regular paths,
> > and also provides a chance for extensions to manipulate pre-built path-list
> > including Gather-path.
> > As long as I know, pg_hint_plan uses the set_rel_pathlist_hook to enforce
> > a particular path-node, including Gather-node, by manipulation of the cost
> > value. Horiguchi-san, is it right?
>
> Mmm. I haven't expected that it is mentioned here.
>
> Actually in the hook, it changes enable_* planner variables, or
> directory manipuraltes path costs or even can clear and
> regenerate the path list and gather paths for the parallel
> case. It will be happy if we had a chance to manpurate partial
> paths before genrating gahter paths.
>
So, is it sufficient if set_rel_pathlist_hook is just relocated in
front of the generate_gather_paths?
If we have no use case for the second hook, here is little necessity
to have the post_rel_pathlist_hook() here.
(At least, PG-Strom will use the first hook only.)

Thanks,
-- 
HeteroDB, Inc / The PG-Strom Project
KaiGai Kohei 



Re: add_partial_path() may remove dominated path but still in use

2019-01-08 Thread Kyotaro HORIGUCHI
At Sun, 30 Dec 2018 12:31:22 +0900, Kohei KaiGai  wrote in 

> 2018年12月30日(日) 4:12 Tom Lane :
> On the other hands, the later hook must be dedicated to add regular paths,
> and also provides a chance for extensions to manipulate pre-built path-list
> including Gather-path.
> As long as I know, pg_hint_plan uses the set_rel_pathlist_hook to enforce
> a particular path-node, including Gather-node, by manipulation of the cost
> value. Horiguchi-san, is it right?

Mmm. I haven't expected that it is mentioned here.

Actually in the hook, it changes enable_* planner variables, or
directory manipuraltes path costs or even can clear and
regenerate the path list and gather paths for the parallel
case. It will be happy if we had a chance to manpurate partial
paths before genrating gahter paths.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: add_partial_path() may remove dominated path but still in use

2019-01-03 Thread Kohei KaiGai
I tried to make a patch to have dual hooks at set_rel_pathlist(), and
adjusted PG-Strom for the new design. It stopped to create GatherPath
by itself, just added a partial path for the base relation.
It successfully made a plan using parallel custom-scan node, without
system crash.

As I mentioned above, it does not use the new "post_rel_pathlist_hook"
because we can add both of partial/regular path-node at the first hook
with no particular problems.

Thanks,

dbt3=# explain select
sum(l_extendedprice * l_discount) as revenue
from
lineitem
where
l_shipdate >= date '1994-01-01'
and l_shipdate < cast(date '1994-01-01' + interval '1 year' as date)
and l_discount between 0.08 - 0.01 and 0.08 + 0.01
and l_quantity < 24 limit 1;
  QUERY PLAN

 Limit  (cost=144332.62..144332.63 rows=1 width=4)
   ->  Aggregate  (cost=144332.62..144332.63 rows=1 width=4)
 ->  Gather  (cost=144285.70..144329.56 rows=408 width=4)
   Workers Planned: 2
   ->  Parallel Custom Scan (GpuPreAgg) on lineitem
(cost=143285.70..143288.76 rows=204 width=4)
 Reduction: NoGroup
 Outer Scan: lineitem  (cost=1666.67..143246.16
rows=63254 width=8)
 Outer Scan Filter: ((l_discount >= '0.07'::double
precision) AND
   (l_discount <=
'0.09'::double precision) AND
   (l_quantity <
'24'::double precision) AND
   (l_shipdate <
'1995-01-01'::date) AND
   (l_shipdate >=
'1994-01-01'::date))
(8 rows)

Thanks,

2019年1月2日(水) 22:34 Kohei KaiGai :
>
> 2018年12月31日(月) 22:25 Amit Kapila :
> >
> > On Mon, Dec 31, 2018 at 5:48 PM Kohei KaiGai  wrote:
> > >
> > > 2018年12月31日(月) 13:10 Amit Kapila :
> > > >
> > > > On Sun, Dec 30, 2018 at 9:01 AM Kohei KaiGai  
> > > > wrote:
> > > > > 2018年12月30日(日) 4:12 Tom Lane :
> > > > > >
> > > > > > Kohei KaiGai  writes:
> > > > > > > 2018年12月29日(土) 1:44 Tom Lane :
> > > > > > >> However, first I'd like to know why this situation is arising in 
> > > > > > >> the first
> > > > > > >> place.  To have the situation you're describing, we'd have to 
> > > > > > >> have
> > > > > > >> attempted to make some Gather paths before we have all the 
> > > > > > >> partial paths
> > > > > > >> for the relation they're for.  Why is that a good thing to do?  
> > > > > > >> It seems
> > > > > > >> like such Gathers are necessarily being made with incomplete 
> > > > > > >> information,
> > > > > > >> and we'd be better off to fix things so that none are made till 
> > > > > > >> later.
> > > > > >
> > > > > > > Because of the hook location, Gather-node shall be constructed 
> > > > > > > with built-in
> > > > > > > and foreign partial scan node first, then extension gets a chance 
> > > > > > > to add its
> > > > > > > custom paths (partial and full).
> > > > > > > At the set_rel_pathlist(), set_rel_pathlist_hook() is invoked 
> > > > > > > next to the
> > > > > > > generate_gather_paths().
> > > > > >
> > > > > > Hmm.  I'm inclined to think that we should have a separate hook
> > > > > > in which extensions are allowed to add partial paths, and that
> > > > > > set_rel_pathlist_hook should only be allowed to add regular paths.
> > > >
> > > > +1.  This idea sounds sensible to me.
> > > >
> > > > > >
> > > > > I have almost same opinion, but the first hook does not need to be
> > > > > dedicated for partial paths. As like set_foreign_pathlist() doing, we 
> > > > > can
> > > > > add both of partial and regular paths here, then 
> > > > > generate_gather_paths()
> > > > > may generate a Gather-path on top of the best partial-path.
> > > > >
> > > >
> > > > Won't it be confusing for users if we allow both partial and full
> > > > paths in first hook and only full paths in the second hook?
> > > > Basically, in many cases, the second hook won't be of much use.  What
> > > > advantage you are seeing in allowing both partial and full paths in
> > > > the first hook?
> > > >
> > > Two advantages. The first one is, it follows same manner of
> > > set_foreign_pathlist()
> > > which allows to add both of full and partial path if FDW supports 
> > > parallel-scan.
> > > The second one is practical. During the path construction, extension 
> > > needs to
> > > check availability to run (e.g, whether operators in WHERE-clause is 
> > > supported
> > > on GPU device...), calculate its estimated cost and so on. Not a small
> > > portion of
> > > them are common for both of full and partial path. So, if the first
> > > hook accepts to
> > > add both kind of paths at once, extension can share the common properties.
> > >
> >
> > You have a point, though I am not sure how much difference it can

Re: add_partial_path() may remove dominated path but still in use

2019-01-02 Thread Kohei KaiGai
2018年12月31日(月) 22:25 Amit Kapila :
>
> On Mon, Dec 31, 2018 at 5:48 PM Kohei KaiGai  wrote:
> >
> > 2018年12月31日(月) 13:10 Amit Kapila :
> > >
> > > On Sun, Dec 30, 2018 at 9:01 AM Kohei KaiGai  wrote:
> > > > 2018年12月30日(日) 4:12 Tom Lane :
> > > > >
> > > > > Kohei KaiGai  writes:
> > > > > > 2018年12月29日(土) 1:44 Tom Lane :
> > > > > >> However, first I'd like to know why this situation is arising in 
> > > > > >> the first
> > > > > >> place.  To have the situation you're describing, we'd have to have
> > > > > >> attempted to make some Gather paths before we have all the partial 
> > > > > >> paths
> > > > > >> for the relation they're for.  Why is that a good thing to do?  It 
> > > > > >> seems
> > > > > >> like such Gathers are necessarily being made with incomplete 
> > > > > >> information,
> > > > > >> and we'd be better off to fix things so that none are made till 
> > > > > >> later.
> > > > >
> > > > > > Because of the hook location, Gather-node shall be constructed with 
> > > > > > built-in
> > > > > > and foreign partial scan node first, then extension gets a chance 
> > > > > > to add its
> > > > > > custom paths (partial and full).
> > > > > > At the set_rel_pathlist(), set_rel_pathlist_hook() is invoked next 
> > > > > > to the
> > > > > > generate_gather_paths().
> > > > >
> > > > > Hmm.  I'm inclined to think that we should have a separate hook
> > > > > in which extensions are allowed to add partial paths, and that
> > > > > set_rel_pathlist_hook should only be allowed to add regular paths.
> > >
> > > +1.  This idea sounds sensible to me.
> > >
> > > > >
> > > > I have almost same opinion, but the first hook does not need to be
> > > > dedicated for partial paths. As like set_foreign_pathlist() doing, we 
> > > > can
> > > > add both of partial and regular paths here, then generate_gather_paths()
> > > > may generate a Gather-path on top of the best partial-path.
> > > >
> > >
> > > Won't it be confusing for users if we allow both partial and full
> > > paths in first hook and only full paths in the second hook?
> > > Basically, in many cases, the second hook won't be of much use.  What
> > > advantage you are seeing in allowing both partial and full paths in
> > > the first hook?
> > >
> > Two advantages. The first one is, it follows same manner of
> > set_foreign_pathlist()
> > which allows to add both of full and partial path if FDW supports 
> > parallel-scan.
> > The second one is practical. During the path construction, extension needs 
> > to
> > check availability to run (e.g, whether operators in WHERE-clause is 
> > supported
> > on GPU device...), calculate its estimated cost and so on. Not a small
> > portion of
> > them are common for both of full and partial path. So, if the first
> > hook accepts to
> > add both kind of paths at once, extension can share the common properties.
> >
>
> You have a point, though I am not sure how much difference it can
> create for cost computation as ideally, both will have different
> costing model.  I understand there are some savings by avoiding some
> common work, is there any way to cache the required information?
>
I have no idea for the clean way.
We may be able to have an opaque pointer for extension usage, however,
it may be problematic if multiple extension uses the hook.

> > Probably, the second hook is only used for a few corner case where an 
> > extension
> > wants to manipulate path-list already built, like pg_hint_plan.
> >
>
> Okay, but it could be some work for extension authors who are using
> the current hook, not sure they would like to divide the work between
> first and second hook.
>
I guess they don't divide their code, but choose either of them.
In case of PG-Strom, even if there are two hooks around the point, it will use
the first hook only, unless it does not prohibit to call add_path() here.
However, some adjustments are required. Its current implementation makes
GatherPath node with partial CustomScanPath because set_rel_pathlist_hook()
is called after the generate_gather_paths().
Once we could choose the first hook, no need to make a GatherPath by itself,
because PostgreSQL-core will make the path if partial custom-path is enough
reasonable cost. Likely, this adjustment is more preferable one.

Thanks,
-- 
HeteroDB, Inc / The PG-Strom Project
KaiGai Kohei 



Re: add_partial_path() may remove dominated path but still in use

2018-12-31 Thread Amit Kapila
On Mon, Dec 31, 2018 at 5:48 PM Kohei KaiGai  wrote:
>
> 2018年12月31日(月) 13:10 Amit Kapila :
> >
> > On Sun, Dec 30, 2018 at 9:01 AM Kohei KaiGai  wrote:
> > > 2018年12月30日(日) 4:12 Tom Lane :
> > > >
> > > > Kohei KaiGai  writes:
> > > > > 2018年12月29日(土) 1:44 Tom Lane :
> > > > >> However, first I'd like to know why this situation is arising in the 
> > > > >> first
> > > > >> place.  To have the situation you're describing, we'd have to have
> > > > >> attempted to make some Gather paths before we have all the partial 
> > > > >> paths
> > > > >> for the relation they're for.  Why is that a good thing to do?  It 
> > > > >> seems
> > > > >> like such Gathers are necessarily being made with incomplete 
> > > > >> information,
> > > > >> and we'd be better off to fix things so that none are made till 
> > > > >> later.
> > > >
> > > > > Because of the hook location, Gather-node shall be constructed with 
> > > > > built-in
> > > > > and foreign partial scan node first, then extension gets a chance to 
> > > > > add its
> > > > > custom paths (partial and full).
> > > > > At the set_rel_pathlist(), set_rel_pathlist_hook() is invoked next to 
> > > > > the
> > > > > generate_gather_paths().
> > > >
> > > > Hmm.  I'm inclined to think that we should have a separate hook
> > > > in which extensions are allowed to add partial paths, and that
> > > > set_rel_pathlist_hook should only be allowed to add regular paths.
> >
> > +1.  This idea sounds sensible to me.
> >
> > > >
> > > I have almost same opinion, but the first hook does not need to be
> > > dedicated for partial paths. As like set_foreign_pathlist() doing, we can
> > > add both of partial and regular paths here, then generate_gather_paths()
> > > may generate a Gather-path on top of the best partial-path.
> > >
> >
> > Won't it be confusing for users if we allow both partial and full
> > paths in first hook and only full paths in the second hook?
> > Basically, in many cases, the second hook won't be of much use.  What
> > advantage you are seeing in allowing both partial and full paths in
> > the first hook?
> >
> Two advantages. The first one is, it follows same manner of
> set_foreign_pathlist()
> which allows to add both of full and partial path if FDW supports 
> parallel-scan.
> The second one is practical. During the path construction, extension needs to
> check availability to run (e.g, whether operators in WHERE-clause is supported
> on GPU device...), calculate its estimated cost and so on. Not a small
> portion of
> them are common for both of full and partial path. So, if the first
> hook accepts to
> add both kind of paths at once, extension can share the common properties.
>

You have a point, though I am not sure how much difference it can
create for cost computation as ideally, both will have different
costing model.  I understand there are some savings by avoiding some
common work, is there any way to cache the required information?

> Probably, the second hook is only used for a few corner case where an 
> extension
> wants to manipulate path-list already built, like pg_hint_plan.
>

Okay, but it could be some work for extension authors who are using
the current hook, not sure they would like to divide the work between
first and second hook.


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



Re: add_partial_path() may remove dominated path but still in use

2018-12-30 Thread Amit Kapila
On Sun, Dec 30, 2018 at 9:01 AM Kohei KaiGai  wrote:
> 2018年12月30日(日) 4:12 Tom Lane :
> >
> > Kohei KaiGai  writes:
> > > 2018年12月29日(土) 1:44 Tom Lane :
> > >> However, first I'd like to know why this situation is arising in the 
> > >> first
> > >> place.  To have the situation you're describing, we'd have to have
> > >> attempted to make some Gather paths before we have all the partial paths
> > >> for the relation they're for.  Why is that a good thing to do?  It seems
> > >> like such Gathers are necessarily being made with incomplete information,
> > >> and we'd be better off to fix things so that none are made till later.
> >
> > > Because of the hook location, Gather-node shall be constructed with 
> > > built-in
> > > and foreign partial scan node first, then extension gets a chance to add 
> > > its
> > > custom paths (partial and full).
> > > At the set_rel_pathlist(), set_rel_pathlist_hook() is invoked next to the
> > > generate_gather_paths().
> >
> > Hmm.  I'm inclined to think that we should have a separate hook
> > in which extensions are allowed to add partial paths, and that
> > set_rel_pathlist_hook should only be allowed to add regular paths.

+1.  This idea sounds sensible to me.

> >
> I have almost same opinion, but the first hook does not need to be
> dedicated for partial paths. As like set_foreign_pathlist() doing, we can
> add both of partial and regular paths here, then generate_gather_paths()
> may generate a Gather-path on top of the best partial-path.
>

Won't it be confusing for users if we allow both partial and full
paths in first hook and only full paths in the second hook?
Basically, in many cases, the second hook won't be of much use.  What
advantage you are seeing in allowing both partial and full paths in
the first hook?

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



Re: add_partial_path() may remove dominated path but still in use

2018-12-29 Thread Kohei KaiGai
2018年12月30日(日) 4:12 Tom Lane :
>
> Kohei KaiGai  writes:
> > 2018年12月29日(土) 1:44 Tom Lane :
> >> However, first I'd like to know why this situation is arising in the first
> >> place.  To have the situation you're describing, we'd have to have
> >> attempted to make some Gather paths before we have all the partial paths
> >> for the relation they're for.  Why is that a good thing to do?  It seems
> >> like such Gathers are necessarily being made with incomplete information,
> >> and we'd be better off to fix things so that none are made till later.
>
> > Because of the hook location, Gather-node shall be constructed with built-in
> > and foreign partial scan node first, then extension gets a chance to add its
> > custom paths (partial and full).
> > At the set_rel_pathlist(), set_rel_pathlist_hook() is invoked next to the
> > generate_gather_paths().
>
> Hmm.  I'm inclined to think that we should have a separate hook
> in which extensions are allowed to add partial paths, and that
> set_rel_pathlist_hook should only be allowed to add regular paths.
>
I have almost same opinion, but the first hook does not need to be
dedicated for partial paths. As like set_foreign_pathlist() doing, we can
add both of partial and regular paths here, then generate_gather_paths()
may generate a Gather-path on top of the best partial-path.

On the other hands, the later hook must be dedicated to add regular paths,
and also provides a chance for extensions to manipulate pre-built path-list
including Gather-path.
As long as I know, pg_hint_plan uses the set_rel_pathlist_hook to enforce
a particular path-node, including Gather-node, by manipulation of the cost
value. Horiguchi-san, is it right?
Likely, this kind of extension needs to use the later hook.

I expect these hooks are located as follows:

set_rel_pathlist(...)
{
:

:
/* for partial / regular paths */
if (set_rel_pathlist_hook)
  (*set_rel_pathlist_hook) (root, rel, rti, rte);
/* generate Gather-node */
if (rel->reloptkind == RELOPT_BASEREL)
generate_gather_paths(root, rel);
/* for regular paths and manipulation */
if (post_rel_pathlist_hook)
  (*post_rel_pathlist_hook) (root, rel, rti, rte);

set_cheapest();
}

Thanks,
-- 
HeteroDB, Inc / The PG-Strom Project
KaiGai Kohei 



Re: add_partial_path() may remove dominated path but still in use

2018-12-28 Thread Kohei KaiGai
2018年12月29日(土) 1:44 Tom Lane :
>
> Kohei KaiGai  writes:
> > I've investigated a crash report of PG-Strom for a few days, then I doubt
> > add_partial_path() can unexpectedly release dominated old partial path
> > but still referenced by other Gather node, and it leads unexpected system
> > crash.
>
> Hm.  This seems comparable to the special case in plain add_path, where it
> doesn't attempt to free IndexPaths because of the risk that they're still
> referenced.  So maybe we should just drop the pfree here.
>
> However, first I'd like to know why this situation is arising in the first
> place.  To have the situation you're describing, we'd have to have
> attempted to make some Gather paths before we have all the partial paths
> for the relation they're for.  Why is that a good thing to do?  It seems
> like such Gathers are necessarily being made with incomplete information,
> and we'd be better off to fix things so that none are made till later.
>
Because of the hook location, Gather-node shall be constructed with built-in
and foreign partial scan node first, then extension gets a chance to add its
custom paths (partial and full).
At the set_rel_pathlist(), set_rel_pathlist_hook() is invoked next to the
generate_gather_paths(). Even if extension adds some partial paths later,
the first generate_gather_paths() has to generate Gather node based on
incomplete information.
If we could ensure Gather node shall be made after all the partial nodes
are added, it may be a solution for the problem.

Of course, relocation of the hook may have a side-effect. Anyone may
expect the pathlist contains all the path-node including Gather node, for
editorialization of the pathlist.

Thanks,
-- 
HeteroDB, Inc / The PG-Strom Project
KaiGai Kohei 



Re: add_partial_path() may remove dominated path but still in use

2018-12-28 Thread Tom Lane
Kohei KaiGai  writes:
> I've investigated a crash report of PG-Strom for a few days, then I doubt
> add_partial_path() can unexpectedly release dominated old partial path
> but still referenced by other Gather node, and it leads unexpected system
> crash.

Hm.  This seems comparable to the special case in plain add_path, where it
doesn't attempt to free IndexPaths because of the risk that they're still
referenced.  So maybe we should just drop the pfree here.

However, first I'd like to know why this situation is arising in the first
place.  To have the situation you're describing, we'd have to have
attempted to make some Gather paths before we have all the partial paths
for the relation they're for.  Why is that a good thing to do?  It seems
like such Gathers are necessarily being made with incomplete information,
and we'd be better off to fix things so that none are made till later.

regards, tom lane