Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)

2014-04-28 Thread Kohei KaiGai
 Yeah.  I'm still not exactly convinced that custom-scan will ever allow
 independent development of new plan types (which, with all due respect to
 Robert, is what it was being sold as last year in Ottawa).  But I'm not
 opposed in principle to committing it, if we can find a way to have a cleaner
 API for things like setrefs.c.  It seems like late-stage planner processing
 in general is an issue for this patch (createplan.c and subselect.c are
 also looking messy).  EXPLAIN isn't too great either.

 I'm not sure exactly what to do about those cases, but I wonder whether
 things would get better if we had the equivalent of
 expression_tree_walker/mutator capability for plan nodes.  The state of
 affairs in setrefs and subselect, at least, is a bit reminiscent of the
 bad old days when we had lots of different bespoke code for traversing
 expression trees.

 Hmm. If we have something like expression_tree_walker/mutator for plan nodes,
 we can pass a walker/mutator function's pointer instead of exposing static
 functions that takes recursive jobs.
 If custom-plan provider (that has sub-plans) got a callback with walker/
 mutator pointer, all it has to do for sub-plans are calling this new
 plan-tree walking support routine with supplied walker/mutator.
 It seems to me more simple design than what I did.

I tried to code the similar walker/mutator functions on plan-node tree,
however, it was not available to implement these routines enough
simple, because the job of walker/mutator functions are not uniform
thus caller side also must have a large switch-case branches.

I picked up setrefs.c for my investigation.
The set_plan_refs() applies fix_scan_list() on the expression tree being
appeared in the plan node if it is delivered from Scan, however, it also
applies set_join_references() for subclass of Join, or
set_dummy_tlist_references() for some other plan nodes.
It implies that the walker/mutator functions of Plan node has to apply
different operation according to the type of Plan node. I'm not certain
how much different forms are needed.
(In addition, set_plan_refs() performs usually like a walker, but
often performs as a mutator if trivial subquery)

I'm expecting the function like below. It allows to call plan_walker
function for each plan-node and also allows to call expr_walker
function for each expression-node on the plan node.

bool
plan_tree_walker(Plan *plan,
 bool (*plan_walker) (),
 bool (*expr_walker) (),
 void *context)

I'd like to see if something other form to implement this routine.


One alternative idea to give custom-plan provider a chance to
handle its subplans is, to give function pointers (1) to handle
recursion of plan-tree and (2) to set up backend's internal
state.
In case of setrefs.c, set_plan_refs() and fix_expr_common()
are minimum necessity for extensions. It also kills necessity
to export static functions.

How about your thought?
-- 
KaiGai Kohei kai...@kaigai.gr.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)

2014-04-15 Thread Robert Haas
On Mon, Apr 14, 2014 at 4:43 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Simon Riggs si...@2ndquadrant.com writes:
 [ assorted comments about custom-scan patch, but particularly ]

 * The prune hook makes me feel very uneasy. It seems weirdly specific
 implementation detail, made stranger by the otherwise lack of data
 maintenance API calls. Calling that for every dirty page sounds like
 an issue and my patch rejection indicator is flashing red around that.

 Yeah.  After a fast review of the custom-scan and cache-scan patches, it
 seems to me that my original fears are largely confirmed: the custom scan
 patch is not going to be sufficient to allow development of any truly new
 plan type.  Yeah, you can plug in some new execution node types, but
 actually doing anything interesting is going to require patching other
 parts of the system.  Are we going to say to all comers, sure, we'll put
 a hook call anywhere you like, just ask?  I can't see this as being the
 way to go.

Without prejudice to the rest of what you said, this argument doesn't
hold much water with me.  I mean, anything that our extensibility
mechanism doesn't support today will require new hooks, but does that
mean we're never going to add any more hooks?  I sure hope not.  When
hooks are proposed here, we evaluate on them on their merits and
attempt to judge the likelihood that a hook in a particular place will
be useful, but generally we're not averse to adding them, and as long
as the paths aren't too performance-critical, I don't think we should
be averse to adding them.

We have a great system today for letting people add new data types and
things of that sort, but anything that penetrates more deeply into the
heart of the system pretty much can't be done; this is why various
companies, such as our respective employers, have developed and
maintained forks of the PostgreSQL code base instead of just hooking
in to the existing code.  We probably can't solve that problem
completely, but that doesn't mean we should throw in the towel.

And in particular, I think it's pretty normal that a new facility like
custom scans might create additional demand for new hooks.  If
something was completely impossible before, and the new facility makes
it almost-possible, then why shouldn't someone ask for a hook there?
A prune hook probably has no business in the custom scan patch proper,
but whether it's a good idea or a bad one should be decided on the
merits.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)

2014-04-15 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Mon, Apr 14, 2014 at 4:43 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Yeah.  After a fast review of the custom-scan and cache-scan patches, it
 seems to me that my original fears are largely confirmed: the custom scan
 patch is not going to be sufficient to allow development of any truly new
 plan type.  Yeah, you can plug in some new execution node types, but
 actually doing anything interesting is going to require patching other
 parts of the system.

 Without prejudice to the rest of what you said, this argument doesn't
 hold much water with me.  I mean, anything that our extensibility
 mechanism doesn't support today will require new hooks, but does that
 mean we're never going to add any more hooks?  I sure hope not.

No, that's not what I said.  ISTM that the argument for the custom-scan
API is that it allows interesting new things to be done *without further
modifying the core code*.  But the example application (cache_scan) fails
to demonstrate that, and indeed seems to be a counterexample.  Whether
we'd accept cache_scan on its own merits is a separate question.  The
problem for me is that custom-scan isn't showing that it can support what
was claimed without doing serious damage to modularity and maintainability
of the core code.

What this may mean is that we need more attention to refactoring of the
core code.  But just removing static from any function that looks like
it might be handy isn't my idea of well-considered refactoring.  More the
opposite in fact: if those things turn into APIs that we have to support,
it's going to kill any ability to do such refactoring.

A concrete example here is setrefs.c, whose responsibilities tend to
change from release to release.  I think if we committed custom-scan
as is, we'd have great difficulty changing setrefs.c's transformations
ever again, at least if we hoped to not break users of the custom-scan
API.  I'm not sure what the solution is --- but turning setrefs into
a white box instead of a black box isn't it.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)

2014-04-15 Thread Stephen Frost
* Tom Lane (t...@sss.pgh.pa.us) wrote:
 A concrete example here is setrefs.c, whose responsibilities tend to
 change from release to release.  I think if we committed custom-scan
 as is, we'd have great difficulty changing setrefs.c's transformations
 ever again, at least if we hoped to not break users of the custom-scan
 API.  I'm not sure what the solution is --- but turning setrefs into
 a white box instead of a black box isn't it.

Yeah, this was my (general) complaint as well and the answer that I kept
getting back is well, it's ok, you can still break it between major
releases and the custom scan users will just have to deal with it.

I'm a bit on the fence about that, itself, but the other half of that
coin is that we could end up with parts of the *core* code that think
it's ok to go pulling in these functions, once they're exposed, and that
could end up making things quite ugly and difficult to maintain going
forward.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)

2014-04-15 Thread Robert Haas
On Tue, Apr 15, 2014 at 10:44 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Mon, Apr 14, 2014 at 4:43 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Yeah.  After a fast review of the custom-scan and cache-scan patches, it
 seems to me that my original fears are largely confirmed: the custom scan
 patch is not going to be sufficient to allow development of any truly new
 plan type.  Yeah, you can plug in some new execution node types, but
 actually doing anything interesting is going to require patching other
 parts of the system.

 Without prejudice to the rest of what you said, this argument doesn't
 hold much water with me.  I mean, anything that our extensibility
 mechanism doesn't support today will require new hooks, but does that
 mean we're never going to add any more hooks?  I sure hope not.

 No, that's not what I said.  ISTM that the argument for the custom-scan
 API is that it allows interesting new things to be done *without further
 modifying the core code*.  But the example application (cache_scan) fails
 to demonstrate that, and indeed seems to be a counterexample.  Whether
 we'd accept cache_scan on its own merits is a separate question.  The
 problem for me is that custom-scan isn't showing that it can support what
 was claimed without doing serious damage to modularity and maintainability
 of the core code.

I think there's two separate things in there, one of which I agree
with and one of which I disagree with.  I agree that we must avoid
damaging the modularity and maintainability of the core code; I don't
agree that custom-scan needs to be able to do interesting things with
zero additional changes to the core code.  If we come up with three
interesting applications for custom scan that require 5 new hooks
between them, I'll consider that a major success - assuming those
hooks don't unduly limit future changes we may wish to make in the
core code.  I think your concern about exposing APIs that may not be
terribly stable is well-founded, but I don't think that means we
shouldn't expose *anything*.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)

2014-04-15 Thread Andres Freund
Hi,

On 2014-04-15 11:07:11 -0400, Robert Haas wrote:
 On Tue, Apr 15, 2014 at 10:44 AM, Tom Lane t...@sss.pgh.pa.us wrote:
  Robert Haas robertmh...@gmail.com writes:
 [ discussion ]

What I think this discussion shows that this patch isn't ready for
9.4. The first iteration of the patch came in 2013-11-06. Imo that's
pretty damn late for a relatively complex patch. And obviously we don't
have agreement on the course forward.
I don't think we need to stop discussing, but I think it's pretty clear
that this isn't 9.4 material. And that it's far from Ready for Committer.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)

2014-04-15 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes:
 What I think this discussion shows that this patch isn't ready for
 9.4. The first iteration of the patch came in 2013-11-06. Imo that's
 pretty damn late for a relatively complex patch. And obviously we don't
 have agreement on the course forward.
 I don't think we need to stop discussing, but I think it's pretty clear
 that this isn't 9.4 material. And that it's far from Ready for Committer.

Yeah.  I'm still not exactly convinced that custom-scan will ever allow
independent development of new plan types (which, with all due respect to
Robert, is what it was being sold as last year in Ottawa).  But I'm not
opposed in principle to committing it, if we can find a way to have a
cleaner API for things like setrefs.c.  It seems like late-stage planner
processing in general is an issue for this patch (createplan.c and
subselect.c are also looking messy).  EXPLAIN isn't too great either.

I'm not sure exactly what to do about those cases, but I wonder
whether things would get better if we had the equivalent of
expression_tree_walker/mutator capability for plan nodes.  The state
of affairs in setrefs and subselect, at least, is a bit reminiscent
of the bad old days when we had lots of different bespoke code for 
traversing expression trees.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)

2014-04-15 Thread Kouhei Kaigai
 Andres Freund and...@2ndquadrant.com writes:
  What I think this discussion shows that this patch isn't ready for
  9.4. The first iteration of the patch came in 2013-11-06. Imo that's
  pretty damn late for a relatively complex patch. And obviously we
  don't have agreement on the course forward.
  I don't think we need to stop discussing, but I think it's pretty
  clear that this isn't 9.4 material. And that it's far from Ready for
 Committer.
 
Yep, today is the expected feature freeze date towards v9.4.
It is little bit late to include v9.4 features, unfortunately.

 Yeah.  I'm still not exactly convinced that custom-scan will ever allow
 independent development of new plan types (which, with all due respect to
 Robert, is what it was being sold as last year in Ottawa).  But I'm not
 opposed in principle to committing it, if we can find a way to have a cleaner
 API for things like setrefs.c.  It seems like late-stage planner processing
 in general is an issue for this patch (createplan.c and subselect.c are
 also looking messy).  EXPLAIN isn't too great either.
 
 I'm not sure exactly what to do about those cases, but I wonder whether
 things would get better if we had the equivalent of
 expression_tree_walker/mutator capability for plan nodes.  The state of
 affairs in setrefs and subselect, at least, is a bit reminiscent of the
 bad old days when we had lots of different bespoke code for traversing
 expression trees.

Hmm. If we have something like expression_tree_walker/mutator for plan nodes,
we can pass a walker/mutator function's pointer instead of exposing static
functions that takes recursive jobs.
If custom-plan provider (that has sub-plans) got a callback with walker/
mutator pointer, all it has to do for sub-plans are calling this new
plan-tree walking support routine with supplied walker/mutator.
It seems to me more simple design than what I did.

Thanks,
--
NEC OSS Promotion Center / PG-Strom Project
KaiGai Kohei kai...@ak.jp.nec.com



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)

2014-04-15 Thread Kouhei Kaigai
 On Tue, Apr 15, 2014 at 10:44 AM, Tom Lane t...@sss.pgh.pa.us wrote:
  Robert Haas robertmh...@gmail.com writes:
  On Mon, Apr 14, 2014 at 4:43 PM, Tom Lane t...@sss.pgh.pa.us wrote:
  Yeah.  After a fast review of the custom-scan and cache-scan
  patches, it seems to me that my original fears are largely
  confirmed: the custom scan patch is not going to be sufficient to
  allow development of any truly new plan type.  Yeah, you can plug in
  some new execution node types, but actually doing anything
  interesting is going to require patching other parts of the system.
 
  Without prejudice to the rest of what you said, this argument doesn't
  hold much water with me.  I mean, anything that our extensibility
  mechanism doesn't support today will require new hooks, but does that
  mean we're never going to add any more hooks?  I sure hope not.
 
  No, that's not what I said.  ISTM that the argument for the
  custom-scan API is that it allows interesting new things to be done
  *without further modifying the core code*.  But the example
  application (cache_scan) fails to demonstrate that, and indeed seems
  to be a counterexample.  Whether we'd accept cache_scan on its own
  merits is a separate question.  The problem for me is that custom-scan
  isn't showing that it can support what was claimed without doing
  serious damage to modularity and maintainability of the core code.
 
 I think there's two separate things in there, one of which I agree with
 and one of which I disagree with.  I agree that we must avoid damaging the
 modularity and maintainability of the core code; I don't agree that
 custom-scan needs to be able to do interesting things with zero additional
 changes to the core code.  If we come up with three interesting applications
 for custom scan that require 5 new hooks between them, I'll consider that
 a major success - assuming those hooks don't unduly limit future changes
 we may wish to make in the core code.  I think your concern about exposing
 APIs that may not be terribly stable is well-founded, but I don't think
 that means we shouldn't expose *anything*.
 
I agree 100%.

We usually change hook definition release-by-release, and it is author's
responsibility to follow the newer interface if he continues to maintain
his extension on the newer release also.
Probably, it is a gray stuff neither black nor white. If we can design
a perfect interface, it might be good but has no evolution further.
Of course, it does not justify poor designed interface, but an important
stuff is to find out a best way at this moment. It may take core
refactoring, not just exposing static functions. What I tried to implement
is the only way to implement it.

Thanks,
--
NEC OSS Promotion Center / PG-Strom Project
KaiGai Kohei kai...@ak.jp.nec.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)

2014-04-14 Thread Simon Riggs
On 24 March 2014 10:25, Kouhei Kaigai kai...@ak.jp.nec.com wrote:

 Brief summary of the current approach that has been revised from my
 original submission through the discussion on pgsql-hackers:

 The plannode was renamed to CustomPlan, instead of CustomScan, because
 it dropped all the hardcoded portion that assumes the custom-node shall
 perform as alternative scan or join method, because it prevents this
 custom-node to perform as other stuff; like sort or append potentially.
 According to the suggestion by Tom, I put a structure that contains
 several function pointers on the new CustomPlan node, and extension will
 allocate an object that extends CustomPlan node.
 It looks like polymorphism in object oriented programming language.
 The core backend knows abstracted set of methods defined in the
 tables of function pointers, and extension can implement its own logic
 on the callback, using private state on the extended object.

I just wanted to add some review comments here. I also apologise for
not reviewing this earlier; I had misunderstood the maturity of the
patch and had assumed it was a request for comments/WIP.

Overall, I very much support the concept of providing for alternate
scans. I like the placement of calls in the optimizer and we'll be
able to do much with that. Other comments in order that I consider
them important.

* There is no declarative structure for this at all. I was expecting
to see a way to declare that a specific table might have an alternate
scan path, but we just call the plugin always and it has to separately
make a cache lookup to see if anything extra is needed. The Index AM
allows us to perform scans, yet indexes are very clearly declared and
easily and clearly identifiable. We need the same thing for alternate
plans.

* There is no mechanism at all for maintaining other data structures.
Are we supposed to use the Index AM? Triggers? Or? The lack of clarity
there is disturbing, though I could be simply missing something big
and obvious.

* There is no catalog support. Complex APIs in Postgres typically have
a structure like pg_am which allows the features to be clearly
identified. I'd be worried about our ability to keep track of so many
calls in such pivotal places without that. I want to be able to know
what a plugin is doing, especially when it will likely come in binary
form. I don't see an easy way to have plugins partially override each
other or work together. What happens when I want to use Mr.X's clever
new join plugin at the same time as Mr.Y's GPU accelerator?

* How do we control security? What stops the Custom Scan API from
overriding privileges? Shouldn't the alternate data structures be
recognised as objects so we can grant privileges? Or do we simply say
if an alternate data structure is linked to a heap then has implied
privileges. It would be a shame to implement better security in one
patch and then ignore it in another (from the same author).

All of the above I can let pass in this release, but in the longer
term we need to look for more structure around these ideas so we can
manage and control what happens. The way this is now is quite raw -
suitable for RD but not for longer term production usage by a wider
audience, IMHO. I wouldn't like to make commitments about the
longevity of this API either; if we accept it, it should have a big
may change radically sign hanging on it. Having said that, I am
interested in progress here and I accept that things will look like
this at this stage of the ideas process, so these are not things to
cause delay.

Some things I would like to see change on in this release are...

* It's not clear to me when we load/create the alternate data
structures. That can't happen at _init time. I was expecting this to
look like an infrastructure for unlogged indexes, but it doesn't look
like that either.

* The name Custom makes me nervous. It sounds too generic, as if the
design or objectives for this is are a little unclear. AlternateScan
sounds like a better name since its clearer that we are scanning an
alternate data structure rather than the main heap.

* The prune hook makes me feel very uneasy. It seems weirdly specific
implementation detail, made stranger by the otherwise lack of data
maintenance API calls. Calling that for every dirty page sounds like
an issue and my patch rejection indicator is flashing red around that.



Two additional use cases I will be looking to explore will be

* Code to make Mat Views recognised as alternate scan targets
* Code to allow queries access to sampled data rather the fully
detailed data, if the result would be within acceptable tolerance for
user

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)

2014-04-14 Thread Tom Lane
Simon Riggs si...@2ndquadrant.com writes:
 [ assorted comments about custom-scan patch, but particularly ]

 * The prune hook makes me feel very uneasy. It seems weirdly specific
 implementation detail, made stranger by the otherwise lack of data
 maintenance API calls. Calling that for every dirty page sounds like
 an issue and my patch rejection indicator is flashing red around that.

Yeah.  After a fast review of the custom-scan and cache-scan patches, it
seems to me that my original fears are largely confirmed: the custom scan
patch is not going to be sufficient to allow development of any truly new
plan type.  Yeah, you can plug in some new execution node types, but
actually doing anything interesting is going to require patching other
parts of the system.  Are we going to say to all comers, sure, we'll put
a hook call anywhere you like, just ask?  I can't see this as being the
way to go.

Another way of describing the problem is that it's not clear where the API
boundaries are for potential users of a custom-scan feature.  (Simon said
several things that are closely related to this point.)  One thing I don't
like at all about the patch is its willingness to turn anything whatsoever
into a publicly exported function, which basically says that the design
attitude is there *are* no boundaries.  But that's not going to lead to
anything maintainable.  We're certainly not going to want to guarantee
that these suddenly-exported functions will all now have stable APIs
forevermore.

Overall I concur with Simon's conclusion that this might be of interest
for RD purposes, but it's hard to see anyone wanting to support a
production feature built on this.  It would be only marginally less
painful than supporting a patch that just adds the equivalent code
to the backend in the traditional way.

So I'm feeling that this was kind of a dead end.  It was worth doing
the legwork to see if this sort of approach could be useful, but the
answer seems like no.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)

2014-04-14 Thread Kouhei Kaigai
 On 24 March 2014 10:25, Kouhei Kaigai kai...@ak.jp.nec.com wrote:
 
  Brief summary of the current approach that has been revised from my
  original submission through the discussion on pgsql-hackers:
 
  The plannode was renamed to CustomPlan, instead of CustomScan, because
  it dropped all the hardcoded portion that assumes the custom-node
  shall perform as alternative scan or join method, because it prevents
  this custom-node to perform as other stuff; like sort or append
 potentially.
  According to the suggestion by Tom, I put a structure that contains
  several function pointers on the new CustomPlan node, and extension
  will allocate an object that extends CustomPlan node.
  It looks like polymorphism in object oriented programming language.
  The core backend knows abstracted set of methods defined in the tables
  of function pointers, and extension can implement its own logic on the
  callback, using private state on the extended object.
 
 I just wanted to add some review comments here. I also apologise for not
 reviewing this earlier; I had misunderstood the maturity of the patch and
 had assumed it was a request for comments/WIP.
 
Thanks for your interest and many comments.

 Overall, I very much support the concept of providing for alternate scans.
 I like the placement of calls in the optimizer and we'll be able to do much
 with that. Other comments in order that I consider them important.
 
 * There is no declarative structure for this at all. I was expecting to
 see a way to declare that a specific table might have an alternate scan
 path, but we just call the plugin always and it has to separately make a
 cache lookup to see if anything extra is needed. The Index AM allows us
 to perform scans, yet indexes are very clearly declared and easily and
 clearly identifiable. We need the same thing for alternate plans.
 
 * There is no mechanism at all for maintaining other data structures.
 Are we supposed to use the Index AM? Triggers? Or? The lack of clarity there
 is disturbing, though I could be simply missing something big and obvious.
 
 * There is no catalog support. Complex APIs in Postgres typically have a
 structure like pg_am which allows the features to be clearly identified.
 I'd be worried about our ability to keep track of so many calls in such
 pivotal places without that. I want to be able to know what a plugin is
 doing, especially when it will likely come in binary form. I don't see an
 easy way to have plugins partially override each other or work together.
 What happens when I want to use Mr.X's clever new join plugin at the same
 time as Mr.Y's GPU accelerator?
 
It was a choice on implementation. I just followed usual PG's hook manner;
that expects loaded extension saves the original function pointer and
has secondary call towards the function on its invocation. Thus, it needs
to walk on chain of extensions if multiple custom providers are loaded.

Even though I initially chose this design, it is an option to have catalog
support to track registered custom-scan providers and its metadata; what
function generate paths, what flags are turned on or what kind of relations
are supported...etc. Probably, optimizer skip some extensions that don't
support the target relation obviously.


 * How do we control security? What stops the Custom Scan API from overriding
 privileges? Shouldn't the alternate data structures be recognised as
 objects so we can grant privileges? Or do we simply say if an alternate
 data structure is linked to a heap then has implied privileges. It would
 be a shame to implement better security in one patch and then ignore it
 in another (from the same author).
 
In general, we have no mechanism to prevent overriding privilege mechanism
by c-binary extensions. Extension can override (existing) hooks and modify
requiredPerms bits of RangeTblEntry; that eventually cause privilege bypass.
But it is neutral for custom-scan API itself. Even though we implements
an alternative scan logic on the API, the core backend still checks required
privileges on ExecCheckRTPerms being called on the head of executor (unless
author of extension does not do something strange).


 All of the above I can let pass in this release, but in the longer term
 we need to look for more structure around these ideas so we can manage and
 control what happens. The way this is now is quite raw - suitable for RD
 but not for longer term production usage by a wider audience, IMHO. I
 wouldn't like to make commitments about the longevity of this API either;
 if we accept it, it should have a big may change radically sign hanging
 on it. Having said that, I am interested in progress here and I accept that
 things will look like this at this stage of the ideas process, so these
 are not things to cause delay.
 
 Some things I would like to see change on in this release are...
 
 * It's not clear to me when we load/create the alternate data structures.
 That can't happen at _init time. I was 

Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)

2014-04-14 Thread Kouhei Kaigai
 Simon Riggs si...@2ndquadrant.com writes:
  [ assorted comments about custom-scan patch, but particularly ]
 
  * The prune hook makes me feel very uneasy. It seems weirdly specific
  implementation detail, made stranger by the otherwise lack of data
  maintenance API calls. Calling that for every dirty page sounds like
  an issue and my patch rejection indicator is flashing red around that.
 
 Yeah.  After a fast review of the custom-scan and cache-scan patches, it
 seems to me that my original fears are largely confirmed: the custom scan
 patch is not going to be sufficient to allow development of any truly new
 plan type.  Yeah, you can plug in some new execution node types, but actually
 doing anything interesting is going to require patching other parts of the
 system.  Are we going to say to all comers, sure, we'll put a hook call
 anywhere you like, just ask?  I can't see this as being the way to go.
 
Here is two different points to be discussed; one is generic to the custom-
plan API, and other is specific to my cache-only scan implementation.

Because existing plan/exec nodes are all built-in and some functional stuffs
are consolidated to a particular source file (like createplan.c, setrefs.c),
so it does not make problems if commonly called functions are declared as
static functions.
Custom-plan API changes this assumption, in other words, it allows to have
some portion of jobs in createplan.c or setrefs.c externally, so it needs
to have the commonly used functions being external.
Because I had try  error during development, I could not list up all the
functions to be public at once. However, it is not a fundamental matter,
should be solved during the discussion on pgsql-hackers.

Regarding to the specific portion in the cache-only scan, it may happen
if we want to create an extension that tracks vacuuming, independent from
custom-scan.
Usually, extension utilizes multiple hooks and interfaces to implement
the feature they want to do. In case of cache-only scan, unfortunately,
PG lacks a way to track heap vacuuming even though it needed to invalidate
cached data. It is unrelated issue from the custom-scan API. We may see
same problem if I tried to create an extension to count number of records
being vacuumed.

 Another way of describing the problem is that it's not clear where the API
 boundaries are for potential users of a custom-scan feature.  (Simon said
 several things that are closely related to this point.)  One thing I don't
 like at all about the patch is its willingness to turn anything whatsoever
 into a publicly exported function, which basically says that the design
 attitude is there *are* no boundaries.  But that's not going to lead to
 anything maintainable.  We're certainly not going to want to guarantee that
 these suddenly-exported functions will all now have stable APIs
 forevermore.
 
I'd like to have *several* existing static functions as (almost) stable
APIs, but not all. Indeed, my patch randomly might pick up static functions
to redefine as external functions, however, it does not mean custom-plan
eventually requires all the functions being external.
According to my investigation, here is two types of functions to be exposed.
- A function that walks on plan/exec node tree recursively
  (Eg: create_plan_recurse)
- A function that adjusts internal state of the core backend
  (Eg: fix_expr_common)

At least, these functions are not majority. I don't think it should be
a strong blocker of this new feature.
(I may have oversights of course, please point out.)

 Overall I concur with Simon's conclusion that this might be of interest
 for RD purposes, but it's hard to see anyone wanting to support a production
 feature built on this.  It would be only marginally less painful than
 supporting a patch that just adds the equivalent code to the backend in
 the traditional way.
 
As we adjusted FDW APIs through the first several releases, in general,
any kind of interfaces takes time to stabilize. Even though it *initially*
sticks on RD purpose (I don't deny), it shall be brushed up to production
stage. I think a feature for RD purpose is a good start-point.

Thanks,
--
NEC OSS Promotion Center / PG-Strom Project
KaiGai Kohei kai...@ak.jp.nec.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)

2014-03-19 Thread Kouhei Kaigai
Hello,

  * Definition of add_join_path_hook
 
  I didn't have idea to improve the definition and location of this
  hook, so it is still on the tail of the add_paths_to_joinrel().
  Its definition was a bit adjusted according to the feedback on the
  pgsql-hackers. I omitted the mergeclause_list and  semifactors
  from the argument list. Indeed, these are specific to the built-in
  MergeJoin logic and easy to reproduce.
 

After the submission, I'm still investigating better way to put a hook
to add custom join paths.

Regarding to the comment from Tom:
| * The API for add_join_path_hook seems overcomplex, as well as too full
| of implementation details that should remain private to joinpath.c.
| I particularly object to passing the mergeclause lists, which seem unlikely
| to be of interest for non-mergejoin plan types anyway.
| More generally, it seems likely that this hook is at the wrong level of
| abstraction; it forces the hook function to concern itself with a lot of
| stuff about join legality and parameterization (which I note your patch3
| code fails to do; but that's not an optional thing).
| 
The earlier half was already done. My trouble is the later portion.

The overall jobs of add_join_path_hook are below:
1. construction of parameterized path information; being saved at
   param_source_rel and extra_lateral_rels.
2. Try to add mergejoin paths with underlying Sort node
3. Try to add mergejoin/nestloop paths without underlying Sort node
4. Try to add hashjoin paths

It seems to me the check for join legality and parameterization are
built within individual routines for each join algorithm.
(what does the join legality check actually mean?)

Probably, it makes sense to provide a common utility function to be
called back from the extension if we can find out a common part for
all the join logics, however, I don't have clear idea to cut off the
common portion. What's jobs can be done independent from the join
algorithm??

I'd like to see ideas around this issue. Of course, I also think it
is still an option to handle it by extension on the initial version.

Thanks,
--
NEC OSS Promotion Center / PG-Strom Project
KaiGai Kohei kai...@ak.jp.nec.com


 -Original Message-
 From: pgsql-hackers-ow...@postgresql.org
 [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Kouhei Kaigai
 Sent: Monday, March 17, 2014 9:29 AM
 To: Kaigai Kouhei(海外 浩平); Tom Lane
 Cc: Kohei KaiGai; Stephen Frost; Shigeru Hanada; Jim Mlodgenski; Robert
 Haas; PgHacker; Peter Eisentraut
 Subject: Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)
 
 Hello,
 
 I adjusted the custom-plan interface patch little bit for the cache-only
 scan patch; that is a demonstration module for vacuum-page hook on top of
 the custom-plan interface.
 
 fix_scan_expr() looks to me useful for custom-plan providers that want to
 implement its own relation scan logic, even though they can implement it
 using fix_expr_common() being already exposed.
 
 Also, I removed the hardcoded portion from the nodeCustom.c although, it
 may make sense to provide a few template functions to be called by custom-
 plan providers, that performs usual common jobs like construction of expr-
 context, assignment of result-slot, open relations, and so on.
 I though the idea during implementation of BeginCustomPlan handler.
 (These template functions are not in the attached patch yet.) How about
 your opinion?
 
 The major portion of this patch is not changed from v10.
 
 Thanks,
 --
 NEC OSS Promotion Center / PG-Strom Project KaiGai Kohei
 kai...@ak.jp.nec.com
 
 
  -Original Message-
  From: pgsql-hackers-ow...@postgresql.org
  [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Kouhei Kaigai
  Sent: Wednesday, March 12, 2014 1:55 PM
  To: Tom Lane
  Cc: Kohei KaiGai; Stephen Frost; Shigeru Hanada; Jim Mlodgenski;
  Robert Haas; PgHacker; Peter Eisentraut
  Subject: Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)
 
  Hello,
 
  The attached two patches are the revised custom-plan interface and
  example usage that implements existing MergeJoin on top of this interface.
 
  According to the discussion last week, I revised the portion where
  custom-node is expected to perform a particular kind of task, like
  scanning a relation, by putting polymorphism with a set of callbacks
  set by custom-plan provider.
  So, the core backend can handle this custom-plan node just an
  abstracted plan-node with no anticipation.
  Even though the subject of this message says custom-scan, I'd like
  to name the interface custom-plan instead, because it became fully
  arbitrary of extension whether it scan on a particular relation.
 
  Definition of Custom data types were simplified:
 
  typedef struct CustomPath
  {
  Pathpath;
  const struct CustomPathMethods   *methods;
  } CustomPath;
 
  typedef struct CustomPlan
  {
  Planplan;
  const struct CustomPlanMethods *methods

Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)

2014-03-06 Thread Tom Lane
Kouhei Kaigai kai...@ak.jp.nec.com writes:
 I expected to include simple function pointers for copying and text-output
 as follows:

   typedef struct {
   Planplan;
 :
   NodeCopy_functionnode_copy;
   NodeTextOut_function node_textout;
   } Custom;

I was thinking more like

typedef struct CustomPathFuncs {
const char *name;   /* used for debugging purposes only */
NodeCopy_function node_copy;
NodeTextOut_function node_textout;
... etc etc etc ...
} CustomPathFuncs;

typedef struct CustomPath {
Path path;
const CustomPathFuncs *funcs;
... maybe a few more fields here, but not too darn many ...
} CustomPath;

and similarly for CustomPlan.

The advantage of this way is it's very cheap for (what I expect will be)
the common case where an extension has a fixed set of support functions
for its custom paths and plans.  It just declares a static constant
CustomPathFuncs struct, and puts a pointer to that into its paths.

If an extension really needs to set the support functions on a per-object
basis, it can do this:

typdef struct MyCustomPath {
   CustomPath cpath;
   CustomPathFuncs funcs;
   ... more fields ...
} MyCustomPath;

and then initialization of a MyCustomPath would include

mypath-cpath.funcs = mypath-funcs;
mypath-funcs.node_copy = MyCustomPathCopy;
... etc etc ...

In this case we're arguably wasting one pointer worth of space in the
path, but considering the number of function pointers such a path will be
carrying, I don't think that's much of an objection.

 So?  If you did that, then you wouldn't have renumbered the Vars as
 INNER/OUTER.  I don't believe that CUSTOM_VAR is necessary at all; if it
 is needed, then there would also be a need for an additional tuple slot
 in executor contexts, which you haven't provided.

 For example, the enhanced postgres_fdw fetches the result set of
 remote join query, thus a tuple contains the fields come from both side.
 In this case, what varno shall be suitable to put?

Not sure what we'd do for the general case, but CUSTOM_VAR isn't the
solution.  Consider for example a join where both tables supply columns
named id --- if you put them both in one tupledesc then there's no
non-kluge way to identify them.

Possibly the route to a solution involves adding another plan-node
callback function that ruleutils.c would use for printing Vars in custom
join nodes.  Or maybe we could let the Vars keep their original RTE
numbers, though that would complicate life at execution time.

Anyway, if we're going to punt on add_join_path_hook for the time
being, this problem can probably be left to solve later.  It won't
arise for simple table-scan cases, nor for single-input plan nodes
such as sorts.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)

2014-03-06 Thread Kouhei Kaigai
 I was thinking more like
 
 typedef struct CustomPathFuncs {
   const char *name;   /* used for debugging purposes only */
   NodeCopy_function node_copy;
   NodeTextOut_function node_textout;
   ... etc etc etc ...
 } CustomPathFuncs;
 
 typedef struct CustomPath {
   Path path;
   const CustomPathFuncs *funcs;
   ... maybe a few more fields here, but not too darn many ...
 } CustomPath;
 
 and similarly for CustomPlan.
 
 The advantage of this way is it's very cheap for (what I expect will be)
 the common case where an extension has a fixed set of support functions
 for its custom paths and plans.  It just declares a static constant
 CustomPathFuncs struct, and puts a pointer to that into its paths.
 
 If an extension really needs to set the support functions on a per-object
 basis, it can do this:
 
 typdef struct MyCustomPath {
CustomPath cpath;
CustomPathFuncs funcs;
... more fields ...
 } MyCustomPath;
 
 and then initialization of a MyCustomPath would include
 
   mypath-cpath.funcs = mypath-funcs;
   mypath-funcs.node_copy = MyCustomPathCopy;
   ... etc etc ...
 
 In this case we're arguably wasting one pointer worth of space in the path,
 but considering the number of function pointers such a path will be carrying,
 I don't think that's much of an objection.
 
That is exactly same as my expectation, and no objection here.
Thanks for your clarification.


  So?  If you did that, then you wouldn't have renumbered the Vars as
  INNER/OUTER.  I don't believe that CUSTOM_VAR is necessary at all; if
  it is needed, then there would also be a need for an additional tuple
  slot in executor contexts, which you haven't provided.
 
  For example, the enhanced postgres_fdw fetches the result set of
  remote join query, thus a tuple contains the fields come from both side.
  In this case, what varno shall be suitable to put?
 
 Not sure what we'd do for the general case, but CUSTOM_VAR isn't the solution.
 Consider for example a join where both tables supply columns named id
 --- if you put them both in one tupledesc then there's no non-kluge way
 to identify them.
 
 Possibly the route to a solution involves adding another plan-node callback
 function that ruleutils.c would use for printing Vars in custom join nodes.
 Or maybe we could let the Vars keep their original RTE numbers, though that
 would complicate life at execution time.
 
My preference is earlier one, because complication in execution time may
make performance degradation.
Once two tuples get joined in custom-node, only extension can know which
relation is the origin of a particular attribute in the unified tuple.
So, it seems to me reasonable extension has to provide a hint to resolve
the Var naming.
Probably, another callback that provides a translation table from a Var
node that reference custom-plan but originated from either of subtree.
(It looks like a translated_vars in prepunion.c)

For example, let's assume here is a Var node with INDEX_VAR in the tlist
of custom-plan. It eventually references ecxt_scantuple in the execution
time, and this tuple-slot will keep a joined tuple being originated from
two relations. If its varattno=9 came from the column varno=1/varatno=3,
we like to print its original name. If we can have a translation table
like translated_vars, it allows to translate an attribute number on the
custom-plan into its original ones.
Even it might be abuse of INDEX_VAR, it seems to me a good idea.
Also, I don't like to re-define the meaning of INNER_VAR/OUTER_VAR
because custom-plan may have both of left-/right-subtree, thus it makes
sense to support a case when both of tupleslots are available.

 Anyway, if we're going to punt on add_join_path_hook for the time being,
 this problem can probably be left to solve later.  It won't arise for simple
 table-scan cases, nor for single-input plan nodes such as sorts.
 
Yes, it is a problem if number of input plans is larger then 1.

Thanks,
--
NEC OSS Promotion Center / PG-Strom Project
KaiGai Kohei kai...@ak.jp.nec.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)

2014-03-05 Thread Kouhei Kaigai
Tom, thanks for your detailed comments.

 I apologize for not having paid much attention to this thread so far.
 It kept getting stuck on my to look at later queue.  Anyway, I've taken
 a preliminary look at the v7 patch now.
 
 While the patch seems roughly along the lines of what we talked about last
 PGCon, I share Stephen's unease about a lot of the details.  It's not
 entirely clear that these hooks are really good for anything, and it's even
 less clear what APIs the hook functions should be expected to depend on.
 I really do not like the approach embodied in the later patches of oh,
 we'll just expose whatever static planner functions seem convenient.
 That's not an approach that's available when doing actual external
 development of an extension, and even if it were that doesn't make it a
 good idea.  The larger the exposed surface of functions the harder it is
 to know what's safe to change.
 
Hmm. It needs a clear reasoning to expose the function rather than
its convenience.

 Anyway, on to specifics:
 
 * Please drop the whole register_custom_provider/get_custom_provider API.
 There is no reason other than debugging for a provider to have a name at
 all, and if we expect providers to have unique names then that creates a
 collision risk for independently-created extensions.  AFAICS, it's
 sufficient for a function hooked into one of the add-a-path hooks to include
 a pointer to a struct-of-function-pointers in the Path object it returns,
 and similarly the CustomScan Plan object can contain a pointer inserted
 when it's created.  I don't object to having a name field in the function
 pointer structs for debugging reasons, but I don't want any lookups being
 done on it.
 
One thing I was worrying about is how copyObject() and nodeToString() support
set of function pointer tables around custom-scan node, however, you suggested
to change the assumption here. So, I think these functions become unnecessary.

 * The function-struct pointers should be marked const in the referencing
 nodes, to indicate that the core code won't be modifying or copying them.
 In practice they'd probably be statically allocated constants in the
 extensions anyway.
 
OK,

 * The patch does lots of violence to the separation between planner and
 executor, starting with the decision to include nodes/relation.h in
 executor.h.  That will not do at all.  I see that you did that because you
 wanted to make ExecSupportsMarkRestore take a Path, but we need some other
 answer.  One slightly grotty answer is to invent two different customscan
 Plan types, one that supports mark/restore and one that doesn't, so that
 ExecSupportsMarkRestore could still just look at the Plan type tag.
 (BTW, path-pathtype is supposed to contain the node tag of the Plan node
 that the path would produce.  Putting T_CustomPath in it is entirely
 tone-deaf.)  Another way would be to remove ExecSupportsMarkRestore in
 favor of some new function in the planner; but it's good to keep it in
 execAmi.c since that has other knowledge of which plan types support
 mark/restore.
 
OK, I'll add one derivative node delivertive plan node type,
CustomScanMarkRestore for instance.
Probably, it shall be populated on the create_customscan_plan()
according to the flag being set on the CustomPath.

 * More generally, I'm not convinced that exactly one Path type and exactly
 one Plan type is going to get us very far.  It seems rather ugly to use
 the same Plan type for both scan and join nodes, and what will happen if
 somebody wants to build a custom Append node, or something else that has
 neither zero nor two subplans?
 
In the previous discussion, CustomJoin will be nonsense because we know
limited number of join algorithms: nest-loop, hash-join and merge-join, unlike
variation of logic to scan relations. Also, IIUC, someone didn't want to add
custom- something node types for each built-in types.
So, we concluded to put CustomScan node to replace built-in join / scan at
that time.
Regarding to the Append node, it probably needs to be enhanced to have
list of subplans on CustomScan, or add individual CustomAppend node, or
opaque CustomPlan may be sufficient if it handles EXPLAIN by itself.

 * nodeCustom.h is being completely abused.  That should only export the
 functions in nodeCustom.c, which are going to be pretty much one-liners
 anyway.  The right place to put the function pointer struct definitions
 is someplace else.  I'd be inclined to start by separating the function
 pointers into two structs, one for functions needed for a Path and one for
 functions needed for a Plan, so that you don't have this problem of having
 to import everything the planner knows into an executor header or vice versa.
 Most likely you could just put the Path function pointer struct declaration
 next to CustomPath in relation.h, and the one for Plans next to CustomPlan
 (or the variants thereof) in plannodes.h.
 
Yes. I didn't have clear idea where we should put the definition of 

Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)

2014-03-05 Thread Tom Lane
Kouhei Kaigai kai...@ak.jp.nec.com writes:
 * Please drop the whole register_custom_provider/get_custom_provider API.

 One thing I was worrying about is how copyObject() and nodeToString() support
 set of function pointer tables around custom-scan node, however, you suggested
 to change the assumption here. So, I think these functions become unnecessary.

If we allow the extension to control copyObject behavior, it can do what
it likes with the function-struct pointer.  I think the typical case would
be that it's a simple pointer to a never-copied static constant.  But you
could imagine that it's a pointer to a struct embedded further down in the
custom path or plan node, if the extension wants different functions for
different plans.

If we had to support stringToNode() for paths or plans, things would get
much more complicated, but we don't (and there are already lots of other
things that would be difficult for that).

 The reason why CustomScan is derived from Scan is, some of backend code
 wants to know rtindex of the relation to be referenced by this CustomScan.
 The scanrelid of Scan is used in three points: nodeCustom.c, setrefs.c and
 explain.c. The usage in nodeCustom.c is just for service routines, and the
 usage in setrefs.c can be moved to the extension according to your suggestion.
 We need to investigate the usage in explain.c; ExplainPreScanNode() walks
 around the nodes to collect relids referenced in this query. If we don't
 want to put a callback for this specific usage, it is a reasonable choice
 to show the backend the associated scanrelid of CustomScan.

I think we have to add another callback for that :-(.  It's a pain since
it's such a trivial point; but the existing code cannot support a custom
node referencing more than one RTE, which seems possible for join or
append type cases.

 Probably, the hunk around show_customscan_info() call can be entirely moved
 to the extension side. If so, I want ExplainNode() being an extern function,
 because it allows extensions to print underlying plan-nodes.

I haven't looked at what explain.c would have to expose to make this
workable, but yeah, we will probably have to export a few things.

 Are you saying the hard-wired portion in setrefs.c can be moved to the
 extension side? If fix_scan_expr() become extern function, I think it
 is feasible.

My recollection is that fix_scan_expr() is a bit specialized.  Not sure
exactly what we'd have to export there --- but we'd have to do it anyway.
What you had in the patch was a hook that could be called, but no way
for it to do what it would likely need to do.

 * Get rid of the CUSTOM_VAR business, too (including custom_tupdesc).

 In case of Var-node that references joined-relations, it shall be replaced to
 either INNER_VAR or OUTER_VAR according the location of underlying
 relations. It eventually makes ExecEvalScalarVar() to reference either
 ecxt_innertuple or ecxt_outertuple, however, it is problematic if we already
 consolidated tuples come from the both side into one.

So?  If you did that, then you wouldn't have renumbered the Vars as
INNER/OUTER.  I don't believe that CUSTOM_VAR is necessary at all;
if it is needed, then there would also be a need for an additional
tuple slot in executor contexts, which you haven't provided.

 For example, the enhanced postgres_fdw fetches the result set of remote
 join query, thus a tuple contains the fields come from both side.
 In this case, what varno shall be suitable to put?

That would be a scan situation, and the vars could reference the scan
tuple slot.  Which in fact was the implementation you were using, so
how is CUSTOM_VAR adding anything?

 * Why is there more than one call site for add_scan_path_hook?  I don't
 see the need for the calling macro with the randomly inconsistent name,
 either.

 Where is the best place to do? Even though I cannot imagine the situation
 to run sub-query or cte by extensions, its path is constructed during
 set_rel_size(), so I had to put the hook for each set__pathlist()
 functions.

Hm.  We could still call the hook in set_rel_pathlist, if we were to
get rid of the individual calls to set_cheapest and do that in one
spot at the bottom of set_rel_pathlist (after the hook call).  Calling
set_cheapest in one place seems more consistent anyway.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)

2014-03-05 Thread Kouhei Kaigai
 Kouhei Kaigai kai...@ak.jp.nec.com writes:
  * Please drop the whole register_custom_provider/get_custom_provider
 API.
 
  One thing I was worrying about is how copyObject() and nodeToString()
  support set of function pointer tables around custom-scan node,
  however, you suggested to change the assumption here. So, I think these
 functions become unnecessary.
 
 If we allow the extension to control copyObject behavior, it can do what
 it likes with the function-struct pointer.  I think the typical case would
 be that it's a simple pointer to a never-copied static constant.  But you
 could imagine that it's a pointer to a struct embedded further down in the
 custom path or plan node, if the extension wants different functions for
 different plans.
 
 If we had to support stringToNode() for paths or plans, things would get
 much more complicated, but we don't (and there are already lots of other
 things that would be difficult for that).
 
I expected to include simple function pointers for copying and text-output
as follows:

  typedef struct {
  Planplan;
:
  NodeCopy_functionnode_copy;
  NodeTextOut_function node_textout;
  } Custom;

Then, sub-routine in copyfuncs.c shall be:
  static Custom *
  _copyCustom(const Custom *from)
  {
  return from-node_copy(from);
  }

Can I share same image for this? It allows Custom node to have polymorphism
on the node it enhanced.

Sorry, I got little bit confused. Is the function-struct pointer you
mentioned something different from usual function pointer?


  The reason why CustomScan is derived from Scan is, some of backend
  code wants to know rtindex of the relation to be referenced by this
 CustomScan.
  The scanrelid of Scan is used in three points: nodeCustom.c, setrefs.c
  and explain.c. The usage in nodeCustom.c is just for service routines,
  and the usage in setrefs.c can be moved to the extension according to
 your suggestion.
  We need to investigate the usage in explain.c; ExplainPreScanNode()
  walks around the nodes to collect relids referenced in this query. If
  we don't want to put a callback for this specific usage, it is a
  reasonable choice to show the backend the associated scanrelid of
 CustomScan.
 
 I think we have to add another callback for that :-(.  It's a pain since
 it's such a trivial point; but the existing code cannot support a custom
 node referencing more than one RTE, which seems possible for join or append
 type cases.
 
It's more generic approach, I like this.
Probably, it can kill the characteristic as Scan of CustomScan from the
view of core backend. It shall perform as an opaque Plan node that may
scan, join, sort or something, so more appropriate its name may be
CustomPlan or simply Custom.

  Probably, the hunk around show_customscan_info() call can be entirely
  moved to the extension side. If so, I want ExplainNode() being an
  extern function, because it allows extensions to print underlying
 plan-nodes.
 
 I haven't looked at what explain.c would have to expose to make this workable,
 but yeah, we will probably have to export a few things.
 
OK,

  Are you saying the hard-wired portion in setrefs.c can be moved to the
  extension side? If fix_scan_expr() become extern function, I think it
  is feasible.
 
 My recollection is that fix_scan_expr() is a bit specialized.  Not sure
 exactly what we'd have to export there --- but we'd have to do it anyway.
 What you had in the patch was a hook that could be called, but no way for
 it to do what it would likely need to do.
 
It probably needs to be exported. It walks on the supplied node tree and
eventually calls record_plan_function_dependency() for each functions being
found. It should not be invented in the extension again.
Anyway, my reworking on the patch will make clear which static functions
need to be exposed. Please wait for a while.

  * Get rid of the CUSTOM_VAR business, too (including custom_tupdesc).
 
  In case of Var-node that references joined-relations, it shall be
  replaced to either INNER_VAR or OUTER_VAR according the location of
  underlying relations. It eventually makes ExecEvalScalarVar() to
  reference either ecxt_innertuple or ecxt_outertuple, however, it is
  problematic if we already consolidated tuples come from the both side
 into one.
 
 So?  If you did that, then you wouldn't have renumbered the Vars as
 INNER/OUTER.  I don't believe that CUSTOM_VAR is necessary at all; if it
 is needed, then there would also be a need for an additional tuple slot
 in executor contexts, which you haven't provided.
 
  For example, the enhanced postgres_fdw fetches the result set of
  remote join query, thus a tuple contains the fields come from both side.
  In this case, what varno shall be suitable to put?
 
 That would be a scan situation, and the vars could reference the scan tuple
 slot.  Which in fact was the implementation you were using, so how is
 CUSTOM_VAR adding anything?
 
Let me sort out the points.
If 

Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)

2014-03-04 Thread Robert Haas
On Mon, Mar 3, 2014 at 5:15 PM, Stephen Frost sfr...@snowman.net wrote:
 As I mentioned
 up-thread, I'd really like to see FDW join push-down, FDW aggregate
 push-down, parallel query execution, and parallel remote-FDW execution
 and I don't see this CustomScan approach as the right answer to any of
 those.

 In accordance with the above, what I'd like to see with this patch is
 removal of the postgres_fdw changes and any changes which were for that
 support.  In addition, I'd like to understand why 'ctidscan' makes any
 sense to have as an example of what to use this for- if that's valuable,
 why wouldn't we simply implement that in core?  I do want an example in
 contrib of how to properly use this capability, but I don't think that's
 it.

I suggested that example to KaiGai at last year's PGCon.  It may
indeed be something we want to have in core, but right now we don't.

More generally, I think this discussion is focusing on the wrong set
of issues.  The threshold issue for this patch is whether there is a
set of hook points that enable a workable custom-scan functionality,
and whether KaiGai has correctly identified them.  In other words, I
think we should be worrying about whether KaiGai's found all of the
places that need to be modified to support a custom scan, and whether
the modifications he's made to each of those places are correct and
adequate.  Whether he's picked the best possible example does not
strike me as a matter of principal concern, and it's far too late to
tell him he's got to go pick a different one at this point anyway.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)

2014-03-04 Thread Stephen Frost
* Kouhei Kaigai (kai...@ak.jp.nec.com) wrote:
 Do you think it makes sense if my submission was only interface portion
 without working example? 

No, we're pretty strongly against putting in interfaces which don't have
working examples in contrib- for one thing, we want to know when we
break it.

 The purpose of ctidscan module is, similar to
 postgres_fdw, to demonstrate the usage of custom-scan interface with
 enough small code scale. If tons of code example were attached, nobody
 will want to review the patch.

I gathered that's why it was included.  Is the plan to eventually submit
something larger to go into -contrib which will use this interface?  Or
will it always be external?

 The cache_scan module that I and Haribabu are discussing in another
 thread also might be a good demonstration for custom-scan interface,
 however, its code scale is a bit larger than ctidscan.

That does sound interesting though I'm curious about the specifics...

  For one thing, an example where you could have this CustomScan node calling
  other nodes underneath would be interesting.  I realize the CTID scan can't
  do that directly but I would think your GPU-based system could; after all,
  if you're running a join or an aggregate with the GPU, the rows could come
  from nearly anything.  Have you considered that, or is the expectation that
  users will just go off and access the heap and/or whatever indexes directly,
  like ctidscan does?  How would such a requirement be handled?
  
 In case when custom-scan node has underlying nodes, it shall be invoked using
 ExecProcNode as built-in node doing, then it will be able to fetch tuples
 come from underlying nodes. Of course, custom-scan provider can perform the
 tuples come from somewhere as if it came from underlying relation. It is
 responsibility of extension module. In some cases, it shall be required to
 return junk system attribute, like ctid, for row-level locks or table 
 updating.
 It is also responsibility of the extension module (or, should not add custom-
 path if this custom-scan provider cannot perform as required).

Right, tons of work to do to make it all fit together and play nice-
what I was trying to get at is: has this actually been done?  Is the GPU
extension that you're talking about as the use-case for this been
written?  How does it handle all of the above?  Or are we going through
all these gyrations in vain hope that it'll actually all work when
someone tries to use it for something real?

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)

2014-03-04 Thread Ashutosh Bapat
On Tue, Mar 4, 2014 at 7:39 PM, Robert Haas robertmh...@gmail.com wrote:

 On Mon, Mar 3, 2014 at 5:15 PM, Stephen Frost sfr...@snowman.net wrote:
  As I mentioned
  up-thread, I'd really like to see FDW join push-down, FDW aggregate
  push-down, parallel query execution, and parallel remote-FDW execution
  and I don't see this CustomScan approach as the right answer to any of
  those.
 
  In accordance with the above, what I'd like to see with this patch is
  removal of the postgres_fdw changes and any changes which were for that
  support.  In addition, I'd like to understand why 'ctidscan' makes any
  sense to have as an example of what to use this for- if that's valuable,
  why wouldn't we simply implement that in core?  I do want an example in
  contrib of how to properly use this capability, but I don't think that's
  it.

 I suggested that example to KaiGai at last year's PGCon.  It may
 indeed be something we want to have in core, but right now we don't.

 More generally, I think this discussion is focusing on the wrong set
 of issues.  The threshold issue for this patch is whether there is a
 set of hook points that enable a workable custom-scan functionality,
 and whether KaiGai has correctly identified them.  In other words, I
 think we should be worrying about whether KaiGai's found all of the
 places that need to be modified to support a custom scan, and whether
 the modifications he's made to each of those places are correct and
 adequate.  Whether he's picked the best possible example does not
 strike me as a matter of principal concern, and it's far too late to
 tell him he's got to go pick a different one at this point anyway.


There are so many places in the planner and optimizer code, where we create
various types of paths and the number of such paths is again significant,
if not large. If we want the custom scan contrib module to work in all
those cases (which seems to be the intention here), then we have to expose
so many hooks. I don't think all of those hooks have been identified.
Second problem is, the functions which create those paths have signatures
difficult enough to be exposed as hooks. Take example of the join hook that
was exposed. These function signatures do get changed from time to time and
thus corresponding hooks need to be changed to. This is going to be a
maintenance burden.

So, unless we have some way of exposing these hooks such that the
definitions of the hooks are independent of the internal function
signatures, supporting custom scan looks difficult.

 --
 Robert Haas
 EnterpriseDB: http://www.enterprisedb.com
 The Enterprise PostgreSQL Company


 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers




-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)

2014-03-04 Thread Stephen Frost
* Ashutosh Bapat (ashutosh.ba...@enterprisedb.com) wrote:
 During EXPLAIN, ExecInitNode() is called. If ExecInitNode() fires queries
 to foreign servers, those would be fired while EXPLAINing a query as well.
 We want to avoid that. Instead, we can run EXPLAIN on that query at foreign
 server. But again, not all foreign servers would be able to EXPLAIN the
 query e.g. file_fdw. OR totally avoid firing query during ExecInitNode(),
 if it's for EXPLAIN (except for ANALYSE may be).

Agreed that we wouldn't want to actually run a query when it's just
being explain'd.  If the FDW can't tell the difference then we'd need to
address that, of course.  A similar issue would, presumably, be around
prepare/execute, though I haven't looked yet.  These kinds of issues are
why it was option '#2' instead of '#1'. :)  I'm not sure they're able to
be addressed. :/

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)

2014-03-04 Thread Kohei KaiGai
2014-03-04 23:09 GMT+09:00 Robert Haas robertmh...@gmail.com:
 On Mon, Mar 3, 2014 at 5:15 PM, Stephen Frost sfr...@snowman.net wrote:
 As I mentioned
 up-thread, I'd really like to see FDW join push-down, FDW aggregate
 push-down, parallel query execution, and parallel remote-FDW execution
 and I don't see this CustomScan approach as the right answer to any of
 those.

 In accordance with the above, what I'd like to see with this patch is
 removal of the postgres_fdw changes and any changes which were for that
 support.  In addition, I'd like to understand why 'ctidscan' makes any
 sense to have as an example of what to use this for- if that's valuable,
 why wouldn't we simply implement that in core?  I do want an example in
 contrib of how to properly use this capability, but I don't think that's
 it.

 I suggested that example to KaiGai at last year's PGCon.  It may
 indeed be something we want to have in core, but right now we don't.

 More generally, I think this discussion is focusing on the wrong set
 of issues.  The threshold issue for this patch is whether there is a
 set of hook points that enable a workable custom-scan functionality,
 and whether KaiGai has correctly identified them.  In other words, I
 think we should be worrying about whether KaiGai's found all of the
 places that need to be modified to support a custom scan, and whether
 the modifications he's made to each of those places are correct and
 adequate.  Whether he's picked the best possible example does not
 strike me as a matter of principal concern, and it's far too late to
 tell him he's got to go pick a different one at this point anyway.

That is definitely the point to be discussed here. Even though I *believe*
I could put the callbacks needed to implement alternative join / scan,
it may lead different conclusion from other person's viewpoint.

At least, I could implement a custom-scan as an alternative of join
using postgres_fdw, however, it's uncertain whether I could cover
all the possible case we should care about.
So, I'd like to see comments from others.

Thanks,
-- 
KaiGai Kohei kai...@kaigai.gr.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)

2014-03-04 Thread Stephen Frost
* Robert Haas (robertmh...@gmail.com) wrote:
 More generally, I think this discussion is focusing on the wrong set
 of issues.  The threshold issue for this patch is whether there is a
 set of hook points that enable a workable custom-scan functionality,
 and whether KaiGai has correctly identified them.

Right- I was trying to hit on that in my email this morning.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)

2014-03-04 Thread Tom Lane
Stephen Frost sfr...@snowman.net writes:
 * Ashutosh Bapat (ashutosh.ba...@enterprisedb.com) wrote:
 During EXPLAIN, ExecInitNode() is called. If ExecInitNode() fires queries
 to foreign servers, those would be fired while EXPLAINing a query as well.
 We want to avoid that. Instead, we can run EXPLAIN on that query at foreign
 server. But again, not all foreign servers would be able to EXPLAIN the
 query e.g. file_fdw. OR totally avoid firing query during ExecInitNode(),
 if it's for EXPLAIN (except for ANALYSE may be).

 Agreed that we wouldn't want to actually run a query when it's just
 being explain'd.  If the FDW can't tell the difference then we'd need to
 address that, of course.

EXEC_FLAG_EXPLAIN_ONLY ...

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)

2014-03-04 Thread Kohei KaiGai
2014-03-04 23:10 GMT+09:00 Stephen Frost sfr...@snowman.net:
 The cache_scan module that I and Haribabu are discussing in another
 thread also might be a good demonstration for custom-scan interface,
 however, its code scale is a bit larger than ctidscan.

 That does sound interesting though I'm curious about the specifics...

This module caches a part of columns, but not all, thus allows to hold
much larger number of records for a particular amount of RAM than the
standard buffer cache.
It is constructed on top of custom-scan node, and also performs a new
hook for a callback on page vacuuming to invalidate its cache entry.
(I originally designed this module for demonstration of on-vacuum hook
because I already made ctidscan and postgres_fdw enhancement for
custom-scan node, by the way.)

  For one thing, an example where you could have this CustomScan node calling
  other nodes underneath would be interesting.  I realize the CTID scan can't
  do that directly but I would think your GPU-based system could; after all,
  if you're running a join or an aggregate with the GPU, the rows could come
  from nearly anything.  Have you considered that, or is the expectation that
  users will just go off and access the heap and/or whatever indexes 
  directly,
  like ctidscan does?  How would such a requirement be handled?
 
 In case when custom-scan node has underlying nodes, it shall be invoked using
 ExecProcNode as built-in node doing, then it will be able to fetch tuples
 come from underlying nodes. Of course, custom-scan provider can perform the
 tuples come from somewhere as if it came from underlying relation. It is
 responsibility of extension module. In some cases, it shall be required to
 return junk system attribute, like ctid, for row-level locks or table 
 updating.
 It is also responsibility of the extension module (or, should not add custom-
 path if this custom-scan provider cannot perform as required).

 Right, tons of work to do to make it all fit together and play nice-
 what I was trying to get at is: has this actually been done?  Is the GPU
 extension that you're talking about as the use-case for this been
 written?

Its chicken-and-egg problem, because implementation of the extension module
fully depends on the interface from the backend. Unlike commit-fest, here is no
deadline for my extension module, so I put higher priority on the submission of
custom-scan node, than the extension.
However, GPU extension is not fully theoretical stuff. I had implemented
a prototype using FDW APIs, and it allowed to accelerate sequential scan if
query has enough complicated qualifiers.

See the movie (from 2:45). The table t1 is a regular table, and t2 is a foreign
table. Both of them has same contents, however, response time of the query
is much faster, if GPU acceleration is working.
http://www.youtube.com/watch?v=xrUBffs9aJ0
So, I'm confident that GPU acceleration will have performance gain once it
can run regular tables, not only foreign tables.

 How does it handle all of the above?  Or are we going through
 all these gyrations in vain hope that it'll actually all work when
 someone tries to use it for something real?

I don't talk something difficult. If junk attribute requires to return ctid of
the tuple, custom-scan provider reads a tuple of underlying relation then
includes a correct item pointer. If this custom-scan is designed to run on
the cache, all it needs to do is reconstruct a tuple with correct item-pointer
(thus this cache needs to have ctid also). It's all I did in the cache_scan
module.

Thanks,
-- 
KaiGai Kohei kai...@kaigai.gr.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)

2014-03-04 Thread Stephen Frost
* Tom Lane (t...@sss.pgh.pa.us) wrote:
 Stephen Frost sfr...@snowman.net writes:
  * Ashutosh Bapat (ashutosh.ba...@enterprisedb.com) wrote:
  During EXPLAIN, ExecInitNode() is called. If ExecInitNode() fires queries
  to foreign servers, those would be fired while EXPLAINing a query as well.
  We want to avoid that. Instead, we can run EXPLAIN on that query at foreign
  server. But again, not all foreign servers would be able to EXPLAIN the
  query e.g. file_fdw. OR totally avoid firing query during ExecInitNode(),
  if it's for EXPLAIN (except for ANALYSE may be).
 
  Agreed that we wouldn't want to actually run a query when it's just
  being explain'd.  If the FDW can't tell the difference then we'd need to
  address that, of course.
 
 EXEC_FLAG_EXPLAIN_ONLY ...

Yeah, figured there should be a way.  Still not sure that kicking the
query off from ExecInitNode() is a good idea though.  Perhaps it could
be optional somehow.  I really like the idea of being able to make
Append work in an async mode where it's pulling data from multiple
sources at the same time, but it's a fair bit of work.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)

2014-03-04 Thread Stephen Frost
* Robert Haas (robertmh...@gmail.com) wrote:
 On Mon, Mar 3, 2014 at 5:15 PM, Stephen Frost sfr...@snowman.net wrote:
  In accordance with the above, what I'd like to see with this patch is
  removal of the postgres_fdw changes and any changes which were for that
  support.  In addition, I'd like to understand why 'ctidscan' makes any
  sense to have as an example of what to use this for- if that's valuable,
  why wouldn't we simply implement that in core?  I do want an example in
  contrib of how to properly use this capability, but I don't think that's
  it.
 
 I suggested that example to KaiGai at last year's PGCon.  It may
 indeed be something we want to have in core, but right now we don't.

Alright- so do you feel that the simple ctidscan use-case is a
sufficient justification and example of how this can be generally
useful that we should be adding these hooks to core..?  I'm willing to
work through the patch and clean it up this weekend if we agree that
it's useful and unlikely to immediately be broken by expected changes..

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)

2014-03-04 Thread Tom Lane
I apologize for not having paid much attention to this thread so far.
It kept getting stuck on my to look at later queue.  Anyway, I've
taken a preliminary look at the v7 patch now.

While the patch seems roughly along the lines of what we talked about
last PGCon, I share Stephen's unease about a lot of the details.  It's
not entirely clear that these hooks are really good for anything, and
it's even less clear what APIs the hook functions should be expected
to depend on.  I really do not like the approach embodied in the later
patches of oh, we'll just expose whatever static planner functions seem
convenient.  That's not an approach that's available when doing actual
external development of an extension, and even if it were that doesn't
make it a good idea.  The larger the exposed surface of functions the
harder it is to know what's safe to change.

Anyway, on to specifics:

* Please drop the whole register_custom_provider/get_custom_provider API.
There is no reason other than debugging for a provider to have a name at
all, and if we expect providers to have unique names then that creates a
collision risk for independently-created extensions.  AFAICS, it's
sufficient for a function hooked into one of the add-a-path hooks to
include a pointer to a struct-of-function-pointers in the Path object it
returns, and similarly the CustomScan Plan object can contain a pointer
inserted when it's created.  I don't object to having a name field in the
function pointer structs for debugging reasons, but I don't want any
lookups being done on it.

* The function-struct pointers should be marked const in the referencing
nodes, to indicate that the core code won't be modifying or copying them.
In practice they'd probably be statically allocated constants in the
extensions anyway.

* The patch does lots of violence to the separation between planner and
executor, starting with the decision to include nodes/relation.h in
executor.h.  That will not do at all.  I see that you did that because you
wanted to make ExecSupportsMarkRestore take a Path, but we need some other
answer.  One slightly grotty answer is to invent two different customscan
Plan types, one that supports mark/restore and one that doesn't, so that
ExecSupportsMarkRestore could still just look at the Plan type tag.
(BTW, path-pathtype is supposed to contain the node tag of the Plan node
that the path would produce.  Putting T_CustomPath in it is entirely
tone-deaf.)  Another way would be to remove ExecSupportsMarkRestore in
favor of some new function in the planner; but it's good to keep it in
execAmi.c since that has other knowledge of which plan types support
mark/restore.

* More generally, I'm not convinced that exactly one Path type and exactly
one Plan type is going to get us very far.  It seems rather ugly to use
the same Plan type for both scan and join nodes, and what will happen if
somebody wants to build a custom Append node, or something else that has
neither zero nor two subplans?

* nodeCustom.h is being completely abused.  That should only export the
functions in nodeCustom.c, which are going to be pretty much one-liners
anyway.  The right place to put the function pointer struct definitions
is someplace else.  I'd be inclined to start by separating the function
pointers into two structs, one for functions needed for a Path and one for
functions needed for a Plan, so that you don't have this problem of having
to import everything the planner knows into an executor header or vice
versa.  Most likely you could just put the Path function pointer struct
declaration next to CustomPath in relation.h, and the one for Plans next
to CustomPlan (or the variants thereof) in plannodes.h.

* The set of fields provided in CustomScan seems nonsensical.  I'm not
even sure that it should be derived from Scan; that's okay for plan types
that actually are scans of a base relation, but it's confusing overhead
for anything that's say a join, or a custom sort node, or anything like
that.  Maybe one argument for multiple plan node types is that one would
be derived from Scan and one directly from Plan.

* More generally, what this definition for CustomScan exposes is that we
have no idea whatever what fields a custom plan node might need.  I'm
inclined to think that what we should be assuming is that any custom path
or plan node is really an object of a struct type known only to its
providing extension, whose first field is the CustomPath or CustomPlan
struct known to the core backend.  (Think C++ subclassing.)  This would
imply that copyfuncs/equalfuncs/outfuncs support would have to be provided
by the extension, which is in principle possible if we add function
pointers for those operations to the struct linked to from the path/plan
object.  (Notationally this might be a bit of a pain, since the macros
that we use in the functions in copyfuncs.c etc aren't public.  Not sure
if it's worth exposing those somewhere, or if people should just
copy/paste them.)  This 

Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)

2014-03-04 Thread Robert Haas
On Tue, Mar 4, 2014 at 2:34 PM, Stephen Frost sfr...@snowman.net wrote:
 Alright- so do you feel that the simple ctidscan use-case is a
 sufficient justification and example of how this can be generally
 useful that we should be adding these hooks to core..?  I'm willing to
 work through the patch and clean it up this weekend if we agree that
 it's useful and unlikely to immediately be broken by expected changes..

Yeah, I think it's useful.  But based on Tom's concurrently-posted
review, I think there's probably a good deal of work left here.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)

2014-03-04 Thread Stephen Frost
* Robert Haas (robertmh...@gmail.com) wrote:
 On Tue, Mar 4, 2014 at 2:34 PM, Stephen Frost sfr...@snowman.net wrote:
  Alright- so do you feel that the simple ctidscan use-case is a
  sufficient justification and example of how this can be generally
  useful that we should be adding these hooks to core..?  I'm willing to
  work through the patch and clean it up this weekend if we agree that
  it's useful and unlikely to immediately be broken by expected changes..
 
 Yeah, I think it's useful.  But based on Tom's concurrently-posted
 review, I think there's probably a good deal of work left here.

Yeah, it certainly looks like it.

KaiGai- will you have time to go over and address Tom's concerns..?

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)

2014-03-04 Thread Kohei KaiGai
2014-03-05 5:52 GMT+09:00 Stephen Frost sfr...@snowman.net:
 * Robert Haas (robertmh...@gmail.com) wrote:
 On Tue, Mar 4, 2014 at 2:34 PM, Stephen Frost sfr...@snowman.net wrote:
  Alright- so do you feel that the simple ctidscan use-case is a
  sufficient justification and example of how this can be generally
  useful that we should be adding these hooks to core..?  I'm willing to
  work through the patch and clean it up this weekend if we agree that
  it's useful and unlikely to immediately be broken by expected changes..

 Yeah, I think it's useful.  But based on Tom's concurrently-posted
 review, I think there's probably a good deal of work left here.

 Yeah, it certainly looks like it.

 KaiGai- will you have time to go over and address Tom's concerns..?

Yes, I need to do. Let me take it through the later half of this week and
the weekend. So, I'd like to submit revised one by next Monday.

Thanks,
-- 
KaiGai Kohei kai...@kaigai.gr.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)

2014-03-03 Thread Robert Haas
On Sat, Mar 1, 2014 at 9:04 PM, Stephen Frost sfr...@snowman.net wrote:
 * Robert Haas (robertmh...@gmail.com) wrote:
 I don't see that parallelizing Append is any easier than any other
 problem in this space.  There's no parallel I/O facility, so you need
 a background worker per append branch to wait on I/O.  And you have
 all the problems of making sure that the workers have the same
 snapshot, making sure they don't self-deadlock, etc. that you have for
 any other case.

 Erm, my thought was to use a select() loop which sends out I/O requests
 and then loops around waiting to see who finishes it.  It doesn't
 parallelize the CPU cost of getting the rows back to the caller, but
 it'd at least parallelize the I/O, and if what's underneath is actually
 a remote FDW running a complex query (because the other side is actually
 a view), it would be a massive win to have all the remote FDWs executing
 concurrently instead of serially as we have today.

I can't really make sense of this.  In general, what's under each
branch of an append node is an arbitrary plan tree, and the only
operation you can count on being able to do for each is get me the
next tuple (i.e. ExecProcNode).  Append has no idea whether that
involves disk I/O or for what blocks.  But even if it did, there's no
standard API for issuing an asynchronous read(), which is how we get
blocks into shared buffers.  We do have an API for requesting the
prefetch of a block on platforms with posix_fadvise(), but can't find
out whether it's completed using select(), and even if you could you
still have to do the actual read() afterwards.

For FDWs, one idea might be to kick off the remote query at
ExecInitNode() time rather than ExecProcNode() time, at least if the
remote query doesn't depend on parameters that aren't available until
run time.  That actually would allow multiple remote queries to run
simultaneously or in parallel with local work.  It would also run them
in cases where the relevant plan node is never executed, which would
be bad but perhaps rare enough not to worry about.  Or we could add a
new API like ExecPrefetchNode() that tells nodes to prepare to have
tuples pulled, and they can do things like kick off asynchronous
queries.  But I still don't see any clean way for the Append node to
find out which one is ready to return results first.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)

2014-03-03 Thread Stephen Frost
* Robert Haas (robertmh...@gmail.com) wrote:
 On Sat, Mar 1, 2014 at 9:04 PM, Stephen Frost sfr...@snowman.net wrote:
  Erm, my thought was to use a select() loop which sends out I/O requests
  and then loops around waiting to see who finishes it.  It doesn't
  parallelize the CPU cost of getting the rows back to the caller, but
  it'd at least parallelize the I/O, and if what's underneath is actually
  a remote FDW running a complex query (because the other side is actually
  a view), it would be a massive win to have all the remote FDWs executing
  concurrently instead of serially as we have today.
 
 I can't really make sense of this.

Sorry, that was a bit hand-wavey since I had posted about it previously
here:
http://www.postgresql.org/message-id/20131104032604.gb2...@tamriel.snowman.net

It'd clearly be more involved than just build a select() loop and
would require adding an async mechanism.  I had been thinking about this
primairly with the idea of FDWs and you're right that it'd require more
thought to deal with getting data into/through shared_buffers.  Still,
we seqscan into a ring buffer, I'd think we could make it work but it
would require additional work.

 For FDWs, one idea might be to kick off the remote query at
 ExecInitNode() time rather than ExecProcNode() time, at least if the
 remote query doesn't depend on parameters that aren't available until
 run time.

Right, I had speculated about that also (option #2 in my earlier email).

 That actually would allow multiple remote queries to run
 simultaneously or in parallel with local work.  It would also run them
 in cases where the relevant plan node is never executed, which would
 be bad but perhaps rare enough not to worry about.  

This was my primary concern, along with the fact that we explicitly says
don't do that in the docs for the FDW API.

 Or we could add a
 new API like ExecPrefetchNode() that tells nodes to prepare to have
 tuples pulled, and they can do things like kick off asynchronous
 queries.  But I still don't see any clean way for the Append node to
 find out which one is ready to return results first.

Yeah, that's tricky.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)

2014-03-03 Thread Robert Haas
On Mon, Mar 3, 2014 at 10:43 AM, Stephen Frost sfr...@snowman.net wrote:
 * Robert Haas (robertmh...@gmail.com) wrote:
 On Sat, Mar 1, 2014 at 9:04 PM, Stephen Frost sfr...@snowman.net wrote:
  Erm, my thought was to use a select() loop which sends out I/O requests
  and then loops around waiting to see who finishes it.  It doesn't
  parallelize the CPU cost of getting the rows back to the caller, but
  it'd at least parallelize the I/O, and if what's underneath is actually
  a remote FDW running a complex query (because the other side is actually
  a view), it would be a massive win to have all the remote FDWs executing
  concurrently instead of serially as we have today.

 I can't really make sense of this.

 Sorry, that was a bit hand-wavey since I had posted about it previously
 here:
 http://www.postgresql.org/message-id/20131104032604.gb2...@tamriel.snowman.net

Huh, somehow I can't remember reading that... but I didn't think I had
missed any posts, either.  Evidently I did.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)

2014-03-03 Thread Stephen Frost
* Robert Haas (robertmh...@gmail.com) wrote:
  http://www.postgresql.org/message-id/20131104032604.gb2...@tamriel.snowman.net
 
 Huh, somehow I can't remember reading that... but I didn't think I had
 missed any posts, either.  Evidently I did.

You and everyone else- you'll note it got exactly zero responses..

Ah well. :)

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)

2014-03-03 Thread Stephen Frost
KaiGai,

* Stephen Frost (sfr...@snowman.net) wrote:
 And I'm still unconvinced of this approach and worry that it's going to
 break more often than it works.  That's my 2c on it, but I won't get in
 the way if someone else wants to step up and support it.

Alright, having heard from Robert (thanks!) regarding his thoughts
(which are pretty similar to my own, though he doesn't anticipate issues
with API changes), I'm going to step back a bit form the above position.

 As I mentioned
 up-thread, I'd really like to see FDW join push-down, FDW aggregate
 push-down, parallel query execution, and parallel remote-FDW execution
 and I don't see this CustomScan approach as the right answer to any of
 those.

In accordance with the above, what I'd like to see with this patch is
removal of the postgres_fdw changes and any changes which were for that
support.  In addition, I'd like to understand why 'ctidscan' makes any
sense to have as an example of what to use this for- if that's valuable,
why wouldn't we simply implement that in core?  I do want an example in
contrib of how to properly use this capability, but I don't think that's
it.

For one thing, an example where you could have this CustomScan node
calling other nodes underneath would be interesting.  I realize the CTID
scan can't do that directly but I would think your GPU-based system
could; after all, if you're running a join or an aggregate with the GPU,
the rows could come from nearly anything.  Have you considered that, or
is the expectation that users will just go off and access the heap
and/or whatever indexes directly, like ctidscan does?  How would such a
requirement be handled?

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)

2014-03-03 Thread Kouhei Kaigai
  As I mentioned
  up-thread, I'd really like to see FDW join push-down, FDW aggregate
  push-down, parallel query execution, and parallel remote-FDW execution
  and I don't see this CustomScan approach as the right answer to any of
  those.
 
 In accordance with the above, what I'd like to see with this patch is removal
 of the postgres_fdw changes and any changes which were for that support.
 
I don't argue this approach. It might be useful to *demonstrate* how custom-
scan node works as replacement of join, however, 

 In addition, I'd like to understand why 'ctidscan' makes any sense to have
 as an example of what to use this for- if that's valuable, why wouldn't
 we simply implement that in core?  I do want an example in contrib of how
 to properly use this capability, but I don't think that's it.
 
Do you think it makes sense if my submission was only interface portion
without working example? The purpose of ctidscan module is, similar to
postgres_fdw, to demonstrate the usage of custom-scan interface with
enough small code scale. If tons of code example were attached, nobody
will want to review the patch.
The cache_scan module that I and Haribabu are discussing in another
thread also might be a good demonstration for custom-scan interface,
however, its code scale is a bit larger than ctidscan.

 For one thing, an example where you could have this CustomScan node calling
 other nodes underneath would be interesting.  I realize the CTID scan can't
 do that directly but I would think your GPU-based system could; after all,
 if you're running a join or an aggregate with the GPU, the rows could come
 from nearly anything.  Have you considered that, or is the expectation that
 users will just go off and access the heap and/or whatever indexes directly,
 like ctidscan does?  How would such a requirement be handled?
 
In case when custom-scan node has underlying nodes, it shall be invoked using
ExecProcNode as built-in node doing, then it will be able to fetch tuples
come from underlying nodes. Of course, custom-scan provider can perform the
tuples come from somewhere as if it came from underlying relation. It is
responsibility of extension module. In some cases, it shall be required to
return junk system attribute, like ctid, for row-level locks or table updating.
It is also responsibility of the extension module (or, should not add custom-
path if this custom-scan provider cannot perform as required).

Thanks,
--
NEC OSS Promotion Center / PG-Strom Project
KaiGai Kohei kai...@ak.jp.nec.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)

2014-03-03 Thread Ashutosh Bapat
On Mon, Mar 3, 2014 at 9:13 PM, Stephen Frost sfr...@snowman.net wrote:

 * Robert Haas (robertmh...@gmail.com) wrote:
  On Sat, Mar 1, 2014 at 9:04 PM, Stephen Frost sfr...@snowman.net
 wrote:
   Erm, my thought was to use a select() loop which sends out I/O requests
   and then loops around waiting to see who finishes it.  It doesn't
   parallelize the CPU cost of getting the rows back to the caller, but
   it'd at least parallelize the I/O, and if what's underneath is actually
   a remote FDW running a complex query (because the other side is
 actually
   a view), it would be a massive win to have all the remote FDWs
 executing
   concurrently instead of serially as we have today.
 
  I can't really make sense of this.

 Sorry, that was a bit hand-wavey since I had posted about it previously
 here:

 http://www.postgresql.org/message-id/20131104032604.gb2...@tamriel.snowman.net

 It'd clearly be more involved than just build a select() loop and
 would require adding an async mechanism.  I had been thinking about this
 primairly with the idea of FDWs and you're right that it'd require more
 thought to deal with getting data into/through shared_buffers.  Still,
 we seqscan into a ring buffer, I'd think we could make it work but it
 would require additional work.

  For FDWs, one idea might be to kick off the remote query at
  ExecInitNode() time rather than ExecProcNode() time, at least if the
  remote query doesn't depend on parameters that aren't available until
  run time.

 Right, I had speculated about that also (option #2 in my earlier email).


During EXPLAIN, ExecInitNode() is called. If ExecInitNode() fires queries
to foreign servers, those would be fired while EXPLAINing a query as well.
We want to avoid that. Instead, we can run EXPLAIN on that query at foreign
server. But again, not all foreign servers would be able to EXPLAIN the
query e.g. file_fdw. OR totally avoid firing query during ExecInitNode(),
if it's for EXPLAIN (except for ANALYSE may be).


  That actually would allow multiple remote queries to run
  simultaneously or in parallel with local work.  It would also run them
  in cases where the relevant plan node is never executed, which would
  be bad but perhaps rare enough not to worry about.

 This was my primary concern, along with the fact that we explicitly says
 don't do that in the docs for the FDW API.

  Or we could add a
  new API like ExecPrefetchNode() that tells nodes to prepare to have
  tuples pulled, and they can do things like kick off asynchronous
  queries.  But I still don't see any clean way for the Append node to
  find out which one is ready to return results first.

 Yeah, that's tricky.

 Thanks,

 Stephen




-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)

2014-03-01 Thread Stephen Frost
KaiGai,

* Kohei KaiGai (kai...@kaigai.gr.jp) wrote:
 BTW, this kind of discussion looks like a talk with a ghost because
 we cannot see the new interface according to the parallel execution
 right now, so we cannot have tangible investigation whether it becomes
 really serious backward incompatibility, or not.

Yeah, it would certainly be nice if we had all of the answers up-front.
What I keep hoping for is that someone who has been working on this area
(eg: Robert) would speak up...

 My bet is minor one. I cannot imagine plan-node interface that does
 not support existing non-parallel SeqScan or NestLoop and so on.

Sure you can- because once we change the interface, we're probably going
to go through and make everything use the new one rather than have to
special-case things.  That's more-or-less exactly my point here because
having an external hook like CustomScan would make that kind of
wholesale change more difficult.

That does *not* mean I'm against using GPUs and GPU optimizations.  What
it means is that I'd rather see that done in core, which would allow us
to simply change that interface along with the rest when doing wholesale
changes and not have to worry about backwards compatibility and breaking
other people's code.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)

2014-03-01 Thread Kohei KaiGai
2014-03-01 22:38 GMT+09:00 Stephen Frost sfr...@snowman.net:
 KaiGai,

 * Kohei KaiGai (kai...@kaigai.gr.jp) wrote:
 BTW, this kind of discussion looks like a talk with a ghost because
 we cannot see the new interface according to the parallel execution
 right now, so we cannot have tangible investigation whether it becomes
 really serious backward incompatibility, or not.

 Yeah, it would certainly be nice if we had all of the answers up-front.
 What I keep hoping for is that someone who has been working on this area
 (eg: Robert) would speak up...

I'd also like to see his opinion.

 My bet is minor one. I cannot imagine plan-node interface that does
 not support existing non-parallel SeqScan or NestLoop and so on.

 Sure you can- because once we change the interface, we're probably going
 to go through and make everything use the new one rather than have to
 special-case things.  That's more-or-less exactly my point here because
 having an external hook like CustomScan would make that kind of
 wholesale change more difficult.

I think, we should follow the general rule in case of custom-scan also.
In other words, it's responsibility of extension's author to follow up the
latest specification of interfaces.
For example, I have an extension module that is unable to work on the
latest PG- code because of interface changes at ProcessUtility_hook.
Is it a matter of backward incompatibility? Definitely, no. It should be
my job.

 That does *not* mean I'm against using GPUs and GPU optimizations.  What
 it means is that I'd rather see that done in core, which would allow us
 to simply change that interface along with the rest when doing wholesale
 changes and not have to worry about backwards compatibility and breaking
 other people's code.

I also have to introduce a previous background discussion.
Now we have two options for GPU programming: CUDA or OpenCL.
Both of libraries and drivers are provided under the proprietary license,
so it does not fit for the core implementation of PostgreSQL, but
extensions that shall be installed on user's responsibility.
Because of the story, I brought up a discussion about pluggable
planner/executor node (as a basis of GPU acceleration) in the last
developer meeting, then has worked for custom-scan node interface.

Thanks,
-- 
KaiGai Kohei kai...@kaigai.gr.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)

2014-03-01 Thread Stephen Frost
KaiGai,

* Kohei KaiGai (kai...@kaigai.gr.jp) wrote:
 Now we have two options for GPU programming: CUDA or OpenCL.
 Both of libraries and drivers are provided under the proprietary license,
 so it does not fit for the core implementation of PostgreSQL, but
 extensions that shall be installed on user's responsibility.

Being able to work with libraries which are not BSD licensed doesn't
change the license under which PostgreSQL code is released.  Nor does it
require PostgreSQL to be licensed in any different way from how it is
today.  Where it would get a bit ugly, I agree, is for the packagers who
have to decide if they want to build against those libraries or not.  We
might be able to make things a bit easier by having a startup-time
determination of if these nodes are able to be used or not.  This isn't
unlike OpenSSL which certainly isn't BSD nor is it even GPL-compatible,
a situation which causes a great deal of difficulty already due to the
whole readline nonsense- but it's difficulty for the packagers, not for
the PostgreSQL project, per se.

 Because of the story, I brought up a discussion about pluggable
 planner/executor node (as a basis of GPU acceleration) in the last
 developer meeting, then has worked for custom-scan node interface.

And I'm still unconvinced of this approach and worry that it's going to
break more often than it works.  That's my 2c on it, but I won't get in
the way if someone else wants to step up and support it.  As I mentioned
up-thread, I'd really like to see FDW join push-down, FDW aggregate
push-down, parallel query execution, and parallel remote-FDW execution
and I don't see this CustomScan approach as the right answer to any of
those.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)

2014-03-01 Thread Kohei KaiGai
2014-03-02 9:51 GMT+09:00 Stephen Frost sfr...@snowman.net:
 KaiGai,

 * Kohei KaiGai (kai...@kaigai.gr.jp) wrote:
 Now we have two options for GPU programming: CUDA or OpenCL.
 Both of libraries and drivers are provided under the proprietary license,
 so it does not fit for the core implementation of PostgreSQL, but
 extensions that shall be installed on user's responsibility.

 Being able to work with libraries which are not BSD licensed doesn't
 change the license under which PostgreSQL code is released.  Nor does it
 require PostgreSQL to be licensed in any different way from how it is
 today.  Where it would get a bit ugly, I agree, is for the packagers who
 have to decide if they want to build against those libraries or not.  We
 might be able to make things a bit easier by having a startup-time
 determination of if these nodes are able to be used or not.  This isn't
 unlike OpenSSL which certainly isn't BSD nor is it even GPL-compatible,
 a situation which causes a great deal of difficulty already due to the
 whole readline nonsense- but it's difficulty for the packagers, not for
 the PostgreSQL project, per se.

As you mentioned, it is a headache for packagers, and does not make
sense for us if packager disabled the feature that requires proprietary
drivers. In fact, Fedora / RHEL does not admit to distribute software
under the none open source software license. Obviously, nvidia's cuda
is a library being distributed under the proprietary license, thus out of
the scope for the Linux distributors. It also leads them to turn off the
feature that shall be linked with proprietary drivers.
All we can do is to implement these features as extension, then offer
an option for users to use or not to use.

 Because of the story, I brought up a discussion about pluggable
 planner/executor node (as a basis of GPU acceleration) in the last
 developer meeting, then has worked for custom-scan node interface.

 And I'm still unconvinced of this approach and worry that it's going to
 break more often than it works.  That's my 2c on it, but I won't get in
 the way if someone else wants to step up and support it.  As I mentioned
 up-thread, I'd really like to see FDW join push-down, FDW aggregate
 push-down, parallel query execution, and parallel remote-FDW execution
 and I don't see this CustomScan approach as the right answer to any of
 those.

It's right approach for FDW functionality enhancement, I never opposed to.

What I'd like to implement is GPU acceleration that can perform on
regular tables, not only foreign tables. Also, regarding to the backlog
in the developer meeting, pluggable executor node is also required
feature by PG-XC folks to work their project with upstream.
I think custom-scan feature is capable to host these requirement,
and does not prevent enhancement FDW features.

Thanks,
-- 
KaiGai Kohei kai...@kaigai.gr.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)

2014-03-01 Thread Stephen Frost
* Kohei KaiGai (kai...@kaigai.gr.jp) wrote:
 As you mentioned, it is a headache for packagers, and does not make
 sense for us if packager disabled the feature that requires proprietary
 drivers.

No, I disagree with that.  I don't expect this use-case to be very
common to begin with and telling individuals that they have to compile
it themselves is certainly not out of the question.

 In fact, Fedora / RHEL does not admit to distribute software
 under the none open source software license.

I'm pretty confident you can get RPMs for those distributions.

 Obviously, nvidia's cuda
 is a library being distributed under the proprietary license, thus out of
 the scope for the Linux distributors. 

This also doesn't make any sense to me- certainly the CUDA libraries are
available under Debian derivatives, along with open-source wrappers for
them like pycuda.

 It also leads them to turn off the
 feature that shall be linked with proprietary drivers.
 All we can do is to implement these features as extension, then offer
 an option for users to use or not to use.

No, we can tell individuals who want it that they're going to need to
build with support for it.  We don't offer OpenSSL as an extension (I
certainly wish we did- and had a way to replace it w/ GNUTLS or one of
the better licensed options).

 What I'd like to implement is GPU acceleration that can perform on
 regular tables, not only foreign tables. Also, regarding to the backlog
 in the developer meeting, pluggable executor node is also required
 feature by PG-XC folks to work their project with upstream.
 I think custom-scan feature is capable to host these requirement,
 and does not prevent enhancement FDW features.

I think you're conflating things again- while it might be possible to
use CustomScan to implement FDW join-pushdown or FDW aggregate-pushdown,
*I* don't think it's the right approach.  Regarding the PG-XC
requirement, I expect they're looking for FDW join/aggregate-pushdown
and also see that it *could* be done w/ CustomScan.

We could punt on the whole thing and drop in hooks which could be used
to replace everything done from the planner through to the executor and
then anyone *could* implement any of the above, and parallel query too.
That doesn't make it the right approach.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)

2014-03-01 Thread Robert Haas
On Wed, Feb 26, 2014 at 3:02 AM, Stephen Frost sfr...@snowman.net wrote:
 The custom-scan node is intended to perform on regular relations, not
 only foreign tables. It means a special feature (like GPU acceleration)
 can perform transparently for most of existing applications. Usually,
 it defines regular tables for their work on installation, not foreign
 tables. It is the biggest concern for me.

 The line between a foreign table and a local one is becoming blurred
 already, but still, if this is the goal then I really think the
 background worker is where you should be focused, not on this Custom
 Scan API.  Consider that, once we've got proper background workers,
 we're going to need new nodes which operate in parallel (or some other
 rejiggering of the nodes- I don't pretend to know exactly what Robert is
 thinking here, and I've apparently forgotten it if he's posted it
 somewhere) and those interfaces may drive changes which would impact the
 Custom Scan API- or worse, make us deprecate or regret having added it
 because now we'll need to break backwards compatibility to add in the
 parallel node capability to satisfy the more general non-GPU case.

This critique seems pretty odd to me.  I haven't had the time to look
at this patch set, but I don't see why anyone would want to use the
background worker facility for GPU acceleration, which is what
KaiGai's trying to accomplish here.  Surely you want, if possible, to
copy the data directly from the user backend into the GPU's working
memory.  What would the use of a background worker be?  We definitely
need background workers to make use of additional *CPUs*, but I don't
see what good they are in leveraging *GPUs*.

I seriously doubt there's any real conflict with parallelism here.
Parallelism may indeed add more ways to scan a relation
(ParallelSeqScan, ParallelIndexScan?) but that doesn't mean that we
shouldn't have a Custom Scan node too.  Indeed, my principle concern
about this patch set isn't that it's too specialized, as you seem to
be worrying about, but that it's aiming to satisfy *too many* use
cases.  I think FDW join pushdown is a fairly different problem from
adding a custom scan type, and I doubt one patch should try to solve
both problems.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)

2014-03-01 Thread Robert Haas
On Wed, Feb 26, 2014 at 10:23 AM, Stephen Frost sfr...@snowman.net wrote:
 * Kouhei Kaigai (kai...@ak.jp.nec.com) wrote:
 IIUC, his approach was integration of join-pushdown within FDW APIs,
 however, it does not mean the idea of remote-join is rejected.

 For my part, trying to consider doing remote joins *without* going
 through FDWs is just nonsensical.

That is, of course, true by definition, but I think it's putting the
focus in the wrong place.  It's possible that there are other cases
when a scan might a plausible path for a joinrel even if there are no
foreign tables in play.  For example, you could cache the joinrel
output and then inject a cache scan as a path for the joinrel.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)

2014-03-01 Thread Robert Haas
On Fri, Feb 28, 2014 at 10:36 AM, Stephen Frost sfr...@snowman.net wrote:
 * Kouhei Kaigai (kai...@ak.jp.nec.com) wrote:
 * Stephen Frost (sfr...@snowman.net) wrote:
  I don't see how you can be when there hasn't been any discussion that I've
  seen about how parallel query execution is going to change things for us.
 
 If parallel query execution changes whole of the structure of plan nodes,
 it will also affect to the interface of custom-scan because it is a thin-
 abstraction of plan-node. However, if parallel execution feature is
 implemented as one of new plan node in addition to existing one, I cannot
 imagine a scenario that affects to the structure of another plan node.

 Let's just say that I have doubts that we'll be able to implement
 parallel execution *without* changing the plan node interface in some
 way which will require, hopefully minor, changes to all of the nodes.
 The issue is that even a minor change would break the custom-scan API
 and we'd immediately be in the boat of dealing with complaints regarding
 backwards compatibility.  Perhaps we can hand-wave that, and we've had
 some success changing hook APIs between major releases, but such changes
 may also be in ways which wouldn't break in obvious ways or even
 possibly be changes which have to be introduced into back-branches.
 Parallel query is going to be brand-new real soon and it's reasonable to
 think we'll need to make bug-fix changes to it after it's out which
 might even involve changes to the API which is developed for it.

For what it's worth, and I can't claim to have all the answers here,
this doesn't match my expectation.  I think we'll do two kinds of
parallelism.  One will be parallelism within nodes, like parallel sort
or parallel seqscan.  Any node we parallelize this way is likely to be
heavily rewritten, or else to get a sister that looks quite different
from the original.  The other kind of parallelism will involve pushing
a whole subtree of the plan into a different node.  In this case we'll
need to pass data between nodes in some different way (this was one of
the major reasons I designed the shm_mq stuff) but the nodes
themselves should change little if at all.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)

2014-03-01 Thread Stephen Frost
* Robert Haas (robertmh...@gmail.com) wrote:
 On Wed, Feb 26, 2014 at 3:02 AM, Stephen Frost sfr...@snowman.net wrote:
  The line between a foreign table and a local one is becoming blurred
  already, but still, if this is the goal then I really think the
  background worker is where you should be focused, not on this Custom
  Scan API.  Consider that, once we've got proper background workers,
  we're going to need new nodes which operate in parallel (or some other
  rejiggering of the nodes- I don't pretend to know exactly what Robert is
  thinking here, and I've apparently forgotten it if he's posted it
  somewhere) and those interfaces may drive changes which would impact the
  Custom Scan API- or worse, make us deprecate or regret having added it
  because now we'll need to break backwards compatibility to add in the
  parallel node capability to satisfy the more general non-GPU case.
 
 This critique seems pretty odd to me.  I haven't had the time to look
 at this patch set, but I don't see why anyone would want to use the
 background worker facility for GPU acceleration, which is what
 KaiGai's trying to accomplish here.

Eh, that didn't come out quite right.  I had intended it to be more
along the lines of look at what Robert's doing.

I was trying to point out that parallel query execution is coming soon
thanks to the work on background worker and that parallel query
execution might drive changes to the way nodes interact in the executor
driving a need to change the API.  In other words, CustomScan could
easily end up being broken by that change and I'd rather we not have to
worry about such breakage.

 I seriously doubt there's any real conflict with parallelism here.
 Parallelism may indeed add more ways to scan a relation
 (ParallelSeqScan, ParallelIndexScan?) but that doesn't mean that we
 shouldn't have a Custom Scan node too.

What about parallel execution through the tree itself, rather than just
at specific end nodes like SeqScan and IndexScan?  Or parallel
aggregates?  I agree that simple parallel SeqScan/IndexScan isn't going
to change any of the interfaces, but surely we're going for more than
that.  Indeed, I'm wishing that I had found more time to spend on just
a simple select-based Append node which could parallelize I/O across
tablespaces and FDWs underneath the Append.

 Indeed, my principle concern
 about this patch set isn't that it's too specialized, as you seem to
 be worrying about, but that it's aiming to satisfy *too many* use
 cases.  I think FDW join pushdown is a fairly different problem from
 adding a custom scan type, and I doubt one patch should try to solve
 both problems.

Yeah, I've voiced those same concerns later in the thread also,
specifically that this punts on nearly everything and expects the
implementor to figure it all out.  We should be able to do better wrt
FDW join-pushdown, etc.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)

2014-03-01 Thread Robert Haas
On Sat, Mar 1, 2014 at 8:49 PM, Stephen Frost sfr...@snowman.net wrote:
 This critique seems pretty odd to me.  I haven't had the time to look
 at this patch set, but I don't see why anyone would want to use the
 background worker facility for GPU acceleration, which is what
 KaiGai's trying to accomplish here.

 Eh, that didn't come out quite right.  I had intended it to be more
 along the lines of look at what Robert's doing.

 I was trying to point out that parallel query execution is coming soon
 thanks to the work on background worker and that parallel query
 execution might drive changes to the way nodes interact in the executor
 driving a need to change the API.  In other words, CustomScan could
 easily end up being broken by that change and I'd rather we not have to
 worry about such breakage.

I think the relation is pretty tangential.  I'm worried about the
possibility that the Custom Scan API is broken *ab initio*, but I'm
not worried about a conflict with parallel query.

 I seriously doubt there's any real conflict with parallelism here.
 Parallelism may indeed add more ways to scan a relation
 (ParallelSeqScan, ParallelIndexScan?) but that doesn't mean that we
 shouldn't have a Custom Scan node too.

 What about parallel execution through the tree itself, rather than just
 at specific end nodes like SeqScan and IndexScan?  Or parallel
 aggregates?  I agree that simple parallel SeqScan/IndexScan isn't going
 to change any of the interfaces, but surely we're going for more than
 that.  Indeed, I'm wishing that I had found more time to spend on just
 a simple select-based Append node which could parallelize I/O across
 tablespaces and FDWs underneath the Append.

Well, as I said in another recent reply that you probably got after
sending this, when you split between nodes, that mostly just has to do
with how you funnel the tuples from one node to another.  The nodes
themselves probably don't even need to know.  Or at least that's what
I'd hope.

I don't see that parallelizing Append is any easier than any other
problem in this space.  There's no parallel I/O facility, so you need
a background worker per append branch to wait on I/O.  And you have
all the problems of making sure that the workers have the same
snapshot, making sure they don't self-deadlock, etc. that you have for
any other case.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)

2014-03-01 Thread Stephen Frost
* Robert Haas (robertmh...@gmail.com) wrote:
 I don't see that parallelizing Append is any easier than any other
 problem in this space.  There's no parallel I/O facility, so you need
 a background worker per append branch to wait on I/O.  And you have
 all the problems of making sure that the workers have the same
 snapshot, making sure they don't self-deadlock, etc. that you have for
 any other case.

Erm, my thought was to use a select() loop which sends out I/O requests
and then loops around waiting to see who finishes it.  It doesn't
parallelize the CPU cost of getting the rows back to the caller, but
it'd at least parallelize the I/O, and if what's underneath is actually
a remote FDW running a complex query (because the other side is actually
a view), it would be a massive win to have all the remote FDWs executing
concurrently instead of serially as we have today.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)

2014-03-01 Thread Kohei KaiGai
2014-03-02 10:29 GMT+09:00 Stephen Frost sfr...@snowman.net:
 * Kohei KaiGai (kai...@kaigai.gr.jp) wrote:
 As you mentioned, it is a headache for packagers, and does not make
 sense for us if packager disabled the feature that requires proprietary
 drivers.

 No, I disagree with that.  I don't expect this use-case to be very
 common to begin with and telling individuals that they have to compile
 it themselves is certainly not out of the question.

 In fact, Fedora / RHEL does not admit to distribute software
 under the none open source software license.

 I'm pretty confident you can get RPMs for those distributions.

 Obviously, nvidia's cuda
 is a library being distributed under the proprietary license, thus out of
 the scope for the Linux distributors.

 This also doesn't make any sense to me- certainly the CUDA libraries are
 available under Debian derivatives, along with open-source wrappers for
 them like pycuda.

 It also leads them to turn off the
 feature that shall be linked with proprietary drivers.
 All we can do is to implement these features as extension, then offer
 an option for users to use or not to use.

 No, we can tell individuals who want it that they're going to need to
 build with support for it.  We don't offer OpenSSL as an extension (I
 certainly wish we did- and had a way to replace it w/ GNUTLS or one of
 the better licensed options).

I know there is some alternative ways. However, it requires users to take
additional knowledge and setting up efforts, also loses the benefit to use
distributor's Linux if alternative RPMs are required.
I don't want to recommend users such a complicated setting up procedure.

 What I'd like to implement is GPU acceleration that can perform on
 regular tables, not only foreign tables. Also, regarding to the backlog
 in the developer meeting, pluggable executor node is also required
 feature by PG-XC folks to work their project with upstream.
 I think custom-scan feature is capable to host these requirement,
 and does not prevent enhancement FDW features.

 I think you're conflating things again- while it might be possible to
 use CustomScan to implement FDW join-pushdown or FDW aggregate-pushdown,
 *I* don't think it's the right approach.  Regarding the PG-XC
 requirement, I expect they're looking for FDW join/aggregate-pushdown
 and also see that it *could* be done w/ CustomScan.

The reason why I submitted the part-3 patch (that enhances postgres_fdw
for remote-join using custom-scan) is easy to demonstrate the usage of
join-replacement by a special scan, with minimum scale of the patch to be
reviewed. If we have another idea to demonstrate it, I don't stick on the remot-
join feature on foreign tables.
Regarding to the PG-XC, I didn't know their exact needs because I didn't
attend the cluster meeting, but the someone mentioned about pluggable
plan/exec node in this context.

 We could punt on the whole thing and drop in hooks which could be used
 to replace everything done from the planner through to the executor and
 then anyone *could* implement any of the above, and parallel query too.
 That doesn't make it the right approach.

That is a problem I pointed out in the last developer meeting. Because we
have no way to enhance a part of plan / exec logic by extension, extension
has to replace whole of the planner / executor using hooks. It is painful for
authors of extensions.

Thanks,
-- 
KaiGai Kohei kai...@kaigai.gr.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)

2014-03-01 Thread Stephen Frost
* Robert Haas (robertmh...@gmail.com) wrote:
 For what it's worth, and I can't claim to have all the answers here,
 this doesn't match my expectation.  I think we'll do two kinds of
 parallelism.  One will be parallelism within nodes, like parallel sort
 or parallel seqscan.  Any node we parallelize this way is likely to be
 heavily rewritten, or else to get a sister that looks quite different
 from the original.  

Sure.

 The other kind of parallelism will involve pushing
 a whole subtree of the plan into a different node.  In this case we'll
 need to pass data between nodes in some different way (this was one of
 the major reasons I designed the shm_mq stuff) but the nodes
 themselves should change little if at all.

It's that some different way of passing data between the nodes that
makes me worry, but I hope you're right and we won't actually need to
change the interfaces or the nodes very much.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)

2014-03-01 Thread Kohei KaiGai
2014-03-02 10:38 GMT+09:00 Robert Haas robertmh...@gmail.com:
 On Wed, Feb 26, 2014 at 10:23 AM, Stephen Frost sfr...@snowman.net wrote:
 * Kouhei Kaigai (kai...@ak.jp.nec.com) wrote:
 IIUC, his approach was integration of join-pushdown within FDW APIs,
 however, it does not mean the idea of remote-join is rejected.

 For my part, trying to consider doing remote joins *without* going
 through FDWs is just nonsensical.

 That is, of course, true by definition, but I think it's putting the
 focus in the wrong place.  It's possible that there are other cases
 when a scan might a plausible path for a joinrel even if there are no
 foreign tables in play.  For example, you could cache the joinrel
 output and then inject a cache scan as a path for the joinrel.

That might be an idea to demonstrate usage of custom-scan node,
rather than the (ad-hoc) enhancement of postgres_fdw.
As I have discussed in another thread, it is available to switch heap
reference by cache reference on the fly, it shall be a possible use-
case for custom-scan node.

So, I'm inclined to drop the portion for postgres_fdw in my submission
to focus on custom-scan capability.

Thanks,
-- 
KaiGai Kohei kai...@kaigai.gr.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)

2014-02-28 Thread Stephen Frost
* Kouhei Kaigai (kai...@ak.jp.nec.com) wrote:
 * Stephen Frost (sfr...@snowman.net) wrote:
  I don't see how you can be when there hasn't been any discussion that I've
  seen about how parallel query execution is going to change things for us.
  
 If parallel query execution changes whole of the structure of plan nodes,
 it will also affect to the interface of custom-scan because it is a thin-
 abstraction of plan-node. However, if parallel execution feature is
 implemented as one of new plan node in addition to existing one, I cannot
 imagine a scenario that affects to the structure of another plan node.

Let's just say that I have doubts that we'll be able to implement
parallel execution *without* changing the plan node interface in some
way which will require, hopefully minor, changes to all of the nodes.
The issue is that even a minor change would break the custom-scan API
and we'd immediately be in the boat of dealing with complaints regarding
backwards compatibility.  Perhaps we can hand-wave that, and we've had
some success changing hook APIs between major releases, but such changes
may also be in ways which wouldn't break in obvious ways or even
possibly be changes which have to be introduced into back-branches.
Parallel query is going to be brand-new real soon and it's reasonable to
think we'll need to make bug-fix changes to it after it's out which
might even involve changes to the API which is developed for it.

  The issue here is that we're going to be expected to maintain an interface
  once we provide it and so that isn't something we should be doing lightly.
  Particularly when it's as involved as this kind of change is with what's
  going on in the backend where we are nearly 100% sure to be changing things
  in the next release or two.
  
 FDW APIs are also revised several times in the recent releases. If we can
 design perfect interface from the beginning, it's best but usually hard
 to design.

Sure, but FDWs also have a *much* broader set of use-cases, in my view,
which is also why I was pushing to work on join-push-down to happen
there instead of having this kind of a hook interface, which I don't
think we'd want to directly expose as part of the 'official FDW API' as
it ends up putting all the work on the FDW with little aide, making it
terribly likely to end up with a bunch of duplciated code in the FDWs
from the backend to deal with it, particularly for individuals writing
FDWs who aren't familiar with what PG already has.

 Also, custom-scan interface is almost symmetric with existing plan node
 structures, so its stability is relatively high, I think.

Perhaps it will come to pass that parallel query execution doesn't
require any changes to the plan node structure, but that's not the horse
that I'd bet on at this point.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)

2014-02-28 Thread Kohei KaiGai
2014-03-01 0:36 GMT+09:00 Stephen Frost sfr...@snowman.net:
 * Kouhei Kaigai (kai...@ak.jp.nec.com) wrote:
 * Stephen Frost (sfr...@snowman.net) wrote:
  I don't see how you can be when there hasn't been any discussion that I've
  seen about how parallel query execution is going to change things for us.
 
 If parallel query execution changes whole of the structure of plan nodes,
 it will also affect to the interface of custom-scan because it is a thin-
 abstraction of plan-node. However, if parallel execution feature is
 implemented as one of new plan node in addition to existing one, I cannot
 imagine a scenario that affects to the structure of another plan node.

 Let's just say that I have doubts that we'll be able to implement
 parallel execution *without* changing the plan node interface in some
 way which will require, hopefully minor, changes to all of the nodes.
 The issue is that even a minor change would break the custom-scan API
 and we'd immediately be in the boat of dealing with complaints regarding
 backwards compatibility.  Perhaps we can hand-wave that, and we've had
 some success changing hook APIs between major releases, but such changes
 may also be in ways which wouldn't break in obvious ways or even
 possibly be changes which have to be introduced into back-branches.
 Parallel query is going to be brand-new real soon and it's reasonable to
 think we'll need to make bug-fix changes to it after it's out which
 might even involve changes to the API which is developed for it.

Even if we will change the plan-node interface in the future release,
it shall not be a change that makes the existing stuff impossible.
The custom-scan API is designed to provide alternative way to scan
or join relations, in addition to the existing logic like SeqScan or
NestLoop. If this change breaks plan-node interfaces and it couldn't
implement existing stuff, it is problematic for all the stuff, not only
custom-scan node. I don't think such a change that makes impossible
to implement existing logic will be merged. Likely, the new parallel
execution feature can work together existing sequential logic and
custom-scan interface.

BTW, this kind of discussion looks like a talk with a ghost because
we cannot see the new interface according to the parallel execution
right now, so we cannot have tangible investigation whether it becomes
really serious backward incompatibility, or not.
My bet is minor one. I cannot imagine plan-node interface that does
not support existing non-parallel SeqScan or NestLoop and so on.

  The issue here is that we're going to be expected to maintain an interface
  once we provide it and so that isn't something we should be doing lightly.
  Particularly when it's as involved as this kind of change is with what's
  going on in the backend where we are nearly 100% sure to be changing things
  in the next release or two.
 
 FDW APIs are also revised several times in the recent releases. If we can
 design perfect interface from the beginning, it's best but usually hard
 to design.

 Sure, but FDWs also have a *much* broader set of use-cases, in my view,
 which is also why I was pushing to work on join-push-down to happen
 there instead of having this kind of a hook interface, which I don't
 think we'd want to directly expose as part of the 'official FDW API' as
 it ends up putting all the work on the FDW with little aide, making it
 terribly likely to end up with a bunch of duplciated code in the FDWs
 from the backend to deal with it, particularly for individuals writing
 FDWs who aren't familiar with what PG already has.

It might not be a good idea to use postgres_fdw as a basis of proof-
of-concept to demonstrate that custom-scan can effectively host
an alternative way to join; that fetches the result set of remote-join
as if relation scan, even though it demonstrated it is possible.
So, I never mind the part-3 portion (remote join of postgres_fdw on
top of custom-scan) being dropped from the submission.

Thanks,
-- 
KaiGai Kohei kai...@kaigai.gr.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)

2014-02-26 Thread Stephen Frost
* Kouhei Kaigai (kai...@ak.jp.nec.com) wrote:
  Just to come back to this- the other two contrib module patches, at least
  as I read over their initial submission, were *also* patching portions of
  backend code which it was apparently discovered that they needed.  That's
  a good bit of my complaint regarding this approach.
  
 ?? Sorry, are you still negative on the portion of backend patched
 by the part-2 and part-3 portion??

Pretty sure that I sent that prior to your last email, or at least
before I was to the end of it.

  If you're looking to just use GPU acceleration for improving individual
  queries, I would think that Robert's work around backend workers would be
  a more appropriate way to go, with the ability to move a working set of
  data from shared buffers and on-disk representation of a relation over to
  the GPU's memory, perform the operation, and then copy the results back.
 
 The approach is similar to the Robert's work except for GPU adoption,
 instead of multicore CPUs. So, I tried to review his work to apply
 the facilities on my extension also.

Good, I'd be very curious to hear how that might solve the issue for
you, instead of using hte CustomScan approach..

  regular PG tables, just to point out one issue, can be locked on a
  row-by-row basis, and we know exactly where in shared buffers to go hunt
  down the rows.  How is that going to work here, if this is both a regular
  table and stored off in a GPU's memory across subsequent queries or even
  transactions?
  
 It shall be handled case-by-case basis, I think. If row-level lock is
 required over the table scan, custom-scan node shall return a tuple being
 located on the shared buffer, instead of the cached tuples. Of course,
 it is an option for custom-scan node to calculate qualifiers by GPU with
 cached data and returns tuples identified by ctid of the cached tuples.
 Anyway, it is not a significant problem.

I think you're being a bit too hand-wavey here, but if we're talking
about pre-scanning the data using PG before sending it to the GPU and
then only performing a single statement on the GPU, we should be able to
deal with it.  I'm worried about your ideas to try and cache things on
the GPU though, if you're not prepared to deal with locks happening in
shared memory on the rows you've got cached out on the GPU, or hint
bits, or the visibility map being updated, etc...

 OK, I'll move the portion that will be needed commonly for other FDWs into
 the backend code.

Alright- but realize that there may be objections there on the basis
that the code/structures which you're exposing aren't, and will not be,
stable.  I'll have to go back and look at them myself, certainly, and
their history.

 Yes. According to the previous discussion around postgres_fdw getting
 merged, all we can trust on the remote side are built-in data types,
 functions, operators or other stuffs only.

Well, we're going to need to expand that a bit for aggregates, I'm
afraid, but we should be able to define the API for those aggregates
very tightly based on what PG does today and require that any FDW
purporting to provides those aggregates do it the way PG does.  Note
that this doesn't solve all the problems- we've got other issues with
regard to pushing aggregates down into FDWs that need to be solved.

 The custom-scan node is intended to perform on regular relations, not
 only foreign tables. It means a special feature (like GPU acceleration)
 can perform transparently for most of existing applications. Usually,
 it defines regular tables for their work on installation, not foreign
 tables. It is the biggest concern for me.

The line between a foreign table and a local one is becoming blurred
already, but still, if this is the goal then I really think the
background worker is where you should be focused, not on this Custom
Scan API.  Consider that, once we've got proper background workers,
we're going to need new nodes which operate in parallel (or some other
rejiggering of the nodes- I don't pretend to know exactly what Robert is
thinking here, and I've apparently forgotten it if he's posted it
somewhere) and those interfaces may drive changes which would impact the
Custom Scan API- or worse, make us deprecate or regret having added it
because now we'll need to break backwards compatibility to add in the
parallel node capability to satisfy the more general non-GPU case.

 I might have miswording. Anyway, I want plan nodes that enable extensions
 to define its behavior, even though it's similar to ForeignScan, but allows
 to perform on regular relations. Also, not only custom-scan and foreign-scan,
 any plan nodes work according to the interface to co-work with other nodes,
 it is not strange that both of interfaces are similar.

It sounds a lot like you're trying to define, external to PG, what
Robert is already trying to get going *internal* to PG, and I really
don't want to end up in a situation where we've got a solution for the
uncommon 

Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)

2014-02-26 Thread Shigeru Hanada
2014-02-26 16:46 GMT+09:00 Kouhei Kaigai kai...@ak.jp.nec.com:
 Just to come back to this- the other two contrib module patches, at least
 as I read over their initial submission, were *also* patching portions of
 backend code which it was apparently discovered that they needed.  That's
 a good bit of my complaint regarding this approach.

 ?? Sorry, are you still negative on the portion of backend patched
 by the part-2 and part-3 portion??

Perhaps he meant to separate patches based on feature-based rule.  IMO
if exposing utilities is essential for Custom Scan API in practical
meaning, IOW to implement and maintain an extension which implements
Custom Scan API, they should be go into the first patch.  IIUC two
contrib modules are also PoC for the API, so part-2/3 patch should
contain only changes against contrib and its document.

Besides that, some typo fixing are mixed in part-2 patch.  They should
go into the part-1 patch where the typo introduced.

-- 
Shigeru HANADA


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)

2014-02-26 Thread Stephen Frost
* Shigeru Hanada (shigeru.han...@gmail.com) wrote:
 Perhaps he meant to separate patches based on feature-based rule.  IMO
 if exposing utilities is essential for Custom Scan API in practical
 meaning, IOW to implement and maintain an extension which implements
 Custom Scan API, they should be go into the first patch.  IIUC two
 contrib modules are also PoC for the API, so part-2/3 patch should
 contain only changes against contrib and its document.

That's what I was getting at, yes.

 Besides that, some typo fixing are mixed in part-2 patch.  They should
 go into the part-1 patch where the typo introduced.

Agreed.

THanks,

Stephen


signature.asc
Description: Digital signature


Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)

2014-02-26 Thread Kouhei Kaigai
 * Kouhei Kaigai (kai...@ak.jp.nec.com) wrote:
  This regular one means usual tables. Even though custom implementation
  may reference self-managed in-memory cache instead of raw heap, the
  table pointed in user's query shall be a usual table.
  In the past, Hanada-san had proposed an enhancement of FDW to support
  remote-join but eventually rejected.
 
 I'm not aware of the specifics around that proposal but I don't believe
 we, as a community, have decided to reject the idea in general.
 
IIUC, his approach was integration of join-pushdown within FDW APIs,
however, it does not mean the idea of remote-join is rejected.
I believe it is still one of our killer feature if we can revise the
implementation.

Hanada-san, could you put the reason why your proposition was rejected
before?

  I thought these functions were useful to have in the backend commonly,
  but is not a fundamental functionality lacks of the custom-scan interface.
 
 Then perhaps they should be exposed more directly?  I can understand
 generally useful functionality being exposed in a way that anyone can use
 it, but we need to avoid interfaces which can't be stable due to normal
 / ongoing changes to the backend code.
 
The functions my patches want to expose are:
 - get_restriction_qual_cost()
 - fix_expr_common()

And, the functions my patches newly want are:
 - bms_to_string()
 - bms_from_string()

Above two functions are defined as static functions because cost estimation
is done at costsize.c and set-reference is done at setrefs.c, however,
custom-scan breaks this assumption, so I moved it into public.
These are used by everyone, but everyone exists on a particular file.

  I can also understand the usefulness of join or aggregation into the
  remote side in case of foreign table reference. In similar way, it is
  also useful if we can push these CPU intensive operations into
  co-processors on regular table references.
 
 That's fine, if we can get data to and from those co-processors efficiently
 enough that it's worth doing so.  If moving the data to the GPU's memory
 will take longer than running the actual aggregation, then it doesn't make
 any sense for regular tables because then we'd have to cache the data in
 the GPU's memory in some way across multiple queries, which isn't something
 we're set up to do.
 
When I made a prototype implementation on top of FDW, using CUDA, it enabled
to run sequential scan 10 times faster than SeqScan on regular tables, if
qualifiers are enough complex.
Library to communicate GPU (OpenCL/CUDA) has asynchronous data transfer
mode using hardware DMA. It allows to hide the cost of data transfer by
pipelining, if here is enough number of records to be transferred.
Also, the recent trend of semiconductor device is GPU integration with CPU,
that shares a common memory space. See, Haswell of Intel, Kaveri of AMD, or
Tegra K1 of nvidia. All of them shares same memory, so no need to transfer
the data to be calculated. This trend is dominated by physical law because
of energy consumption by semiconductor. So, I'm optimistic for my idea.

  As I mentioned above, the backend changes by the part-2/-3 patches are
  just minor stuff, and I thought it should not be implemented by
  contrib module locally.
 
 Fine- then propose them as generally useful additions, not as patches which
 are supposed to just be for contrib modules using an already defined
 interface.  If you can make a case for that then perhaps this is more
 practical.
 
The usage was found by the contrib module that wants to call static
functions, or feature to translate existing data structure to/from
cstring. But, anyway, does separated patch make sense?

  No. What I want to implement is, read the regular table and transfer
  the contents into GPU's local memory for calculation, then receives
  its calculation result. The in-memory cache (also I'm working on) is
  supplemental stuff because disk access is much slower and row-oriented
  data structure is not suitable for SIMD style instructions.
 
 Is that actually performant?  Is it actually faster than processing the
 data directly?  The discussions that I've had with folks have cast a great
 deal of doubt in my mind about just how well that kind of quick turn-around
 to the GPU's memory actually works.
 
See above.

   This really strikes me as the wrong approach for an FDW
   join-pushdown API, which should be geared around giving the remote
   side an opportunity on a case-by-case basis to cost out joins using
   whatever methods it has available to implement them.  I've outlined
   above the reasons I don't agree with just making the entire
 planner/optimizer pluggable.
  
  I'm also inclined to have arguments that will provide enough
  information for extensions to determine the best path for them.
 
 For join push-down, I proposed above that we have an interface to the FDW
 which allows us to ask it how much each join of the tables which are on
 a given FDW's server would cost if 

Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)

2014-02-26 Thread Kouhei Kaigai
   If you're looking to just use GPU acceleration for improving
   individual queries, I would think that Robert's work around backend
   workers would be a more appropriate way to go, with the ability to
   move a working set of data from shared buffers and on-disk
   representation of a relation over to the GPU's memory, perform the
 operation, and then copy the results back.
  
  The approach is similar to the Robert's work except for GPU adoption,
  instead of multicore CPUs. So, I tried to review his work to apply the
  facilities on my extension also.
 
 Good, I'd be very curious to hear how that might solve the issue for you,
 instead of using hte CustomScan approach..
 
I (plan to) use custom-scan of course. Once a relation is referenced
and optimizer decided GPU acceleration is cheaper, associated custom-
scan node read the data from underlying relation (or in-memory cache
if exists) then move to the shared memory buffer to deliver GPU
management background worker that launches asynchronous DMA one by one.
After that, custom-scan node receives filtered records via shared-
memory buffer, so it can construct tuples to be returned to the upper
node.

   regular PG tables, just to point out one issue, can be locked on a
   row-by-row basis, and we know exactly where in shared buffers to go
   hunt down the rows.  How is that going to work here, if this is both
 a regular
   table and stored off in a GPU's memory across subsequent queries or
   even transactions?
  
  It shall be handled case-by-case basis, I think. If row-level lock
  is required over the table scan, custom-scan node shall return a tuple
  being located on the shared buffer, instead of the cached tuples. Of
  course, it is an option for custom-scan node to calculate qualifiers
  by GPU with cached data and returns tuples identified by ctid of the cached
 tuples.
  Anyway, it is not a significant problem.
 
 I think you're being a bit too hand-wavey here, but if we're talking about
 pre-scanning the data using PG before sending it to the GPU and then only
 performing a single statement on the GPU, we should be able to deal with
 it.
It's what I want to implement.

 I'm worried about your ideas to try and cache things on the GPU though,
 if you're not prepared to deal with locks happening in shared memory on
 the rows you've got cached out on the GPU, or hint bits, or the visibility
 map being updated, etc...
 
It does not remain any state/information on the GPU side. Things related
to PG internal stuff is job of CPU.

  OK, I'll move the portion that will be needed commonly for other FDWs
  into the backend code.
 
 Alright- but realize that there may be objections there on the basis that
 the code/structures which you're exposing aren't, and will not be, stable.
 I'll have to go back and look at them myself, certainly, and their history.
 
I see, but it is a process during code getting merged.

  Yes. According to the previous discussion around postgres_fdw getting
  merged, all we can trust on the remote side are built-in data types,
  functions, operators or other stuffs only.
 
 Well, we're going to need to expand that a bit for aggregates, I'm afraid,
 but we should be able to define the API for those aggregates very tightly
 based on what PG does today and require that any FDW purporting to provides
 those aggregates do it the way PG does.  Note that this doesn't solve all
 the problems- we've got other issues with regard to pushing aggregates down
 into FDWs that need to be solved.
 
I see. It probably needs more detailed investigation.

  The custom-scan node is intended to perform on regular relations, not
  only foreign tables. It means a special feature (like GPU
  acceleration) can perform transparently for most of existing
  applications. Usually, it defines regular tables for their work on
  installation, not foreign tables. It is the biggest concern for me.
 
 The line between a foreign table and a local one is becoming blurred already,
 but still, if this is the goal then I really think the background worker
 is where you should be focused, not on this Custom Scan API.  Consider that,
 once we've got proper background workers, we're going to need new nodes
 which operate in parallel (or some other rejiggering of the nodes- I don't
 pretend to know exactly what Robert is thinking here, and I've apparently
 forgotten it if he's posted it
 somewhere) and those interfaces may drive changes which would impact the
 Custom Scan API- or worse, make us deprecate or regret having added it
 because now we'll need to break backwards compatibility to add in the
 parallel node capability to satisfy the more general non-GPU case.
 
The custom-scan API is thin abstraction towards the plan node interface,
not tightly convinced with a particular use case, like GPU, remote-join
and so on. So, I'm quite optimistic for the future maintainability.
Also, please remind the discussion at the last developer meeting.
The purpose of custom-scan 

Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)

2014-02-26 Thread Shigeru Hanada
2014-02-26 17:31 GMT+09:00 Kouhei Kaigai kai...@ak.jp.nec.com:
 IIUC, his approach was integration of join-pushdown within FDW APIs,
 however, it does not mean the idea of remote-join is rejected.
 I believe it is still one of our killer feature if we can revise the
 implementation.

 Hanada-san, could you put the reason why your proposition was rejected
 before?

IIUC it was not rejected, just returned-with-feedback.  We could not
get consensus about how join-push-down works.  A duscussion point was
multiple paths for a joinrel, but it was not so serious point.  Here
is the tail of the thread.

http://www.postgresql.org/message-id/4f058241.2000...@enterprisedb.com

 Heikki Linnakangasheikki.linnakan...@enterprisedb.com  writes:

 Hmm, so you're saying that the FDW function needs to be able to return
 multiple paths for a single joinrel. Fair enough, and that's not
 specific to remote joins. Even a single-table foreign scan could be
 implemented differently depending on whether you prefer fast-start or
 cheapest total.


 ... or ordered vs unordered, etc.  Yeah, good point, we already got this
 wrong with the PlanForeignScan API.  Good thing we didn't promise that
 would be stable.


 This discussion withered down here...

 I think the advice to Shigeru-san is to work on the API. We didn't reach a
 consensus on what exactly it should look like, but at least you need to be
 able to return multiple paths for a single joinrel, and should look at
 fixing the PlanForeignScan API to allow that too.

And I've gave up for lack of time, IOW to finish more fundamental
portion of FDW API.

http://www.postgresql.org/message-id/4f39fc1a.7090...@gmail.com

-- 
Shigeru HANADA


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)

2014-02-26 Thread Stephen Frost
* Kouhei Kaigai (kai...@ak.jp.nec.com) wrote:
 IIUC, his approach was integration of join-pushdown within FDW APIs,
 however, it does not mean the idea of remote-join is rejected.

For my part, trying to consider doing remote joins *without* going
through FDWs is just nonsensical.  What are you joining remotely if not
two foreign tables?  With regard to the GPU approach, if that model
works whereby the normal PG tuples are read off disk, fed over to the
GPU, processed, then returned back to the user through PG, then I
wouldn't consider it really a 'remote' join but rather simply a new
execution node inside of PG which is planned and costed just like the
others.  We've been over the discussion already about trying to make
that a pluggable system but the, very reasonable, push-back on that has
been if it's really possible and really makes sense to be pluggable.  It
certainly doesn't *have* to be- PostgreSQL is written in C, as we all
know, and plenty of C code talks to GPUs and shuffles memory around- and
that's almost exactly what Robert is working on supporting with regular
CPUs and PG backends already.

In many ways, trying to conflate this idea of using-GPUs-to-do-work with
the idea of remote-FDW-joins has really disillusioned me with regard to
the CustomScan approach.

  Then perhaps they should be exposed more directly?  I can understand
  generally useful functionality being exposed in a way that anyone can use
  it, but we need to avoid interfaces which can't be stable due to normal
  / ongoing changes to the backend code.
  
 The functions my patches want to expose are:
  - get_restriction_qual_cost()
  - fix_expr_common()

I'll try and find time to go look at these in more detail later this
week.  I have reservations about exposing the current estimates on costs
as we may want to adjust them in the future- but such adjustments may
need to be made in balance with other changes throughout the system and
an external module which depends on one result from the qual costing
might end up having problems with the costing changes because the
extension author wasn't aware of the other changes happening in other
areas of the costing.

I'm talking about this from a beyond-just-the-GUCs point of view, I
realize that the extension author could go look at the GUC settings, but
it's entirely reasonable to believe we'll make changes to the default
GUC settings along with how they're used in the future.

 And, the functions my patches newly want are:
  - bms_to_string()
  - bms_from_string()

Offhand, these look fine, if there's really an external use for them.
Will try to look at them in more detail later.

  That's fine, if we can get data to and from those co-processors efficiently
  enough that it's worth doing so.  If moving the data to the GPU's memory
  will take longer than running the actual aggregation, then it doesn't make
  any sense for regular tables because then we'd have to cache the data in
  the GPU's memory in some way across multiple queries, which isn't something
  we're set up to do.
  
 When I made a prototype implementation on top of FDW, using CUDA, it enabled
 to run sequential scan 10 times faster than SeqScan on regular tables, if
 qualifiers are enough complex.
 Library to communicate GPU (OpenCL/CUDA) has asynchronous data transfer
 mode using hardware DMA. It allows to hide the cost of data transfer by
 pipelining, if here is enough number of records to be transferred.

That sounds very interesting and certainly figuring out the costing to
support that model will be tricky.  Also, shuffling the data around in
that way will also be interesting.  It strikes me that it'll be made
more difficult if we're trying to do it through the limitations of a
pre-defined API between the core code and an extension.

 Also, the recent trend of semiconductor device is GPU integration with CPU,
 that shares a common memory space. See, Haswell of Intel, Kaveri of AMD, or
 Tegra K1 of nvidia. All of them shares same memory, so no need to transfer
 the data to be calculated. This trend is dominated by physical law because
 of energy consumption by semiconductor. So, I'm optimistic for my idea.

And this just makes me wonder why the focus isn't on the background
worker approach instead of trying to do this all in an extension.

 The usage was found by the contrib module that wants to call static
 functions, or feature to translate existing data structure to/from
 cstring. But, anyway, does separated patch make sense?

I haven't had a chance to go back and look into the functions in detail,
but offhand I'd say the bms ones are probably fine while the others
would need more research as to if they make sense to expose to an
extension.

 Hmm... It seems to me we should follow the existing manner to construct
 join path, rather than special handling. Even if a query contains three or
 more foreign tables managed by same server, it shall be consolidated into
 one remote join as long as its cost is less than 

Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)

2014-02-26 Thread Stephen Frost
* Kouhei Kaigai (kai...@ak.jp.nec.com) wrote:
 I (plan to) use custom-scan of course. Once a relation is referenced
 and optimizer decided GPU acceleration is cheaper, associated custom-
 scan node read the data from underlying relation (or in-memory cache
 if exists) then move to the shared memory buffer to deliver GPU
 management background worker that launches asynchronous DMA one by one.
 After that, custom-scan node receives filtered records via shared-
 memory buffer, so it can construct tuples to be returned to the upper
 node.

Alright- but have you discussed this with Robert?  We're going to be
whacking things around for parallel support with new nodes and more
built-in helper functionality for doing this work and I'm not anxious to
have CustomScan end up being a legacy interface that we're required to
pull forward because we accepted it before things had settled.

  I'm worried about your ideas to try and cache things on the GPU though,
  if you're not prepared to deal with locks happening in shared memory on
  the rows you've got cached out on the GPU, or hint bits, or the visibility
  map being updated, etc...
  
 It does not remain any state/information on the GPU side. Things related
 to PG internal stuff is job of CPU.

Right, good, I'm glad to hear that this approach is for doing things at
only a individual statement level and it's good to know that it can be
performant at that level now.

  Well, we're going to need to expand that a bit for aggregates, I'm afraid,
  but we should be able to define the API for those aggregates very tightly
  based on what PG does today and require that any FDW purporting to provides
  those aggregates do it the way PG does.  Note that this doesn't solve all
  the problems- we've got other issues with regard to pushing aggregates down
  into FDWs that need to be solved.
  
 I see. It probably needs more detailed investigation.

These issues will hopefully not be a problem (or at least, one that can
be worked around) for non-FDW implementations which are part of core and
implemented in a similar way to the existing aggregates..  Where the
scan node could continue to be a simple SeqScan as it is today.

 The custom-scan API is thin abstraction towards the plan node interface,
 not tightly convinced with a particular use case, like GPU, remote-join
 and so on. So, I'm quite optimistic for the future maintainability.

I don't see how you can be when there hasn't been any discussion that
I've seen about how parallel query execution is going to change things
for us.

 Also, please remind the discussion at the last developer meeting.
 The purpose of custom-scan (we didn't name it at that time) is to avoid
 unnecessary project branch for people who want to implement their own
 special feature but no facilities to enhance optimizer/executor are
 supported.
 Even though we have in-core parallel execution feature by CPU, it also
 makes sense to provide some unique implementation that may be suitable
 for a specific region.

The issue here is that we're going to be expected to maintain an
interface once we provide it and so that isn't something we should be
doing lightly.  Particularly when it's as involved as this kind of
change is with what's going on in the backend where we are nearly 100%
sure to be changing things in the next release or two.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)

2014-02-26 Thread Kouhei Kaigai
 * Kouhei Kaigai (kai...@ak.jp.nec.com) wrote:
  IIUC, his approach was integration of join-pushdown within FDW APIs,
  however, it does not mean the idea of remote-join is rejected.
 
 For my part, trying to consider doing remote joins *without* going through
 FDWs is just nonsensical.  What are you joining remotely if not two foreign
 tables?

It is a case to be joined locally. If query has two foreign tables managed
by same server, this couple shall be found during the optimizer tries
various possible combinations.

 With regard to the GPU approach, if that model works whereby the
 normal PG tuples are read off disk, fed over to the GPU, processed, then
 returned back to the user through PG, then I wouldn't consider it really
 a 'remote' join but rather simply a new execution node inside of PG which
 is planned and costed just like the others.  We've been over the discussion
 already about trying to make that a pluggable system but the, very reasonable,
 push-back on that has been if it's really possible and really makes sense
 to be pluggable.  It certainly doesn't *have* to be- PostgreSQL is written
 in C, as we all know, and plenty of C code talks to GPUs and shuffles memory
 around- and that's almost exactly what Robert is working on supporting with
 regular CPUs and PG backends already.
 
 In many ways, trying to conflate this idea of using-GPUs-to-do-work with
 the idea of remote-FDW-joins has really disillusioned me with regard to
 the CustomScan approach.
 
Are you suggesting me to focus on the GPU stuff, rather than killing two birds
with a stone? It may be an approach, however, these have common part because
the plan-node for remote-join will pops tuples towards its upper node.
From viewpoint of the upper node, it looks like a black box that returns tuples
that joined two underlying relations. On the other hands, here is another black
box that returns tuples that scans or joins underlying relations with GPU 
assist.
Both of implementation detail is not visible for the upper node, but its 
external
interface is common. The custom-scan node can provide a pluggable way for both
of use-case.
Anyway, I'm not motivated to remote-join feature more than GPU-acceleration
stuff. If it is better to drop FDW's remote-join stuff from the custom-scan
scope, I don't claim it.

   Then perhaps they should be exposed more directly?  I can understand
   generally useful functionality being exposed in a way that anyone
   can use it, but we need to avoid interfaces which can't be stable
   due to normal / ongoing changes to the backend code.
  
  The functions my patches want to expose are:
   - get_restriction_qual_cost()
   - fix_expr_common()
 
 I'll try and find time to go look at these in more detail later this week.
 I have reservations about exposing the current estimates on costs as we
 may want to adjust them in the future- but such adjustments may need to
 be made in balance with other changes throughout the system and an external
 module which depends on one result from the qual costing might end up having
 problems with the costing changes because the extension author wasn't aware
 of the other changes happening in other areas of the costing.
 
It is also the point of mine. If cost estimation logic is revised in
the future, it makes a problem if extension cuts and copies the code.

 I'm talking about this from a beyond-just-the-GUCs point of view, I
 realize that the extension author could go look at the GUC settings, but
 it's entirely reasonable to believe we'll make changes to the default GUC
 settings along with how they're used in the future.
 
Is the GUC something like Boolean that shows whether the new costing model
is applied or not? If so, extension needs to keep two cost estimation logics
within its code, isn't it?
If the GUC shows something like a weight, I also think it makes sense.

  And, the functions my patches newly want are:
   - bms_to_string()
   - bms_from_string()
 
 Offhand, these look fine, if there's really an external use for them.
 Will try to look at them in more detail later.
 
At least, it makes sense to carry bitmap data structure on the private
field of custom-scan, because all the plan node has to be safe for
copyObject() manner.

   That's fine, if we can get data to and from those co-processors
   efficiently enough that it's worth doing so.  If moving the data to
   the GPU's memory will take longer than running the actual
   aggregation, then it doesn't make any sense for regular tables
   because then we'd have to cache the data in the GPU's memory in some
   way across multiple queries, which isn't something we're set up to do.
  
  When I made a prototype implementation on top of FDW, using CUDA, it
  enabled to run sequential scan 10 times faster than SeqScan on regular
  tables, if qualifiers are enough complex.
  Library to communicate GPU (OpenCL/CUDA) has asynchronous data
  transfer mode using hardware DMA. It allows to hide the cost of data
  transfer 

Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)

2014-02-26 Thread Kouhei Kaigai
 * Kouhei Kaigai (kai...@ak.jp.nec.com) wrote:
  I (plan to) use custom-scan of course. Once a relation is referenced
  and optimizer decided GPU acceleration is cheaper, associated custom-
  scan node read the data from underlying relation (or in-memory cache
  if exists) then move to the shared memory buffer to deliver GPU
  management background worker that launches asynchronous DMA one by one.
  After that, custom-scan node receives filtered records via shared-
  memory buffer, so it can construct tuples to be returned to the upper
  node.
 
 Alright- but have you discussed this with Robert?  We're going to be
 whacking things around for parallel support with new nodes and more built-in
 helper functionality for doing this work and I'm not anxious to have
 CustomScan end up being a legacy interface that we're required to pull
 forward because we accepted it before things had settled.
 
I had briefly introduced him my idea using GPU at Ottawa last year,
even though I'm not certain he remembered it.
At least, idea of custom-scan node came from the discussion at that
time.

  The custom-scan API is thin abstraction towards the plan node
  interface, not tightly convinced with a particular use case, like GPU,
  remote-join and so on. So, I'm quite optimistic for the future
 maintainability.
 
 I don't see how you can be when there hasn't been any discussion that I've
 seen about how parallel query execution is going to change things for us.
 
If parallel query execution changes whole of the structure of plan nodes,
it will also affect to the interface of custom-scan because it is a thin-
abstraction of plan-node. However, if parallel execution feature is
implemented as one of new plan node in addition to existing one, I cannot
imagine a scenario that affects to the structure of another plan node.

  Also, please remind the discussion at the last developer meeting.
  The purpose of custom-scan (we didn't name it at that time) is to
  avoid unnecessary project branch for people who want to implement
  their own special feature but no facilities to enhance
  optimizer/executor are supported.
  Even though we have in-core parallel execution feature by CPU, it also
  makes sense to provide some unique implementation that may be suitable
  for a specific region.
 
 The issue here is that we're going to be expected to maintain an interface
 once we provide it and so that isn't something we should be doing lightly.
 Particularly when it's as involved as this kind of change is with what's
 going on in the backend where we are nearly 100% sure to be changing things
 in the next release or two.
 
FDW APIs are also revised several times in the recent releases. If we can
design perfect interface from the beginning, it's best but usually hard
to design.
Also, custom-scan interface is almost symmetric with existing plan node
structures, so its stability is relatively high, I think.

Thanks,
--
NEC OSS Promotion Center / PG-Strom Project
KaiGai Kohei kai...@ak.jp.nec.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)

2014-02-25 Thread Ashutosh Bapat
On Sun, Feb 23, 2014 at 6:54 PM, Kohei KaiGai kai...@kaigai.gr.jp wrote:

 Folks,

 Let me remind the custom-scan patches; that is a basis feature of
 remote join of postgres_fdw, cache-only scan, (upcoming) GPU
 acceleration feature or various alternative ways to scan/join relations.
 Unfortunately, small amount of discussion we could have in this commit
 fest, even though Hanada-san volunteered to move the patches into
 ready for committer state at the CF-Nov.


Sorry for jumping into this late.
Instead of custom node, it might be better idea to improve FDW
infrastructure to push join. For the starters, is it possible for the
custom scan node hooks to create a ForeignScan node? In general, I think,
it might be better for the custom scan hooks to create existing nodes if
they serve the purpose.



 Prior to time-up, I'd like to ask hacker's opinion about its potential
 arguable points (from my standpoint) if it needs to be fixed up.
 One is hook definition to add alternative join path, and the other one
 is a special varno when a custom scan replace a join node.
 I'd like to see your opinion about them while we still have to change
 the design if needed.

 (1) Interface to add alternative paths in addition to built-in join paths

 This patch adds add_join_path_hook on add_paths_to_joinrel to allow
 extensions to provide alternative scan path in addition to the built-in
 join paths. Custom-scan path being added is assumed to perform to scan
 on a (virtual) relation that is a result set of joining relations.
 My concern is its arguments to be pushed. This hook is declared as follows:

 /* Hook for plugins to add custom join path, in addition to default ones */
 typedef void (*add_join_path_hook_type)(PlannerInfo *root,
 RelOptInfo *joinrel,
 RelOptInfo *outerrel,
 RelOptInfo *innerrel,
 JoinType jointype,
 SpecialJoinInfo *sjinfo,
 List *restrictlist,
 List *mergeclause_list,
 SemiAntiJoinFactors *semifactors,
 Relids param_source_rels,
 Relids extra_lateral_rels);
 extern PGDLLIMPORT add_join_path_hook_type add_join_path_hook;

 Likely, its arguments upper than restrictlist should be informed to
 extensions,
 because these are also arguments of add_paths_to_joinrel().
 However, I'm not 100% certain how about other arguments should be informed.
 Probably, it makes sense to inform param_source_rels and extra_lateral_rels
 to check whether the path is sensible for parameterized paths.
 On the other hand, I doubt whether mergeclause_list is usuful to deliver.
 (It may make sense if someone tries to implement their own merge-join
 implementation??)

 I'd like to seem idea to improve the current interface specification.


Since a custom node is open implementation, it will be important to pass as
much information down to the hooks as possible; lest the hooks will be
constrained.  Since the functions signatures within the planner, optimizer
will change from time to time, so the custom node hook signatures will need
to change from time to time. That might turn out to be maintenance overhead.

BTW, is it a good idea for custom nodes to also affect other paths like
append, group etc.? Will it need separate hooks for each of those?



 (2) CUSTOM_VAR for special Var reference

 @@ -134,6 +134,7 @@ typedef struct Expr
  #defineINNER_VAR   65000   /* reference to inner subplan */
  #defineOUTER_VAR   65001   /* reference to outer subplan */
  #defineINDEX_VAR   65002   /* reference to index column */
 +#defineCUSTOM_VAR  65003   /* reference to custom column */

 I newly added CUSTOM_VAR to handle a case when custom-scan override
 join relations.
 Var-nodes within join plan are adjusted to refer either ecxt_innertuple or
 ecxt_outertuple of ExprContext. It makes a trouble if custom-scan runs
 instead of built-in joins, because its tuples being fetched are usually
 stored on the ecxt_scantuple, thus Var-nodes also need to have right
 varno neither inner nor outer.

 SetPlanRefCustomScan callback, being kicked on set_plan_refs, allows
 extensions to rewrite Var-nodes within custom-scan node to indicate
 ecxt_scantuple using CUSTOM_VAR, instead of inner or outer.
 For example, a var-node with varno=CUSTOM_VAR and varattno=3 means
 this node reference the third attribute of the tuple in ecxt_scantuple.
 I think it is a reasonable solution, however, I'm not 100% certain
 whether people have more graceful idea to implement it.

 If you have comments around above two topic, or others, please give
 your ideas.

 Thanks,

 2014-01-28 9:14 GMT+09:00 Kouhei Kaigai kai...@ak.jp.nec.com:
  

Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)

2014-02-25 Thread Kouhei Kaigai
 Sorry for jumping into this late.
 
 Instead of custom node, it might be better idea to improve FDW infrastructure
 to push join. For the starters, is it possible for the custom scan node
 hooks to create a ForeignScan node? In general, I think, it might be better
 for the custom scan hooks to create existing nodes if they serve the purpose.

It does not work well because existing FDW infrastructure is designed to
perform on foreign tables, not regular tables. Probably, it needs to revise
much our assumption around the background code, if we re-define the purpose
of FDW infrastructure. For example, ForeignScan is expected to return a tuple
according to the TupleDesc that is exactly same with table definition.
It does not fit the requirement if we replace a join-node by ForeignScan
because its TupleDesc of joined relations is not predefined.

I'd like to define these features are designed for individual purpose.
FDW is designed to intermediate an external data source and internal heap
representation according to foreign table definition. In other words, its
role is to generate contents of predefined database object on the fly.
On the other hands, custom-scan is designed to implement alternative ways
to scan / join relations in addition to the methods supported by built-in
feature.

I'm motivated to implement GPU acceleration feature that works transparently
for application. Thus, it has to be capable on regular tables, because most
of application stores data on regular tables, not foreign ones.

 Since a custom node is open implementation, it will be important to pass
 as much information down to the hooks as possible; lest the hooks will be
 constrained.  Since the functions signatures within the planner, optimizer
 will change from time to time, so the custom node hook signatures will need
 to change from time to time. That might turn out to be maintenance overhead.
 
Yes. You are also right. But it also makes maintenance overhead if hook has
many arguments nobody uses.
Probably, it makes sense to list up the arguments that cannot be reproduced
from other information, can be reproduced but complicated steps, and can be
reproduced easily.

Below is the information we cannot reproduce:
 - PlannerInfo *root
 - RelOptInfo *joinrel
 - RelOptInfo *outerrel
 - RelOptInfo *innerrel
 - JoinType jointype
 - SpecialJoinInfo *sjinfo
 - List *restrictlist

Below is the information we can reproduce but complicated steps:
 - List *mergeclause_list
 - bool mergejoin_allow
 - Relids param_source_rels
 - Relids extra_lateral_rels

Below is the information we can reproduce easily:
 - SemiAntiJoinFactors *semifactors

I think, the first two categories or the first category (if functions to
reproduce the second group is exposed) should be informed to extension,
however, priority of the third group is not high.


 BTW, is it a good idea for custom nodes to also affect other paths like
 append, group etc.? Will it need separate hooks for each of those?

Yes. I plan to support above plan node, in addition to scan / join only.
The custom-scan node is thin abstraction towards general executor behavior,
so I believe it is not hard to enhance this node, without new plan node
for each of them.
Of course, it will need separate hook to add alternative path on the planner
stage, but no individual plan nodes. (Sorry, it was unclear for me what
does the hook mean.)

Thanks,
--
NEC OSS Promotion Center / PG-Strom Project
KaiGai Kohei kai...@ak.jp.nec.com


 -Original Message-
 From: Ashutosh Bapat [mailto:ashutosh.ba...@enterprisedb.com]
 Sent: Tuesday, February 25, 2014 5:59 PM
 To: Kohei KaiGai
 Cc: Kaigai, Kouhei(海外, 浩平); Stephen Frost; Shigeru Hanada; Jim
 Mlodgenski; Robert Haas; Tom Lane; PgHacker; Peter Eisentraut
 Subject: Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)
 
 
 
 
 On Sun, Feb 23, 2014 at 6:54 PM, Kohei KaiGai kai...@kaigai.gr.jp wrote:
 
 
   Folks,
 
   Let me remind the custom-scan patches; that is a basis feature of
   remote join of postgres_fdw, cache-only scan, (upcoming) GPU
   acceleration feature or various alternative ways to scan/join
 relations.
   Unfortunately, small amount of discussion we could have in this
 commit
   fest, even though Hanada-san volunteered to move the patches into
   ready for committer state at the CF-Nov.
 
 
 
 Sorry for jumping into this late.
 
 Instead of custom node, it might be better idea to improve FDW infrastructure
 to push join. For the starters, is it possible for the custom scan node
 hooks to create a ForeignScan node? In general, I think, it might be better
 for the custom scan hooks to create existing nodes if they serve the purpose.
 
 
 
 
   Prior to time-up, I'd like to ask hacker's opinion about its
 potential
   arguable points (from my standpoint) if it needs to be fixed up.
   One is hook definition to add alternative join path, and the other
 one
   is a special varno when a custom scan replace

Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)

2014-02-25 Thread Ashutosh Bapat
 weight. I was
expecting more of a list of hooks to be defined by the user and this
infrastructure just calling them at appropriate places.


 Thanks,
 --
 NEC OSS Promotion Center / PG-Strom Project
 KaiGai Kohei kai...@ak.jp.nec.com


  -Original Message-
  From: Ashutosh Bapat [mailto:ashutosh.ba...@enterprisedb.com]
  Sent: Tuesday, February 25, 2014 5:59 PM
  To: Kohei KaiGai
  Cc: Kaigai, Kouhei(海外, 浩平); Stephen Frost; Shigeru Hanada; Jim
  Mlodgenski; Robert Haas; Tom Lane; PgHacker; Peter Eisentraut
  Subject: Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)
 
 
 
 
  On Sun, Feb 23, 2014 at 6:54 PM, Kohei KaiGai kai...@kaigai.gr.jp
 wrote:
 
 
Folks,
 
Let me remind the custom-scan patches; that is a basis feature of
remote join of postgres_fdw, cache-only scan, (upcoming) GPU
acceleration feature or various alternative ways to scan/join
  relations.
Unfortunately, small amount of discussion we could have in this
  commit
fest, even though Hanada-san volunteered to move the patches into
ready for committer state at the CF-Nov.
 
 
 
  Sorry for jumping into this late.
 
  Instead of custom node, it might be better idea to improve FDW
 infrastructure
  to push join. For the starters, is it possible for the custom scan node
  hooks to create a ForeignScan node? In general, I think, it might be
 better
  for the custom scan hooks to create existing nodes if they serve the
 purpose.
 
 
 
 
Prior to time-up, I'd like to ask hacker's opinion about its
  potential
arguable points (from my standpoint) if it needs to be fixed up.
One is hook definition to add alternative join path, and the other
  one
is a special varno when a custom scan replace a join node.
I'd like to see your opinion about them while we still have to
 change
the design if needed.
 
(1) Interface to add alternative paths in addition to built-in join
  paths
 
 
This patch adds add_join_path_hook on add_paths_to_joinrel to
  allow
extensions to provide alternative scan path in addition to the
  built-in
join paths. Custom-scan path being added is assumed to perform to
  scan
on a (virtual) relation that is a result set of joining relations.
My concern is its arguments to be pushed. This hook is declared
  as follows:
 
/* Hook for plugins to add custom join path, in addition to default
  ones */
typedef void (*add_join_path_hook_type)(PlannerInfo *root,
RelOptInfo *joinrel,
RelOptInfo *outerrel,
RelOptInfo *innerrel,
JoinType jointype,
SpecialJoinInfo
  *sjinfo,
List *restrictlist,
List *mergeclause_list,
SemiAntiJoinFactors
  *semifactors,
Relids
  param_source_rels,
Relids
  extra_lateral_rels);
extern PGDLLIMPORT add_join_path_hook_type add_join_path_hook;
 
Likely, its arguments upper than restrictlist should be informed
  to extensions,
because these are also arguments of add_paths_to_joinrel().
However, I'm not 100% certain how about other arguments should be
  informed.
Probably, it makes sense to inform param_source_rels and
  extra_lateral_rels
to check whether the path is sensible for parameterized paths.
On the other hand, I doubt whether mergeclause_list is usuful to
  deliver.
(It may make sense if someone tries to implement their own
  merge-join
implementation??)
 
I'd like to seem idea to improve the current interface
  specification.
 
 
 
 
  Since a custom node is open implementation, it will be important to pass
  as much information down to the hooks as possible; lest the hooks will be
  constrained.  Since the functions signatures within the planner,
 optimizer
  will change from time to time, so the custom node hook signatures will
 need
  to change from time to time. That might turn out to be maintenance
 overhead.
 
 
  BTW, is it a good idea for custom nodes to also affect other paths like
  append, group etc.? Will it need separate hooks for each of those?
 
 
 
 
(2) CUSTOM_VAR for special Var reference
 
@@ -134,6 +134,7 @@ typedef struct Expr
 #defineINNER_VAR   65000   /* reference to inner
  subplan */
 #defineOUTER_VAR   65001   /* reference to outer
  subplan */
 #defineINDEX_VAR   65002   /* reference to index
  column */
+#defineCUSTOM_VAR  65003   /* reference to custom
  column

Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)

2014-02-25 Thread Kohei KaiGai
 to believe these
arguments are one of the most stable stuffs.

 Below is the information we can reproduce but complicated steps:
  - List *mergeclause_list
  - bool mergejoin_allow
  - Relids param_source_rels
  - Relids extra_lateral_rels

 Below is the information we can reproduce easily:
  - SemiAntiJoinFactors *semifactors

 I think, the first two categories or the first category (if functions to
 reproduce the second group is exposed) should be informed to extension,
 however, priority of the third group is not high.


  BTW, is it a good idea for custom nodes to also affect other paths like
  append, group etc.? Will it need separate hooks for each of those?
 
 Yes. I plan to support above plan node, in addition to scan / join only.
 The custom-scan node is thin abstraction towards general executor
 behavior,
 so I believe it is not hard to enhance this node, without new plan node
 for each of them.
 Of course, it will need separate hook to add alternative path on the
 planner
 stage, but no individual plan nodes. (Sorry, it was unclear for me what
 does the hook mean.)


 If we represent all the operation like grouping, sorting, aggregation, as
 some sort of relation, we can create paths for each of the relation like we
 do (I am heavily borrowing from Tom's idea of pathifying those operations).
 We will need much lesser hooks in custom scan node.

 BTW, from the patch, I do not see this change to be light weight. I was
 expecting more of a list of hooks to be defined by the user and this
 infrastructure just calling them at appropriate places.

Let's focus on scan and join that we are currently working on.
Even if we need separate node type for grouping or sorting, it will not
be necessary to construct whole of the framework from the scratch.
For example, definition of CustomProvider table will be able to reuse
for other class of operations, because most of them are thin abstraction
of existing executor's interface.

Thanks,

 Thanks,
 --
 NEC OSS Promotion Center / PG-Strom Project
 KaiGai Kohei kai...@ak.jp.nec.com


  -Original Message-
  From: Ashutosh Bapat [mailto:ashutosh.ba...@enterprisedb.com]
  Sent: Tuesday, February 25, 2014 5:59 PM
  To: Kohei KaiGai
  Cc: Kaigai, Kouhei(海外, 浩平); Stephen Frost; Shigeru Hanada; Jim
  Mlodgenski; Robert Haas; Tom Lane; PgHacker; Peter Eisentraut
  Subject: Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)
 
 
 
 
  On Sun, Feb 23, 2014 at 6:54 PM, Kohei KaiGai kai...@kaigai.gr.jp
  wrote:
 
 
Folks,
 
Let me remind the custom-scan patches; that is a basis feature of
remote join of postgres_fdw, cache-only scan, (upcoming) GPU
acceleration feature or various alternative ways to scan/join
  relations.
Unfortunately, small amount of discussion we could have in this
  commit
fest, even though Hanada-san volunteered to move the patches into
ready for committer state at the CF-Nov.
 
 
 
  Sorry for jumping into this late.
 
  Instead of custom node, it might be better idea to improve FDW
  infrastructure
  to push join. For the starters, is it possible for the custom scan node
  hooks to create a ForeignScan node? In general, I think, it might be
  better
  for the custom scan hooks to create existing nodes if they serve the
  purpose.
 
 
 
 
Prior to time-up, I'd like to ask hacker's opinion about its
  potential
arguable points (from my standpoint) if it needs to be fixed up.
One is hook definition to add alternative join path, and the other
  one
is a special varno when a custom scan replace a join node.
I'd like to see your opinion about them while we still have to
  change
the design if needed.
 
(1) Interface to add alternative paths in addition to built-in
  join
  paths
 
 
This patch adds add_join_path_hook on add_paths_to_joinrel to
  allow
extensions to provide alternative scan path in addition to the
  built-in
join paths. Custom-scan path being added is assumed to perform to
  scan
on a (virtual) relation that is a result set of joining relations.
My concern is its arguments to be pushed. This hook is declared
  as follows:
 
/* Hook for plugins to add custom join path, in addition to
  default
  ones */
typedef void (*add_join_path_hook_type)(PlannerInfo *root,
RelOptInfo *joinrel,
RelOptInfo *outerrel,
RelOptInfo *innerrel,
JoinType jointype,
SpecialJoinInfo
  *sjinfo,
List *restrictlist,
List *mergeclause_list,
SemiAntiJoinFactors
  *semifactors

Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)

2014-02-25 Thread Shigeru Hanada
Hi Kaigai-san,

2014-02-25 13:28 GMT+09:00 Kouhei Kaigai kai...@ak.jp.nec.com:
 The reason why I asked the question above is, I haven't been 100% certain
 about its usage. Indeed, semifactors is applied on a limited usage, but
 quite easy to reproduce by extension later (using clauselist_selectivity)
 if extension wants this factor. So, I agree with removing the semifactors
 here.

Agreed.  It would be nice to mention how to obtain semifactos for
people who want to implement advanced join overriding.

 mergeclause_list and param_source_rels seem little easier to use, but
 anyway it should be documented how to use those parameters.

 The mergeclause_list might not be sufficient for extensions to determine
 whether its own mergejoin is applicable here. See the comment below; that
 is on the head of select_mergejoin_clauses.

 |  * *mergejoin_allowed is normally set to TRUE, but it is set to FALSE if
 |  * this is a right/full join and there are nonmergejoinable join clauses.
 |  * The executor's mergejoin machinery cannot handle such cases, so we have
 |  * to avoid generating a mergejoin plan.  (Note that this flag does NOT
 |  * consider whether there are actually any mergejoinable clauses.  This is
 |  * correct because in some cases we need to build a clauseless mergejoin.
 |  * Simply returning NIL is therefore not enough to distinguish safe from
 |  * unsafe cases.)
 |
 It says, mergejoin_clause == NIL is not a sufficient check to determine
 whether the mergejoin logic is applicable on the target join.
 So, either of them is probably an option for extension that tries to implement

Perhaps you mean both of them?

 their own mergejoin logic; (1) putting both of mergejoin_allowed and
 mergeclause_list as arguments of the hook, or (2) re-definition of
 select_mergejoin_clauses() as extern function to reproduce the variables on
 demand. Which one is more preferable?

I prefer (1), because exposing inside of planner might blocks changing
those internal functions.  If (at the moment) those information is
enough for overriding merge join for CSP, let's provide as parameters.


-- 
Shigeru HANADA


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)

2014-02-25 Thread Shigeru Hanada
Hi Kaigai-san,

2014-02-23 22:24 GMT+09:00 Kohei KaiGai kai...@kaigai.gr.jp:
 (1) Interface to add alternative paths in addition to built-in join paths

I found that create_custom_path is not used at all in your patch.
I revised postgresql_fdw.c to use it like this.

...
/* Create join information which is stored as private information. */
memset(jinfo, 0, sizeof(PgRemoteJoinInfo));
jinfo.fdw_server_oid = o_server_oid;
jinfo.fdw_user_oid = o_user_oid;
jinfo.relids = joinrel-relids;
jinfo.jointype = jointype;
jinfo.outer_rel = o_relinfo;
jinfo.inner_rel = i_relinfo;
jinfo.remote_conds = j_remote_conds;
jinfo.local_conds = j_local_conds;

/* OK, make a CustomScan node to run remote join */
cpath = create_customscan_path(root,
   joinrel,
   0, 0, 0, /* estimate later */
   NIL,
   required_outer,
   postgres-fdw,
   0,
   packPgRemoteJoinInfo(jinfo));

estimate_remote_join_cost(root, cpath, jinfo, sjinfo);

add_path(joinrel, cpath-path);
...

This seems to work fine.  Is this right approach?  If so,this portion
would be a good example to replace local join with custom scan for
authors of custom scan providers.  One thing I worry is the case that
you've intentionally avoided calling create_customscan_path.

-- 
Shigeru HANADA


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)

2014-02-25 Thread Stephen Frost
* Kouhei Kaigai (kai...@ak.jp.nec.com) wrote:
  Instead of custom node, it might be better idea to improve FDW 
  infrastructure
  to push join. For the starters, is it possible for the custom scan node
  hooks to create a ForeignScan node? In general, I think, it might be better
  for the custom scan hooks to create existing nodes if they serve the 
  purpose.
 
 It does not work well because existing FDW infrastructure is designed to
 perform on foreign tables, not regular tables. Probably, it needs to revise
 much our assumption around the background code, if we re-define the purpose
 of FDW infrastructure. For example, ForeignScan is expected to return a tuple
 according to the TupleDesc that is exactly same with table definition.
 It does not fit the requirement if we replace a join-node by ForeignScan
 because its TupleDesc of joined relations is not predefined.

I'm not following this logic at all- how are you defining foreign from
regular?  Certainly, in-memory-only tables which are sitting out in
some non-persistent GPU memory aren't regular by any PG definition.
Perhaps you can't make ForeignScan suddenly work as a join-node
replacement, but I've not seen where anyone has proposed that (directly-
I've implied it on occation where a remote view can be used, but that's
not the same thing as having proper push-down support for joins).

 I'd like to define these features are designed for individual purpose.

My previous complaint about this patch set has been precisely that each
piece seems to be custom-built and every patch needs more and more
backend changes.  If every time someone wants to do something with this
CustomScan API, they need changes made to the backend code, then it's
not a generally useful external API.  We really don't want to define
such an external API as then we have to deal with backwards
compatibility, particularly when it's all specialized to specific use
cases which are all different.

 FDW is designed to intermediate an external data source and internal heap
 representation according to foreign table definition. In other words, its
 role is to generate contents of predefined database object on the fly.

There's certainly nothing in the FDW API which requires that the remote
side have an internal heap representation, as evidenced by the various
FDWs which already exist and certainly are not any kind of 'normal'
heap.  Every query against the foriegn relation goes through the FDW API
and can end up returning whatever the FDW author decides is appropriate
to return at that time, as long as it matches the tuple description-
which is absolutely necessary for any kind of sanity, imv.

 On the other hands, custom-scan is designed to implement alternative ways
 to scan / join relations in addition to the methods supported by built-in
 feature.

I can see the usefulness in being able to push down aggregates or other
function-type calls to the remote side of an FDW and would love to see
work done along those lines, along with the ability to push down joins
to remote systems- but I'm not convinced that the claimed flexibility
with the CustomScan API is there, given the need to continue modifying
the backend code for each use-case, nor that there are particularly new
and inventive ways of saying find me all the cases where set X overlaps
with set Y.  I'm certainly open to the idea that we could have an FDW
API which allows us to ask exactly that question and let the remote side
cost it out and give us an answer for a pair of relations but that isn't
what this is.  Note also that in any kind of aggregation push-down we
must be sure that the function is well-defined and that the FDW is on
the hook to ensure that the returned data is the same as if we ran the
same aggregate function locally, otherwise the results of a query might
differ based on if the aggregate was fired locally or remotely (which
could be influenced by costing- eg: the size of the relation or its
statistics).

 I'm motivated to implement GPU acceleration feature that works transparently
 for application. Thus, it has to be capable on regular tables, because most
 of application stores data on regular tables, not foreign ones.

You want to persist that data in the GPU across multiple calls though,
which makes it unlike any kind of regular PG table and much more like
some foreign table.  Perhaps the data is initially loaded from a local
table and then updated on the GPU card in some way when the 'real' table
is updated, but neither of those makes it a regular PG table.

  Since a custom node is open implementation, it will be important to pass
  as much information down to the hooks as possible; lest the hooks will be
  constrained.  Since the functions signatures within the planner, optimizer
  will change from time to time, so the custom node hook signatures will need
  to change from time to time. That might turn out to be maintenance overhead.

It's more than from time-to-time, it was for each use case in the
given 

Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)

2014-02-25 Thread Stephen Frost
* Kouhei Kaigai (kai...@ak.jp.nec.com) wrote:
 Yes, the part-1 patch provides a set of interface portion to interact
 between the backend code and extension code. Rest of part-2 and part-3
 portions are contrib modules that implements its feature on top of
 custom-scan API.

Just to come back to this- the other two contrib module patches, at
least as I read over their initial submission, were *also* patching
portions of backend code which it was apparently discovered that they
needed.  That's a good bit of my complaint regarding this approach.

 FDW's join pushing down is one of the valuable use-cases of this interface,
 but not all. As you might know, my motivation is to implement GPU acceleration
 feature on top of this interface, that offers alternative way to scan or join
 relations or potentially sort or aggregate.

If you're looking to just use GPU acceleration for improving individual
queries, I would think that Robert's work around backend workers would
be a more appropriate way to go, with the ability to move a working set
of data from shared buffers and on-disk representation of a relation
over to the GPU's memory, perform the operation, and then copy the
results back.  If that's not possible or effective wrt performance, then
I think we need to look at managing the external GPU memory as a foreign
system through an FDW which happens to be updated through triggers or
similar.  The same could potentially be done for memcached systems, etc.

regular PG tables, just to point out one issue, can be locked on a
row-by-row basis, and we know exactly where in shared buffers to go hunt
down the rows.  How is that going to work here, if this is both a
regular table and stored off in a GPU's memory across subsequent
queries or even transactions?

 Right now, I put all the logic to interact CSI and FDW driver on postgres_fdw
 side, it might be an idea to have common code (like a logic to check whether
 the both relations to be joined belongs to same foreign server) on the backend
 side as something like a gateway of them.

Yes, that's what I was suggesting above- we should be asking the FDWs on
a case-by-case basis how to cost out the join between foreign tables
which they are responsible for.  Asking two different FDWs servers to
cost out a join between their tables doesn't make any sense to me.

 As an aside, what should be the scope of FDW interface?
 In my understanding, it allows extension to implement something on behalf of
 a particular data structure being declared with CREATE FOREIGN TABLE.

That's where it is today, but certainly not our end goal.

 In other words, extension's responsibility is to generate a view of 
 something
 according to PostgreSQL' internal data structure, instead of the object 
 itself.

The result of the FDW call needs to be something which PG understands
and can work with, otherwise we wouldn't be able to, say, run PL/pgsql
code on the result, or pass it into some other aggregate which we
decided was cheaper to run locally.  Being able to push down aggregates
to the remote side of an FDW certainly fits in quite well with that.

 On the other hands, custom-scan interface allows extensions to implement
 alternative methods to scan or join particular relations, but it is not a role
 to perform as a target being referenced in queries. In other words, it is 
 methods
 to access objects.

The custom-scan interface still needs to produce something according
to PG's internal data structures, so it's not clear to me where you're
going with this.

 It is natural both features are similar because both of them intends 
 extensions
 to hook the planner and executor, however, its purpose is different.

I disagree as I don't really view FDWs as hooks.  A hook is more
like a trigger- sure, you can modify the data in transit, or throw an
error if you see an issue, but you don't get to redefine the world and
throw out what the planner or optimizer knows about the rest of what is
going on in the query.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)

2014-02-25 Thread Kouhei Kaigai
 * Kouhei Kaigai (kai...@ak.jp.nec.com) wrote:
   Instead of custom node, it might be better idea to improve FDW
   infrastructure to push join. For the starters, is it possible for
   the custom scan node hooks to create a ForeignScan node? In general,
   I think, it might be better for the custom scan hooks to create existing
 nodes if they serve the purpose.
  
  It does not work well because existing FDW infrastructure is designed
  to perform on foreign tables, not regular tables. Probably, it needs
  to revise much our assumption around the background code, if we
  re-define the purpose of FDW infrastructure. For example, ForeignScan
  is expected to return a tuple according to the TupleDesc that is exactly
 same with table definition.
  It does not fit the requirement if we replace a join-node by
  ForeignScan because its TupleDesc of joined relations is not predefined.
 
 I'm not following this logic at all- how are you defining foreign from
 regular?  Certainly, in-memory-only tables which are sitting out in some
 non-persistent GPU memory aren't regular by any PG definition.
 Perhaps you can't make ForeignScan suddenly work as a join-node replacement,
 but I've not seen where anyone has proposed that (directly- I've implied
 it on occation where a remote view can be used, but that's not the same
 thing as having proper push-down support for joins).
 
This regular one means usual tables. Even though custom implementation
may reference self-managed in-memory cache instead of raw heap, the table
pointed in user's query shall be a usual table.
In the past, Hanada-san had proposed an enhancement of FDW to support
remote-join but eventually rejected. 

  I'd like to define these features are designed for individual purpose.
 
 My previous complaint about this patch set has been precisely that each
 piece seems to be custom-built and every patch needs more and more backend
 changes.  If every time someone wants to do something with this CustomScan
 API, they need changes made to the backend code, then it's not a generally
 useful external API.  We really don't want to define such an external API
 as then we have to deal with backwards compatibility, particularly when
 it's all specialized to specific use cases which are all different.
 
The changes to backend are just for convenient. We may be able to implement
functions to translate Bitmapset from/to cstring form in postgres_fdw,
does it make sense to maintain individually?
I thought these functions were useful to have in the backend commonly, but
is not a fundamental functionality lacks of the custom-scan interface.

  FDW is designed to intermediate an external data source and internal
  heap representation according to foreign table definition. In other
  words, its role is to generate contents of predefined database object
 on the fly.
 
 There's certainly nothing in the FDW API which requires that the remote
 side have an internal heap representation, as evidenced by the various FDWs
 which already exist and certainly are not any kind of 'normal'
 heap.  Every query against the foriegn relation goes through the FDW API
 and can end up returning whatever the FDW author decides is appropriate
 to return at that time, as long as it matches the tuple description- which
 is absolutely necessary for any kind of sanity, imv.
 
Yes. It's my understanding for the role of FDW driver.

  On the other hands, custom-scan is designed to implement alternative
  ways to scan / join relations in addition to the methods supported by
  built-in feature.
 
 I can see the usefulness in being able to push down aggregates or other
 function-type calls to the remote side of an FDW and would love to see work
 done along those lines, along with the ability to push down joins to remote
 systems- but I'm not convinced that the claimed flexibility with the
 CustomScan API is there, given the need to continue modifying the backend
 code for each use-case, nor that there are particularly new and inventive
 ways of saying find me all the cases where set X overlaps with set Y.
 I'm certainly open to the idea that we could have an FDW API which allows
 us to ask exactly that question and let the remote side cost it out and
 give us an answer for a pair of relations but that isn't what this is.  Note
 also that in any kind of aggregation push-down we must be sure that the
 function is well-defined and that the FDW is on the hook to ensure that
 the returned data is the same as if we ran the same aggregate function 
 locally,
 otherwise the results of a query might differ based on if the aggregate
 was fired locally or remotely (which could be influenced by costing- eg:
 the size of the relation or its statistics).
 
I can also understand the usefulness of join or aggregation into the remote
side in case of foreign table reference. In similar way, it is also useful
if we can push these CPU intensive operations into co-processors on regular
table references.
As I mentioned above, 

Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)

2014-02-25 Thread Stephen Frost
* Kouhei Kaigai (kai...@ak.jp.nec.com) wrote:
 This regular one means usual tables. Even though custom implementation
 may reference self-managed in-memory cache instead of raw heap, the table
 pointed in user's query shall be a usual table.
 In the past, Hanada-san had proposed an enhancement of FDW to support
 remote-join but eventually rejected. 

I'm not aware of the specifics around that proposal but I don't believe
we, as a community, have decided to reject the idea in general.

 The changes to backend are just for convenient. We may be able to implement
 functions to translate Bitmapset from/to cstring form in postgres_fdw,
 does it make sense to maintain individually?

Perhaps not.

 I thought these functions were useful to have in the backend commonly, but
 is not a fundamental functionality lacks of the custom-scan interface.

Then perhaps they should be exposed more directly?  I can understand
generally useful functionality being exposed in a way that anyone can
use it, but we need to avoid interfaces which can't be stable due to
normal / ongoing changes to the backend code.

 I can also understand the usefulness of join or aggregation into the remote
 side in case of foreign table reference. In similar way, it is also useful
 if we can push these CPU intensive operations into co-processors on regular
 table references.

That's fine, if we can get data to and from those co-processors
efficiently enough that it's worth doing so.  If moving the data to the
GPU's memory will take longer than running the actual aggregation, then
it doesn't make any sense for regular tables because then we'd have to
cache the data in the GPU's memory in some way across multiple queries,
which isn't something we're set up to do.

 As I mentioned above, the backend changes by the part-2/-3 patches are just
 minor stuff, and I thought it should not be implemented by contrib module
 locally.

Fine- then propose them as generally useful additions, not as patches
which are supposed to just be for contrib modules using an already
defined interface.  If you can make a case for that then perhaps this is
more practical.

 Regarding to the condition where we can run remote aggregation, you are
 right. As current postgres_fdw push-down qualifiers into remote side,
 we need to ensure remote aggregate definition is identical with local one.

Of course.

 No. What I want to implement is, read the regular table and transfer the
 contents into GPU's local memory for calculation, then receives its
 calculation result. The in-memory cache (also I'm working on) is supplemental
 stuff because disk access is much slower and row-oriented data structure is
 not suitable for SIMD style instructions.

Is that actually performant?  Is it actually faster than processing the
data directly?  The discussions that I've had with folks have cast a
great deal of doubt in my mind about just how well that kind of quick
turn-around to the GPU's memory actually works.

  This really strikes me as the wrong approach for an FDW join-pushdown API,
  which should be geared around giving the remote side an opportunity on a
  case-by-case basis to cost out joins using whatever methods it has available
  to implement them.  I've outlined above the reasons I don't agree with just
  making the entire planner/optimizer pluggable.
  
 I'm also inclined to have arguments that will provide enough information
 for extensions to determine the best path for them.

For join push-down, I proposed above that we have an interface to the
FDW which allows us to ask it how much each join of the tables which are
on a given FDW's server would cost if the FDW did it vs. pulling it back
and doing it locally.  We could also pass all of the relations to the
FDW with the various join-quals and try to get an answer to everything,
but I'm afraid that'd simply end up duplicating the logic of the
optimizer into every FDW, which would be counter-productive.
Admittedly, getting the costing right isn't easy either, but it's not
clear to me how it'd make sense for the local server to be doing costing
for remote servers.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)

2014-02-25 Thread Kouhei Kaigai
 * Kouhei Kaigai (kai...@ak.jp.nec.com) wrote:
  Yes, the part-1 patch provides a set of interface portion to interact
  between the backend code and extension code. Rest of part-2 and part-3
  portions are contrib modules that implements its feature on top of
  custom-scan API.
 
 Just to come back to this- the other two contrib module patches, at least
 as I read over their initial submission, were *also* patching portions of
 backend code which it was apparently discovered that they needed.  That's
 a good bit of my complaint regarding this approach.
 
?? Sorry, are you still negative on the portion of backend patched
by the part-2 and part-3 portion??

  FDW's join pushing down is one of the valuable use-cases of this
  interface, but not all. As you might know, my motivation is to
  implement GPU acceleration feature on top of this interface, that
  offers alternative way to scan or join relations or potentially sort or
 aggregate.
 
 If you're looking to just use GPU acceleration for improving individual
 queries, I would think that Robert's work around backend workers would be
 a more appropriate way to go, with the ability to move a working set of
 data from shared buffers and on-disk representation of a relation over to
 the GPU's memory, perform the operation, and then copy the results back.

The approach is similar to the Robert's work except for GPU adoption,
instead of multicore CPUs. So, I tried to review his work to apply
the facilities on my extension also.

 If that's not possible or effective wrt performance, then I think we need
 to look at managing the external GPU memory as a foreign system through
 an FDW which happens to be updated through triggers or similar.  The same
 could potentially be done for memcached systems, etc.
 
I didn't imagine the idea that expose GPU's local memory.
A supplemental stuff for the data load performance I'm planning is just
a cache mechanism besides regular tables.

 regular PG tables, just to point out one issue, can be locked on a
 row-by-row basis, and we know exactly where in shared buffers to go hunt
 down the rows.  How is that going to work here, if this is both a regular
 table and stored off in a GPU's memory across subsequent queries or even
 transactions?
 
It shall be handled case-by-case basis, I think. If row-level lock is
required over the table scan, custom-scan node shall return a tuple being
located on the shared buffer, instead of the cached tuples. Of course,
it is an option for custom-scan node to calculate qualifiers by GPU with
cached data and returns tuples identified by ctid of the cached tuples.
Anyway, it is not a significant problem.

  Right now, I put all the logic to interact CSI and FDW driver on
  postgres_fdw side, it might be an idea to have common code (like a
  logic to check whether the both relations to be joined belongs to same
  foreign server) on the backend side as something like a gateway of them.
 
 Yes, that's what I was suggesting above- we should be asking the FDWs on
 a case-by-case basis how to cost out the join between foreign tables which
 they are responsible for.  Asking two different FDWs servers to cost out
 a join between their tables doesn't make any sense to me.
 
OK, I'll move the portion that will be needed commonly for other FDWs into
the backend code.

  As an aside, what should be the scope of FDW interface?
  In my understanding, it allows extension to implement something on
  behalf of a particular data structure being declared with CREATE FOREIGN
 TABLE.
 
 That's where it is today, but certainly not our end goal.
 
  In other words, extension's responsibility is to generate a view of
 something
  according to PostgreSQL' internal data structure, instead of the object
 itself.
 
 The result of the FDW call needs to be something which PG understands and
 can work with, otherwise we wouldn't be able to, say, run PL/pgsql code
 on the result, or pass it into some other aggregate which we decided was
 cheaper to run locally.  Being able to push down aggregates to the remote
 side of an FDW certainly fits in quite well with that.
 
Yes. According to the previous discussion around postgres_fdw getting
merged, all we can trust on the remote side are built-in data types,
functions, operators or other stuffs only.

  On the other hands, custom-scan interface allows extensions to
  implement alternative methods to scan or join particular relations,
  but it is not a role to perform as a target being referenced in
  queries. In other words, it is methods to access objects.
 
 The custom-scan interface still needs to produce something according to
 PG's internal data structures, so it's not clear to me where you're going
 with this.
 
The custom-scan node is intended to perform on regular relations, not
only foreign tables. It means a special feature (like GPU acceleration)
can perform transparently for most of existing applications. Usually,
it defines regular tables for their work on installation, 

Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)

2014-02-24 Thread Shigeru Hanada
Hi Kaigai-san,

Sorry to leave the thread for a while.

2014-02-23 22:24 GMT+09:00 Kohei KaiGai kai...@kaigai.gr.jp:
 (1) Interface to add alternative paths in addition to built-in join paths

 This patch adds add_join_path_hook on add_paths_to_joinrel to allow
 extensions to provide alternative scan path in addition to the built-in
 join paths. Custom-scan path being added is assumed to perform to scan
 on a (virtual) relation that is a result set of joining relations.
 My concern is its arguments to be pushed. This hook is declared as follows:

 /* Hook for plugins to add custom join path, in addition to default ones */
 typedef void (*add_join_path_hook_type)(PlannerInfo *root,
 RelOptInfo *joinrel,
 RelOptInfo *outerrel,
 RelOptInfo *innerrel,
 JoinType jointype,
 SpecialJoinInfo *sjinfo,
 List *restrictlist,
 List *mergeclause_list,
 SemiAntiJoinFactors *semifactors,
 Relids param_source_rels,
 Relids extra_lateral_rels);
 extern PGDLLIMPORT add_join_path_hook_type add_join_path_hook;

 Likely, its arguments upper than restrictlist should be informed to 
 extensions,
 because these are also arguments of add_paths_to_joinrel().
 However, I'm not 100% certain how about other arguments should be informed.
 Probably, it makes sense to inform param_source_rels and extra_lateral_rels
 to check whether the path is sensible for parameterized paths.
 On the other hand, I doubt whether mergeclause_list is usuful to deliver.
 (It may make sense if someone tries to implement their own merge-join
 implementation??)

 I'd like to seem idea to improve the current interface specification.

I've read the code path to add custom join again, and felt that
providing semifactors seems not necessary for the first cut, because
it is used in only initial_cost_nestloop (final_cost_nestloop receives
semifactors but it is not used there), and external module would not
become so smart before 9.5 development cycle.  It seems enough complex
to postpone determinig  whether it's essential for add_join_path_hook.
 Do you have any concrete use case for the parameter?

mergeclause_list and param_source_rels seem little easier to use, but
anyway it should be documented how to use those parameters.

IMHO, minimal interface seems better than fully-fledged but half-baked
one, especially in the first-cut.

-- 
Shigeru HANADA


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)

2014-02-24 Thread Shigeru Hanada
2014-02-23 22:24 GMT+09:00 Kohei KaiGai kai...@kaigai.gr.jp:
 Folks,

 Let me remind the custom-scan patches; that is a basis feature of
 remote join of postgres_fdw, cache-only scan, (upcoming) GPU
 acceleration feature or various alternative ways to scan/join relations.
 Unfortunately, small amount of discussion we could have in this commit
 fest, even though Hanada-san volunteered to move the patches into
 ready for committer state at the CF-Nov.

I found some cosmetic flaw and .gitignore leak in the patches.  Please
see attached a patch for details.

-- 
Shigeru HANADA


custom_scan_cosmetic_fix.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)

2014-02-24 Thread Kouhei Kaigai
  /* Hook for plugins to add custom join path, in addition to default
  ones */ typedef void (*add_join_path_hook_type)(PlannerInfo *root,
  RelOptInfo *joinrel,
  RelOptInfo *outerrel,
  RelOptInfo *innerrel,
  JoinType jointype,
  SpecialJoinInfo *sjinfo,
  List *restrictlist,
  List *mergeclause_list,
  SemiAntiJoinFactors *semifactors,
  Relids param_source_rels,
  Relids extra_lateral_rels);
  extern PGDLLIMPORT add_join_path_hook_type add_join_path_hook;
 
  Likely, its arguments upper than restrictlist should be informed to
  extensions, because these are also arguments of add_paths_to_joinrel().
  However, I'm not 100% certain how about other arguments should be informed.
  Probably, it makes sense to inform param_source_rels and
  extra_lateral_rels to check whether the path is sensible for parameterized
 paths.
  On the other hand, I doubt whether mergeclause_list is usuful to deliver.
  (It may make sense if someone tries to implement their own merge-join
  implementation??)
 
  I'd like to seem idea to improve the current interface specification.
 
 I've read the code path to add custom join again, and felt that providing
 semifactors seems not necessary for the first cut, because it is used in
 only initial_cost_nestloop (final_cost_nestloop receives semifactors but
 it is not used there), and external module would not become so smart before
 9.5 development cycle.  It seems enough complex to postpone determinig
 whether it's essential for add_join_path_hook.
  Do you have any concrete use case for the parameter?
 
The reason why I asked the question above is, I haven't been 100% certain
about its usage. Indeed, semifactors is applied on a limited usage, but
quite easy to reproduce by extension later (using clauselist_selectivity)
if extension wants this factor. So, I agree with removing the semifactors
here.

 mergeclause_list and param_source_rels seem little easier to use, but
 anyway it should be documented how to use those parameters.

The mergeclause_list might not be sufficient for extensions to determine
whether its own mergejoin is applicable here. See the comment below; that
is on the head of select_mergejoin_clauses.

|  * *mergejoin_allowed is normally set to TRUE, but it is set to FALSE if
|  * this is a right/full join and there are nonmergejoinable join clauses.
|  * The executor's mergejoin machinery cannot handle such cases, so we have
|  * to avoid generating a mergejoin plan.  (Note that this flag does NOT
|  * consider whether there are actually any mergejoinable clauses.  This is
|  * correct because in some cases we need to build a clauseless mergejoin.
|  * Simply returning NIL is therefore not enough to distinguish safe from
|  * unsafe cases.)
| 
It says, mergejoin_clause == NIL is not a sufficient check to determine
whether the mergejoin logic is applicable on the target join.
So, either of them is probably an option for extension that tries to implement
their own mergejoin logic; (1) putting both of mergejoin_allowed and
mergeclause_list as arguments of the hook, or (2) re-definition of
select_mergejoin_clauses() as extern function to reproduce the variables on
demand. Which one is more preferable?

BTW, I found a duplicate clause_sides_match_join() definition, that is
invoked at select_mergejoin_clauses(), in joinpath.c and analyzejoins.c.
Either of them should be eliminated, I think.

For param_source_rels and extra_lateral_rels, I'll put source code comments
around add_join_path_hook.
Earlier half of try_(nestloop|hashjoin|mergejoin)_path is probably useful
as example of extension implementation.

Thanks,
--
NEC OSS Promotion Center / PG-Strom Project
KaiGai Kohei kai...@ak.jp.nec.com


 -Original Message-
 From: Shigeru Hanada [mailto:shigeru.han...@gmail.com]
 Sent: Tuesday, February 25, 2014 12:41 AM
 To: Kohei KaiGai
 Cc: Kaigai, Kouhei(海外, 浩平); Stephen Frost; Jim Mlodgenski; Robert Haas;
 Tom Lane; PgHacker; Peter Eisentraut
 Subject: Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)
 
 Hi Kaigai-san,
 
 Sorry to leave the thread for a while.
 
 2014-02-23 22:24 GMT+09:00 Kohei KaiGai kai...@kaigai.gr.jp:
  (1) Interface to add alternative paths in addition to built-in join
  paths
 
  This patch adds add_join_path_hook on add_paths_to_joinrel to allow
  extensions to provide alternative scan path in addition to the
  built-in join paths. Custom-scan path being added is assumed to
  perform to scan on a (virtual) relation that is a result set of joining
 relations.
  My concern is its arguments to be pushed. This hook is declared as follows:
 
  /* Hook

Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)

2014-01-29 Thread Christian Convey
On Mon, Jan 27, 2014 at 7:14 PM, Kouhei Kaigai kai...@ak.jp.nec.com wrote:

 FDW's join pushing down is one of the valuable use-cases of this interface,
 but not all. As you might know, my motivation is to implement GPU
 acceleration
 feature on top of this interface, that offers alternative way to scan or
 join
 relations or potentially sort or aggregate.


I'm curious how this relates to the pluggable storage idea discussed here
https://wiki.postgresql.org/wiki/PgCon_2013_Developer_Meeting and here
http://www.databasesoup.com/2013/05/postgresql-new-development-priorities-2.html


I haven't seen a specific proposal about how much functionality should be
encapsulated by a pluggable storage system.  But I wonder if that would be
the best place for specialized table-scan code to end up?


Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)

2014-01-29 Thread Kohei KaiGai
2014-01-29 Christian Convey christian.con...@gmail.com:

 On Mon, Jan 27, 2014 at 7:14 PM, Kouhei Kaigai kai...@ak.jp.nec.com wrote:

 FDW's join pushing down is one of the valuable use-cases of this
 interface,
 but not all. As you might know, my motivation is to implement GPU
 acceleration
 feature on top of this interface, that offers alternative way to scan or
 join
 relations or potentially sort or aggregate.


 I'm curious how this relates to the pluggable storage idea discussed here
 https://wiki.postgresql.org/wiki/PgCon_2013_Developer_Meeting and here
 http://www.databasesoup.com/2013/05/postgresql-new-development-priorities-2.html

 I haven't seen a specific proposal about how much functionality should be
 encapsulated by a pluggable storage system.  But I wonder if that would be
 the best place for specialized table-scan code to end up?

If you are interested in designing your own storage layer (thus it needs to
have own scan/writer implementation), FDW is an option currently available.
It defines a set of interface that allows extensions to generate things to be
there on the fly. It does not force to perform as a client of remote database,
even though it was originally designed for dblink purpose.
In other words, FDW is a feature to translate a particular data source into
something visible according to the table definition. As long as driver can
intermediate table definition and data format of your own storage layer,
it shall work.

On the other hands, custom-scan interface, basically, allows extensions to
implement alternative way to access the data. If we have wiser way to
scan or join relations than built-in logic (yep, it will be a wiser
logic to scan
a result set of remote-join than local join on a couple of remote scan results),
this interface suggest the backend I have such a wise strategy, then planner
will choose one of them; including either built-in or additional one, according
to the cost.

Let's back to your question. This interface is, right now, not designed to
implement pluggable storage layer. FDW is an option now, and maybe
a development item in v9.5 cycle if we want regular tables being pluggable.
Because I'm motivated to implement my GPU acceleration feature to
perform on regular relations, I put my higher priority on the interface to
allow extension to suggest how to scan it.

Thanks,
-- 
KaiGai Kohei kai...@kaigai.gr.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)

2014-01-27 Thread Kouhei Kaigai
Hackers,

Is somebody available to volunteer to review the custom-scan patch?

Even though Hanada-san acknowledged before, it seems to me this patch
has potentially arguable implementations. Even if you have enough time
to review whole of the code, it helps me if you can comment on the
following topics.


(1) Interface to add alternative paths instead of built-in join paths

This patch adds add_join_path_hook on add_paths_to_joinrel to allow
extensions to provide alternative scan path in addition to the built-in
join paths. Custom-scan path being added is assumed to perform to scan
on a (virtual) relation that is a result set of joining relations.
My concern is its arguments to be pushed. This hook is declared as follows:

/* Hook for plugins to add custom join path, in addition to default ones */
typedef void (*add_join_path_hook_type)(PlannerInfo *root,
RelOptInfo *joinrel,
RelOptInfo *outerrel,
RelOptInfo *innerrel,
JoinType jointype,
SpecialJoinInfo *sjinfo,
List *restrictlist,
List *mergeclause_list,
SemiAntiJoinFactors *semifactors,
Relids param_source_rels,
Relids extra_lateral_rels);
extern PGDLLIMPORT add_join_path_hook_type add_join_path_hook;

Likely, its arguments upper than restrictlist should be informed to extensions,
because these are also arguments of add_paths_to_joinrel().
However, I'm not 100% certain how about other arguments should be informed.
Probably, it makes sense to inform param_source_rels and extra_lateral_rels
to check whether the path is sensible for parameterized paths.
On the other hand, I doubt whether mergeclause_list is usuful to deliver.
(It may make sense if someone tries to implement their own merge-join
implementation??)

I'd like to seem idea to improve the current interface specification.


(2) CUSTOM_VAR for special Var reference

@@ -134,6 +134,7 @@ typedef struct Expr
 #defineINNER_VAR   65000   /* reference to inner subplan */
 #defineOUTER_VAR   65001   /* reference to outer subplan */
 #defineINDEX_VAR   65002   /* reference to index column */
+#defineCUSTOM_VAR  65003   /* reference to custom column */

I newly added CUSTOM_VAR to handle a case when custom-scan override
join relations.
Var-nodes within join plan are adjusted to refer either ecxt_innertuple or
ecxt_outertuple of ExprContext. It makes a trouble if custom-scan runs
instead of built-in joins, because its tuples being fetched are usually
stored on the ecxt_scantuple, thus Var-nodes also need to have right
varno neither inner nor outer.

SetPlanRefCustomScan callback, being kicked on set_plan_refs, allows
extensions to rewrite Var-nodes within custom-scan node to indicate
ecxt_scantuple using CUSTOM_VAR, instead of inner or outer.
For example, a var-node with varno=CUSTOM_VAR and varattno=3 means
this node reference the third attribute of the tuple in ecxt_scantuple.
I think it is a reasonable solution, however, I'm not 100% certain
whether people have more graceful idea to implement it.

If you have comments around above two topic, or others, please give
your ideas.

Thanks,

 -Original Message-
 From: pgsql-hackers-ow...@postgresql.org
 [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Kohei KaiGai
 Sent: Tuesday, January 14, 2014 11:20 PM
 To: Shigeru Hanada
 Cc: Kaigai, Kouhei(海外, 浩平); Jim Mlodgenski; Robert Haas; Tom Lane;
 PgHacker; Peter Eisentraut
 Subject: Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)
 
 Hello,
 
 The attached patches are the ones rebased to the latest git tree, but no
 functional changes from the previous revision on the commit-fest:Nov.
 Hanada-san volunteered to review the series of patches, including the
 portion for postgres_fdw, then marked it as ready for committer on the
 last commit fest.
 So, I hope someone of committer also volunteer to review the patches for
 final checking.
 
 * Part-1 - CustomScan APIs
 This patch provides a set of interfaces to interact query-optimizer and
 -executor for extensions. The new add_scan_path_hook or add_join_path_hook
 allows to offer alternative ways to scan a particular relation or to join
 a particular relations.
 Then, once the alternative ways are chosen by the optimizer, associated
 callbacks shall be kicked from the executor. In this case, extension has
 responsibility to return a slot that hold a tuple (or empty for end of scan)
 being scanned from the underlying relation.
 
 * Part-2 - contrib/ctidscan
 This patch provides a simple example implementation of CustomScan API.
 It enables to skip pages when inequality operators are given on ctid system

Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)

2014-01-27 Thread Stephen Frost
KaiGai Kohei,

* Kouhei Kaigai (kai...@ak.jp.nec.com) wrote:
 Is somebody available to volunteer to review the custom-scan patch?

I looked through it a bit and my first take away from it was that the
patches to actually use the new hooks were also making more changes to
the backend code, leaving me with the impression that the proposed
interface isn't terribly stable.  Perhaps those changes should have just
been in the first patch, but they weren't and that certainly gave me
pause.

I'm also not entirely convinced that this is the direction to go in when
it comes to pushing down joins to FDWs.  While that's certainly a goal
that I think we all share, this seems to be intending to add a
completely different feature which happens to be able to be used for
that.  For FDWs, wouldn't we only present the FDW with the paths where
the foreign tables for that FDW, or perhaps just a given foreign server,
are being joined?

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)

2014-01-27 Thread Kouhei Kaigai
Hi Stephen,

Thanks for your comments.

 * Kouhei Kaigai (kai...@ak.jp.nec.com) wrote:
  Is somebody available to volunteer to review the custom-scan patch?
 
 I looked through it a bit and my first take away from it was that the patches
 to actually use the new hooks were also making more changes to the backend
 code, leaving me with the impression that the proposed interface isn't
 terribly stable.  Perhaps those changes should have just been in the first
 patch, but they weren't and that certainly gave me pause.
 
Yes, the part-1 patch provides a set of interface portion to interact
between the backend code and extension code. Rest of part-2 and part-3
portions are contrib modules that implements its feature on top of
custom-scan API.

 I'm also not entirely convinced that this is the direction to go in when
 it comes to pushing down joins to FDWs.  While that's certainly a goal that
 I think we all share, this seems to be intending to add a completely different
 feature which happens to be able to be used for that.  For FDWs, wouldn't
 we only present the FDW with the paths where the foreign tables for that
 FDW, or perhaps just a given foreign server, are being joined?

FDW's join pushing down is one of the valuable use-cases of this interface,
but not all. As you might know, my motivation is to implement GPU acceleration
feature on top of this interface, that offers alternative way to scan or join
relations or potentially sort or aggregate.
Probably, it is too stretch interpretation if we implement radix-sort on top
of FDW. I'd like you to understand the part-3 patch (FDW's join pushing-down)
is a demonstration of custom-scan interface for application, but not designed
for a special purpose.

Right now, I put all the logic to interact CSI and FDW driver on postgres_fdw
side, it might be an idea to have common code (like a logic to check whether
the both relations to be joined belongs to same foreign server) on the backend
side as something like a gateway of them.


As an aside, what should be the scope of FDW interface?
In my understanding, it allows extension to implement something on behalf of
a particular data structure being declared with CREATE FOREIGN TABLE.
In other words, extension's responsibility is to generate a view of something
according to PostgreSQL' internal data structure, instead of the object itself.
On the other hands, custom-scan interface allows extensions to implement
alternative methods to scan or join particular relations, but it is not a role
to perform as a target being referenced in queries. In other words, it is 
methods
to access objects.
It is natural both features are similar because both of them intends extensions
to hook the planner and executor, however, its purpose is different.

Thanks,
--
NEC OSS Promotion Center / PG-Strom Project
KaiGai Kohei kai...@ak.jp.nec.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)

2013-12-16 Thread Shigeru Hanada
KaiGai-san,

2013/12/16 KaiGai Kohei kai...@ak.jp.nec.com:
 (2013/12/16 14:15), Shigeru Hanada wrote:
 (1) ctidscan
 Is session_preload_libraries available to enable the feature, like
 shared_*** and local_***?  According to my trial it works fine like
 two similar GUCs.

 It shall be available; nothing different from the two parameters that
 we have supported for long time. Sorry, I missed the new feature to
 mention about.

Check.

 (2) postgres_fdw
 JOIN push--down is a killer application of Custom Scan Provider
 feature, so I think it's good to mention it in the Remote Query
 Optimization section.

 I added an explanation about remote join execution on the section.
 Probably, it help users understand why Custom Scan node is here
 instead of Join node. Thanks for your suggestion.

Check.

I think that these patches are enough considered to mark as Ready for
Committer.

Regards,
-- 
Shigeru HANADA


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)

2013-12-15 Thread Shigeru Hanada
Hi Kaigai-san,

2013/12/11 Kohei KaiGai kai...@kaigai.gr.jp:
 2013/12/10 Shigeru Hanada shigeru.han...@gmail.com:
 The patches could be applied cleanly, but I saw a compiler warning
 about get_rel_relkind() in foreign.c, but it's minor issue.  Please
 just add #include of utils/lsyscache.h there.

 Fixed,

Check.

 I have some more random comments about EXPLAIN.

 1) You use Operation as the label of Custom Scan nodes in non-text
 format, but it seems to me rather provider name.  What is the string
 shown there?

 I tried free-riding on the existing properties, but it does not make sense
 indeed, as you pointed out.
 I adjusted the explain.c to show Custom-Provider property for Custom-
 Scan node, as follows.

New name seems better, it is what the node express.

 2) It would be nice if we can see the information about what the
 Custom Scan node replaced in EXPLAIN output (even only in verbose
 mode).  I know that we can't show plan tree below custom scan nodes,
 because CS Provider can't obtain other candidates.  But even only
 relation names used in the join or the scan would help users to
 understand what is going on in Custom Scan.

 Even though I agree that it helps users to understand the plan,
 it also has a headache to implement because CustomScan node
 (and its super class) does not have an information which relations
 are underlying. Probably, this functionality needs to show
 the underlying relations on ExplainTargetRel() if CustomScan node
 represents a scan instead of join. What data source can produce
 the list of underlying relations here?
 So, if it is not a significant restriction for users, I'd like to work on this
 feature later.

Agreed.  It would be enough that Custom Scan Providers can add
arbitrary information, such as Remote SQL of postgres_fdw, to
EXPLAIN result via core API.  Some kind of framework which helps
authors of Custom Scan Providers, but it should be considered after
the first cut.

 The attached patch fixes up a minor warning around get_rel_relkind
 and name of the property for custom-provider. Please check it.

The patch can be applied onto 2013-12-16 HEAD cleanly, and gives no
unexpected error/warinig.

I'm sorry to post separately, but I have some comments on document.

(1) ctidscan
Is session_preload_libraries available to enable the feature, like
shared_*** and local_***?  According to my trial it works fine like
two similar GUCs.

(2) postgres_fdw
JOIN push--down is a killer application of Custom Scan Provider
feature, so I think it's good to mention it in the Remote Query
Optimization section.

Codes for core and contrib seem fine, so I'll mark the patches Ready
for committer after the document enhancement.

Regards,
-- 
Shigeru HANADA


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)

2013-12-10 Thread Shigeru Hanada
Hi KaiGai-san,

2013/12/8 Kohei KaiGai kai...@kaigai.gr.jp:
 The attached patches include documentation fixup by Hanada-san,
 and relocation of is_managed_relation (the portion to check whether
 the relation is a foreign table managed by a particular FDW) and
 has_wholerow_reference.
 I didn't touch the EXPLAIN logic because I'm uncertain whether the
 cost of remote join is reasonable towards the cost as an alternative
 path to local joins.

 Please check it. Thanks,

The patches could be applied cleanly, but I saw a compiler warning
about get_rel_relkind() in foreign.c, but it's minor issue.  Please
just add #include of utils/lsyscache.h there.

I have some more random comments about EXPLAIN.

1) You use Operation as the label of Custom Scan nodes in non-text
format, but it seems to me rather provider name.  What is the string
shown there?

2) It would be nice if we can see the information about what the
Custom Scan node replaced in EXPLAIN output (even only in verbose
mode).  I know that we can't show plan tree below custom scan nodes,
because CS Provider can't obtain other candidates.  But even only
relation names used in the join or the scan would help users to
understand what is going on in Custom Scan.

Regards,
-- 
Shigeru HANADA


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)

2013-12-04 Thread Kohei KaiGai
Hanada-san,

Thanks for your reviewing,

2013/12/4 Shigeru Hanada shigeru.han...@gmail.com:
 I first reviewed postgres_fdw portion of the patches to learn the
 outline of Custom Plan.  Wiki page is also a good textbook of the
 feature.  I have some random comments about the basic design of Custom
 Plan:

 (1) IIUC add_join_path and add_scan_path are added to allow extensions
 to plug their code into planner.

Almost yes. For more correctness, these hooks allows extensions to
plug paths they can provide into a particular join or scan. Then planner
will choose the cheapest one  according to the cost value.

 (2) FDW framework has executor callbacks based on existing executor
 nodes.  Is there any plan to integrate them into one way, or wrap on
 by another?  I'm not sure that we should have two similar framework
 side by side.
 # I'm sorry if I've missed the past discussion about this issue.

Probably, FDW has different role from the CustomScan API.
As literal, FDW performs as a bridge between a relation form and
an opaque external data source, to intermediate two different world
on behalf of a foreign table.
On the other hand, CustomScan allows to provide alternative logic
to scan or join particular relations, in addition to the built-in ones,
but does not perform on behalf of foreign tables.

Existing FDW is designed to implement a scan on an intangible
relation, thus it can assume some things; like a tuple returned
from FDW has equivalent TupleDesc as table definition, or it can
always use ExecScan() for all the cases.
So, I don't think these two frameworks should be consolidated
because it makes confusion on the existing extensions that
assumes FDW callbacks always has a particular foreign table
definition.

 (3) Internal routines such as is_self_managed_relation and
 has_wholerow_reference seem to be useful for other FDWs.  Is it able
 to move them  into core?

Probably, src/backend/foreign/foreign.c is a good host for them.

 (4) postgres_fdw estimates costs of join by calculating local numbers.
  How about to support remote estimation by throwing EXPLALAIN query
 when use_remote_estimates = true.

I'm uncertain whether the cost value from remote EXPLAIN represents
right difficulty on the local side, because it indeed represents the
difficulty to join two relations on the remote side, however, does not
represents local job; that just fetches tuples from the result set of
remote query with table joining.
How about your opinion? Is the remote cost estimation value comparable
with local value?

Thanks,
-- 
KaiGai Kohei kai...@kaigai.gr.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)

2013-12-04 Thread Kohei KaiGai
Thanks for fixing many my carelessness.
I didn't know seek was an irregular verb...

Best regards,

2013/12/4 Shigeru Hanada shigeru.han...@gmail.com:
 2013/11/29 Kohei KaiGai kai...@kaigai.gr.jp:
 I merged all the propositions from Jim. Thanks, it made the documentation
 quality better. Also, I fixed up cosmetic stuff around whitespace - tab.

 I found some typos in documents and comments.  Please see attached
 patch for detail.

 --
 Shigeru HANADA



-- 
KaiGai Kohei kai...@kaigai.gr.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)

2013-12-03 Thread Shigeru Hanada
Hi KaiGai-san,


2013/11/29 Kohei KaiGai kai...@kaigai.gr.jp:
 The attached ones are the revised patches.

 I merged all the propositions from Jim. Thanks, it made the documentation
 quality better. Also, I fixed up cosmetic stuff around whitespace - tab.

 An actual code changes are to follow the changes in FunctionScan when
 CustomScan replaces a FunctionScan. It puts a List * object instead of
 a single expression tree, to have multiple functions.

 Nothing were changed from the previous version.

I first reviewed postgres_fdw portion of the patches to learn the
outline of Custom Plan.  Wiki page is also a good textbook of the
feature.  I have some random comments about the basic design of Custom
Plan:

(1) IIUC add_join_path and add_scan_path are added to allow extensions
to plug their code into planner.

(2) FDW framework has executor callbacks based on existing executor
nodes.  Is there any plan to integrate them into one way, or wrap on
by another?  I'm not sure that we should have two similar framework
side by side.
# I'm sorry if I've missed the past discussion about this issue.

(3) Internal routines such as is_self_managed_relation and
has_wholerow_reference seem to be useful for other FDWs.  Is it able
to move them  into core?

(4) postgres_fdw estimates costs of join by calculating local numbers.
 How about to support remote estimation by throwing EXPLALAIN query
when use_remote_estimates = true.

-- 
Shigeru HANADA


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)

2013-12-03 Thread Shigeru Hanada
2013/11/29 Kohei KaiGai kai...@kaigai.gr.jp:
 I merged all the propositions from Jim. Thanks, it made the documentation
 quality better. Also, I fixed up cosmetic stuff around whitespace - tab.

I found some typos in documents and comments.  Please see attached
patch for detail.

-- 
Shigeru HANADA


fix_typo.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)

2013-11-26 Thread Peter Eisentraut
contrib/ctidscan/ctidscan.c:44: indent with spaces.
contrib/ctidscan/ctidscan.c:250: indent with spaces.
contrib/ctidscan/ctidscan.c:266: trailing whitespace.
contrib/postgres_fdw/deparse.c:1044: indent with spaces.
contrib/postgres_fdw/postgres_fdw.c:940: indent with spaces.
src/backend/commands/explain.c:2097: indent with spaces.
src/backend/optimizer/plan/createplan.c:2097: trailing whitespace.




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)

2013-11-26 Thread Shigeru Hanada
2013/11/19 Kohei KaiGai kai...@kaigai.gr.jp:
 OK, I split them off. The part-1 is custom-scan API itself, the part-2 is
 ctidscan portion, and the part-3 is remote join on postgres_fdw.

These three patches can be applied with no conflict onto 2013-11-27
HEAD, but some fixes are necessary to build because commit
784e762e886e6f72f548da86a27cd2ead87dbd1c (committed on 2013-11-21)
allows FunctionScan node to have multiple function expression, so Node
* funcexpr in CustomScan should be List *funcitons now.

I'll continue to review by applying patches onto 2013-11-19 HEAD.

-- 
Shigeru HANADA


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)

2013-11-22 Thread Jim Mlodgenski
KaiGai


On Tue, Nov 19, 2013 at 9:41 AM, Kohei KaiGai kai...@kaigai.gr.jp wrote:

 Thanks for your review.

 2013/11/19 Jim Mlodgenski jimm...@gmail.com:
  My initial review on this feature:
  - The patches apply and build, but it produces a warning:
  ctidscan.c: In function ‘CTidInitCustomScanPlan’:
  ctidscan.c:362:9: warning: unused variable ‘scan_relid’
 [-Wunused-variable]
 
 This variable was only used in Assert() macro, so it causes a warning if
 you
 don't put --enable-cassert on the configure script.
 Anyway, I adjusted the code to check relid of RelOptInfo directly.



The warning is now gone.


  I'd recommend that you split the part1 patch containing the ctidscan
 contrib
  into its own patch. It is more than half of the patch and its certainly
  stands on its own. IMO, I think ctidscan fits a very specific use case
 and
  would be better off being an extension instead of in contrib.
 
 OK, I split them off. The part-1 is custom-scan API itself, the part-2 is
 ctidscan portion, and the part-3 is remote join on postgres_fdw.


Attached is a patch for the documentation. I think the documentation still
needs a little more work, but it is pretty close. I can add some more
detail to it once finish adapting the hadoop_fdw to using the custom scan
api and have a better understanding of all of the calls.


 Thanks,
 --
 KaiGai Kohei kai...@kaigai.gr.jp

*** a/doc/src/sgml/custom-scan.sgml	2013-11-18 17:50:02.652039003 -0500
--- b/doc/src/sgml/custom-scan.sgml	2013-11-22 09:09:13.624254649 -0500
***
*** 8,47 
secondaryhandler for/secondary
   /indexterm
   para
!   Custom-scan API enables extension to provide alternative ways to scan or
!   join relations, being fully integrated with cost based optimizer,
!   in addition to the built-in implementation.
!   It consists of a set of callbacks, with a unique name, to be invoked during
!   query planning and execution. Custom-scan provider should implement these
!   callback functions according to the expectation of API.
   /para
   para
!   Overall, here is four major jobs that custom-scan provider should implement.
!   The first one is registration of custom-scan provider itself. Usually, it
!   shall be done once at literal_PG_init()/literal entrypoint on module
!   loading.
!   The other three jobs shall be done for each query planning and execution.
!   The second one is submission of candidate paths to scan or join relations,
!   with an adequate cost, for the core planner.
!   Then, planner shall chooses a cheapest path from all the candidates.
!   If custom path survived, the planner kicks the third job; construction of
!   literalCustomScan/literal plan node, being located within query plan
!   tree instead of the built-in plan node.
!   The last one is execution of its implementation in answer to invocations
!   by the core executor.
   /para
   para
!   Some of contrib module utilize the custom-scan API. It may be able to
!   provide a good example for new development.
variablelist
 varlistentry
  termxref linkend=ctidscan/term
  listitem
   para
!   Its logic enables to skip earlier pages or terminate scan prior to
!   end of the relation, if inequality operator on literalctid/literal
!   system column can narrow down the scope to be scanned, instead of
!   the sequential scan that reads a relation from the head to the end.
   /para
  /listitem
 /varlistentry
--- 8,46 
secondaryhandler for/secondary
   /indexterm
   para
!   The custom-scan API enables an extension to provide alternative ways to scan
!   or join relations leveraging the cost based optimizer. The API consists of a
!   set of callbacks, with a unique names, to be invoked during query planning 
!   and execution. A custom-scan provider should implement these callback 
!   functions according to the expectation of the API.
   /para
   para
!   Overall, there are four major tasks that a custom-scan provider should 
!   implement. The first task is the registration of custom-scan provider itself.
!   Usually, this needs to be done once at the literal_PG_init()/literal 
!   entrypoint when the module is loading. The remaing three tasks are all done
!   when a query is planning and executing. The second task is the submission of
!   candidate paths to either scan or join relations with an adequate cost for
!   the core planner. Then, the planner will choose the cheapest path from all of
!   the candidates. If the custom path survived, the planner starts the third 
!   task; construction of a literalCustomScan/literal plan node, located
!   within the query plan tree instead of the built-in plan node. The last task
!   is the execution of its implementation in answer to invocations by the core
!   executor.
   /para
   para
!   Some of contrib modules utilize the custom-scan API. They may provide a good
!   example for new development.
variablelist
 varlistentry
  termxref linkend=ctidscan/term
 

Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)

2013-11-18 Thread Jim Mlodgenski
On Mon, Nov 18, 2013 at 7:25 AM, Kohei KaiGai kai...@kaigai.gr.jp wrote:

 The attached patches are the revised custom-scan APIs.


My initial review on this feature:
- The patches apply and build, but it produces a warning:
ctidscan.c: In function ‘CTidInitCustomScanPlan’:
ctidscan.c:362:9: warning: unused variable ‘scan_relid’ [-Wunused-variable]

I'd recommend that you split the part1 patch containing the ctidscan
contrib into its own patch. It is more than half of the patch and its
certainly stands on its own. IMO, I think ctidscan fits a very specific use
case and would be better off being an extension instead of in contrib.




 - Custom-scan.sgml was added to introduce the way to write custom-scan
   provider in the official documentation.
 - Much code duplication in postgres_fdw.c was eliminated. I split some fdw-
   handlers into two parts; common portion and fdw specific one.
   Executor callbacks of custom-scan code utilizes the common portion above
   because most of its implementations are equivalent.

 I'd like to see comments regarding to the way to handle Var reference onto
 a custom-scan that replaced relations join.
 A varno of Var that references a join relation is rtindex of either
 right or left
 relation, then setrefs.c adjust it well; INNER_VAR or OUTER_VAR shall be
 set instead.
 However, it does not work well if a custom-scan that just references result
 of remote join query was chosen instead of local join, because its result
 shall be usually set in the ps_ResultTupleSlot of PlanState, thus
 ExecEvalScalarVar does not reference neither inner nor outer slot.
 Instead of existing solution, I added one more special varno; CUSTOM_VARNO
 that just references result-tuple-slot of the target relation.
 If CUSTOM_VARNO is given, EXPLAIN(verbose) generates column name from
 the TupleDesc of underlying ps_ResultTupleSlot.
 I'm not 100% certain whether it is the best approach for us, but it works
 well.

 Also, I'm uncertain for usage of param_info in Path structure, even though
 I followed the manner in other portion. So, please point out if my usage
 was not applicable well.

 Thanks,

 2013/11/11 Kohei KaiGai kai...@kaigai.gr.jp:
  Hi,
 
  I tried to write up a wikipage to introduce how custom-scan works.
 
  https://wiki.postgresql.org/wiki/CustomScanAPI
 
  Any comments please.
 
  2013/11/6 Kohei KaiGai kai...@kaigai.gr.jp:
  The attached patches provide a feature to implement custom scan node
  that allows extension to replace a part of plan tree with its own code
  instead of the built-in logic.
  In addition to the previous proposition, it enables us to integrate
 custom
  scan as a part of candidate paths to be chosen by optimizer.
  Here is two patches. The first one (pgsql-v9.4-custom-scan-apis) offers
  a set of API stuff and a simple demonstration module that implement
  regular table scan using inequality operator on ctid system column.
  The second one (pgsql-v9.4-custom-scan-remote-join) enhances
  postgres_fdw to support remote join capability.
 
  Below is an example to show how does custom-scan work.
 
  We usually run sequential scan even if clause has inequality operator
  that references ctid system column.
 
  postgres=# EXPLAIN SELECT ctid,* FROM t1 WHERE ctid  '(10,0)'::tid;
 QUERY PLAN
  
   Seq Scan on t1  (cost=0.00..209.00 rows= width=43)
 Filter: (ctid  '(10,0)'::tid)
  (2 rows)
 
  An extension that performs as custom-scan provider suggests
  an alternative path, and its cost was less than sequential scan,
  thus optimizer choose it.
 
  postgres=# LOAD 'ctidscan';
  LOAD
  postgres=# EXPLAIN SELECT ctid,* FROM t1 WHERE ctid  '(10,0)'::tid;
QUERY PLAN
  --
   Custom Scan (ctidscan) on t1  (cost=0.00..100.00 rows= width=43)
 Filter: (ctid  '(10,0)'::tid)
  (2 rows)
 
  Of course, more cost effective plan will win if exists.
 
  postgres=# EXPLAIN SELECT ctid,* FROM t1 WHERE ctid  '(10,0)'::tid AND
 a = 200;
  QUERY PLAN
  ---
   Index Scan using t1_pkey on t1  (cost=0.29..8.30 rows=1 width=43)
 Index Cond: (a = 200)
 Filter: (ctid  '(10,0)'::tid)
  (3 rows)
 
  One other worthwhile example is remote-join enhancement on the
  postgres_fdw as follows. Both of ft1 and ft2 are foreign table being
  managed by same foreign server.
 
  postgres=# EXPLAIN (verbose) SELECT * FROM ft1 JOIN ft2 ON a = x
WHERE f_leak(b) AND y
  like '%aaa%';
 QUERY
 PLAN
 
 --
   Custom Scan (postgres-fdw)  (cost=100.00..100.01 rows=0 width=72)
 Output: a, b, x, y
 Filter: 

Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)

2013-11-11 Thread Kohei KaiGai
Hi,

I tried to write up a wikipage to introduce how custom-scan works.

https://wiki.postgresql.org/wiki/CustomScanAPI

Any comments please.

2013/11/6 Kohei KaiGai kai...@kaigai.gr.jp:
 The attached patches provide a feature to implement custom scan node
 that allows extension to replace a part of plan tree with its own code
 instead of the built-in logic.
 In addition to the previous proposition, it enables us to integrate custom
 scan as a part of candidate paths to be chosen by optimizer.
 Here is two patches. The first one (pgsql-v9.4-custom-scan-apis) offers
 a set of API stuff and a simple demonstration module that implement
 regular table scan using inequality operator on ctid system column.
 The second one (pgsql-v9.4-custom-scan-remote-join) enhances
 postgres_fdw to support remote join capability.

 Below is an example to show how does custom-scan work.

 We usually run sequential scan even if clause has inequality operator
 that references ctid system column.

 postgres=# EXPLAIN SELECT ctid,* FROM t1 WHERE ctid  '(10,0)'::tid;
QUERY PLAN
 
  Seq Scan on t1  (cost=0.00..209.00 rows= width=43)
Filter: (ctid  '(10,0)'::tid)
 (2 rows)

 An extension that performs as custom-scan provider suggests
 an alternative path, and its cost was less than sequential scan,
 thus optimizer choose it.

 postgres=# LOAD 'ctidscan';
 LOAD
 postgres=# EXPLAIN SELECT ctid,* FROM t1 WHERE ctid  '(10,0)'::tid;
   QUERY PLAN
 --
  Custom Scan (ctidscan) on t1  (cost=0.00..100.00 rows= width=43)
Filter: (ctid  '(10,0)'::tid)
 (2 rows)

 Of course, more cost effective plan will win if exists.

 postgres=# EXPLAIN SELECT ctid,* FROM t1 WHERE ctid  '(10,0)'::tid AND a = 
 200;
 QUERY PLAN
 ---
  Index Scan using t1_pkey on t1  (cost=0.29..8.30 rows=1 width=43)
Index Cond: (a = 200)
Filter: (ctid  '(10,0)'::tid)
 (3 rows)

 One other worthwhile example is remote-join enhancement on the
 postgres_fdw as follows. Both of ft1 and ft2 are foreign table being
 managed by same foreign server.

 postgres=# EXPLAIN (verbose) SELECT * FROM ft1 JOIN ft2 ON a = x
   WHERE f_leak(b) AND y
 like '%aaa%';
QUERY PLAN
 --
  Custom Scan (postgres-fdw)  (cost=100.00..100.01 rows=0 width=72)
Output: a, b, x, y
Filter: f_leak(b)
Remote SQL: SELECT r1.a, r1.b, r2.x, r2.y FROM (public.ft1 r1 JOIN
 public.ft2 r2 ON ((r1.a = r2.x))) WHERE ((r2.y ~~ '%aaa%'::text))
 (4 rows)

 ---
 How does it works
 ---
 This patch adds two hooks (for base and join relations) around allpaths.c
 and joinpaths.c. It allows extensions to add alternative paths to handle
 scanning on the base relation or join of two relations.

 Its callback routine can add CustomPath using add_path() to inform
 optimizer this alternative scan path. Every custom-scan provider is
 identified by its name being registered preliminary using the following
 function.

   void register_custom_provider(const CustomProvider *provider);

 CustomProvider is a set of name string and function pointers of callbacks.

 Once CustomPath got chosen, create_scan_plan() construct a custom-
 scan plan and calls back extension to initialize the node.
 Rest of portions are similar to foreign scan, however, some of detailed
 portions are different. For example, foreign scan is assumed to return
 a tuple being formed according to table definition. On the other hand,
 custom-scan does not have such assumption, so extension needs to
 set tuple-descriptor on the scan tuple slot of ScanState structure by
 itself.

 In case of join, custom-scan performs as like a regular scan but it
 returns tuples being already joined on underlying relations.
 The patched postgres_fdw utilizes a hook at joinpaths.c to run
 remote join.

 
 Issues
 
 I'm not 100% certain whether arguments of add_join_path_hook is
 reasonable. I guess the first 7 arguments are minimum necessity.
 The mergeclause_list and semifactors might be useful if someone
 tries to implement its own mergejoin or semijoin. Also, I'm not
 good at usage of path parameterization, but the last two arguments
 are related to. Where is the best code to learn about its usage?

 +/* Hook for plugins to add custom join path, in addition to default ones */
 +typedef void (*add_join_path_hook_type)(PlannerInfo *root,
 +   RelOptInfo *joinrel,
 +   RelOptInfo *outerrel,
 +   RelOptInfo