Re: [HACKERS] review: FDW API

2011-02-22 Thread Tom Lane
I wrote:
 I did a bit of poking around here, and came to the following
 conclusions:

 1. We don't want to add another RTEKind.  RTE_RELATION basically has the
 semantics of anything with a pg_class OID, so it ought to include
 foreign tables.  Therefore the fix ought to be to add relkind to
 RangeTblEntry.  (BTW, so far as I can tell, RTE_SPECIAL is obsolete:
 there are assorted switch cases that handle it, but no place that can
 generate the value.  I'm inclined to delete it while we are messing
 with this.)

 2. In the current code layout, to make sense of relkind you need to
 #include catalog/pg_class.h where the values for relkind are #defined.
 I dislike the idea of that being true for a field of such a widely-known
 struct as RangeTblEntry.  Accordingly, I suggest that we move those
 values into parsenodes.h.  (Perhaps we could convert them to an enum,
 too, though still keeping the same ASCII values.)

 3. We can have the rewriter update an RTE's stored value of relkind
 during AcquireRewriteLocks: it opens the rel for each RTE_RELATION entry
 anyway, so copying over the relkind is essentially free.  While it's not
 instantly obvious that that is soon enough, I think that it is, since
 up to the point of acquiring a lock there we can't assume that the rel
 isn't being changed or dropped undeneath us --- that is, any earlier
 test on an RTE's relkind might be testing just-obsoleted state anyway.

 4. I had hoped that we might be able to get rid of some pre-existing
 syscache lookups, but at least so far as the parse/plan/execute chain
 is concerned, there don't seem to be any other places that need to
 fetch a relkind based on just an RTE entry.

 So point #4 is a bit discouraging, but other that, this seems like
 a fairly straightforward exercise.  I'm inclined to go ahead and do it,
 unless there are objections.

Applied, except I ended up not moving the RELKIND #defines as suggested
in point #2.  Those #defines are used by pg_dump, and having pg_dump.c
#include parsenodes.h seemed like a bad idea.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: FDW API

2011-02-20 Thread Tom Lane
I wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Fri, Feb 18, 2011 at 10:47 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 The main downside of that is that relation relkinds would have
 to become fixed (because there would be no practical way of updating
 RTEs in stored rules), which means the convert relation to view hack
 would have to go away.

 That actually sounds like a possible problem, because it's possible to
 create views with circular dependencies using CORV, and they won't
 dump-and-reload properly without that hack.

 Urgh.  That's problematic, because even if we changed pg_dump (which
 would not be that hard I think), we'd still have to cope with dump files
 emitted by existing versions of pg_dump.

 [ thinks a bit ... ]  But we can probably hack our way around that:
 teach the rule rewriter to update relkind in any RTE it brings in from a
 stored rule.  We already do something similar in some other cases where
 a stored parsetree node contains information that could become obsolete.

I did a bit of poking around here, and came to the following
conclusions:

1. We don't want to add another RTEKind.  RTE_RELATION basically has the
semantics of anything with a pg_class OID, so it ought to include
foreign tables.  Therefore the fix ought to be to add relkind to
RangeTblEntry.  (BTW, so far as I can tell, RTE_SPECIAL is obsolete:
there are assorted switch cases that handle it, but no place that can
generate the value.  I'm inclined to delete it while we are messing
with this.)

2. In the current code layout, to make sense of relkind you need to
#include catalog/pg_class.h where the values for relkind are #defined.
I dislike the idea of that being true for a field of such a widely-known
struct as RangeTblEntry.  Accordingly, I suggest that we move those
values into parsenodes.h.  (Perhaps we could convert them to an enum,
too, though still keeping the same ASCII values.)

3. We can have the rewriter update an RTE's stored value of relkind
during AcquireRewriteLocks: it opens the rel for each RTE_RELATION entry
anyway, so copying over the relkind is essentially free.  While it's not
instantly obvious that that is soon enough, I think that it is, since
up to the point of acquiring a lock there we can't assume that the rel
isn't being changed or dropped undeneath us --- that is, any earlier
test on an RTE's relkind might be testing just-obsoleted state anyway.

4. I had hoped that we might be able to get rid of some pre-existing
syscache lookups, but at least so far as the parse/plan/execute chain
is concerned, there don't seem to be any other places that need to
fetch a relkind based on just an RTE entry.

So point #4 is a bit discouraging, but other that, this seems like
a fairly straightforward exercise.  I'm inclined to go ahead and do it,
unless there are objections.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: FDW API

2011-02-19 Thread Tom Lane
Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes:
 On 11.02.2011 22:50, Heikki Linnakangas wrote:
 I spent some more time reviewing this, and working on the PostgreSQL FDW
 in tandem. Here's an updated API patch, with a bunch of cosmetic
 changes, and a bug fix for FOR SHARE/UPDATE.

 Another version, rebased against master branch and with a bunch of small 
 cosmetic fixes.

I've applied this after a moderate amount of editorialization.

The question of avoiding extra relkind lookups is still open.  We talked
about adding relkind to RangeTblEntry, but I wonder whether adding a new
RTEKind would be a better idea.  Haven't researched it yet.

I have a hacked-up version of contrib/file_fdw that I've been using to
test it with.  That needs some more cleanup before committing, but I
think it should not take too long.  The one thing that is kind of
annoying is that the regression tests need generated files (to insert
absolute paths) and it seems like the PGXS infrastructure doesn't know
how to clean up the generated files afterwards.  Anybody have any
thoughts about fixing that?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: FDW API

2011-02-18 Thread Tom Lane
Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes:
 On 15.02.2011 23:00, Tom Lane wrote:
 I think moving the error check downstream would be a good thing.

 Ok, I tried moving the error checks to preprocess_rowmarks(). 
 Unfortunately RelOptInfos haven't been built at that stage yet, so you 
 still have to do the catalog lookup to get the relkind. That defeats the 
 purpose.

Mph.  It seems like the right fix here is to add relkind to
RangeTblEntry: it could be filled in for free in addRangeTableEntry, for
example.  The main downside of that is that relation relkinds would have
to become fixed (because there would be no practical way of updating
RTEs in stored rules), which means the convert relation to view hack
would have to go away.  Offhand I think no one cares about that anymore,
but ...

In any case, this is looking like a performance optimization that should
be dealt with in a separate patch.  I'd suggest leaving the code in the
form with the extra relkind lookups for the initial commit.  It's
possible that no one would notice the extra lookups anyway --- have you
benchmarked it?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: FDW API

2011-02-18 Thread Robert Haas
On Fri, Feb 18, 2011 at 10:47 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes:
 On 15.02.2011 23:00, Tom Lane wrote:
 I think moving the error check downstream would be a good thing.

 Ok, I tried moving the error checks to preprocess_rowmarks().
 Unfortunately RelOptInfos haven't been built at that stage yet, so you
 still have to do the catalog lookup to get the relkind. That defeats the
 purpose.

 Mph.  It seems like the right fix here is to add relkind to
 RangeTblEntry: it could be filled in for free in addRangeTableEntry, for
 example.

Heikki and I came to the same conclusion yesterday while chatting
about this on IM.

 The main downside of that is that relation relkinds would have
 to become fixed (because there would be no practical way of updating
 RTEs in stored rules), which means the convert relation to view hack
 would have to go away.  Offhand I think no one cares about that anymore,
 but ...

That actually sounds like a possible problem, because it's possible to
create views with circular dependencies using CORV, and they won't
dump-and-reload properly without that hack.  It's not a particularly
useful thing to do, of course, and I think we could reengineer pg_dump
to not need the hack even if someone does do it, but that sounds like
more work than we want to tackle right now.

 In any case, this is looking like a performance optimization that should
 be dealt with in a separate patch.  I'd suggest leaving the code in the
 form with the extra relkind lookups for the initial commit.  It's
 possible that no one would notice the extra lookups anyway --- have you
 benchmarked it?

This is a good point... although I think this results in at least one
extra syscache lookup per table per SELECT, even when no foreign
tables are involved.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: FDW API

2011-02-18 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Fri, Feb 18, 2011 at 10:47 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 The main downside of that is that relation relkinds would have
 to become fixed (because there would be no practical way of updating
 RTEs in stored rules), which means the convert relation to view hack
 would have to go away.  Offhand I think no one cares about that anymore,
 but ...

 That actually sounds like a possible problem, because it's possible to
 create views with circular dependencies using CORV, and they won't
 dump-and-reload properly without that hack.  It's not a particularly
 useful thing to do, of course, and I think we could reengineer pg_dump
 to not need the hack even if someone does do it, but that sounds like
 more work than we want to tackle right now.

Urgh.  That's problematic, because even if we changed pg_dump (which
would not be that hard I think), we'd still have to cope with dump files
emitted by existing versions of pg_dump.  The time constant before that
stops being an issue is measured in years.  I'm not at all sure whether
the circular dependency case is infrequent enough that we could get away
with saying tough luck to people who hit the case.

[ thinks a bit ... ]  But we can probably hack our way around that:
teach the rule rewriter to update relkind in any RTE it brings in from a
stored rule.  We already do something similar in some other cases where
a stored parsetree node contains information that could become obsolete.

But that conclusion just makes it even clearer that fixing this
performance problem, if it even is real, should be a separate patch.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: FDW API

2011-02-18 Thread Tom Lane
Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes:
 Another version, rebased against master branch and with a bunch of small 
 cosmetic fixes.

 I guess this is as good as this is going to get for 9.1.

This is *badly* in need of another cleanup pass; it's full of typos,
contradictory comments, #ifdef NOT_USED stuff, etc etc.  And the
documentation is really inadequate.  If you're out of energy to go
over it, I guess I should step up.

Question after first look: what is the motivation for passing
estate-es_param_list_info to BeginScan?  AFAICS, even if there is a
reason for that function to need that, it isn't receiving any info that
would be sufficient to let it know what's in there.  What seems more
likely to be useful is to pass in the EState pointer, as for example
being able to look at es_query_cxt seems like a good idea.

BTW, I see no particularly good reason to let the FDW omit ReScan.
If it wants to implement that as end-and-begin, it can do so internally.
It would be a lot clearer to just insist that all the function pointers
be valid, as indeed some (not all) of the comments say already.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: FDW API

2011-02-18 Thread Heikki Linnakangas

On 18.02.2011 22:16, Tom Lane wrote:

Heikki Linnakangasheikki.linnakan...@enterprisedb.com  writes:

Another version, rebased against master branch and with a bunch of small
cosmetic fixes.



I guess this is as good as this is going to get for 9.1.


This is *badly* in need of another cleanup pass; it's full of typos,
contradictory comments, #ifdef NOT_USED stuff, etc etc.  And the
documentation is really inadequate.  If you're out of energy to go
over it, I guess I should step up.


If you have the energy, by all means, thanks.


Question after first look: what is the motivation for passing
estate-es_param_list_info to BeginScan?  AFAICS, even if there is a
reason for that function to need that, it isn't receiving any info that
would be sufficient to let it know what's in there.


The idea is that when the query is planned, the FDW can choose to push 
down a qual that contains a parameter marker, like WHERE remotecol = 
$1. At execution time, it needs the value of the parameter to send it 
to the remote server. The PostgreSQL FDW does that, although I didn't 
test it so it might well be broken.



 What seems more
likely to be useful is to pass in the EState pointer, as for example
being able to look at es_query_cxt seems like a good idea.


By look at, you mean allocate stuff in it?

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: FDW API

2011-02-18 Thread Tom Lane
Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes:
 On 18.02.2011 22:16, Tom Lane wrote:
 Question after first look: what is the motivation for passing
 estate-es_param_list_info to BeginScan?  AFAICS, even if there is a
 reason for that function to need that, it isn't receiving any info that
 would be sufficient to let it know what's in there.

 The idea is that when the query is planned, the FDW can choose to push 
 down a qual that contains a parameter marker, like WHERE remotecol = 
 $1. At execution time, it needs the value of the parameter to send it 
 to the remote server. The PostgreSQL FDW does that, although I didn't 
 test it so it might well be broken.

s/might well be/is/ --- there's no guarantee that parameters are valid
at executor setup time.  The place that needs to be grabbing the
parameter value for that purpose is BeginScan.

 What seems more
 likely to be useful is to pass in the EState pointer, as for example
 being able to look at es_query_cxt seems like a good idea.

 By look at, you mean allocate stuff in it?

Right.  I suppose you're going to comment that CurrentMemoryContext is
probably the same thing, but in general it's not going to pay to make
this API run with blinders on.  My feeling is it'd be best to pass down
all the information the executor node has got --- probably we should
just pass the ForeignScanState node itself, and leave a void * in that
for FDW-private data, and be done with it.  Otherwise we're going to be
adding missed stuff back to the API every time somebody notices that
their FDW can't do X because they don't have access to the necessary
information.  That definitional instability will trump any ABI stability
that might be gained from not relying on executor node types.  (And it's
not like changing ScanState in a released version is an entirely safe
thing to do even today --- there are lots of loadable modules that know
about that struct.)

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: FDW API

2011-02-18 Thread Tom Lane
I wrote:
 ... My feeling is it'd be best to pass down
 all the information the executor node has got --- probably we should
 just pass the ForeignScanState node itself, and leave a void * in that
 for FDW-private data, and be done with it.  Otherwise we're going to be
 adding missed stuff back to the API every time somebody notices that
 their FDW can't do X because they don't have access to the necessary
 information.

Attached is a rewritten version of fdwhandler.sgml that specifies what I
think is a more future-proof API for the callback functions.  Barring
objections, I'll push ahead with editing the code to match.

regards, tom lane


!-- doc/src/sgml/fdwhandler.sgml --

 chapter id=fdwhandler
   titleWriting A Foreign Data Wrapper/title

   indexterm zone=fdwhandler
primaryforeign data wrapper/primary
secondaryhandler for/secondary
   /indexterm

   para
All operations on a foreign table are handled through its foreign data
wrapper, which consists of a set of functions that the planner and
executor call. The foreign data wrapper is responsible for fetching
data from the remote data source and returning it to the
productnamePostgreSQL/productname executor. This chapter outlines how
to write a new foreign data wrapper.
   /para

   para
The FDW author needs to implement a handler function, and optionally
a validator function. Both functions must be written in a compiled
language such as C, using the version-1 interface.
For details on C language calling conventions and dynamic loading,
see xref linkend=xfunc-c.
   /para

   para
The handler function simply returns a struct of function pointers to
callback functions that will be called by the planner and executor.
Most of the effort in writing an FDW is in implementing these callback
functions.
The handler function must be registered with
productnamePostgreSQL/productname as taking no arguments and returning
the special pseudo-type typefdw_handler/type.
The callback functions are plain C functions and are not visible or
callable at the SQL level.
   /para

   para
The validator function is responsible for validating options given in the
commandCREATE FOREIGN DATA WRAPPER/command, commandCREATE
SERVER/command and commandCREATE FOREIGN TABLE/command commands.
The validator function must be registered as taking two arguments, a text
array containing the options to be validated, and an OID representing the
type of object the options are associated with (in the form of the OID
of the system catalog the object would be stored in).  If no validator
function is supplied, the options are not checked at object creation time.
   /para

   para
The foreign data wrappers included in the standard distribution are good
references when trying to write your own.  Look into the
filenamecontrib/file_fdw/ subdirectory of the source tree.
The xref linkend=sql-createforeigndatawrapper reference page also has
some useful details.
   /para

   note
para
 The SQL standard specifies an interface for writing foreign data wrappers.
 However, PostgreSQL does not implement that API, because the effort to
 accommodate it into PostgreSQL would be large, and the standard API hasn't
 gained wide adoption anyway.
/para
   /note

   sect1 id=fdw-routines
titleForeign Data Wrapper Callback Routines/title

para
 The FDW handler function returns a palloc'd structnameFdwRoutine/
 struct containing pointers to the following callback functions:
/para

para
programlisting
FdwPlan *
PlanForeignScan (Oid foreigntableid,
 PlannerInfo *root,
 RelOptInfo *baserel);
/programlisting

 Plan a scan on a foreign table. This is called when a query is planned.
 literalforeigntableid/ is the structnamepg_class/ OID of the
 foreign table.  literalroot/ is the planner's global information
 about the query, and literalbaserel/ is the planner's information
 about this table.
 The function must return a palloc'd struct that contains cost estimates,
 a string to show for this scan in commandEXPLAIN/, and any
 FDW-private information that is needed to execute the foreign scan at a
 later time.  (Note that the private information must be represented in
 a form that functioncopyObject/ knows how to copy.)
/para

para
 The information in literalroot/ and literalbaserel/ can be used
 to reduce the amount of information that has to be fetched from the
 foreign table (and therefore reduce the cost estimate).
 literalbaserel-gt;baserestrictinfo/ is particularly interesting, as
 it contains restriction quals (literalWHERE/ clauses) that can be
 used to filter the rows to be fetched.  (The FDW is not required to
 enforce these quals, as the finished plan will recheck them anyway.)
 

Re: [HACKERS] review: FDW API

2011-02-16 Thread Heikki Linnakangas

On 15.02.2011 23:00, Tom Lane wrote:

Heikki Linnakangasheikki.linnakan...@enterprisedb.com  writes:

On 15.02.2011 21:13, Tom Lane wrote:

Hmm.  I don't have a problem with adding relkind to the planner's
RelOptInfo, but it seems to me that if parse analysis needs to know
this, you have put functionality into parse analysis that does not
belong there.



Possibly. We throw the existing errors, for example if you try to do
FOR UPDATE OF foo where foo is a set-returning function, in
transformLockingClause(), so it seemed like the logical place to check
for foreign tables too.



Hmm, one approach would be to go ahead and create the RowMarkClauses for
all relations in the parse analysis phase, foreign or not, and throw the
error later, in preprocess_rowmarks().


I think moving the error check downstream would be a good thing.


Ok, I tried moving the error checks to preprocess_rowmarks(). 
Unfortunately RelOptInfos haven't been built at that stage yet, so you 
still have to do the catalog lookup to get the relkind. That defeats the 
purpose.


We could delay the error checking further, but preprocess_rowmarks() 
would need to distinguish foreign tables anyway, so that it can mark 
them with ROW_MARK_COPY instead of ROW_MARK_REFERENCE.



IIRC, at the moment we're basically duplicating the tests between parse
analysis and the planner, but it's not clear what the value of that is.


There's duplicate logic in parse analysis and rewriter, to be precise. 
And then there's this one check in make_outerjoininfo:


/*
 	 * Presently the executor cannot support FOR UPDATE/SHARE marking of 
rels
 	 * appearing on the nullable side of an outer join. (It's somewhat 
unclear
 	 * what that would mean, anyway: what should we mark when a result 
row is

 * generated from no element of the nullable relation?)  So, complain if
 * any nullable rel is FOR UPDATE/SHARE.
 *
 * You might be wondering why this test isn't made far upstream in the
 * parser.  It's because the parser hasn't got enough info --- consider
 * FOR UPDATE applied to a view.  Only after rewriting and flattening do
 * we know whether the view contains an outer join.
 *
 	 * We use the original RowMarkClause list here; the PlanRowMark list 
would

 * list everything.
 */
foreach(l, root-parse-rowMarks)
{
RowMarkClause *rc = (RowMarkClause *) lfirst(l);

if (bms_is_member(rc-rti, right_rels) ||
(jointype == JOIN_FULL  bms_is_member(rc-rti, 
left_rels)))
ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
 	 errmsg(SELECT FOR UPDATE/SHARE cannot be applied to the 
nullable side of an outer join)));

}

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: FDW API

2011-02-15 Thread Robert Haas
On Tue, Feb 15, 2011 at 1:40 PM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
 I'm actually surprised we don't need to distinguish them in more places, but
 nevertheless it feels like we should have that info available more
 conveniently, and without requiring a catalog lookup like get_rel_relkind()
 does. At first I thought we should add a field to RelOptInfo, but that
 doesn't help transformLockingClause. Adding the field to RangeTblEntry seems
 quite invasive, and it doesn't feel like the right place to cache that kind
 of information anyway. Thoughts on that?

Maybe it should be a new RTEKind.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: FDW API

2011-02-15 Thread Tom Lane
Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes:
 As the patch stands, we have to do get_rel_relkind() in a couple of 
 places in parse analysis and the planner to distinguish a foreign table 
 from a regular one. As the patch stands, there's nothing in 
 RangeTblEntry (which is what we have in transformLockingClause) or 
 RelOptInfo (for set_plain_rel_pathlist) to directly distinguish them.

Hmm.  I don't have a problem with adding relkind to the planner's
RelOptInfo, but it seems to me that if parse analysis needs to know
this, you have put functionality into parse analysis that does not
belong there.

(No, I haven't read the patch...)

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: FDW API

2011-02-15 Thread Heikki Linnakangas

On 15.02.2011 21:13, Tom Lane wrote:

Heikki Linnakangasheikki.linnakan...@enterprisedb.com  writes:

As the patch stands, we have to do get_rel_relkind() in a couple of
places in parse analysis and the planner to distinguish a foreign table
from a regular one. As the patch stands, there's nothing in
RangeTblEntry (which is what we have in transformLockingClause) or
RelOptInfo (for set_plain_rel_pathlist) to directly distinguish them.


Hmm.  I don't have a problem with adding relkind to the planner's
RelOptInfo, but it seems to me that if parse analysis needs to know
this, you have put functionality into parse analysis that does not
belong there.


Possibly. We throw the existing errors, for example if you try to do 
FOR UPDATE OF foo where foo is a set-returning function, in 
transformLockingClause(), so it seemed like the logical place to check 
for foreign tables too.


Hmm, one approach would be to go ahead and create the RowMarkClauses for 
all relations in the parse analysis phase, foreign or not, and throw the 
error later, in preprocess_rowmarks(). preprocess_rowmarks() doesn't 
currently know if each RowMarkClause was created by ... FOR UPDATE or 
FOR UPDATE OF foo, but to be consistent with the logic in 
transformLockingClause() it would need to. For the former case, it would 
need to just ignore foreign tables but for the latter it would need to 
throw an error. I guess we could add an explicit flag to RowMarkClause 
for that.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: FDW API

2011-02-15 Thread Heikki Linnakangas

On 15.02.2011 21:00, Robert Haas wrote:

On Tue, Feb 15, 2011 at 1:40 PM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com  wrote:

I'm actually surprised we don't need to distinguish them in more places, but
nevertheless it feels like we should have that info available more
conveniently, and without requiring a catalog lookup like get_rel_relkind()
does. At first I thought we should add a field to RelOptInfo, but that
doesn't help transformLockingClause. Adding the field to RangeTblEntry seems
quite invasive, and it doesn't feel like the right place to cache that kind
of information anyway. Thoughts on that?


Maybe it should be a new RTEKind.


That would be even more invasive.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: FDW API

2011-02-15 Thread Tom Lane
Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes:
 On 15.02.2011 21:13, Tom Lane wrote:
 Hmm.  I don't have a problem with adding relkind to the planner's
 RelOptInfo, but it seems to me that if parse analysis needs to know
 this, you have put functionality into parse analysis that does not
 belong there.

 Possibly. We throw the existing errors, for example if you try to do 
 FOR UPDATE OF foo where foo is a set-returning function, in 
 transformLockingClause(), so it seemed like the logical place to check 
 for foreign tables too.

 Hmm, one approach would be to go ahead and create the RowMarkClauses for 
 all relations in the parse analysis phase, foreign or not, and throw the 
 error later, in preprocess_rowmarks().

I think moving the error check downstream would be a good thing.
Consider for example the case of applying FOR UPDATE to a view.  You
can't know what that entails until after the rewriter expands the view.
IIRC, at the moment we're basically duplicating the tests between parse
analysis and the planner, but it's not clear what the value of that is.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: FDW API

2011-02-08 Thread Shigeru HANADA

On Mon, 07 Feb 2011 09:37:37 +0100
Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote:

 On 07.02.2011 08:00, Shigeru HANADA wrote:
  Sorry for late, attached are revised version of FDW API patches which
  reflect Heikki's comments except removing catalog lookup via
  IsForeignTable().  ISTM that the point is avoiding catalog lookup
  during planning, but I have not found when we can set foreign table
  flag without catalog lookup during RelOptInfo generation.
 
 In get_relation_info(), you do the catalog lookup anyway and have the 
 Relation object at hand. Add a flag to RelOptInfo indicating if it's a 
 foreign table or not, and set that in get_relation_info().

Thanks a lot.

Attached is a revised version of foreign_scan patch.  This still
requires fdw_handler patch which was attached to the orginal post.

Avoid_catalog_lookup.patch is attached for review purpose.
This patch includes changes for this fix.

Regards,
--
Shigeru Hanada


avoid_catalog_lookup.patch
Description: Binary data


20110208-foreign_scan.patch.gz
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: FDW API

2011-02-07 Thread Heikki Linnakangas

On 07.02.2011 08:00, Shigeru HANADA wrote:

Sorry for late, attached are revised version of FDW API patches which
reflect Heikki's comments except removing catalog lookup via
IsForeignTable().  ISTM that the point is avoiding catalog lookup
during planning, but I have not found when we can set foreign table
flag without catalog lookup during RelOptInfo generation.


In get_relation_info(), you do the catalog lookup anyway and have the 
Relation object at hand. Add a flag to RelOptInfo indicating if it's a 
foreign table or not, and set that in get_relation_info().


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: FDW API

2011-02-06 Thread Shigeru HANADA
On Mon, 31 Jan 2011 22:00:55 +0900
Shigeru HANADA han...@metrosystems.co.jp wrote:
 I'll post FDW API patches which reflect comments first, and then I'll
 rebase postgresql_fdw against them.

Sorry for late, attached are revised version of FDW API patches which
reflect Heikki's comments except removing catalog lookup via
IsForeignTable().  ISTM that the point is avoiding catalog lookup
during planning, but I have not found when we can set foreign table
flag without catalog lookup during RelOptInfo generation.

Please apply attached patches in this order.

1) fdw_catalog_lookup.patch
2) fdw_handler.patch
3) foreign_scan.patch

To execute SELECT quereis for foreign tables, you need a FDW which has
valid fdwhandler function.  The file_fdw which is posted in another
thread SQL/MED file_fdw would help.

Changes from last patches are:

1) Now SELECT FOR UPDATE check for foreign tables are done properly in
executor phase, in ExecLockTuple().  Or such check should be done in
parser or planner?

2) Server version is checked in pg_dump (= 90100).

3) ReScan is not required now.  If ReScan is not supplied, ForeignScan
uses EndScan + BeginSacn instead.

4) FDW-Info in EXPLAIN is shown always, except FDW set NULL to
explainInfo.

Regards,
--
Shigeru Hanada


fdw_catalog_lookup.patch.gz
Description: Binary data


fdw_handler.patch.gz
Description: Binary data


foreign_scan.patch.gz
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: FDW API

2011-01-31 Thread Shigeru HANADA
On Mon, 24 Jan 2011 15:08:11 +0200
Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote:
 I've gone through the code in a bit more detail now. I did a bunch of 
 cosmetic changes along the way, patch attached. I also added a few 
 paragraphs in the docs. We need more extensive documentation, but this 
 at least marks the places where I think the docs need to go.
 
 Comments:

Thanks for the comments!

 * How can a FDW fetch only the columns required by the scan? The file 
 FDW has to read the whole file anyhow, but it could perhaps skip calling 
 the input function for unnecessary columns. But more importantly, with 
 something like the postgresql_fdw you don't want to fetch any extra 
 columns across the wire. I gather the way to do it is to copy 
 RelOptInfo-attr_needed to private storage at plan stage, and fill the 
 not-needed attributes with NULLs in Iterate. That gets a bit awkward, 
 you need to transform attr_needed to something that can be copied for 
 starters. Or is that information somehow available at execution phase 
 otherwise?

I thought that RelOptInfo-reltargetlist, a list of Var, can be used
for that purpose.  FdwPlan can copy it with copyObject(), and pass
it to Iterate through FdwPlan-fdw_private.

Then, postgresql_fdw would be able to retrieve only necessary columns,
or just use NULL for unnecessary columns in the SELECT clause to
avoid mapping values to columns.  Each way would be decrease amount of
data transfer.

 * I think we need something in RelOptInfo to mark foreign tables. At the 
 moment, you need to call IsForeignTable() which does a catalog lookup. 
 Maybe a new RTEKind, or a boolean flag.

We can avoid catalog lookup with checking table type in get_relation_info()
and updating RelOptInfo-is_foreign_table if the target was a foreign
table.

 * Can/should we make ReScan optional? Could the executor just do 
 EndScan+BeginScan if there's no ReScan function?

Right, we have enough information to call BeginScan again.  Will fix.

 * Is there any point in allowing a FDW without a handler? It's totally 
 useless, isn't it? We had the CREATE FOREIGN DATA WRAPPER syntax in 
 previous versions, and it allowed it, but it has always been totally 
 useless so I don't think we need to worry much about 
 backwards-compatibility here.

dblink (and possibly other external modules) uses FDW without a
handler.

 * Is there any use case for changing the handler or validator function 
 of an existign FDW with ALTER? To me it just seems like an unnecessary 
 complication.

AFAICS, the only case for that is upgrading FDW to new one without
re-creating foreign tables.  I don't have strong opinion for this
issue, and it seems reasonable to remove ALTER feature in first
version.

 * IMHO the FDW-info should always be displayed, without VERBOSE. In my 
 experience with another DBMS that had this feature, the SQL being sent 
 to the remote server was almost always the key piece of information that 
 I was looking for in the query plans.

Agreed, will fix to show FDW-info always.  Is it reasonable to show
FDW-info row even if a FDW set explainInfo to NULL?

 * this check in expand_inherited_rtentry seems misplaced:
 
  /*
   * SELECT FOR UPDATE/SHARE is not allowed on foreign tables 
  because
   * they are read-only.
   */
  if (newrelation-rd_rel-relkind == RELKIND_FOREIGN_TABLE 
  lockmode != AccessShareLock)
  ereport(ERROR,
  (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
   errmsg(SELECT FOR UPDATE/SHARE is not 
  allowed with foreign tables)));
 
 I don't understand why we'd need to do that for inherited tables in 
 particular. And it's not working for regular non-inherited foreign tables:
 
 postgres=# SELECT * FROM filetbl2 FOR UPDATE;
 ERROR:  could not open file base/11933/16397: No such file or directory

It's a remnants of table inheritance support for foreign tables.  This
check should be removed from here, and another check should be added
to avoid above error.

 * Need to document how the FDW interacts with transaction 
 commit/rollback. In particular, I believe EndScan is never called if the 
 transaction is aborted. That needs to be noted explicitly, and need to 
 suggest how to clean up any external resources in that case. For 
 example, postgresql_fdw will need to close any open cursors or result sets.

I agree that resource cleanup is an important issue when writing FDW. 
FDW should use transaction-safe resources like VirtualFile, or use
ResourceOwner callback mechanism.  Is it reasonable to add new page
under Chapter 35. Extending SQL?

 In general, I think the design is sound. What we need is more 
 documentation. It'd also be nice to see the postgresql_fdw brought back 
 to shape so that it works against this latest version of the api patch.

I'll post FDW API patches which reflect comments first, and 

Re: [HACKERS] review: FDW API

2011-01-31 Thread Robert Haas
On Mon, Jan 31, 2011 at 8:00 AM, Shigeru HANADA
han...@metrosystems.co.jp wrote:
 * Is there any use case for changing the handler or validator function
 of an existign FDW with ALTER? To me it just seems like an unnecessary
 complication.

 AFAICS, the only case for that is upgrading FDW to new one without
 re-creating foreign tables.  I don't have strong opinion for this
 issue, and it seems reasonable to remove ALTER feature in first
 version.

-1.  I don't think that removing the ability to change this is going
to save a measurable amount of complexity, and it certainly will suck
if you need it and don't have it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: FDW API

2011-01-30 Thread Robert Haas
On Mon, Jan 24, 2011 at 8:08 AM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
 How much review have you done of parts (3) and (4)?  The key issue for
 all of the FDW work in progress seems to be what the handler API is
 going to look like, and so once we get that committed it will unblock
 a lot of other things.

 I've gone through the code in a bit more detail now. I did a bunch of
 cosmetic changes along the way, patch attached. I also added a few
 paragraphs in the docs. We need more extensive documentation, but this at
 least marks the places where I think the docs need to go.

 Comments:

I haven't seen any responses to these comments.  Time grows short to
get this committed to PostgreSQL 9.1.  We need responses to these
comments and an updated patch ASAP.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: FDW API

2011-01-30 Thread Shigeru HANADA
On Fri, 21 Jan 2011 18:28:19 +0200
Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote:
 On 18.01.2011 17:26, Shigeru HANADA wrote:
  3) 20110118-fdw_handler.patch - This patch adds support for HANDLER
  option to FOREIGN DATA WRAPPER object.
 
 Some quick comments on that:

Thanks for the comments.
I'll post revised version of patches soon.

 * The elogs in parse_func_options() should be ereports.
 
 * pg_dump should check the version number and only try to select 
 fdwhandler column if = 9.1. See the other functions there for example 
 of that.

Fixed.

 * dumpForeignDataWrapper() in pg_dump checks if fdwhandler field is -. 
 I don't think we use that as magic value there, do we? Same with validator.

That magic value, -,  is used as no-function-was-set in
dumpForeignDataWrapper() and dumpForeignServer(), and I followed them.
Agreed that magic value should be removed, but it would be a refactoring
issue about pg_dump.

 * Please check that the HANDLER and VALIDATOR options that pg_dump 
 creates properly quoted.

I checked quoting for HANDLER and VALIDATOR with file_fdw functions,
and it seems works fine.  The pg_dump generats:


CREATE FOREIGN DATA WRAPPER dummy_fdw VALIDATOR public.File_Fdw_Validator
HANDLER public.FILE_FDW_HANDLER;


from these DDLs:


CREATE OR REPLACE FUNCTION File_Fdw_Validator (text[], oid)
RETURNS bool
AS '$libdir/file_fdw','file_fdw_validator'
LANGUAGE C STRICT;

CREATE OR REPLACE FUNCTION FILE_FDW_HANDLER ()
RETURNS fdw_handler
AS '$libdir/file_fdw','file_fdw_handler'
LANGUAGE C STRICT;

CREATE FOREIGN DATA WRAPPER dummy_fdw
VALIDATOR File_Fdw_Validator HANDLER FILE_FDW_HANDLER;


Regard,
--
Shigeru Hanada



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: FDW API

2011-01-24 Thread Itagaki Takahiro
On Sat, Jan 22, 2011 at 07:20, Tom Lane t...@sss.pgh.pa.us wrote:
 Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes:
 * I wonder if CREATE FOREIGN DATA WRAPPER should automatically create
 the handler function, if it doesn't exist yet.

 Doing that would require the equivalent of pg_pltemplate for FDWs, no?
 I think we're a long way from wanting to do that.  Also, it seems to me
 that add-on FDWs are likely to end up getting packaged as extensions,

The proposed file_fdw.sql actually creates a default FDW on installation.
So I think the installation scripts work as a template even if we don't
have FDW template catalogs.

+ /* contrib/file_fdw/file_fdw.sql.in */
+ -- create wrapper with validator and handler
+ CREATE OR REPLACE FUNCTION file_fdw_validator (text[], oid)
+ CREATE OR REPLACE FUNCTION file_fdw_handler ()
+ CREATE FOREIGN DATA WRAPPER file_fdw
+ VALIDATOR file_fdw_validator HANDLER file_fdw_handler;

-- 
Itagaki Takahiro

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: FDW API

2011-01-24 Thread Heikki Linnakangas

On 21.01.2011 17:57, Robert Haas wrote:

On Fri, Jan 21, 2011 at 10:17 AM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com  wrote:

On 18.01.2011 17:26, Shigeru HANADA wrote:


1) 20110118-no_fdw_perm_check.patch - This patch is not included in
last post.  This had been proposed on 2011-01-05 first, but maybe has
not been reviewd yet.  I re-propose this patch for SQL standard
conformance.  This patch removes permission check that requires USAGE
on the foreign-data wrapper at CREATE FOREIGN TABLE.
Please see original post for details.

http://archives.postgresql.org/message-id/20110105145206.30fd.69899...@metrosystems.co.jp


Committed this part.


How much review have you done of parts (3) and (4)?  The key issue for
all of the FDW work in progress seems to be what the handler API is
going to look like, and so once we get that committed it will unblock
a lot of other things.


I've gone through the code in a bit more detail now. I did a bunch of 
cosmetic changes along the way, patch attached. I also added a few 
paragraphs in the docs. We need more extensive documentation, but this 
at least marks the places where I think the docs need to go.


Comments:

* How can a FDW fetch only the columns required by the scan? The file 
FDW has to read the whole file anyhow, but it could perhaps skip calling 
the input function for unnecessary columns. But more importantly, with 
something like the postgresql_fdw you don't want to fetch any extra 
columns across the wire. I gather the way to do it is to copy 
RelOptInfo-attr_needed to private storage at plan stage, and fill the 
not-needed attributes with NULLs in Iterate. That gets a bit awkward, 
you need to transform attr_needed to something that can be copied for 
starters. Or is that information somehow available at execution phase 
otherwise?


* I think we need something in RelOptInfo to mark foreign tables. At the 
moment, you need to call IsForeignTable() which does a catalog lookup. 
Maybe a new RTEKind, or a boolean flag.


* Can/should we make ReScan optional? Could the executor just do 
EndScan+BeginScan if there's no ReScan function?


* Is there any point in allowing a FDW without a handler? It's totally 
useless, isn't it? We had the CREATE FOREIGN DATA WRAPPER syntax in 
previous versions, and it allowed it, but it has always been totally 
useless so I don't think we need to worry much about 
backwards-compatibility here.


* Is there any use case for changing the handler or validator function 
of an existign FDW with ALTER? To me it just seems like an unnecessary 
complication.


* IMHO the FDW-info should always be displayed, without VERBOSE. In my 
experience with another DBMS that had this feature, the SQL being sent 
to the remote server was almost always the key piece of information that 
I was looking for in the query plans.


* this check in expand_inherited_rtentry seems misplaced:


/*
 * SELECT FOR UPDATE/SHARE is not allowed on foreign tables 
because
 * they are read-only.
 */
if (newrelation-rd_rel-relkind == RELKIND_FOREIGN_TABLE 
lockmode != AccessShareLock)
ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
 errmsg(SELECT FOR UPDATE/SHARE is not 
allowed with foreign tables)));


I don't understand why we'd need to do that for inherited tables in 
particular. And it's not working for regular non-inherited foreign tables:


postgres=# SELECT * FROM filetbl2 FOR UPDATE;
ERROR:  could not open file base/11933/16397: No such file or directory

* Need to document how the FDW interacts with transaction 
commit/rollback. In particular, I believe EndScan is never called if the 
transaction is aborted. That needs to be noted explicitly, and need to 
suggest how to clean up any external resources in that case. For 
example, postgresql_fdw will need to close any open cursors or result sets.


In general, I think the design is sound. What we need is more 
documentation. It'd also be nice to see the postgresql_fdw brought back 
to shape so that it works against this latest version of the api patch.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com
diff --git a/doc/src/sgml/ddl.sgml b/doc/src/sgml/ddl.sgml
index a65b4bc..06a82a4 100644
--- a/doc/src/sgml/ddl.sgml
+++ b/doc/src/sgml/ddl.sgml
@@ -2986,6 +2986,41 @@ ANALYZE measurement;
   /sect2
  /sect1
 
+ sect1 id=ddl-foreign-data
+  titleForeign Data/title
+   indexterm
+primaryforeign data/primary
+   /indexterm
+   para
+productnamePostgreSQL/productname implements parts of the SQL/MED
+specification, which allows you to access external data that resides
+outside PostgreSQL, using normal SQL queries.
+   /para
+
+   para
+Foreign data is accessed with help from a
+firsttermforeign data wrapper/firstterm. A foreign data wrapper is a

Re: [HACKERS] review: FDW API

2011-01-24 Thread Robert Haas
On Mon, Jan 24, 2011 at 8:08 AM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
 * Is there any point in allowing a FDW without a handler? It's totally
 useless, isn't it? We had the CREATE FOREIGN DATA WRAPPER syntax in previous
 versions, and it allowed it, but it has always been totally useless so I
 don't think we need to worry much about backwards-compatibility here.

Aren't things like dblink using this in its existing form?

 * Is there any use case for changing the handler or validator function of an
 existign FDW with ALTER? To me it just seems like an unnecessary
 complication.

+1.

 * IMHO the FDW-info should always be displayed, without VERBOSE. In my
 experience with another DBMS that had this feature, the SQL being sent to
 the remote server was almost always the key piece of information that I was
 looking for in the query plans.

+1.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: FDW API

2011-01-21 Thread Heikki Linnakangas

On 18.01.2011 17:26, Shigeru HANADA wrote:

1) 20110118-no_fdw_perm_check.patch - This patch is not included in
last post.  This had been proposed on 2011-01-05 first, but maybe has
not been reviewd yet.  I re-propose this patch for SQL standard
conformance.  This patch removes permission check that requires USAGE
on the foreign-data wrapper at CREATE FOREIGN TABLE.
Please see original post for details.
http://archives.postgresql.org/message-id/20110105145206.30fd.69899...@metrosystems.co.jp


Committed this part.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: FDW API

2011-01-21 Thread Robert Haas
On Fri, Jan 21, 2011 at 10:17 AM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
 On 18.01.2011 17:26, Shigeru HANADA wrote:

 1) 20110118-no_fdw_perm_check.patch - This patch is not included in
 last post.  This had been proposed on 2011-01-05 first, but maybe has
 not been reviewd yet.  I re-propose this patch for SQL standard
 conformance.  This patch removes permission check that requires USAGE
 on the foreign-data wrapper at CREATE FOREIGN TABLE.
 Please see original post for details.

 http://archives.postgresql.org/message-id/20110105145206.30fd.69899...@metrosystems.co.jp

 Committed this part.

How much review have you done of parts (3) and (4)?  The key issue for
all of the FDW work in progress seems to be what the handler API is
going to look like, and so once we get that committed it will unblock
a lot of other things.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: FDW API

2011-01-21 Thread Heikki Linnakangas

On 21.01.2011 17:57, Robert Haas wrote:

How much review have you done of parts (3) and (4)?


Not much. I'm getting there..


The key issue for
all of the FDW work in progress seems to be what the handler API is
going to look like, and so once we get that committed it will unblock
a lot of other things.


Yep. The API that's there now was originally suggested by me, so I 
probably won't have big complaints about it. I'll have to also look at 
the PostgreSQL and file implementations of it to see that it really fits 
the bill.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: FDW API

2011-01-21 Thread Heikki Linnakangas

On 18.01.2011 17:26, Shigeru HANADA wrote:

3) 20110118-fdw_handler.patch - This patch adds support for HANDLER
option to FOREIGN DATA WRAPPER object.


Some quick comments on that:

* I wonder if CREATE FOREIGN DATA WRAPPER should automatically create 
the handler function, if it doesn't exist yet. That's what CREATE 
LANGUAGE does, which is similar. Although it doesn't seem to be 
documented for CREATE LANGUAGE either, is it deprecated?


* The elogs in parse_func_options() should be ereports.

* pg_dump should check the version number and only try to select 
fdwhandler column if = 9.1. See the other functions there for example 
of that.


* dumpForeignDataWrapper() in pg_dump checks if fdwhandler field is -. 
I don't think we use that as magic value there, do we? Same with validator.


* Please check that the HANDLER and VALIDATOR options that pg_dump 
creates properly quoted.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: FDW API

2011-01-21 Thread Tom Lane
Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes:
 Some quick comments on that:

 * I wonder if CREATE FOREIGN DATA WRAPPER should automatically create 
 the handler function, if it doesn't exist yet. That's what CREATE 
 LANGUAGE does, which is similar. Although it doesn't seem to be 
 documented for CREATE LANGUAGE either, is it deprecated?

Doing that would require the equivalent of pg_pltemplate for FDWs, no?
I think we're a long way from wanting to do that.  Also, it seems to me
that add-on FDWs are likely to end up getting packaged as extensions,
so the extension machinery will probably render the question moot pretty
soon.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: FDW API

2011-01-18 Thread Shigeru HANADA
On Sun, 16 Jan 2011 01:55:19 +0100
Jan Urbański wulc...@wulczer.org wrote:
snip
 In general, the feature looks great and I hope it'll make it into 9.1.
 And it we'd get the possibility to write FDW handlers in other PLs than
 C, it would rock so hard...
 
 I'm going to mark this a Waiting for Author because of the typos, the
 BeginScan/EndScan issue, and the nested loop stopping halfway issue. The
 rest are suggestions or just thoughts, and if you don't see them as
 justified, I'll mark the next patch Ready for Committer.

Thanks a lot for the comments.  I've (hopefully) fixed issues above. 
Please find attached patches.

== patch list ==

1) 20110118-no_fdw_perm_check.patch - This patch is not included in
last post.  This had been proposed on 2011-01-05 first, but maybe has
not been reviewd yet.  I re-propose this patch for SQL standard
conformance.  This patch removes permission check that requires USAGE
on the foreign-data wrapper at CREATE FOREIGN TABLE.
Please see original post for details.
http://archives.postgresql.org/message-id/20110105145206.30fd.69899...@metrosystems.co.jp

2) 20110118-fdw_catalog_lookup.patch - This patch adds GetForeignTables. 
Fixed lack of pg_foreign_table.h inclusion.

3) 20110118-fdw_handler.patch - This patch adds support for HANDLER
option to FOREIGN DATA WRAPPER object.

4) 20110118-foreign_scan.patch - This patch adds ForeignScan executor
node and FDW API hooks based on Heikki's proposal.  As Itagaki-san
suggested on 2010-12-21, FDW must generate information for EXPLAIN
VERBOSE every PlanRelScan() call.  It's better to avoid such overhead,
so new EXPLAIN hook would be needed.  I'll try to make it cleaner.



And I'll reply to the rest of your comments.

 We could use comments about how to return tuples from Iterate and how to
 finish returning them. I had to look at the example to figure out that
 you need ExecClearTuple(slot) in your last Iterate. If Iterate's
 interface were to change, we could just return NULL instead of a tuple
 to say that we're done.

I've added some comments for FDW-developer to fdwapi.h, though they
wouldn't be enough.

 We could be a bit more explicit about how to allocate objects, for
 instance if I'm allocating a FdwPlan in my PlanRelScan with a palloc,
 will it not go away too soon, or too late (IOW leak)?

For that example, the answer is no.  Objects are allocated in
MessageContext if you don't switch context and released when the query
has been finished.  I agree that more documentation or comments for
FDW-developers should be added.

 Maybe PlanNative should get the foreign table OID, not the server OID,
 to resemble PlanRelScan more. The server OID is just a syscache lookup
 away, anyway.

You would missed a case that multiple foreign tables are used in a
query.  Main purpose of PlanNative is to support pass-through
execution of remote query.  In pass-through mode, you can use syntax
as if you have directly connected to external server, so can't use
PostgreSQL's parser.

 If you do EXPLAIN SELECT * FROM foregintable, BeginScan is not called,
 but EndScan is. That's weird, and I noticed because I got a segfault
 when EndScan tried to free things that BeginScan allocated. Maybe just
 don't call EndScan for EXPLAIN?

Fixed to not call EndScan if it was EXPLAIN execution.

Regards,
--
Shigeru Hanada


20110118-no_fdw_perm_check.patch.gz
Description: Binary data


20110118-fdw_catalog_lookup.patch.gz
Description: Binary data


20110118-fdw_handler.patch.gz
Description: Binary data


20110118-foreign_scan.patch.gz
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: FDW API

2011-01-17 Thread Shigeru HANADA
On Sun, 16 Jan 2011 01:55:19 +0100
Jan Urbański wulc...@wulczer.org wrote:
 what follows is a review of the FDW API patch from
 http://archives.postgresql.org/message-id/20110114212358.82c7.69899...@metrosystems.co.jp

Thanks for the comments!
For now, I answer to the first half of your comments. I'll answer to
the rest soon.

 All three patches apply cleanly and compile without warnings. Regression
 tests pass.
 
 Let me go patch by patch, starting with the first one that adds the
 HANDLER option.

Sure.

 It adds one useless hunk in src/backend/commands/foreigncmds.c (changes
 order if #includes).
 
 There's a typo in a C commnent (determin which validator to be used).
 
 Other than that, it looks OK.

Fixed in attached patch.

 The third patch just adds a GetForeignTable helper function and it looks OK.

Thanks.  This patch might be able to committed separately because it
would be small enough, and similar to existing lookup functions such
as GetForeignDataWrapper() and GetForeignServer().

 The second patch actually adds the API. First of all, I'd like say that
 it's a really cool piece of code, allowing all kinds of awesome
 funcionality to be added. I'm already excited by the things that this
 will make possible. Congratulations!
 
 To get a feel of the API I wrote a simple FDW wrapper that presents data
 from the commitfest RSS feed, based heavily on twitter_fdw by Hitoshi
 Harada. Please treat my thoughts as someone's who doesn't really know
 *why* the API looks like it does, but has some observations about what
 was missing or what felt strange when using it. I guess that's the
 position a typical FDW wrapper writer will be in.

Sure, I think your point of view is very important.

 First of all, the C comments mention that PlanRelScan should put a tuple
 descriptor in FdwPlan, but there's no such field in it. Also, comments
 for PlanRelScan talk about the 'attnos' argument, which is not in the
 function's signature. I guess the comments are just obsolete and should
 be updated. I think it's actually a good thing you don't have to put a
 TupleDesc in FdwPlan.

Removed comments about 'attnos' and tuple descriptor.

 There are two API methods, PlanNative and PlanQuery that are ifdef'd out
 using IN_THE_FUTURE. Isn't the standard symbol we use NOT_USED? Also,
 the comments say you can implement either PlanRelScan or PlanNative, and
 only the former is available for now. If we keep these methods
 commented, they should be moved to the end of the struct, so that
 uncommenting them will not break compatibility with existing FDWs.

Agreed.  Moved ifdef'd part to the end of struct.

 The only documentation a FDW writer has is fdwapi.h, so comments there
 need to be top-notch. We might contemplate writing a documentation
 chapter explaining how FDW handlers should be written, like we do for C
 SRFs and libpq programs, but I guess for now just the headers file would
 be enough.

file_fdw and postgresql_fdw would be good samples  for wrapper
developer if we could ship them as contrib modules.

 FdwExecutionState is just a struct around a void pointer, can we imagine
 adding more fields there? If not, maybe we could just remove the
 structure and pass void pointers around? OTOH that gives us some
 compiler checking and possibility of extending the struct, so I guess we
 could also just leave it like that.

ISTM that using a struct as a interface is better than void*, as you
mentioned.

 The Iterate method gets passed a TupleTableSlot. Do we really need such
 a low-level structure? It makes returning the result easy, because you
 just form your tuple and call ExecStoreTuple, but perhaps we could
 abstract that away a bit and add a helper method that will take a tuple
 and call ExecStoreTuple for us, passing InvalidBuffer and false as the
 remaining arguments. Or maybe make Iterate return the tuple and call
 ExecStoreTuple internally? I don't have strong opinions, but
 TupleTableSlot feels a bit too gutty - I'm having a hard time imagining
 what fields from it would be useful for a FDW writer, and so perhaps you
 don't need to expose it.

This would be debatable issue.  Currently Iterate() is expected to
return materialized HeapTuple through TupleTableSlot.

I think an advantage to use TupleTableSlot is that FDW can set shoudFree
flag for the tuple.

 Why does BeginScan accept a ParamListInfo argument? First of all, it
 feels like a parsing thing, not a relation scan thing, so perhaps it
 should be available at some other, earlier stage. Second of all, what
 would it be useful for anyway? Neither file_fdw nor my commitfest_fdw
 does anything with it.

ParamListInfo is added to pass parameters of PREPARE/EXECUTE statement
to FDWs.

Plan for a prepared query is generated at PREPARE with placeholders,
and executed at EXECUTE with actual values.  With  ParamListInfo
parameter for BeginScan(), FDWs would be able to optimize their remote
query with actual parameter values.

Regard,
--
Shigeru Hanada



-- 
Sent via 

Re: [HACKERS] review: FDW API

2011-01-17 Thread Shigeru HANADA
On Mon, 17 Jan 2011 22:13:19 +0900
Shigeru HANADA han...@metrosystems.co.jp wrote:
 Fixed in attached patch.

Sorry, I have not attached patch to last message.
I'll post revised patch in next message soon.

Regards,
--
Shigeru Hanada



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] review: FDW API

2011-01-15 Thread Jan Urbański
Hi,

what follows is a review of the FDW API patch from
http://archives.postgresql.org/message-id/20110114212358.82c7.69899...@metrosystems.co.jp

All three patches apply cleanly and compile without warnings. Regression
tests pass.

Let me go patch by patch, starting with the first one that adds the
HANDLER option.

It adds one useless hunk in src/backend/commands/foreigncmds.c (changes
order if #includes).

There's a typo in a C commnent (determin which validator to be used).

Other than that, it looks OK.

The third patch just adds a GetForeignTable helper function and it looks OK.

The second patch actually adds the API. First of all, I'd like say that
it's a really cool piece of code, allowing all kinds of awesome
funcionality to be added. I'm already excited by the things that this
will make possible. Congratulations!

To get a feel of the API I wrote a simple FDW wrapper that presents data
from the commitfest RSS feed, based heavily on twitter_fdw by Hitoshi
Harada. Please treat my thoughts as someone's who doesn't really know
*why* the API looks like it does, but has some observations about what
was missing or what felt strange when using it. I guess that's the
position a typical FDW wrapper writer will be in.

First of all, the C comments mention that PlanRelScan should put a tuple
descriptor in FdwPlan, but there's no such field in it. Also, comments
for PlanRelScan talk about the 'attnos' argument, which is not in the
function's signature. I guess the comments are just obsolete and should
be updated. I think it's actually a good thing you don't have to put a
TupleDesc in FdwPlan.

There are two API methods, PlanNative and PlanQuery that are ifdef'd out
using IN_THE_FUTURE. Isn't the standard symbol we use NOT_USED? Also,
the comments say you can implement either PlanRelScan or PlanNative, and
only the former is available for now. If we keep these methods
commented, they should be moved to the end of the struct, so that
uncommenting them will not break compatibility with existing FDWs.

The only documentation a FDW writer has is fdwapi.h, so comments there
need to be top-notch. We might contemplate writing a documentation
chapter explaining how FDW handlers should be written, like we do for C
SRFs and libpq programs, but I guess for now just the headers file would
be enough.

FdwExecutionState is just a struct around a void pointer, can we imagine
adding more fields there? If not, maybe we could just remove the
structure and pass void pointers around? OTOH that gives us some
compiler checking and possibility of extending the struct, so I guess we
could also just leave it like that.

The Iterate method gets passed a TupleTableSlot. Do we really need such
a low-level structure? It makes returning the result easy, because you
just form your tuple and call ExecStoreTuple, but perhaps we could
abstract that away a bit and add a helper method that will take a tuple
and call ExecStoreTuple for us, passing InvalidBuffer and false as the
remaining arguments. Or maybe make Iterate return the tuple and call
ExecStoreTuple internally? I don't have strong opinions, but
TupleTableSlot feels a bit too gutty - I'm having a hard time imagining
what fields from it would be useful for a FDW writer, and so perhaps you
don't need to expose it.

Why does BeginScan accept a ParamListInfo argument? First of all, it
feels like a parsing thing, not a relation scan thing, so perhaps it
should be available at some other, earlier stage. Second of all, what
would it be useful for anyway? Neither file_fdw nor my commitfest_fdw
does anything with it.

We could use comments about how to return tuples from Iterate and how to
finish returning them. I had to look at the example to figure out that
you need ExecClearTuple(slot) in your last Iterate. If Iterate's
interface were to change, we could just return NULL instead of a tuple
to say that we're done.

We could be a bit more explicit about how to allocate objects, for
instance if I'm allocating a FdwPlan in my PlanRelScan with a palloc,
will it not go away too soon, or too late (IOW leak)?

I ran into a problem when doing:

select i from generate_series(1, 10) as g(i), pgcommitfest;

when I was trying to check for leaks. It returned four rows
(pgcommitfest is the foreign table that returns four rows). Explain
analyze shows a nested loop with a foreign scan as inner loop. Maybe
it's because I didn't implement ReScan, but the API says I don't have to.

If you don't implement Iterate you get a elog(ERROR). But if you don't
implement one of the other required methods, you segfault. Feels
inconsistent.

PlanRelScan looks like something that could use all kinds of information
to come up with a good plan. Maybe we could change its input argument to
a single struct that would contain all the current arguments, so it'll
be easier to extend when people will start writing FDWs and will find
out that they'd like more information available. Doing that would mean
that adding a field in a