Re: [HACKERS] review: FDW API

2011-02-22 Thread Tom Lane
I wrote:
> I did a bit of poking around here, and came to the following
> conclusions:

> 1. We don't want to add another RTEKind.  RTE_RELATION basically has the
> semantics of "anything with a pg_class OID", so it ought to include
> foreign tables.  Therefore the fix ought to be to add relkind to
> RangeTblEntry.  (BTW, so far as I can tell, RTE_SPECIAL is obsolete:
> there are assorted switch cases that handle it, but no place that can
> generate the value.  I'm inclined to delete it while we are messing
> with this.)

> 2. In the current code layout, to make sense of relkind you need to
> #include catalog/pg_class.h where the values for relkind are #defined.
> I dislike the idea of that being true for a field of such a widely-known
> struct as RangeTblEntry.  Accordingly, I suggest that we move those
> values into parsenodes.h.  (Perhaps we could convert them to an enum,
> too, though still keeping the same ASCII values.)

> 3. We can have the rewriter update an RTE's stored value of relkind
> during AcquireRewriteLocks: it opens the rel for each RTE_RELATION entry
> anyway, so copying over the relkind is essentially free.  While it's not
> instantly obvious that that is "soon enough", I think that it is, since
> up to the point of acquiring a lock there we can't assume that the rel
> isn't being changed or dropped undeneath us --- that is, any earlier
> test on an RTE's relkind might be testing just-obsoleted state anyway.

> 4. I had hoped that we might be able to get rid of some pre-existing
> syscache lookups, but at least so far as the parse/plan/execute chain
> is concerned, there don't seem to be any other places that need to
> fetch a relkind based on just an RTE entry.

> So point #4 is a bit discouraging, but other that, this seems like
> a fairly straightforward exercise.  I'm inclined to go ahead and do it,
> unless there are objections.

Applied, except I ended up not moving the RELKIND #defines as suggested
in point #2.  Those #defines are used by pg_dump, and having pg_dump.c
#include parsenodes.h seemed like a bad idea.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: FDW API

2011-02-20 Thread Tom Lane
I wrote:
> Robert Haas  writes:
>> On Fri, Feb 18, 2011 at 10:47 AM, Tom Lane  wrote:
>>> The main downside of that is that relation relkinds would have
>>> to become fixed (because there would be no practical way of updating
>>> RTEs in stored rules), which means the "convert relation to view" hack
>>> would have to go away.

>> That actually sounds like a possible problem, because it's possible to
>> create views with circular dependencies using CORV, and they won't
>> dump-and-reload properly without that hack.

> Urgh.  That's problematic, because even if we changed pg_dump (which
> would not be that hard I think), we'd still have to cope with dump files
> emitted by existing versions of pg_dump.

> [ thinks a bit ... ]  But we can probably hack our way around that:
> teach the rule rewriter to update relkind in any RTE it brings in from a
> stored rule.  We already do something similar in some other cases where
> a stored parsetree node contains information that could become obsolete.

I did a bit of poking around here, and came to the following
conclusions:

1. We don't want to add another RTEKind.  RTE_RELATION basically has the
semantics of "anything with a pg_class OID", so it ought to include
foreign tables.  Therefore the fix ought to be to add relkind to
RangeTblEntry.  (BTW, so far as I can tell, RTE_SPECIAL is obsolete:
there are assorted switch cases that handle it, but no place that can
generate the value.  I'm inclined to delete it while we are messing
with this.)

2. In the current code layout, to make sense of relkind you need to
#include catalog/pg_class.h where the values for relkind are #defined.
I dislike the idea of that being true for a field of such a widely-known
struct as RangeTblEntry.  Accordingly, I suggest that we move those
values into parsenodes.h.  (Perhaps we could convert them to an enum,
too, though still keeping the same ASCII values.)

3. We can have the rewriter update an RTE's stored value of relkind
during AcquireRewriteLocks: it opens the rel for each RTE_RELATION entry
anyway, so copying over the relkind is essentially free.  While it's not
instantly obvious that that is "soon enough", I think that it is, since
up to the point of acquiring a lock there we can't assume that the rel
isn't being changed or dropped undeneath us --- that is, any earlier
test on an RTE's relkind might be testing just-obsoleted state anyway.

4. I had hoped that we might be able to get rid of some pre-existing
syscache lookups, but at least so far as the parse/plan/execute chain
is concerned, there don't seem to be any other places that need to
fetch a relkind based on just an RTE entry.

So point #4 is a bit discouraging, but other that, this seems like
a fairly straightforward exercise.  I'm inclined to go ahead and do it,
unless there are objections.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: FDW API

2011-02-19 Thread Tom Lane
Heikki Linnakangas  writes:
> On 11.02.2011 22:50, Heikki Linnakangas wrote:
>> I spent some more time reviewing this, and working on the PostgreSQL FDW
>> in tandem. Here's an updated API patch, with a bunch of cosmetic
>> changes, and a bug fix for FOR SHARE/UPDATE.

> Another version, rebased against master branch and with a bunch of small 
> cosmetic fixes.

I've applied this after a moderate amount of editorialization.

The question of avoiding extra relkind lookups is still open.  We talked
about adding relkind to RangeTblEntry, but I wonder whether adding a new
RTEKind would be a better idea.  Haven't researched it yet.

I have a hacked-up version of contrib/file_fdw that I've been using to
test it with.  That needs some more cleanup before committing, but I
think it should not take too long.  The one thing that is kind of
annoying is that the regression tests need generated files (to insert
absolute paths) and it seems like the PGXS infrastructure doesn't know
how to clean up the generated files afterwards.  Anybody have any
thoughts about fixing that?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: FDW API

2011-02-18 Thread Tom Lane
I wrote:
> ... My feeling is it'd be best to pass down
> all the information the executor node has got --- probably we should
> just pass the ForeignScanState node itself, and leave a void * in that
> for FDW-private data, and be done with it.  Otherwise we're going to be
> adding missed stuff back to the API every time somebody notices that
> their FDW can't do X because they don't have access to the necessary
> information.

Attached is a rewritten version of fdwhandler.sgml that specifies what I
think is a more future-proof API for the callback functions.  Barring
objections, I'll push ahead with editing the code to match.

regards, tom lane




 
   Writing A Foreign Data Wrapper

   
foreign data wrapper
handler for
   

   
All operations on a foreign table are handled through its foreign data
wrapper, which consists of a set of functions that the planner and
executor call. The foreign data wrapper is responsible for fetching
data from the remote data source and returning it to the
PostgreSQL executor. This chapter outlines how
to write a new foreign data wrapper.
   

   
The FDW author needs to implement a handler function, and optionally
a validator function. Both functions must be written in a compiled
language such as C, using the version-1 interface.
For details on C language calling conventions and dynamic loading,
see .
   

   
The handler function simply returns a struct of function pointers to
callback functions that will be called by the planner and executor.
Most of the effort in writing an FDW is in implementing these callback
functions.
The handler function must be registered with
PostgreSQL as taking no arguments and returning
the special pseudo-type fdw_handler.
The callback functions are plain C functions and are not visible or
callable at the SQL level.
   

   
The validator function is responsible for validating options given in the
CREATE FOREIGN DATA WRAPPER, CREATE
SERVER and CREATE FOREIGN TABLE commands.
The validator function must be registered as taking two arguments, a text
array containing the options to be validated, and an OID representing the
type of object the options are associated with (in the form of the OID
of the system catalog the object would be stored in).  If no validator
function is supplied, the options are not checked at object creation time.
   

   
The foreign data wrappers included in the standard distribution are good
references when trying to write your own.  Look into the
contrib/file_fdw subdirectory of the source tree.
The  reference page also has
some useful details.
   

   

 The SQL standard specifies an interface for writing foreign data wrappers.
 However, PostgreSQL does not implement that API, because the effort to
 accommodate it into PostgreSQL would be large, and the standard API hasn't
 gained wide adoption anyway.

   

   
Foreign Data Wrapper Callback Routines


 The FDW handler function returns a palloc'd FdwRoutine
 struct containing pointers to the following callback functions:




FdwPlan *
PlanForeignScan (Oid foreigntableid,
 PlannerInfo *root,
 RelOptInfo *baserel);


 Plan a scan on a foreign table. This is called when a query is planned.
 foreigntableid is the pg_class OID of the
 foreign table.  root is the planner's global information
 about the query, and baserel is the planner's information
 about this table.
 The function must return a palloc'd struct that contains cost estimates,
 a string to show for this scan in EXPLAIN, and any
 FDW-private information that is needed to execute the foreign scan at a
 later time.  (Note that the private information must be represented in
 a form that copyObject knows how to copy.)



 The information in root and baserel can be used
 to reduce the amount of information that has to be fetched from the
 foreign table (and therefore reduce the cost estimate).
 baserel->baserestrictinfo is particularly interesting, as
 it contains restriction quals (WHERE clauses) that can be
 used to filter the rows to be fetched.  (The FDW is not required to
 enforce these quals, as the finished plan will recheck them anyway.)
 baserel->reltargetlist can be used to determine which
 columns need to be fetched.




void
BeginForeignScan (ForeignScanState *node,
  int eflags);


 Begin executing a foreign scan. This is called during executor startup.
 It should perform any initialization needed before the scan can start.
 The ForeignScanState node has already been created, but
 its fdw_private field is still NULL.  Information about
 the table to scan is accessible through the
 ForeignScanState node (in particular, from the underlying

Re: [HACKERS] review: FDW API

2011-02-18 Thread Tom Lane
Heikki Linnakangas  writes:
> On 18.02.2011 22:16, Tom Lane wrote:
>> Question after first look: what is the motivation for passing
>> estate->es_param_list_info to BeginScan?  AFAICS, even if there is a
>> reason for that function to need that, it isn't receiving any info that
>> would be sufficient to let it know what's in there.

> The idea is that when the query is planned, the FDW can choose to push 
> down a qual that contains a parameter marker, like "WHERE remotecol = 
> $1". At execution time, it needs the value of the parameter to send it 
> to the remote server. The PostgreSQL FDW does that, although I didn't 
> test it so it might well be broken.

s/might well be/is/ --- there's no guarantee that parameters are valid
at executor setup time.  The place that needs to be grabbing the
parameter value for that purpose is BeginScan.

>> What seems more
>> likely to be useful is to pass in the EState pointer, as for example
>> being able to look at es_query_cxt seems like a good idea.

> By "look at", you mean allocate stuff in it?

Right.  I suppose you're going to comment that CurrentMemoryContext is
probably the same thing, but in general it's not going to pay to make
this API run with blinders on.  My feeling is it'd be best to pass down
all the information the executor node has got --- probably we should
just pass the ForeignScanState node itself, and leave a void * in that
for FDW-private data, and be done with it.  Otherwise we're going to be
adding missed stuff back to the API every time somebody notices that
their FDW can't do X because they don't have access to the necessary
information.  That definitional instability will trump any ABI stability
that might be gained from not relying on executor node types.  (And it's
not like changing ScanState in a released version is an entirely safe
thing to do even today --- there are lots of loadable modules that know
about that struct.)

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: FDW API

2011-02-18 Thread Heikki Linnakangas

On 18.02.2011 22:16, Tom Lane wrote:

Heikki Linnakangas  writes:

Another version, rebased against master branch and with a bunch of small
cosmetic fixes.



I guess this is as good as this is going to get for 9.1.


This is *badly* in need of another cleanup pass; it's full of typos,
contradictory comments, #ifdef NOT_USED stuff, etc etc.  And the
documentation is really inadequate.  If you're out of energy to go
over it, I guess I should step up.


If you have the energy, by all means, thanks.


Question after first look: what is the motivation for passing
estate->es_param_list_info to BeginScan?  AFAICS, even if there is a
reason for that function to need that, it isn't receiving any info that
would be sufficient to let it know what's in there.


The idea is that when the query is planned, the FDW can choose to push 
down a qual that contains a parameter marker, like "WHERE remotecol = 
$1". At execution time, it needs the value of the parameter to send it 
to the remote server. The PostgreSQL FDW does that, although I didn't 
test it so it might well be broken.



 What seems more
likely to be useful is to pass in the EState pointer, as for example
being able to look at es_query_cxt seems like a good idea.


By "look at", you mean allocate stuff in it?

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: FDW API

2011-02-18 Thread Tom Lane
Heikki Linnakangas  writes:
> Another version, rebased against master branch and with a bunch of small 
> cosmetic fixes.

> I guess this is as good as this is going to get for 9.1.

This is *badly* in need of another cleanup pass; it's full of typos,
contradictory comments, #ifdef NOT_USED stuff, etc etc.  And the
documentation is really inadequate.  If you're out of energy to go
over it, I guess I should step up.

Question after first look: what is the motivation for passing
estate->es_param_list_info to BeginScan?  AFAICS, even if there is a
reason for that function to need that, it isn't receiving any info that
would be sufficient to let it know what's in there.  What seems more
likely to be useful is to pass in the EState pointer, as for example
being able to look at es_query_cxt seems like a good idea.

BTW, I see no particularly good reason to let the FDW omit ReScan.
If it wants to implement that as end-and-begin, it can do so internally.
It would be a lot clearer to just insist that all the function pointers
be valid, as indeed some (not all) of the comments say already.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: FDW API

2011-02-18 Thread Tom Lane
Robert Haas  writes:
> On Fri, Feb 18, 2011 at 10:47 AM, Tom Lane  wrote:
>> The main downside of that is that relation relkinds would have
>> to become fixed (because there would be no practical way of updating
>> RTEs in stored rules), which means the "convert relation to view" hack
>> would have to go away.  Offhand I think no one cares about that anymore,
>> but ...

> That actually sounds like a possible problem, because it's possible to
> create views with circular dependencies using CORV, and they won't
> dump-and-reload properly without that hack.  It's not a particularly
> useful thing to do, of course, and I think we could reengineer pg_dump
> to not need the hack even if someone does do it, but that sounds like
> more work than we want to tackle right now.

Urgh.  That's problematic, because even if we changed pg_dump (which
would not be that hard I think), we'd still have to cope with dump files
emitted by existing versions of pg_dump.  The time constant before that
stops being an issue is measured in years.  I'm not at all sure whether
the circular dependency case is infrequent enough that we could get away
with saying "tough luck" to people who hit the case.

[ thinks a bit ... ]  But we can probably hack our way around that:
teach the rule rewriter to update relkind in any RTE it brings in from a
stored rule.  We already do something similar in some other cases where
a stored parsetree node contains information that could become obsolete.

But that conclusion just makes it even clearer that fixing this
performance problem, if it even is real, should be a separate patch.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: FDW API

2011-02-18 Thread Robert Haas
On Fri, Feb 18, 2011 at 10:47 AM, Tom Lane  wrote:
> Heikki Linnakangas  writes:
>> On 15.02.2011 23:00, Tom Lane wrote:
>>> I think moving the error check downstream would be a good thing.
>
>> Ok, I tried moving the error checks to preprocess_rowmarks().
>> Unfortunately RelOptInfos haven't been built at that stage yet, so you
>> still have to do the catalog lookup to get the relkind. That defeats the
>> purpose.
>
> Mph.  It seems like the right fix here is to add relkind to
> RangeTblEntry: it could be filled in for free in addRangeTableEntry, for
> example.

Heikki and I came to the same conclusion yesterday while chatting
about this on IM.

> The main downside of that is that relation relkinds would have
> to become fixed (because there would be no practical way of updating
> RTEs in stored rules), which means the "convert relation to view" hack
> would have to go away.  Offhand I think no one cares about that anymore,
> but ...

That actually sounds like a possible problem, because it's possible to
create views with circular dependencies using CORV, and they won't
dump-and-reload properly without that hack.  It's not a particularly
useful thing to do, of course, and I think we could reengineer pg_dump
to not need the hack even if someone does do it, but that sounds like
more work than we want to tackle right now.

> In any case, this is looking like a performance optimization that should
> be dealt with in a separate patch.  I'd suggest leaving the code in the
> form with the extra relkind lookups for the initial commit.  It's
> possible that no one would notice the extra lookups anyway --- have you
> benchmarked it?

This is a good point... although I think this results in at least one
extra syscache lookup per table per SELECT, even when no foreign
tables are involved.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: FDW API

2011-02-18 Thread Tom Lane
Heikki Linnakangas  writes:
> On 15.02.2011 23:00, Tom Lane wrote:
>> I think moving the error check downstream would be a good thing.

> Ok, I tried moving the error checks to preprocess_rowmarks(). 
> Unfortunately RelOptInfos haven't been built at that stage yet, so you 
> still have to do the catalog lookup to get the relkind. That defeats the 
> purpose.

Mph.  It seems like the right fix here is to add relkind to
RangeTblEntry: it could be filled in for free in addRangeTableEntry, for
example.  The main downside of that is that relation relkinds would have
to become fixed (because there would be no practical way of updating
RTEs in stored rules), which means the "convert relation to view" hack
would have to go away.  Offhand I think no one cares about that anymore,
but ...

In any case, this is looking like a performance optimization that should
be dealt with in a separate patch.  I'd suggest leaving the code in the
form with the extra relkind lookups for the initial commit.  It's
possible that no one would notice the extra lookups anyway --- have you
benchmarked it?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: FDW API

2011-02-16 Thread Heikki Linnakangas

On 15.02.2011 23:00, Tom Lane wrote:

Heikki Linnakangas  writes:

On 15.02.2011 21:13, Tom Lane wrote:

Hmm.  I don't have a problem with adding relkind to the planner's
RelOptInfo, but it seems to me that if parse analysis needs to know
this, you have put functionality into parse analysis that does not
belong there.



Possibly. We throw the existing errors, for example if you try to do
"FOR UPDATE OF foo" where foo is a set-returning function, in
transformLockingClause(), so it seemed like the logical place to check
for foreign tables too.



Hmm, one approach would be to go ahead and create the RowMarkClauses for
all relations in the parse analysis phase, foreign or not, and throw the
error later, in preprocess_rowmarks().


I think moving the error check downstream would be a good thing.


Ok, I tried moving the error checks to preprocess_rowmarks(). 
Unfortunately RelOptInfos haven't been built at that stage yet, so you 
still have to do the catalog lookup to get the relkind. That defeats the 
purpose.


We could delay the error checking further, but preprocess_rowmarks() 
would need to distinguish foreign tables anyway, so that it can mark 
them with ROW_MARK_COPY instead of ROW_MARK_REFERENCE.



IIRC, at the moment we're basically duplicating the tests between parse
analysis and the planner, but it's not clear what the value of that is.


There's duplicate logic in parse analysis and rewriter, to be precise. 
And then there's this one check in make_outerjoininfo:


>/*
> 	 * Presently the executor cannot support FOR UPDATE/SHARE marking of 
rels
> 	 * appearing on the nullable side of an outer join. (It's somewhat 
unclear
> 	 * what that would mean, anyway: what should we mark when a result 
row is

> * generated from no element of the nullable relation?)  So, complain if
> * any nullable rel is FOR UPDATE/SHARE.
> *
> * You might be wondering why this test isn't made far upstream in the
> * parser.  It's because the parser hasn't got enough info --- consider
> * FOR UPDATE applied to a view.  Only after rewriting and flattening do
> * we know whether the view contains an outer join.
> *
> 	 * We use the original RowMarkClause list here; the PlanRowMark list 
would

> * list everything.
> */
>foreach(l, root->parse->rowMarks)
>{
>RowMarkClause *rc = (RowMarkClause *) lfirst(l);
>
>if (bms_is_member(rc->rti, right_rels) ||
>(jointype == JOIN_FULL && bms_is_member(rc->rti, 
left_rels)))
>ereport(ERROR,
>(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
> 	 errmsg("SELECT FOR UPDATE/SHARE cannot be applied to the 
nullable side of an outer join")));

>}

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: FDW API

2011-02-15 Thread Tom Lane
Heikki Linnakangas  writes:
> On 15.02.2011 21:13, Tom Lane wrote:
>> Hmm.  I don't have a problem with adding relkind to the planner's
>> RelOptInfo, but it seems to me that if parse analysis needs to know
>> this, you have put functionality into parse analysis that does not
>> belong there.

> Possibly. We throw the existing errors, for example if you try to do 
> "FOR UPDATE OF foo" where foo is a set-returning function, in 
> transformLockingClause(), so it seemed like the logical place to check 
> for foreign tables too.

> Hmm, one approach would be to go ahead and create the RowMarkClauses for 
> all relations in the parse analysis phase, foreign or not, and throw the 
> error later, in preprocess_rowmarks().

I think moving the error check downstream would be a good thing.
Consider for example the case of applying FOR UPDATE to a view.  You
can't know what that entails until after the rewriter expands the view.
IIRC, at the moment we're basically duplicating the tests between parse
analysis and the planner, but it's not clear what the value of that is.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: FDW API

2011-02-15 Thread Heikki Linnakangas

On 15.02.2011 21:00, Robert Haas wrote:

On Tue, Feb 15, 2011 at 1:40 PM, Heikki Linnakangas
  wrote:

I'm actually surprised we don't need to distinguish them in more places, but
nevertheless it feels like we should have that info available more
conveniently, and without requiring a catalog lookup like get_rel_relkind()
does. At first I thought we should add a field to RelOptInfo, but that
doesn't help transformLockingClause. Adding the field to RangeTblEntry seems
quite invasive, and it doesn't feel like the right place to cache that kind
of information anyway. Thoughts on that?


Maybe it should be a new RTEKind.


That would be even more invasive.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: FDW API

2011-02-15 Thread Heikki Linnakangas

On 15.02.2011 21:13, Tom Lane wrote:

Heikki Linnakangas  writes:

As the patch stands, we have to do get_rel_relkind() in a couple of
places in parse analysis and the planner to distinguish a foreign table
from a regular one. As the patch stands, there's nothing in
RangeTblEntry (which is what we have in transformLockingClause) or
RelOptInfo (for set_plain_rel_pathlist) to directly distinguish them.


Hmm.  I don't have a problem with adding relkind to the planner's
RelOptInfo, but it seems to me that if parse analysis needs to know
this, you have put functionality into parse analysis that does not
belong there.


Possibly. We throw the existing errors, for example if you try to do 
"FOR UPDATE OF foo" where foo is a set-returning function, in 
transformLockingClause(), so it seemed like the logical place to check 
for foreign tables too.


Hmm, one approach would be to go ahead and create the RowMarkClauses for 
all relations in the parse analysis phase, foreign or not, and throw the 
error later, in preprocess_rowmarks(). preprocess_rowmarks() doesn't 
currently know if each RowMarkClause was created by "... FOR UPDATE" or 
"FOR UPDATE OF foo", but to be consistent with the logic in 
transformLockingClause() it would need to. For the former case, it would 
need to just ignore foreign tables but for the latter it would need to 
throw an error. I guess we could add an "explicit" flag to RowMarkClause 
for that.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: FDW API

2011-02-15 Thread Tom Lane
Heikki Linnakangas  writes:
> As the patch stands, we have to do get_rel_relkind() in a couple of 
> places in parse analysis and the planner to distinguish a foreign table 
> from a regular one. As the patch stands, there's nothing in 
> RangeTblEntry (which is what we have in transformLockingClause) or 
> RelOptInfo (for set_plain_rel_pathlist) to directly distinguish them.

Hmm.  I don't have a problem with adding relkind to the planner's
RelOptInfo, but it seems to me that if parse analysis needs to know
this, you have put functionality into parse analysis that does not
belong there.

(No, I haven't read the patch...)

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: FDW API

2011-02-15 Thread Robert Haas
On Tue, Feb 15, 2011 at 1:40 PM, Heikki Linnakangas
 wrote:
> I'm actually surprised we don't need to distinguish them in more places, but
> nevertheless it feels like we should have that info available more
> conveniently, and without requiring a catalog lookup like get_rel_relkind()
> does. At first I thought we should add a field to RelOptInfo, but that
> doesn't help transformLockingClause. Adding the field to RangeTblEntry seems
> quite invasive, and it doesn't feel like the right place to cache that kind
> of information anyway. Thoughts on that?

Maybe it should be a new RTEKind.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: FDW API

2011-02-08 Thread Shigeru HANADA

On Mon, 07 Feb 2011 09:37:37 +0100
Heikki Linnakangas  wrote:

> On 07.02.2011 08:00, Shigeru HANADA wrote:
> > Sorry for late, attached are revised version of FDW API patches which
> > reflect Heikki's comments except removing catalog lookup via
> > IsForeignTable().  ISTM that the point is avoiding catalog lookup
> > during planning, but I have not found when we can set "foreign table
> > flag" without catalog lookup during RelOptInfo generation.
> 
> In get_relation_info(), you do the catalog lookup anyway and have the 
> Relation object at hand. Add a flag to RelOptInfo indicating if it's a 
> foreign table or not, and set that in get_relation_info().

Thanks a lot.

Attached is a revised version of foreign_scan patch.  This still
requires fdw_handler patch which was attached to the orginal post.

Avoid_catalog_lookup.patch is attached for review purpose.
This patch includes changes for this fix.

Regards,
--
Shigeru Hanada


avoid_catalog_lookup.patch
Description: Binary data


20110208-foreign_scan.patch.gz
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: FDW API

2011-02-07 Thread Heikki Linnakangas

On 07.02.2011 08:00, Shigeru HANADA wrote:

Sorry for late, attached are revised version of FDW API patches which
reflect Heikki's comments except removing catalog lookup via
IsForeignTable().  ISTM that the point is avoiding catalog lookup
during planning, but I have not found when we can set "foreign table
flag" without catalog lookup during RelOptInfo generation.


In get_relation_info(), you do the catalog lookup anyway and have the 
Relation object at hand. Add a flag to RelOptInfo indicating if it's a 
foreign table or not, and set that in get_relation_info().


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: FDW API

2011-02-06 Thread Shigeru HANADA
On Mon, 31 Jan 2011 22:00:55 +0900
Shigeru HANADA  wrote:
> I'll post FDW API patches which reflect comments first, and then I'll
> rebase postgresql_fdw against them.

Sorry for late, attached are revised version of FDW API patches which
reflect Heikki's comments except removing catalog lookup via
IsForeignTable().  ISTM that the point is avoiding catalog lookup
during planning, but I have not found when we can set "foreign table
flag" without catalog lookup during RelOptInfo generation.

Please apply attached patches in this order.

1) fdw_catalog_lookup.patch
2) fdw_handler.patch
3) foreign_scan.patch

To execute SELECT quereis for foreign tables, you need a FDW which has
valid fdwhandler function.  The file_fdw which is posted in another
thread "SQL/MED file_fdw" would help.

Changes from last patches are:

1) Now SELECT FOR UPDATE check for foreign tables are done properly in
executor phase, in ExecLockTuple().  Or such check should be done in
parser or planner?

2) Server version is checked in pg_dump (>= 90100).

3) ReScan is not required now.  If ReScan is not supplied, ForeignScan
uses EndScan + BeginSacn instead.

4) FDW-Info in EXPLAIN is shown always, except FDW set NULL to
explainInfo.

Regards,
--
Shigeru Hanada


fdw_catalog_lookup.patch.gz
Description: Binary data


fdw_handler.patch.gz
Description: Binary data


foreign_scan.patch.gz
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: FDW API

2011-01-31 Thread Robert Haas
On Mon, Jan 31, 2011 at 8:00 AM, Shigeru HANADA
 wrote:
>> * Is there any use case for changing the handler or validator function
>> of an existign FDW with ALTER? To me it just seems like an unnecessary
>> complication.
>
> AFAICS, the only case for that is upgrading FDW to new one without
> re-creating foreign tables.  I don't have strong opinion for this
> issue, and it seems reasonable to remove ALTER feature in first
> version.

-1.  I don't think that removing the ability to change this is going
to save a measurable amount of complexity, and it certainly will suck
if you need it and don't have it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: FDW API

2011-01-31 Thread Shigeru HANADA
On Mon, 24 Jan 2011 15:08:11 +0200
Heikki Linnakangas  wrote:
> I've gone through the code in a bit more detail now. I did a bunch of 
> cosmetic changes along the way, patch attached. I also added a few 
> paragraphs in the docs. We need more extensive documentation, but this 
> at least marks the places where I think the docs need to go.
> 
> Comments:

Thanks for the comments!

> * How can a FDW fetch only the columns required by the scan? The file 
> FDW has to read the whole file anyhow, but it could perhaps skip calling 
> the input function for unnecessary columns. But more importantly, with 
> something like the postgresql_fdw you don't want to fetch any extra 
> columns across the wire. I gather the way to do it is to copy 
> RelOptInfo->attr_needed to private storage at plan stage, and fill the 
> not-needed attributes with NULLs in Iterate. That gets a bit awkward, 
> you need to transform attr_needed to something that can be copied for 
> starters. Or is that information somehow available at execution phase 
> otherwise?

I thought that RelOptInfo->reltargetlist, a list of Var, can be used
for that purpose.  FdwPlan can copy it with copyObject(), and pass
it to Iterate through FdwPlan->fdw_private.

Then, postgresql_fdw would be able to retrieve only necessary columns,
or just use "NULL" for unnecessary columns in the SELECT clause to
avoid mapping values to columns.  Each way would be decrease amount of
data transfer.

> * I think we need something in RelOptInfo to mark foreign tables. At the 
> moment, you need to call IsForeignTable() which does a catalog lookup. 
> Maybe a new RTEKind, or a boolean flag.

We can avoid catalog lookup with checking table type in get_relation_info()
and updating RelOptInfo->is_foreign_table if the target was a foreign
table.

> * Can/should we make ReScan optional? Could the executor just do 
> EndScan+BeginScan if there's no ReScan function?

Right, we have enough information to call BeginScan again.  Will fix.

> * Is there any point in allowing a FDW without a handler? It's totally 
> useless, isn't it? We had the CREATE FOREIGN DATA WRAPPER syntax in 
> previous versions, and it allowed it, but it has always been totally 
> useless so I don't think we need to worry much about 
> backwards-compatibility here.

dblink (and possibly other external modules) uses FDW without a
handler.

> * Is there any use case for changing the handler or validator function 
> of an existign FDW with ALTER? To me it just seems like an unnecessary 
> complication.

AFAICS, the only case for that is upgrading FDW to new one without
re-creating foreign tables.  I don't have strong opinion for this
issue, and it seems reasonable to remove ALTER feature in first
version.

> * IMHO the "FDW-info" should always be displayed, without VERBOSE. In my 
> experience with another DBMS that had this feature, the SQL being sent 
> to the remote server was almost always the key piece of information that 
> I was looking for in the query plans.

Agreed, will fix to show FDW-info always.  Is it reasonable to show
"FDW-info" row even if a FDW set explainInfo to NULL?

> * this check in expand_inherited_rtentry seems misplaced:
> 
> > /*
> >  * SELECT FOR UPDATE/SHARE is not allowed on foreign tables 
> > because
> >  * they are read-only.
> >  */
> > if (newrelation->rd_rel->relkind == RELKIND_FOREIGN_TABLE &&
> > lockmode != AccessShareLock)
> > ereport(ERROR,
> > (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
> >  errmsg("SELECT FOR UPDATE/SHARE is not 
> > allowed with foreign tables")));
> 
> I don't understand why we'd need to do that for inherited tables in 
> particular. And it's not working for regular non-inherited foreign tables:
> 
> postgres=# SELECT * FROM filetbl2 FOR UPDATE;
> ERROR:  could not open file "base/11933/16397": No such file or directory

It's a remnants of table inheritance support for foreign tables.  This
check should be removed from here, and another check should be added
to avoid above error.

> * Need to document how the FDW interacts with transaction 
> commit/rollback. In particular, I believe EndScan is never called if the 
> transaction is aborted. That needs to be noted explicitly, and need to 
> suggest how to clean up any external resources in that case. For 
> example, postgresql_fdw will need to close any open cursors or result sets.

I agree that resource cleanup is an important issue when writing FDW. 
FDW should use transaction-safe resources like VirtualFile, or use
ResourceOwner callback mechanism.  Is it reasonable to add new page
under "Chapter 35. Extending SQL"?

> In general, I think the design is sound. What we need is more 
> documentation. It'd also be nice to see the postgresql_fdw brought back 
> to shape so that it works against this latest version of the api patch.

I'll pos

Re: [HACKERS] review: FDW API

2011-01-30 Thread Shigeru HANADA
On Fri, 21 Jan 2011 18:28:19 +0200
Heikki Linnakangas  wrote:
> On 18.01.2011 17:26, Shigeru HANADA wrote:
> > 3) 20110118-fdw_handler.patch - This patch adds support for HANDLER
> > option to FOREIGN DATA WRAPPER object.
> 
> Some quick comments on that:

Thanks for the comments.
I'll post revised version of patches soon.

> * The elogs in parse_func_options() should be ereports.
> 
> * pg_dump should check the version number and only try to select 
> fdwhandler column if >= 9.1. See the other functions there for example 
> of that.

Fixed.

> * dumpForeignDataWrapper() in pg_dump checks if fdwhandler field is "-". 
> I don't think we use that as magic value there, do we? Same with validator.

That magic value, "-",  is used as "no-function-was-set" in
dumpForeignDataWrapper() and dumpForeignServer(), and I followed them.
Agreed that magic value should be removed, but it would be a refactoring
issue about pg_dump.

> * Please check that the HANDLER and VALIDATOR options that pg_dump 
> creates properly quoted.

I checked quoting for HANDLER and VALIDATOR with file_fdw functions,
and it seems works fine.  The pg_dump generats:


CREATE FOREIGN DATA WRAPPER dummy_fdw VALIDATOR public."File_Fdw_Validator"
HANDLER public."FILE_FDW_HANDLER";


from these DDLs:


CREATE OR REPLACE FUNCTION "File_Fdw_Validator" (text[], oid)
RETURNS bool
AS '$libdir/file_fdw','file_fdw_validator'
LANGUAGE C STRICT;

CREATE OR REPLACE FUNCTION "FILE_FDW_HANDLER" ()
RETURNS fdw_handler
AS '$libdir/file_fdw','file_fdw_handler'
LANGUAGE C STRICT;

CREATE FOREIGN DATA WRAPPER dummy_fdw
VALIDATOR "File_Fdw_Validator" HANDLER "FILE_FDW_HANDLER";


Regard,
--
Shigeru Hanada



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: FDW API

2011-01-30 Thread Robert Haas
On Mon, Jan 24, 2011 at 8:08 AM, Heikki Linnakangas
 wrote:
>> How much review have you done of parts (3) and (4)?  The key issue for
>> all of the FDW work in progress seems to be what the handler API is
>> going to look like, and so once we get that committed it will unblock
>> a lot of other things.
>
> I've gone through the code in a bit more detail now. I did a bunch of
> cosmetic changes along the way, patch attached. I also added a few
> paragraphs in the docs. We need more extensive documentation, but this at
> least marks the places where I think the docs need to go.
>
> Comments:

I haven't seen any responses to these comments.  Time grows short to
get this committed to PostgreSQL 9.1.  We need responses to these
comments and an updated patch ASAP.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: FDW API

2011-01-24 Thread Robert Haas
On Mon, Jan 24, 2011 at 8:08 AM, Heikki Linnakangas
 wrote:
> * Is there any point in allowing a FDW without a handler? It's totally
> useless, isn't it? We had the CREATE FOREIGN DATA WRAPPER syntax in previous
> versions, and it allowed it, but it has always been totally useless so I
> don't think we need to worry much about backwards-compatibility here.

Aren't things like dblink using this in its existing form?

> * Is there any use case for changing the handler or validator function of an
> existign FDW with ALTER? To me it just seems like an unnecessary
> complication.

+1.

> * IMHO the "FDW-info" should always be displayed, without VERBOSE. In my
> experience with another DBMS that had this feature, the SQL being sent to
> the remote server was almost always the key piece of information that I was
> looking for in the query plans.

+1.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: FDW API

2011-01-24 Thread Heikki Linnakangas

On 21.01.2011 17:57, Robert Haas wrote:

On Fri, Jan 21, 2011 at 10:17 AM, Heikki Linnakangas
  wrote:

On 18.01.2011 17:26, Shigeru HANADA wrote:


1) 20110118-no_fdw_perm_check.patch - This patch is not included in
last post.  This had been proposed on 2011-01-05 first, but maybe has
not been reviewd yet.  I re-propose this patch for SQL standard
conformance.  This patch removes permission check that requires USAGE
on the foreign-data wrapper at CREATE FOREIGN TABLE.
Please see original post for details.

http://archives.postgresql.org/message-id/20110105145206.30fd.69899...@metrosystems.co.jp


Committed this part.


How much review have you done of parts (3) and (4)?  The key issue for
all of the FDW work in progress seems to be what the handler API is
going to look like, and so once we get that committed it will unblock
a lot of other things.


I've gone through the code in a bit more detail now. I did a bunch of 
cosmetic changes along the way, patch attached. I also added a few 
paragraphs in the docs. We need more extensive documentation, but this 
at least marks the places where I think the docs need to go.


Comments:

* How can a FDW fetch only the columns required by the scan? The file 
FDW has to read the whole file anyhow, but it could perhaps skip calling 
the input function for unnecessary columns. But more importantly, with 
something like the postgresql_fdw you don't want to fetch any extra 
columns across the wire. I gather the way to do it is to copy 
RelOptInfo->attr_needed to private storage at plan stage, and fill the 
not-needed attributes with NULLs in Iterate. That gets a bit awkward, 
you need to transform attr_needed to something that can be copied for 
starters. Or is that information somehow available at execution phase 
otherwise?


* I think we need something in RelOptInfo to mark foreign tables. At the 
moment, you need to call IsForeignTable() which does a catalog lookup. 
Maybe a new RTEKind, or a boolean flag.


* Can/should we make ReScan optional? Could the executor just do 
EndScan+BeginScan if there's no ReScan function?


* Is there any point in allowing a FDW without a handler? It's totally 
useless, isn't it? We had the CREATE FOREIGN DATA WRAPPER syntax in 
previous versions, and it allowed it, but it has always been totally 
useless so I don't think we need to worry much about 
backwards-compatibility here.


* Is there any use case for changing the handler or validator function 
of an existign FDW with ALTER? To me it just seems like an unnecessary 
complication.


* IMHO the "FDW-info" should always be displayed, without VERBOSE. In my 
experience with another DBMS that had this feature, the SQL being sent 
to the remote server was almost always the key piece of information that 
I was looking for in the query plans.


* this check in expand_inherited_rtentry seems misplaced:


/*
 * SELECT FOR UPDATE/SHARE is not allowed on foreign tables 
because
 * they are read-only.
 */
if (newrelation->rd_rel->relkind == RELKIND_FOREIGN_TABLE &&
lockmode != AccessShareLock)
ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
 errmsg("SELECT FOR UPDATE/SHARE is not 
allowed with foreign tables")));


I don't understand why we'd need to do that for inherited tables in 
particular. And it's not working for regular non-inherited foreign tables:


postgres=# SELECT * FROM filetbl2 FOR UPDATE;
ERROR:  could not open file "base/11933/16397": No such file or directory

* Need to document how the FDW interacts with transaction 
commit/rollback. In particular, I believe EndScan is never called if the 
transaction is aborted. That needs to be noted explicitly, and need to 
suggest how to clean up any external resources in that case. For 
example, postgresql_fdw will need to close any open cursors or result sets.


In general, I think the design is sound. What we need is more 
documentation. It'd also be nice to see the postgresql_fdw brought back 
to shape so that it works against this latest version of the api patch.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com
diff --git a/doc/src/sgml/ddl.sgml b/doc/src/sgml/ddl.sgml
index a65b4bc..06a82a4 100644
--- a/doc/src/sgml/ddl.sgml
+++ b/doc/src/sgml/ddl.sgml
@@ -2986,6 +2986,41 @@ ANALYZE measurement;
   
  
 
+ 
+  Foreign Data
+   
+foreign data
+   
+   
+PostgreSQL implements parts of the SQL/MED
+specification, which allows you to access external data that resides
+outside PostgreSQL, using normal SQL queries.
+   
+
+   
+Foreign data is accessed with help from a
+foreign data wrapper. A foreign data wrapper is a
+library that can communicate with an external data source, hiding the
+details of connecting to the data source and fetching data from it. There
+ar

Re: [HACKERS] review: FDW API

2011-01-24 Thread Itagaki Takahiro
On Sat, Jan 22, 2011 at 07:20, Tom Lane  wrote:
> Heikki Linnakangas  writes:
>> * I wonder if CREATE FOREIGN DATA WRAPPER should automatically create
>> the handler function, if it doesn't exist yet.
>
> Doing that would require the equivalent of pg_pltemplate for FDWs, no?
> I think we're a long way from wanting to do that.  Also, it seems to me
> that add-on FDWs are likely to end up getting packaged as extensions,

The proposed file_fdw.sql actually creates a default FDW on installation.
So I think the installation scripts work as a template even if we don't
have FDW template catalogs.

+ /* contrib/file_fdw/file_fdw.sql.in */
+ -- create wrapper with validator and handler
+ CREATE OR REPLACE FUNCTION file_fdw_validator (text[], oid)
+ CREATE OR REPLACE FUNCTION file_fdw_handler ()
+ CREATE FOREIGN DATA WRAPPER file_fdw
+ VALIDATOR file_fdw_validator HANDLER file_fdw_handler;

-- 
Itagaki Takahiro

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: FDW API

2011-01-21 Thread Tom Lane
Heikki Linnakangas  writes:
> Some quick comments on that:

> * I wonder if CREATE FOREIGN DATA WRAPPER should automatically create 
> the handler function, if it doesn't exist yet. That's what CREATE 
> LANGUAGE does, which is similar. Although it doesn't seem to be 
> documented for CREATE LANGUAGE either, is it deprecated?

Doing that would require the equivalent of pg_pltemplate for FDWs, no?
I think we're a long way from wanting to do that.  Also, it seems to me
that add-on FDWs are likely to end up getting packaged as extensions,
so the extension machinery will probably render the question moot pretty
soon.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: FDW API

2011-01-21 Thread Heikki Linnakangas

On 18.01.2011 17:26, Shigeru HANADA wrote:

3) 20110118-fdw_handler.patch - This patch adds support for HANDLER
option to FOREIGN DATA WRAPPER object.


Some quick comments on that:

* I wonder if CREATE FOREIGN DATA WRAPPER should automatically create 
the handler function, if it doesn't exist yet. That's what CREATE 
LANGUAGE does, which is similar. Although it doesn't seem to be 
documented for CREATE LANGUAGE either, is it deprecated?


* The elogs in parse_func_options() should be ereports.

* pg_dump should check the version number and only try to select 
fdwhandler column if >= 9.1. See the other functions there for example 
of that.


* dumpForeignDataWrapper() in pg_dump checks if fdwhandler field is "-". 
I don't think we use that as magic value there, do we? Same with validator.


* Please check that the HANDLER and VALIDATOR options that pg_dump 
creates properly quoted.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: FDW API

2011-01-21 Thread Heikki Linnakangas

On 21.01.2011 17:57, Robert Haas wrote:

How much review have you done of parts (3) and (4)?


Not much. I'm getting there..


The key issue for
all of the FDW work in progress seems to be what the handler API is
going to look like, and so once we get that committed it will unblock
a lot of other things.


Yep. The API that's there now was originally suggested by me, so I 
probably won't have big complaints about it. I'll have to also look at 
the PostgreSQL and file implementations of it to see that it really fits 
the bill.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: FDW API

2011-01-21 Thread Robert Haas
On Fri, Jan 21, 2011 at 10:17 AM, Heikki Linnakangas
 wrote:
> On 18.01.2011 17:26, Shigeru HANADA wrote:
>>
>> 1) 20110118-no_fdw_perm_check.patch - This patch is not included in
>> last post.  This had been proposed on 2011-01-05 first, but maybe has
>> not been reviewd yet.  I re-propose this patch for SQL standard
>> conformance.  This patch removes permission check that requires USAGE
>> on the foreign-data wrapper at CREATE FOREIGN TABLE.
>> Please see original post for details.
>>
>> http://archives.postgresql.org/message-id/20110105145206.30fd.69899...@metrosystems.co.jp
>
> Committed this part.

How much review have you done of parts (3) and (4)?  The key issue for
all of the FDW work in progress seems to be what the handler API is
going to look like, and so once we get that committed it will unblock
a lot of other things.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: FDW API

2011-01-21 Thread Heikki Linnakangas

On 18.01.2011 17:26, Shigeru HANADA wrote:

1) 20110118-no_fdw_perm_check.patch - This patch is not included in
last post.  This had been proposed on 2011-01-05 first, but maybe has
not been reviewd yet.  I re-propose this patch for SQL standard
conformance.  This patch removes permission check that requires USAGE
on the foreign-data wrapper at CREATE FOREIGN TABLE.
Please see original post for details.
http://archives.postgresql.org/message-id/20110105145206.30fd.69899...@metrosystems.co.jp


Committed this part.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: FDW API

2011-01-18 Thread Shigeru HANADA
On Sun, 16 Jan 2011 01:55:19 +0100
Jan Urbański  wrote:

> In general, the feature looks great and I hope it'll make it into 9.1.
> And it we'd get the possibility to write FDW handlers in other PLs than
> C, it would rock so hard...
> 
> I'm going to mark this a Waiting for Author because of the typos, the
> BeginScan/EndScan issue, and the nested loop stopping halfway issue. The
> rest are suggestions or just thoughts, and if you don't see them as
> justified, I'll mark the next patch Ready for Committer.

Thanks a lot for the comments.  I've (hopefully) fixed issues above. 
Please find attached patches.

== patch list ==

1) 20110118-no_fdw_perm_check.patch - This patch is not included in
last post.  This had been proposed on 2011-01-05 first, but maybe has
not been reviewd yet.  I re-propose this patch for SQL standard
conformance.  This patch removes permission check that requires USAGE
on the foreign-data wrapper at CREATE FOREIGN TABLE.
Please see original post for details.
http://archives.postgresql.org/message-id/20110105145206.30fd.69899...@metrosystems.co.jp

2) 20110118-fdw_catalog_lookup.patch - This patch adds GetForeignTables. 
Fixed lack of pg_foreign_table.h inclusion.

3) 20110118-fdw_handler.patch - This patch adds support for HANDLER
option to FOREIGN DATA WRAPPER object.

4) 20110118-foreign_scan.patch - This patch adds ForeignScan executor
node and FDW API hooks based on Heikki's proposal.  As Itagaki-san
suggested on 2010-12-21, FDW must generate information for EXPLAIN
VERBOSE every PlanRelScan() call.  It's better to avoid such overhead,
so new EXPLAIN hook would be needed.  I'll try to make it cleaner.



And I'll reply to the rest of your comments.

> We could use comments about how to return tuples from Iterate and how to
> finish returning them. I had to look at the example to figure out that
> you need ExecClearTuple(slot) in your last Iterate. If Iterate's
> interface were to change, we could just return NULL instead of a tuple
> to say that we're done.

I've added some comments for FDW-developer to fdwapi.h, though they
wouldn't be enough.

> We could be a bit more explicit about how to allocate objects, for
> instance if I'm allocating a FdwPlan in my PlanRelScan with a palloc,
> will it not go away too soon, or too late (IOW leak)?

For that example, the answer is no.  Objects are allocated in
MessageContext if you don't switch context and released when the query
has been finished.  I agree that more documentation or comments for
FDW-developers should be added.

> Maybe PlanNative should get the foreign table OID, not the server OID,
> to resemble PlanRelScan more. The server OID is just a syscache lookup
> away, anyway.

You would missed a case that multiple foreign tables are used in a
query.  Main purpose of PlanNative is to support pass-through
execution of remote query.  In pass-through mode, you can use syntax
as if you have directly connected to external server, so can't use
PostgreSQL's parser.

> If you do EXPLAIN SELECT * FROM foregintable, BeginScan is not called,
> but EndScan is. That's weird, and I noticed because I got a segfault
> when EndScan tried to free things that BeginScan allocated. Maybe just
> don't call EndScan for EXPLAIN?

Fixed to not call EndScan if it was EXPLAIN execution.

Regards,
--
Shigeru Hanada


20110118-no_fdw_perm_check.patch.gz
Description: Binary data


20110118-fdw_catalog_lookup.patch.gz
Description: Binary data


20110118-fdw_handler.patch.gz
Description: Binary data


20110118-foreign_scan.patch.gz
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: FDW API

2011-01-17 Thread Shigeru HANADA
On Mon, 17 Jan 2011 22:13:19 +0900
Shigeru HANADA  wrote:
> Fixed in attached patch.

Sorry, I have not attached patch to last message.
I'll post revised patch in next message soon.

Regards,
--
Shigeru Hanada



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: FDW API

2011-01-17 Thread Shigeru HANADA
On Sun, 16 Jan 2011 01:55:19 +0100
Jan Urbański  wrote:
> what follows is a review of the FDW API patch from
> http://archives.postgresql.org/message-id/20110114212358.82c7.69899...@metrosystems.co.jp

Thanks for the comments!
For now, I answer to the first half of your comments. I'll answer to
the rest soon.

> All three patches apply cleanly and compile without warnings. Regression
> tests pass.
> 
> Let me go patch by patch, starting with the first one that adds the
> HANDLER option.

Sure.

> It adds one useless hunk in src/backend/commands/foreigncmds.c (changes
> order if #includes).
> 
> There's a typo in a C commnent ("determin which validator to be used").
> 
> Other than that, it looks OK.

Fixed in attached patch.

> The third patch just adds a GetForeignTable helper function and it looks OK.

Thanks.  This patch might be able to committed separately because it
would be small enough, and similar to existing lookup functions such
as GetForeignDataWrapper() and GetForeignServer().

> The second patch actually adds the API. First of all, I'd like say that
> it's a really cool piece of code, allowing all kinds of awesome
> funcionality to be added. I'm already excited by the things that this
> will make possible. Congratulations!
> 
> To get a feel of the API I wrote a simple FDW wrapper that presents data
> from the commitfest RSS feed, based heavily on twitter_fdw by Hitoshi
> Harada. Please treat my thoughts as someone's who doesn't really know
> *why* the API looks like it does, but has some observations about what
> was missing or what felt strange when using it. I guess that's the
> position a typical FDW wrapper writer will be in.

Sure, I think your point of view is very important.

> First of all, the C comments mention that PlanRelScan should put a tuple
> descriptor in FdwPlan, but there's no such field in it. Also, comments
> for PlanRelScan talk about the 'attnos' argument, which is not in the
> function's signature. I guess the comments are just obsolete and should
> be updated. I think it's actually a good thing you don't have to put a
> TupleDesc in FdwPlan.

Removed comments about 'attnos' and tuple descriptor.

> There are two API methods, PlanNative and PlanQuery that are ifdef'd out
> using IN_THE_FUTURE. Isn't the standard symbol we use NOT_USED? Also,
> the comments say you can implement either PlanRelScan or PlanNative, and
> only the former is available for now. If we keep these methods
> commented, they should be moved to the end of the struct, so that
> uncommenting them will not break compatibility with existing FDWs.

Agreed.  Moved ifdef'd part to the end of struct.

> The only documentation a FDW writer has is fdwapi.h, so comments there
> need to be top-notch. We might contemplate writing a documentation
> chapter explaining how FDW handlers should be written, like we do for C
> SRFs and libpq programs, but I guess for now just the headers file would
> be enough.

file_fdw and postgresql_fdw would be good samples  for wrapper
developer if we could ship them as contrib modules.

> FdwExecutionState is just a struct around a void pointer, can we imagine
> adding more fields there? If not, maybe we could just remove the
> structure and pass void pointers around? OTOH that gives us some
> compiler checking and possibility of extending the struct, so I guess we
> could also just leave it like that.

ISTM that using a struct as a interface is better than void*, as you
mentioned.

> The Iterate method gets passed a TupleTableSlot. Do we really need such
> a low-level structure? It makes returning the result easy, because you
> just form your tuple and call ExecStoreTuple, but perhaps we could
> abstract that away a bit and add a helper method that will take a tuple
> and call ExecStoreTuple for us, passing InvalidBuffer and false as the
> remaining arguments. Or maybe make Iterate return the tuple and call
> ExecStoreTuple internally? I don't have strong opinions, but
> TupleTableSlot feels a bit too gutty - I'm having a hard time imagining
> what fields from it would be useful for a FDW writer, and so perhaps you
> don't need to expose it.

This would be debatable issue.  Currently Iterate() is expected to
return materialized HeapTuple through TupleTableSlot.

I think an advantage to use TupleTableSlot is that FDW can set shoudFree
flag for the tuple.

> Why does BeginScan accept a ParamListInfo argument? First of all, it
> feels like a parsing thing, not a relation scan thing, so perhaps it
> should be available at some other, earlier stage. Second of all, what
> would it be useful for anyway? Neither file_fdw nor my commitfest_fdw
> does anything with it.

ParamListInfo is added to pass parameters of PREPARE/EXECUTE statement
to FDWs.

Plan for a prepared query is generated at PREPARE with placeholders,
and executed at EXECUTE with actual values.  With  ParamListInfo
parameter for BeginScan(), FDWs would be able to optimize their remote
query with actual parameter values.

[HACKERS] review: FDW API

2011-01-15 Thread Jan Urbański
Hi,

what follows is a review of the FDW API patch from
http://archives.postgresql.org/message-id/20110114212358.82c7.69899...@metrosystems.co.jp

All three patches apply cleanly and compile without warnings. Regression
tests pass.

Let me go patch by patch, starting with the first one that adds the
HANDLER option.

It adds one useless hunk in src/backend/commands/foreigncmds.c (changes
order if #includes).

There's a typo in a C commnent ("determin which validator to be used").

Other than that, it looks OK.

The third patch just adds a GetForeignTable helper function and it looks OK.

The second patch actually adds the API. First of all, I'd like say that
it's a really cool piece of code, allowing all kinds of awesome
funcionality to be added. I'm already excited by the things that this
will make possible. Congratulations!

To get a feel of the API I wrote a simple FDW wrapper that presents data
from the commitfest RSS feed, based heavily on twitter_fdw by Hitoshi
Harada. Please treat my thoughts as someone's who doesn't really know
*why* the API looks like it does, but has some observations about what
was missing or what felt strange when using it. I guess that's the
position a typical FDW wrapper writer will be in.

First of all, the C comments mention that PlanRelScan should put a tuple
descriptor in FdwPlan, but there's no such field in it. Also, comments
for PlanRelScan talk about the 'attnos' argument, which is not in the
function's signature. I guess the comments are just obsolete and should
be updated. I think it's actually a good thing you don't have to put a
TupleDesc in FdwPlan.

There are two API methods, PlanNative and PlanQuery that are ifdef'd out
using IN_THE_FUTURE. Isn't the standard symbol we use NOT_USED? Also,
the comments say you can implement either PlanRelScan or PlanNative, and
only the former is available for now. If we keep these methods
commented, they should be moved to the end of the struct, so that
uncommenting them will not break compatibility with existing FDWs.

The only documentation a FDW writer has is fdwapi.h, so comments there
need to be top-notch. We might contemplate writing a documentation
chapter explaining how FDW handlers should be written, like we do for C
SRFs and libpq programs, but I guess for now just the headers file would
be enough.

FdwExecutionState is just a struct around a void pointer, can we imagine
adding more fields there? If not, maybe we could just remove the
structure and pass void pointers around? OTOH that gives us some
compiler checking and possibility of extending the struct, so I guess we
could also just leave it like that.

The Iterate method gets passed a TupleTableSlot. Do we really need such
a low-level structure? It makes returning the result easy, because you
just form your tuple and call ExecStoreTuple, but perhaps we could
abstract that away a bit and add a helper method that will take a tuple
and call ExecStoreTuple for us, passing InvalidBuffer and false as the
remaining arguments. Or maybe make Iterate return the tuple and call
ExecStoreTuple internally? I don't have strong opinions, but
TupleTableSlot feels a bit too gutty - I'm having a hard time imagining
what fields from it would be useful for a FDW writer, and so perhaps you
don't need to expose it.

Why does BeginScan accept a ParamListInfo argument? First of all, it
feels like a parsing thing, not a relation scan thing, so perhaps it
should be available at some other, earlier stage. Second of all, what
would it be useful for anyway? Neither file_fdw nor my commitfest_fdw
does anything with it.

We could use comments about how to return tuples from Iterate and how to
finish returning them. I had to look at the example to figure out that
you need ExecClearTuple(slot) in your last Iterate. If Iterate's
interface were to change, we could just return NULL instead of a tuple
to say that we're done.

We could be a bit more explicit about how to allocate objects, for
instance if I'm allocating a FdwPlan in my PlanRelScan with a palloc,
will it not go away too soon, or too late (IOW leak)?

I ran into a problem when doing:

select i from generate_series(1, 10) as g(i), pgcommitfest;

when I was trying to check for leaks. It returned four rows
(pgcommitfest is the foreign table that returns four rows). Explain
analyze shows a nested loop with a foreign scan as inner loop. Maybe
it's because I didn't implement ReScan, but the API says I don't have to.

If you don't implement Iterate you get a elog(ERROR). But if you don't
implement one of the other required methods, you segfault. Feels
inconsistent.

PlanRelScan looks like something that could use all kinds of information
to come up with a good plan. Maybe we could change its input argument to
a single struct that would contain all the current arguments, so it'll
be easier to extend when people will start writing FDWs and will find
out that they'd like more information available. Doing that would mean
that adding a field in a