Re: [HACKERS] Exposing quals

2008-07-14 Thread David Fetter
On Tue, Jul 08, 2008 at 02:41:45PM -0400, Jan Wieck wrote:
 Here,

 I talked to my supervisor here in Toronto (that's where I am this week)  
 and Afilias actually sees enough value in this for me to go and spend  
 time officially on it.

Yay!

 The ideas I have so far are as follows:

 Down in the exec nodes like SeqScan or IndexScan, there are several  
 parts available that are important.

 - Scanned relation
 - Targetlist
 - Filter (for SeqScan)
 - IndexQual (for IndexScan)

 These pieces are available at least in the scans Init function and  
 actually can be converted back into some SQL statement that effectively  
 represents this one single table scan. However, parsing it back at that  
 point is nonsense, as we cannot expect everything out there to actually  
 be an SQL database.

 Also, both the qualification as well as the targetlist can contain  
 things like user defined function calls. We neither want to deny nor  
 require that this sort of construct is actually handed over to the  
 external data source, so the interface needs to be more flexible.  
 Therefore it is best to divide the functionality into several user exit  
 functions.

Right :)

 The several functions that implement a scan type inside of the
 executor  very much resemble opening a cursor for a single table
 query, fetching  rows from it, eventually (in the case of a nested
 loop for example)  close and reopen the cursor with different key
 values from the outer  tuple, close the cursor. So it makes total
 sense to actually require an  implementation of an external data
 source to provide functions to open a  cursor, fetch rows, close the
 cursor.

That's not unreasonable, given that the simplest kind of external data
source would likely be along the lines of fopen().

 There will be some connection and transaction handling around all
 this  that I have in mind but think it would distract from the
 problem to be  solved right here, so more about that another time.

Maybe something like a way of setting, for each relation, what kind
(if any) of transactions are available?

 The C implementation for open cursor would be called with a scan
 handle,  containing the connection, the relname, the targetlist and
 the  qualification subtrees. These are modified from the real ones
 in the  scan node so that all Var's have varno=1 and that all OUTER
 Var's have  been replaced with a Const that reflects the current
 outer tuples  values. From here there are several support functions
 available to dumb  down each of those to whatever the external
 data source may support. In  case of the targetlist, this could mean
 to filter out a unique list of  Var nodes only, removing all
 expressions from it. In case of the  qualification, this could mean
 remove everything that isn't a standard  operator (=, , ...), or
 remove everything that isn't Postgres builtin.  Finally, there is a
 support function that will build a SQL statement  according to
 what's left inside that scan handle.

Interesting.

 The scan handle would track which modifications have been done to the  
 various pieces so that the outer support framework knows if it gets back  
 the originally requested targetlist, or if it has to run the projection  
 on the returned unique list of Var's. And if it has to recheck the  
 returned tuples for qualification, because some of the qual's had been  
 removed.

 In order to allow the user exits to be written in PL's, I can think of  
 makiing a complex data type containing the scan handle. The subtrees  
 could be accessed by the PL via support functions that return them in  
 nodeToString() or other formats.

 I'll try to write up a more complete proposal until end of next week.

Yay!

Cheers,
David.
-- 
David Fetter [EMAIL PROTECTED] http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: [EMAIL PROTECTED]

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: New relkind (was Re: [HACKERS] Exposing quals)

2008-07-08 Thread Hans-Juergen Schoenig

Simon Riggs wrote:

On Mon, 2008-07-07 at 16:26 -0700, David Fetter wrote:
  

On Mon, Jul 07, 2008 at 06:46:29PM -0400, Andrew Dunstan wrote:


For the record, I agree with Jan's suggestion of passing a pointer
to the parse tree, and offline gave David a suggestion verbally as
to how this could be handled for PL/PerlU.

I don't think we should be tied too closely to a string
representation, although possibly the first and simplest callback
function would simply stringify the quals.
  

As I understand Jan's plan, the idea is to create a new relkind with
an exit to user code at leaf nodes in the plan tree.  This would
require an API design for both user C code and for each PL to use, but
would then allow PostgreSQL's optimizer to work on JOINs, etc.

Jan, have I got that right so far?  Do you have something in the way
of a rough patch, docs, etc. for this?



It sounds like we can make it happen as text for other DBMS and as plan
nodes for PostgreSQL, which is the best solution all round.

Personally not too worried which way we do this - as long as we do it
for 8.4 :-) It's obviously happening in the background, so I'll leave it
alone.

  


I think the concept involving the plan tree is gold. Hannu Krosing 
mentioned some idea like that recently as well.
If the function had the chance to tell the planner how it is gonna 
operate (e.g produces sorted output, etc.) it would be perfect.
The golden thing here would be if we could teach a function whether it 
is  STREAMABLE | NOT STREAMABLE. streamable would make sure that we 
don't have to materialize the output of a set returning function. this 
would allow google-like analysis in postgresql easily by allowing to 
fetch data from any amount of data from any data source.


   best regards,

  hans


--
Cybertec Schönig  Schönig GmbH
PostgreSQL Solutions and Support
Gröhrmühlgasse 26, A-2700 Wiener Neustadt
Tel: +43/1/205 10 35 / 340
www.postgresql-support.de, www.postgresql-support.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Exposing quals

2008-07-08 Thread Heikki Linnakangas

Simon Riggs wrote:

The notes say Heikki doesn't think this is a long term solution, but
in the following discussion it was the *only* way of doing this that
will work with non-PostgreSQL databases. So it seems like the way we
would want to go, yes?


How did you come to the conclusion that this is the only way that will 
work with non-PostgreSQL databases? I don't see any limitations like 
that in any of the proposed approaches.


I guess I should clarify my position on this:

We should start moving towards a full SQL:MED solution that will 
ultimately support pushing down joins, aggregates etc. to the remote 
database. Including support for transaction control, using 2PC, and cost 
estimation and intelligent planning.


This should be done in an extensible way, so that people can write their 
own plugins to connect to different RDBMSs, as well as simple data 
sources like flat files. The plugin needs to be able to control which 
parts of a plan tree can be pushed down to the remote source, estimate 
the cost of remote execution, and map remote data types to local ones. 
And it then needs to be able to construct and execute the remote parts 
of a plan.


We're obviously not going to get all that overnight, but whatever we 
implement now should be the first step towards that, rather than 
something that we need to deprecate and replace in the future. 
Unfortunately I don't see a way to extend the proposed exposing quals 
to functions patch to do more than just that.


The list of functionality a full-blown plugin will need is quite long. I 
don't think there's any hope of supporting all that without reaching 
into some PostgreSQL internal data structures, particularly the planner 
structures like RelOptInfo, Path and Plan. The plugins will be more 
tightly integrated into the system than say user defined data types. 
They will need to be written in C, and they will be somewhat version 
dependent. Simpler plugins, like one to read CSV files, with no pushing 
down and no update support, will need less access to internals, and 
thus will be less version dependent, so pgfoundry projects like that 
will be feasible.


Note that the dependency on internal data structures doesn't go away by 
saying that they're passed as text; the text representation of our data 
structures is version dependent as well.


So what would the plugin API look like? To hook into the planner, I'm 
envisioning the plugin would define these functions:


  /*
   * Generate a remote plan for executing a whole subquery remotely. For
   * example, if the query is an aggregate, we might be able to execute
   * the whole aggregate in the remote database. This will be called
   * from grouping_planner(), like optimize_minmax_aggregates().
   * Returns NULL if remote execution is not possible. (a dummy
   * implementation can always return NULL.
   */
  Plan *generate_remote_path(PlannerInfo *, List *tlist);

  /*
   * Generate a path for executing one relation in remote
   * database. The relation can be a base (non-join) remote relation,
   * or a join involving a remote relation. Can return NULL for join
   * relations if the join can't be executed remotely.
   */
  Path *generate_remote_path(PlannerInfo *, RelOptInfo *)

  /*
   * Create a Plan node from a Path. Called from create_plan, when
   * the planner chooses to use a remote path. A typical implementation
   * would create the SQL string to be executed in the remote database,
   * and return a RemotePlan node with that SQL string in it.
   */
  Plan *create_remote_plan(PlannerInfo *, RemotePath *)

On the execution side, the plugin needs to be able to execute a 
previously generated RemotePlan. There would be a new executor node 
type, a RemoteScan, that would be similar to a seq scan or index scan, 
but delegates the actual execution to the plugin. The execution part of 
the plugin API would reflect the API of executor nodes, something like:


  void *scan_open(RemotePlan *)
  HeapTuple *scan_getnext(void *scanstate)
  void scan_close(void *scanstate)


The presumption here is that you would define remote tables with the 
appropriate SQL:MED statements beforehand (CREATE FOREIGN TABLE). 
However, it is flexible enough that you could implement the exposing 
quals to functions functionality with this as well: 
generate_remote_path() would need to recognize the function scans that 
it can handle, and return a RemotePath struct with all the same 
information as create_functionscan_path does (the cost estimates could 
be adjusted for the pushed down quals at this point as well). 
create_remote_plan would return a FunctionScan node, but with the extra 
qualifiers passed into the function as arguments. In case of dblink, it 
could just add extra WHERE clauses to the query that's being passed as 
argument. I'm not proposing that we do the stuff described in this 
paragraph, just using it as an example of the flexibility.


BTW, I think the exposing quals to functions functionality could be 

Re: [HACKERS] Exposing quals

2008-07-08 Thread Simon Riggs

On Tue, 2008-07-08 at 17:51 +0300, Heikki Linnakangas wrote:
 Simon Riggs wrote:
  The notes say Heikki doesn't think this is a long term solution, but
  in the following discussion it was the *only* way of doing this that
  will work with non-PostgreSQL databases. So it seems like the way we
  would want to go, yes?
 
 How did you come to the conclusion that this is the only way that will 
 work with non-PostgreSQL databases? 

SQL, in text form, is the interface to other databases. You can't pass
half a plan tree to Oracle, especially not a PostgreSQL plan tree. It
has to be text if you wish to send a query to another RDBMS, or another
version of PostgreSQL.

 We should start moving towards a full SQL:MED solution that will 
 ultimately support pushing down joins, aggregates etc. to the remote 
 database. Including support for transaction control, using 2PC, and
 cost estimation and intelligent planning.
 
 This should be done in an extensible way, so that people can write
 their own plugins to connect to different RDBMSs, as well as simple
 data sources like flat files. The plugin needs to be able to control
 which parts of a plan tree can be pushed down to the remote source,
 estimate the cost of remote execution, and map remote data types to
 local ones. And it then needs to be able to construct and execute the
 remote parts of a plan.

So if I understand you, you want to pass the partial plan tree and then
have a plugin construct the SQL text. Sounds like a great approach.

Maybe you thought I meant internal interfaces should be in text? No,
that would be bizarre. I meant we should not attempt to pass partial
plan trees outside of the database, since that would limit the feature
to only working with the same version of PostgreSQL database. I support
your wish to have something that can work with other types of database.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Exposing quals

2008-07-08 Thread Heikki Linnakangas

Simon Riggs wrote:

SQL, in text form, is the interface to other databases. You can't pass
half a plan tree to Oracle, especially not a PostgreSQL plan tree. It
has to be text if you wish to send a query to another RDBMS, or another
version of PostgreSQL.


Oh, I see. Agreed.

Though note that there's big differences in SQL dialects, so a 
one-size-fits-all approach to generating SQL to be executed in the 
remote database won't work. (not that I think anyone has suggested that)



So if I understand you, you want to pass the partial plan tree and then
have a plugin construct the SQL text.


Exactly.


Maybe you thought I meant internal interfaces should be in text?


Yeah, that's exactly what I thought you meant.


No,
that would be bizarre. I meant we should not attempt to pass partial
plan trees outside of the database, since that would limit the feature
to only working with the same version of PostgreSQL database. 


Agreed. I'm glad we're on the same page now.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Exposing quals

2008-07-08 Thread David Fetter
On Tue, Jul 08, 2008 at 06:22:23PM +0300, Heikki Linnakangas wrote:
 Simon Riggs wrote:
 SQL, in text form, is the interface to other databases. You can't
 pass half a plan tree to Oracle, especially not a PostgreSQL plan
 tree. It has to be text if you wish to send a query to another
 RDBMS, or another version of PostgreSQL.

 Oh, I see. Agreed.

 Though note that there's big differences in SQL dialects, so a
 one-size-fits-all approach to generating SQL to be executed in the
 remote database won't work. (not that I think anyone has suggested
 that)

 So if I understand you, you want to pass the partial plan tree and
 then have a plugin construct the SQL text.

 Exactly.

 Maybe you thought I meant internal interfaces should be in text?

 Yeah, that's exactly what I thought you meant.

 No, that would be bizarre. I meant we should not attempt to pass
 partial plan trees outside of the database, since that would limit
 the feature to only working with the same version of PostgreSQL
 database. 

 Agreed. I'm glad we're on the same page now.

Everybody's weighed in on this thread except the guy who's actually
doing the work.

Jan?

Cheers,
David.
-- 
David Fetter [EMAIL PROTECTED] http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: [EMAIL PROTECTED]

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Exposing quals

2008-07-08 Thread Jan Wieck

On 7/8/2008 11:38 AM, David Fetter wrote:

On Tue, Jul 08, 2008 at 06:22:23PM +0300, Heikki Linnakangas wrote:

Simon Riggs wrote:

SQL, in text form, is the interface to other databases. You can't
pass half a plan tree to Oracle, especially not a PostgreSQL plan
tree. It has to be text if you wish to send a query to another
RDBMS, or another version of PostgreSQL.


Oh, I see. Agreed.

Though note that there's big differences in SQL dialects, so a
one-size-fits-all approach to generating SQL to be executed in the
remote database won't work. (not that I think anyone has suggested
that)


So if I understand you, you want to pass the partial plan tree and
then have a plugin construct the SQL text.


Exactly.


Maybe you thought I meant internal interfaces should be in text?


Yeah, that's exactly what I thought you meant.


No, that would be bizarre. I meant we should not attempt to pass
partial plan trees outside of the database, since that would limit
the feature to only working with the same version of PostgreSQL
database. 


Agreed. I'm glad we're on the same page now.


Everybody's weighed in on this thread except the guy who's actually
doing the work.

Jan?


Here,

I talked to my supervisor here in Toronto (that's where I am this week) 
and Afilias actually sees enough value in this for me to go and spend 
time officially on it.


The ideas I have so far are as follows:

Down in the exec nodes like SeqScan or IndexScan, there are several 
parts available that are important.


- Scanned relation
- Targetlist
- Filter (for SeqScan)
- IndexQual (for IndexScan)

These pieces are available at least in the scans Init function and 
actually can be converted back into some SQL statement that effectively 
represents this one single table scan. However, parsing it back at that 
point is nonsense, as we cannot expect everything out there to actually 
be an SQL database.


Also, both the qualification as well as the targetlist can contain 
things like user defined function calls. We neither want to deny nor 
require that this sort of construct is actually handed over to the 
external data source, so the interface needs to be more flexible. 
Therefore it is best to divide the functionality into several user exit 
functions.


The several functions that implement a scan type inside of the executor 
very much resemble opening a cursor for a single table query, fetching 
rows from it, eventually (in the case of a nested loop for example) 
close and reopen the cursor with different key values from the outer 
tuple, close the cursor. So it makes total sense to actually require an 
implementation of an external data source to provide functions to open a 
cursor, fetch rows, close the cursor.


There will be some connection and transaction handling around all this 
that I have in mind but think it would distract from the problem to be 
solved right here, so more about that another time.


The C implementation for open cursor would be called with a scan handle, 
containing the connection, the relname, the targetlist and the 
qualification subtrees. These are modified from the real ones in the 
scan node so that all Var's have varno=1 and that all OUTER Var's have 
been replaced with a Const that reflects the current outer tuples 
values. From here there are several support functions available to dumb 
down each of those to whatever the external data source may support. In 
case of the targetlist, this could mean to filter out a unique list of 
Var nodes only, removing all expressions from it. In case of the 
qualification, this could mean remove everything that isn't a standard 
operator (=, , ...), or remove everything that isn't Postgres builtin. 
Finally, there is a support function that will build a SQL statement 
according to what's left inside that scan handle.


The scan handle would track which modifications have been done to the 
various pieces so that the outer support framework knows if it gets back 
the originally requested targetlist, or if it has to run the projection 
on the returned unique list of Var's. And if it has to recheck the 
returned tuples for qualification, because some of the qual's had been 
removed.


In order to allow the user exits to be written in PL's, I can think of 
makiing a complex data type containing the scan handle. The subtrees 
could be accessed by the PL via support functions that return them in 
nodeToString() or other formats.


I'll try to write up a more complete proposal until end of next week.


Jan

--
Anyone who trades liberty for security deserves neither
liberty nor security. -- Benjamin Franklin


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Exposing quals

2008-07-07 Thread Simon Riggs

On Mon, 2008-05-05 at 12:01 -0700, David Fetter wrote:

 Please find attached the patch, and thanks to Neil Conway and Korry
 Douglas for the code, and to Jan Wieck for helping me hammer out the
 scheme above.  Mistakes are all mine ;)

I see no negative comments to this patch on -hackers.

This was discussed here
http://wiki.postgresql.org/wiki/PgCon_2008_Developer_Meeting#SQL.2FMED
and I had understood the consensus to be that we would go ahead with
this?

The notes say Heikki doesn't think this is a long term solution, but
in the following discussion it was the *only* way of doing this that
will work with non-PostgreSQL databases. So it seems like the way we
would want to go, yes?

So, can we add this to the CommitFest July page so it can receive some
substantial constructive/destructive comments?

This could be an important feature in conjunction with Hot Standby.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Exposing quals

2008-07-07 Thread andrew

 On Mon, 2008-05-05 at 12:01 -0700, David Fetter wrote:

 Please find attached the patch, and thanks to Neil Conway and Korry
 Douglas for the code, and to Jan Wieck for helping me hammer out the
 scheme above.  Mistakes are all mine ;)

 I see no negative comments to this patch on -hackers.

 This was discussed here
 http://wiki.postgresql.org/wiki/PgCon_2008_Developer_Meeting#SQL.2FMED
 and I had understood the consensus to be that we would go ahead with
 this?

 The notes say Heikki doesn't think this is a long term solution, but
 in the following discussion it was the *only* way of doing this that
 will work with non-PostgreSQL databases. So it seems like the way we
 would want to go, yes?

 So, can we add this to the CommitFest July page so it can receive some
 substantial constructive/destructive comments?

 This could be an important feature in conjunction with Hot Standby.

The notes say at the end:

Jan thinks that showing the node tree will work better. But others don't
agree with him -- it wouldn't work for PL/perlU. But Jan thinks it would
work to give it a pointer to the parse tree and the range, we'd need to
add an access function for the PL.

For the record, I agree with Jan's suggestion of passing a pointer to the
parse tree, and offline gave David a suggestion verbally as to how this
could be handled for PL/PerlU.


I don't think we should be tied too closely to a string representation,
although possibly the first and simplest callback function would simply
stringify the quals.

cheers

andrew


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


New relkind (was Re: [HACKERS] Exposing quals)

2008-07-07 Thread David Fetter
On Mon, Jul 07, 2008 at 06:46:29PM -0400, Andrew Dunstan wrote:
 
  On Mon, 2008-05-05 at 12:01 -0700, David Fetter wrote:
 
  Please find attached the patch, and thanks to Neil Conway and
  Korry Douglas for the code, and to Jan Wieck for helping me
  hammer out the scheme above.  Mistakes are all mine ;)
 
  I see no negative comments to this patch on -hackers.
 
  This was discussed here
  http://wiki.postgresql.org/wiki/PgCon_2008_Developer_Meeting#SQL.2FMED
  and I had understood the consensus to be that we would go ahead
  with this?
 
  The notes say Heikki doesn't think this is a long term solution,
  but in the following discussion it was the *only* way of doing
  this that will work with non-PostgreSQL databases. So it seems
  like the way we would want to go, yes?
 
  So, can we add this to the CommitFest July page so it can receive
  some substantial constructive/destructive comments?
 
  This could be an important feature in conjunction with Hot
  Standby.
 
 The notes say at the end:
 
 Jan thinks that showing the node tree will work better. But others
 don't agree with him -- it wouldn't work for PL/perlU. But Jan
 thinks it would work to give it a pointer to the parse tree and the
 range, we'd need to add an access function for the PL.
 
 For the record, I agree with Jan's suggestion of passing a pointer
 to the parse tree, and offline gave David a suggestion verbally as
 to how this could be handled for PL/PerlU.
 
 I don't think we should be tied too closely to a string
 representation, although possibly the first and simplest callback
 function would simply stringify the quals.

As I understand Jan's plan, the idea is to create a new relkind with
an exit to user code at leaf nodes in the plan tree.  This would
require an API design for both user C code and for each PL to use, but
would then allow PostgreSQL's optimizer to work on JOINs, etc.

Jan, have I got that right so far?  Do you have something in the way
of a rough patch, docs, etc. for this?

Cheers,
David.
-- 
David Fetter [EMAIL PROTECTED] http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: [EMAIL PROTECTED]

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: New relkind (was Re: [HACKERS] Exposing quals)

2008-07-07 Thread Simon Riggs

On Mon, 2008-07-07 at 16:26 -0700, David Fetter wrote:
 On Mon, Jul 07, 2008 at 06:46:29PM -0400, Andrew Dunstan wrote:
  
  For the record, I agree with Jan's suggestion of passing a pointer
  to the parse tree, and offline gave David a suggestion verbally as
  to how this could be handled for PL/PerlU.
  
  I don't think we should be tied too closely to a string
  representation, although possibly the first and simplest callback
  function would simply stringify the quals.
 
 As I understand Jan's plan, the idea is to create a new relkind with
 an exit to user code at leaf nodes in the plan tree.  This would
 require an API design for both user C code and for each PL to use, but
 would then allow PostgreSQL's optimizer to work on JOINs, etc.
 
 Jan, have I got that right so far?  Do you have something in the way
 of a rough patch, docs, etc. for this?

It sounds like we can make it happen as text for other DBMS and as plan
nodes for PostgreSQL, which is the best solution all round.

Personally not too worried which way we do this - as long as we do it
for 8.4 :-) It's obviously happening in the background, so I'll leave it
alone.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Exposing quals

2008-05-05 Thread David Fetter
Folks,

I'd like to make it possible to see qualifiers from inside functions
in PostgreSQL.  For my particular case, and for dblink, it would
allow user-space code to some remove bottlenecks which make currently
make inter-DBMS communication extremely inefficient.

The current code returns one (potentially) big string with the quals
ANDed together.  I'm thinking that the new code would:

1.  Return just a pointer.

2.  Add some functions like rsinfo_get_node_tree_str() which can turn
same into (in this case, a string, but others might get tree
structures) on demand from client code.

3.  Add wrapper functions in each untrusted PL (trusted PLs shouldn't
be doing anything that needs this).

4.  Document the above.

Please find attached the patch, and thanks to Neil Conway and Korry
Douglas for the code, and to Jan Wieck for helping me hammer out the
scheme above.  Mistakes are all mine ;)

Comments?  Questions?

Cheers,
David.
-- 
David Fetter [EMAIL PROTECTED] http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: [EMAIL PROTECTED]

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate
Index: contrib/dblink/dblink.c
===
RCS file: /projects/cvsroot/pgsql/contrib/dblink/dblink.c,v
retrieving revision 1.73
diff -c -c -r1.73 dblink.c
*** contrib/dblink/dblink.c 4 Apr 2008 17:02:56 -   1.73
--- contrib/dblink/dblink.c 8 Apr 2008 18:29:05 -
***
*** 752,757 
--- 752,758 
char   *conname = NULL;
remoteConn *rconn = NULL;
boolfail = true;/* default to backward 
compatible */
+   ReturnSetInfo   *rsi;
  
/* create a function context for cross-call persistence */
funcctx = SRF_FIRSTCALL_INIT();
***
*** 826,831 
--- 827,843 
elog(ERROR, wrong number of arguments);
}
  
+   if (sql  rsi-qual)
+   {
+   char *quals = rsinfo_get_qual_str(rsi);
+   char *qualifiedQuery = palloc(strlen(sql) + strlen( 
WHERE ) +
+   
  strlen(quals) + 1);
+ 
+   sprintf(qualifiedQuery, %s WHERE %s, sql, quals);
+ 
+   sql = qualifiedQuery;
+   }
+ 
if (!conn)
DBLINK_CONN_NOT_AVAIL;
  
Index: src/backend/executor/execQual.c
===
RCS file: /projects/cvsroot/pgsql/src/backend/executor/execQual.c,v
retrieving revision 1.228
diff -c -c -r1.228 execQual.c
*** src/backend/executor/execQual.c 25 Mar 2008 22:42:43 -  1.228
--- src/backend/executor/execQual.c 8 Apr 2008 18:29:06 -
***
*** 45,50 
--- 45,51 
  #include funcapi.h
  #include miscadmin.h
  #include nodes/makefuncs.h
+ #include optimizer/clauses.h
  #include optimizer/planmain.h
  #include parser/parse_expr.h
  #include utils/acl.h
***
*** 1415,1420 
--- 1416,1440 
return result;
  }
  
+ /*
+  *
+  * Get either an empty string or a batch of qualifiers.
+  *
+  */
+ char *
+ rsinfo_get_qual_str(ReturnSetInfo *rsinfo)
+ {
+   Node*qual;
+   List*context;
+ 
+   if (rsinfo-qual == NIL)
+   return pstrdup();
+ 
+   qual = (Node *) make_ands_explicit(rsinfo-qual);
+   context = deparse_context_for_plan(NULL, NULL, rsinfo-rtable);
+ 
+   return deparse_expression(qual, context, false, false);
+ }
  
  /*
   *ExecMakeTableFunctionResult
***
*** 1426,1431 
--- 1446,1452 
  Tuplestorestate *
  ExecMakeTableFunctionResult(ExprState *funcexpr,
ExprContext *econtext,
+   List *qual, List 
*rtable,
TupleDesc expectedDesc,
TupleDesc *returnDesc)
  {
***
*** 1458,1463 
--- 1479,1486 
InitFunctionCallInfoData(fcinfo, NULL, 0, NULL, (Node *) rsinfo);
rsinfo.type = T_ReturnSetInfo;
rsinfo.econtext = econtext;
+   rsinfo.qual = qual;
+   rsinfo.rtable = rtable;
rsinfo.expectedDesc = expectedDesc;
rsinfo.allowedModes = (int) (SFRM_ValuePerCall | SFRM_Materialize);
rsinfo.returnMode = SFRM_ValuePerCall;
Index: src/backend/executor/nodeFunctionscan.c
===
RCS file: /projects/cvsroot/pgsql/src/backend/executor/nodeFunctionscan.c,v
retrieving revision 1.46
diff -c -c -r1.46 nodeFunctionscan.c
*** src/backend/executor/nodeFunctionscan.c 29 Feb 2008 02:49:39 -