Re: [HACKERS] minimizing the target list for foreign data wrappers

2013-04-25 Thread Tom Lane
David Gudeman dave.gude...@gmail.com writes:
 One of the problems I'm having is that in my application, the foreign
 tables typically have hundreds of columns while typical queries only access
 a dozen or so (the foreign server is a columnar SQL database). Furthermore,
 there is no size optimization for NULL values passed back from the foreign
 server, so if I return all of the columns from the table --even as NULLs--
 the returned data size will be several times the size that it needs to be.
 My application cannot tolerate this level of inefficiency, so I need to
 return minimal columns from the foreign table.

That's already possible; see contrib/postgres_fdw in HEAD for an
existence proof.

 The documentation doesn't say how to do this, but looking at the code I
 think it is possible. In GetForeignPlan() you have to pass on the tlist
 argument, which I presume means that the query plan will use the tlist that
 I pass in, right? If so, then it should be possible for me to write a
 function that takes tlist and baserel-reltargetlist and return a version
 of tlist that knows which foreign-table columns are actually used, and
 replaces the rest with a NULL constant.

You do not get to editorialize on the tlist that will be computed by the
ForeignScan node: in the case of a simple single-table SELECT, that's
going to be computing the expressions the user asked for.  Nor can you
alter the expectation about the rowtype of the scan tuple that's
returned by the FDW: that needs to match the declared rowtype of the
foreign table.  However, you can skip fetching unneeded columns and just
set those fields of the scan tuple to nulls.  That's cheap enough
(particularly if the scan tuple stays virtual) that I'm unconvinced we
should contort the APIs to the extent that would be needed to let the
FDW change the scan tuple rowtype dynamically.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] minimizing the target list for foreign data wrappers

2013-04-23 Thread David Gudeman
In case anyone is interested, I tried it and it doesn't seem to work.
It looks like some other plan element already has the target-list
tuple baked. Now I'm trying to decide whether to give up on FDW. It's
a shame because it's such a sweet facility, but at this point, I just
don't think that it's mature enough for what I need to do.
Regards,
David Gudeman


On Mon, Apr 22, 2013 at 11:27 AM, David Gudeman dave.gude...@gmail.com wrote:
 Re-reading my first email I thought it was a little confusing, so here
 is some clarification. In GetForeignPlan, tlist seems to be a target
 list for a basic select * from the foreign table. For the ith
 TargetEntry te in tlist, it seems that te-expr is a var with
 varattno=i. I was mis-remembering and calling varattno attrno in the
 original email.

 My assumption is that the plan elements that use the output of the FDW
 plan node will access columns indirectly using tlist. In other words,
 I'm assuming that if there is a reference to a column c of the foreign
 table, this column will be represented as a Var with varattno being an
 offset into tlist. So if c is column number 3, for example, you get
 its value by looking up TargetEntry number 3 in tlist and evaluate the
 expr column for that TargetEntry. So if I change the  Var in the expr
 column so the varattno points to a different column in the output
 tuple, then everything will work.

 The two risky assumptions I'm making are 1. that it actually uses this
 indirect way of looking up columns in a foreign table and 2. that it
 actually uses the tlist that I pass in when I call make_foreignscan().

 Can anyone confirm or deny these assumptions?

 Thanks.

 On Sun, Apr 21, 2013 at 6:57 PM, David Gudeman dave.gude...@gmail.com wrote:

 A few years ago I wrote a roll-your-own foreign-data-wrapper system for 
 Postgres because Postgres didn't have one at the time (some details here 
 (http://unobtainabol.blogspot.com/2013/04/dave-foreign-data-introuction.html)
  if anyone is interested). Now I'm being tasked to move it to Postgres 9.2.x 
 and I'd like to use FDW if possible.

 One of the problems I'm having is that in my application, the foreign tables 
 typically have hundreds of columns while typical queries only access a dozen 
 or so (the foreign server is a columnar SQL database). Furthermore, there is 
 no size optimization for NULL values passed back from the foreign server, so 
 if I return all of the columns from the table --even as NULLs-- the returned 
 data size will be several times the size that it needs to be. My application 
 cannot tolerate this level of inefficiency, so I need to return minimal 
 columns from the foreign table.

 The documentation doesn't say how to do this, but looking at the code I 
 think it is possible. In GetForeignPlan() you have to pass on the tlist 
 argument, which I presume means that the query plan will use the tlist that 
 I pass in, right? If so, then it should be possible for me to write a 
 function that takes tlist and baserel-reltargetlist and return a version of 
 tlist that knows which foreign-table columns are actually used, and replaces 
 the rest with a NULL constant.

 For example, suppose the original tlist is this: [VAR(attrno=1), 
 VAR(attrno=2), VAR(attrno=3)] and reltarget list says that I only need args 
 1 and 3. Then the new tlist would look like this: [VAR(attrno=1), 
 CONST(val=NULL), VAR(attrno=2)] where the attrno of the last VAR has been 
 reduced by one because the 2 column is no longer there.

 I did something very much like this in my roll-your-own version of FDW so I 
 know basically how to do it, but I did it at the pre-planning stage and I'm 
 not sure how much is already packed into the other plan nodes at this point. 
 Maybe it's too late to change the target list?

 Can anyone give me some advice or warnings on this? I'd hate to go to the 
 trouble of implementing and testing it only to find that I'm making some 
 bogus assumptions.

 Thanks,
 David Gudeman



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] minimizing the target list for foreign data wrappers

2013-04-23 Thread David Fetter
David,

Please post your patch(es) and some demo of how things broke so others
can improve future versions--possibly even 9.3 versions if it turns
out you've discovered a bug in the implementation.

Thanks very much for your hard work and insights into this.

Cheers,
David.
On Tue, Apr 23, 2013 at 11:08:04AM -0700, David Gudeman wrote:
 In case anyone is interested, I tried it and it doesn't seem to work.
 It looks like some other plan element already has the target-list
 tuple baked. Now I'm trying to decide whether to give up on FDW. It's
 a shame because it's such a sweet facility, but at this point, I just
 don't think that it's mature enough for what I need to do.
 Regards,
 David Gudeman
 
 
 On Mon, Apr 22, 2013 at 11:27 AM, David Gudeman dave.gude...@gmail.com 
 wrote:
  Re-reading my first email I thought it was a little confusing, so here
  is some clarification. In GetForeignPlan, tlist seems to be a target
  list for a basic select * from the foreign table. For the ith
  TargetEntry te in tlist, it seems that te-expr is a var with
  varattno=i. I was mis-remembering and calling varattno attrno in the
  original email.
 
  My assumption is that the plan elements that use the output of the FDW
  plan node will access columns indirectly using tlist. In other words,
  I'm assuming that if there is a reference to a column c of the foreign
  table, this column will be represented as a Var with varattno being an
  offset into tlist. So if c is column number 3, for example, you get
  its value by looking up TargetEntry number 3 in tlist and evaluate the
  expr column for that TargetEntry. So if I change the  Var in the expr
  column so the varattno points to a different column in the output
  tuple, then everything will work.
 
  The two risky assumptions I'm making are 1. that it actually uses this
  indirect way of looking up columns in a foreign table and 2. that it
  actually uses the tlist that I pass in when I call make_foreignscan().
 
  Can anyone confirm or deny these assumptions?
 
  Thanks.
 
  On Sun, Apr 21, 2013 at 6:57 PM, David Gudeman dave.gude...@gmail.com 
  wrote:
 
  A few years ago I wrote a roll-your-own foreign-data-wrapper system for 
  Postgres because Postgres didn't have one at the time (some details here 
  (http://unobtainabol.blogspot.com/2013/04/dave-foreign-data-introuction.html)
   if anyone is interested). Now I'm being tasked to move it to Postgres 
  9.2.x and I'd like to use FDW if possible.
 
  One of the problems I'm having is that in my application, the foreign 
  tables typically have hundreds of columns while typical queries only 
  access a dozen or so (the foreign server is a columnar SQL database). 
  Furthermore, there is no size optimization for NULL values passed back 
  from the foreign server, so if I return all of the columns from the table 
  --even as NULLs-- the returned data size will be several times the size 
  that it needs to be. My application cannot tolerate this level of 
  inefficiency, so I need to return minimal columns from the foreign table.
 
  The documentation doesn't say how to do this, but looking at the code I 
  think it is possible. In GetForeignPlan() you have to pass on the tlist 
  argument, which I presume means that the query plan will use the tlist 
  that I pass in, right? If so, then it should be possible for me to write a 
  function that takes tlist and baserel-reltargetlist and return a version 
  of tlist that knows which foreign-table columns are actually used, and 
  replaces the rest with a NULL constant.
 
  For example, suppose the original tlist is this: [VAR(attrno=1), 
  VAR(attrno=2), VAR(attrno=3)] and reltarget list says that I only need 
  args 1 and 3. Then the new tlist would look like this: [VAR(attrno=1), 
  CONST(val=NULL), VAR(attrno=2)] where the attrno of the last VAR has been 
  reduced by one because the 2 column is no longer there.
 
  I did something very much like this in my roll-your-own version of FDW so 
  I know basically how to do it, but I did it at the pre-planning stage and 
  I'm not sure how much is already packed into the other plan nodes at this 
  point. Maybe it's too late to change the target list?
 
  Can anyone give me some advice or warnings on this? I'd hate to go to the 
  trouble of implementing and testing it only to find that I'm making some 
  bogus assumptions.
 
  Thanks,
  David Gudeman
 
 
 
 -- 
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers

-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your 

Re: [HACKERS] minimizing the target list for foreign data wrappers

2013-04-22 Thread David Gudeman
Re-reading my first email I thought it was a little confusing, so here
is some clarification. In GetForeignPlan, tlist seems to be a target
list for a basic select * from the foreign table. For the ith
TargetEntry te in tlist, it seems that te-expr is a var with
varattno=i. I was mis-remembering and calling varattno attrno in the
original email.

My assumption is that the plan elements that use the output of the FDW
plan node will access columns indirectly using tlist. In other words,
I'm assuming that if there is a reference to a column c of the foreign
table, this column will be represented as a Var with varattno being an
offset into tlist. So if c is column number 3, for example, you get
its value by looking up TargetEntry number 3 in tlist and evaluate the
expr column for that TargetEntry. So if I change the  Var in the expr
column so the varattno points to a different column in the output
tuple, then everything will work.

The two risky assumptions I'm making are 1. that it actually uses this
indirect way of looking up columns in a foreign table and 2. that it
actually uses the tlist that I pass in when I call make_foreignscan().

Can anyone confirm or deny these assumptions?

Thanks.

On Sun, Apr 21, 2013 at 6:57 PM, David Gudeman dave.gude...@gmail.com wrote:

 A few years ago I wrote a roll-your-own foreign-data-wrapper system for 
 Postgres because Postgres didn't have one at the time (some details here 
 (http://unobtainabol.blogspot.com/2013/04/dave-foreign-data-introuction.html) 
 if anyone is interested). Now I'm being tasked to move it to Postgres 9.2.x 
 and I'd like to use FDW if possible.

 One of the problems I'm having is that in my application, the foreign tables 
 typically have hundreds of columns while typical queries only access a dozen 
 or so (the foreign server is a columnar SQL database). Furthermore, there is 
 no size optimization for NULL values passed back from the foreign server, so 
 if I return all of the columns from the table --even as NULLs-- the returned 
 data size will be several times the size that it needs to be. My application 
 cannot tolerate this level of inefficiency, so I need to return minimal 
 columns from the foreign table.

 The documentation doesn't say how to do this, but looking at the code I think 
 it is possible. In GetForeignPlan() you have to pass on the tlist argument, 
 which I presume means that the query plan will use the tlist that I pass in, 
 right? If so, then it should be possible for me to write a function that 
 takes tlist and baserel-reltargetlist and return a version of tlist that 
 knows which foreign-table columns are actually used, and replaces the rest 
 with a NULL constant.

 For example, suppose the original tlist is this: [VAR(attrno=1), 
 VAR(attrno=2), VAR(attrno=3)] and reltarget list says that I only need args 1 
 and 3. Then the new tlist would look like this: [VAR(attrno=1), 
 CONST(val=NULL), VAR(attrno=2)] where the attrno of the last VAR has been 
 reduced by one because the 2 column is no longer there.

 I did something very much like this in my roll-your-own version of FDW so I 
 know basically how to do it, but I did it at the pre-planning stage and I'm 
 not sure how much is already packed into the other plan nodes at this point. 
 Maybe it's too late to change the target list?

 Can anyone give me some advice or warnings on this? I'd hate to go to the 
 trouble of implementing and testing it only to find that I'm making some 
 bogus assumptions.

 Thanks,
 David Gudeman



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] minimizing the target list for foreign data wrappers

2013-04-21 Thread David Gudeman
A few years ago I wrote a roll-your-own foreign-data-wrapper system for
Postgres because Postgres didn't have one at the time (some details
herehttp://unubtainabol.blogspot.com/2013/04/daves-foreign-data-introuction.htmlif
anyone is interested). Now I'm being tasked to move it to Postgres
9.2.x
and I'd like to use FDW if possible.

One of the problems I'm having is that in my application, the foreign
tables typically have hundreds of columns while typical queries only access
a dozen or so (the foreign server is a columnar SQL database). Furthermore,
there is no size optimization for NULL values passed back from the foreign
server, so if I return all of the columns from the table --even as NULLs--
the returned data size will be several times the size that it needs to be.
My application cannot tolerate this level of inefficiency, so I need to
return minimal columns from the foreign table.

The documentation doesn't say how to do this, but looking at the code I
think it is possible. In GetForeignPlan() you have to pass on the tlist
argument, which I presume means that the query plan will use the tlist that
I pass in, right? If so, then it should be possible for me to write a
function that takes tlist and baserel-reltargetlist and return a version
of tlist that knows which foreign-table columns are actually used, and
replaces the rest with a NULL constant.

For example, suppose the original tlist is this: [VAR(attrno=1),
VAR(attrno=2), VAR(attrno=3)] and reltarget list says that I only need args
1 and 3. Then the new tlist would look like this: [VAR(attrno=1),
CONST(val=NULL), VAR(attrno=2)] where the attrno of the last VAR has been
reduced by one because the 2 column is no longer there.

I did something very much like this in my roll-your-own version of FDW so I
know basically how to do it, but I did it at the pre-planning stage and I'm
not sure how much is already packed into the other plan nodes at this
point. Maybe it's too late to change the target list?

Can anyone give me some advice or warnings on this? I'd hate to go to the
trouble of implementing and testing it only to find that I'm making some
bogus assumptions.

Thanks,
David Gudeman