Re: postgres_fdw vs data formatting GUCs (was Re: [HACKERS] [v9.3] writable foreign tables)

2013-03-22 Thread Tom Lane
Daniel Farina dan...@heroku.com writes:
 This contains some edits to comments that referred to the obsolete and
 bogus TupleDesc scanning.  No mechanical alterations.

Applied with some substantial revisions.  I didn't like where you'd put
the apply/restore calls, for one thing --- we need to wait to do the
applies until we have the PGresult in hand, else we might be applying
stale values of the remote's GUCs.  Also, adding a call that could throw
errors right before materializeResult() won't do, because that would
result in leaking the PGresult on error.  The struct for state seemed a
bit of a mess too, given that you couldn't always initialize it in one
place.  (In hindsight I could have left that alone given where I ended
up putting the calls, but it didn't seem to be providing any useful
isolation.)

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: postgres_fdw vs data formatting GUCs (was Re: [HACKERS] [v9.3] writable foreign tables)

2013-03-22 Thread Daniel Farina
On Fri, Mar 22, 2013 at 12:29 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Daniel Farina dan...@heroku.com writes:
 This contains some edits to comments that referred to the obsolete and
 bogus TupleDesc scanning.  No mechanical alterations.

 Applied with some substantial revisions.  I didn't like where you'd put
 the apply/restore calls, for one thing --- we need to wait to do the
 applies until we have the PGresult in hand, else we might be applying
 stale values of the remote's GUCs.  Also, adding a call that could throw
 errors right before materializeResult() won't do, because that would
 result in leaking the PGresult on error.

Good catches.

 The struct for state seemed a
 bit of a mess too, given that you couldn't always initialize it in one
 place.

Yeah, I had to give that up when pushing things around, unless I
wanted to push more state down.  It used to be neater.

 (In hindsight I could have left that alone given where I ended
 up putting the calls, but it didn't seem to be providing any useful
 isolation.)

I studied your commit.

Yeah, the idea I had was to try to avoid pushing down a loaded a value
as a PGconn into the lower level helper functions, but perhaps that
economy was false one after the modifications.  Earlier versions used
to push down the RemoteGucs struct instead of a full-blown conn to
hint to the restricted purpose of that reference.  By conceding to this
pushdown I think the struct could have remained, as you said, but the
difference to clarity is likely marginal.  I thought I found a way to
not have to widen the parameter list at all, so I preferred that one,
but clearly it is wrong, w.r.t. leaks and the not up-to-date protocol
state.

Sorry you had to root around so much in there to get something you
liked, but thanks for going through it.

--
fdr


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: postgres_fdw vs data formatting GUCs (was Re: [HACKERS] [v9.3] writable foreign tables)

2013-03-21 Thread Daniel Farina
This contains some edits to comments that referred to the obsolete and
bogus TupleDesc scanning.  No mechanical alterations.

--- a/contrib/dblink/dblink.c
+++ b/contrib/dblink/dblink.c
@@ -2961,9 +2961,8 @@ initRemoteGucs(remoteGucs *rgs, PGconn *conn)
 }

 /*
- * Scan a TupleDesc and, should it contain types that are sensitive to
- * GUCs, acquire remote GUCs and set them in a new GUC nesting level.
- * This is undone with restoreLocalGucs.
+ * Acquire remote GUCs that may affect type parsing and set them in a
+ * new GUC nesting level.
  */
 static void
 applyRemoteGucs(remoteGucs *rgs)
@@ -2974,11 +2973,8 @@ applyRemoteGucs(remoteGucs *rgs)
int addedGucNesting = false;

/*
-* Affected types require local GUC manipulations.  Create a new
-* GUC NestLevel to overlay the remote settings.
-*
-* Also, this nesting is done exactly once per remoteGucInfo
-* structure, so expect it to come with an invalid NestLevel.
+* This nesting is done exactly once per remoteGucInfo structure,
+* so expect it to come with an invalid NestLevel.
 */
Assert(rgs-localGUCNestLevel == -1);

diff --git a/contrib/dblink/expected/dblink.out
b/contrib/dblink/expected/dblink.out
index 3946485..579664e 100644
--- a/contrib/dblink/expected/dblink.out
+++ b/contrib/dblink/expected/dblink.out
@@ -930,9 +930,8 @@ SELECT dblink_exec('myconn', 'SET datestyle =
GERMAN, DMY;');
  SET
 (1 row)

--- The following attempt test various paths at which TupleDescs are
--- formed and inspected for containment of types requiring local GUC
--- setting.
+-- The following attempt test various paths at which tuples are formed
+-- and inspected for containment of types requiring local GUC setting.
 -- single row synchronous case
 SELECT *
 FROM dblink('myconn',
diff --git a/contrib/dblink/sql/dblink.sql b/contrib/dblink/sql/dblink.sql
index de925eb..7ff43fd 100644
--- a/contrib/dblink/sql/dblink.sql
+++ b/contrib/dblink/sql/dblink.sql
@@ -435,9 +435,8 @@ SET timezone = UTC;
 SELECT dblink_connect('myconn','dbname=contrib_regression');
 SELECT dblink_exec('myconn', 'SET datestyle = GERMAN, DMY;');

--- The following attempt test various paths at which TupleDescs are
--- formed and inspected for containment of types requiring local GUC
--- setting.
+-- The following attempt test various paths at which tuples are formed
+-- and inspected for containment of types requiring local GUC setting.

 -- single row synchronous case
 SELECT *

--
fdr


dblink-guc-sensitive-types-v7.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: postgres_fdw vs data formatting GUCs (was Re: [HACKERS] [v9.3] writable foreign tables)

2013-03-20 Thread Daniel Farina
On Tue, Mar 19, 2013 at 10:37 PM, Daniel Farina dan...@heroku.com wrote:
 I added programming around various NULL returns reading GUCs in this
 revision, v4.

Okay, one more of those fridge-logic bugs.  Sorry for the noise. v5.

A missing PG_RETHROW to get the properly finally-esque semantics:

--- a/contrib/dblink/dblink.c
+++ b/contrib/dblink/dblink.c
@@ -642,7 +642,10 @@ dblink_fetch(PG_FUNCTION_ARGS)
  }
  PG_CATCH();
  {
+ /* Pop any set GUCs, if necessary */
  restoreLocalGucs(rgs);
+
+ PG_RE_THROW();
  }
  PG_END_TRY();

This was easy to add a regression test to exercise, and so I did (not
displayed here).

--
fdr


dblink-guc-sensitive-types-v5.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: postgres_fdw vs data formatting GUCs (was Re: [HACKERS] [v9.3] writable foreign tables)

2013-03-20 Thread Tom Lane
Daniel Farina dan...@heroku.com writes:
 Okay, one more of those fridge-logic bugs.  Sorry for the noise. v5.

 A missing PG_RETHROW to get the properly finally-esque semantics:

 --- a/contrib/dblink/dblink.c
 +++ b/contrib/dblink/dblink.c
 @@ -642,7 +642,10 @@ dblink_fetch(PG_FUNCTION_ARGS)
   }
   PG_CATCH();
   {
 + /* Pop any set GUCs, if necessary */
   restoreLocalGucs(rgs);
 +
 + PG_RE_THROW();
   }
   PG_END_TRY();

Um ... you shouldn't need a PG_TRY for that at all.  guc.c will take
care of popping the values on transaction abort --- that's really rather
the whole point of having that mechanism.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: postgres_fdw vs data formatting GUCs (was Re: [HACKERS] [v9.3] writable foreign tables)

2013-03-20 Thread Daniel Farina
On Wed, Mar 20, 2013 at 7:43 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Daniel Farina dan...@heroku.com writes:
 Okay, one more of those fridge-logic bugs.  Sorry for the noise. v5.

 A missing PG_RETHROW to get the properly finally-esque semantics:

 --- a/contrib/dblink/dblink.c
 +++ b/contrib/dblink/dblink.c
 @@ -642,7 +642,10 @@ dblink_fetch(PG_FUNCTION_ARGS)
   }
   PG_CATCH();
   {
 + /* Pop any set GUCs, if necessary */
   restoreLocalGucs(rgs);
 +
 + PG_RE_THROW();
   }
   PG_END_TRY();

 Um ... you shouldn't need a PG_TRY for that at all.  guc.c will take
 care of popping the values on transaction abort --- that's really rather
 the whole point of having that mechanism.

Hmm, well, merely raising the error doesn't reset the GUCs, so I was
rather thinking that this was a good idea to compose more neatly in
the case of nested exception processing, e.g.:

PG_TRY();
{
PG_TRY();
{
elog(NOTICE, pre: %s,
 GetConfigOption(DateStyle, false, true));
materializeResult(fcinfo, res);
}
PG_CATCH();
{
elog(NOTICE, inner catch: %s,
 GetConfigOption(DateStyle, false, true));
PG_RE_THROW();
}
PG_END_TRY();
}
PG_CATCH();
{
elog(NOTICE, outer catch: %s,
 GetConfigOption(DateStyle, false, true));
restoreLocalGucs(rgs);
elog(NOTICE, restored: %s,
 GetConfigOption(DateStyle, false, true));
PG_RE_THROW();
}
PG_END_TRY();

I don't think this paranoia is made in other call sites for AtEOXact,
so included is a version that takes the same stance. This one shows up
 best with the whitespace-insensitive option for git:

--- a/contrib/dblink/dblink.c
+++ b/contrib/dblink/dblink.c
@@ -634,21 +634,8 @@ dblink_fetch(PG_FUNCTION_ARGS)
 * affect parsing and then un-set them afterwards.
 */
initRemoteGucs(rgs, conn);
-
-   PG_TRY();
-   {
applyRemoteGucs(rgs);
materializeResult(fcinfo, res);
-   }
-   PG_CATCH();
-   {
-   /* Pop any set GUCs, if necessary */
-   restoreLocalGucs(rgs);
-
-   PG_RE_THROW();
-   }
-   PG_END_TRY();
-
restoreLocalGucs(rgs);

return (Datum) 0;
@@ -823,9 +810,6 @@ dblink_record_internal(FunctionCallInfo fcinfo,
bool is_async)
if (freeconn)
PQfinish(conn);

-   /* Pop any set GUCs, if necessary */
-   restoreLocalGucs(rgs);
-
PG_RE_THROW();
}
PG_END_TRY();

--
fdr


dblink-guc-sensitive-types-v6.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: postgres_fdw vs data formatting GUCs (was Re: [HACKERS] [v9.3] writable foreign tables)

2013-03-20 Thread Tom Lane
Daniel Farina dan...@heroku.com writes:
 On Wed, Mar 20, 2013 at 7:43 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Um ... you shouldn't need a PG_TRY for that at all.  guc.c will take
 care of popping the values on transaction abort --- that's really rather
 the whole point of having that mechanism.

 Hmm, well, merely raising the error doesn't reset the GUCs, so I was
 rather thinking that this was a good idea to compose more neatly in
 the case of nested exception processing, e.g.:

In general, we don't allow processing to resume after an error until
transaction or subtransaction abort cleanup has been done.  It's true
that if you look at the GUC state in a PG_CATCH block, you'll see it
hasn't been reset yet, but that's not very relevant.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: postgres_fdw vs data formatting GUCs (was Re: [HACKERS] [v9.3] writable foreign tables)

2013-03-19 Thread Daniel Farina
On Thu, Mar 14, 2013 at 8:07 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Daniel Farina dan...@heroku.com writes:
 On Tue, Mar 12, 2013 at 11:51 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Yeah, watching the remote side's datestyle and intervalstyle and
 matching them (for both input and output) would probably work.

 Alright, so I've been whacking at this and there's one interesting
 thing to ask about: saving and restoring the local GUCs.  There are a
 bunch of things about GUCs besides their value that are maintained,
 such as their 'source', so writing a little ad-hoc save/restore is not
 going to do the right thing.

 Right, you should use NewGUCNestLevel/AtEOXact_GUC.  Look at the fixes
 I committed in postgres_fdw a day or two ago for an example.

Thanks.  Here's a patch.  Here is the comments on top of the patch file inlined:

Similar in purpose to cc3f281ffb0a5d9b187e7a7b7de4a045809ff683, but
taking into account that a dblink caller may choose to cause arbitrary
changes to DateStyle and IntervalStyle.  To handle this, it is
necessary to use PQparameterStatus before parsing any input, every
time.  This is avoided in cases that do not include the two
problematic types treated -- interval and timestamptz -- by scanning
the TupleDesc's types first.

Although it has been suggested that extra_float_digits would need
similar treatment as IntervalStyle and DateStyle (as it's seen in
pg_dump), extra_float_digits is not an GUC_REPORT-ed GUC and is not
able to be checked in PQparameterStatus, and furthermore, the float4
and float8 parsers are not sensitive to the GUC anyway: both accept as
much precision as is provided in an unambiguous way.

 So, I can add one more such use of AtEOXact_GUC for the dblink fix,
 but would it also be appropriate to find some new names for the
 concepts (instead of AtEOXact_GUC and isCommit) here to more
 accurately express what's going on?

 Meh.  I guess we could invent an EndGUCNestLevel that's a wrapper
 around AtEOXact_GUC, but I'm not that excited about it ...

Per your sentiment, I won't pursue that then.

Overall, the patch I think has reasonably thorough regression testing
(I tried to hit the several code paths whereby one would have to scan
TupleDescs, as well as a few other edge cases) and has some of the
obvious optimizations in place: it doesn't call PQparameterStatus more
than once per vulnerable type per TupleDesc scan, and avoids using the
 parameter status procedure at all if there are no affected types.

The mechanisms may be overwrought though, since it was originally
intended to generalize to three types before I realized that
extra_float_digits is another kind of problem entirely, leaving just
IntervalStyle and DateStyle remaining, which perhaps could have been
treated even more specially to make the patch more terse.

I'll add it to the commitfest.

--
fdr


dblink-guc-sensitive-types-v1.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: postgres_fdw vs data formatting GUCs (was Re: [HACKERS] [v9.3] writable foreign tables)

2013-03-19 Thread Tom Lane
Daniel Farina dan...@heroku.com writes:
 Similar in purpose to cc3f281ffb0a5d9b187e7a7b7de4a045809ff683, but
 taking into account that a dblink caller may choose to cause arbitrary
 changes to DateStyle and IntervalStyle.  To handle this, it is
 necessary to use PQparameterStatus before parsing any input, every
 time.  This is avoided in cases that do not include the two
 problematic types treated -- interval and timestamptz -- by scanning
 the TupleDesc's types first.

Hm.  Is that really a win?  Examining the tupdesc wouldn't be free
either, and what's more, I think you're making false assumptions about
which types have behavior dependent on the GUCs.  You definitely missed
out on date and plain timestamp, and what of domains over these types?

I'd be inclined to eat the cost of calling PQparameterStatus every time
(which won't be that much) and instead try to optimize by avoiding the
GUC-setting overhead if the current value matches the local setting.
But even that might be premature optimization.  Did you do any
performance testing to see if there was a problem worth avoiding?

 Although it has been suggested that extra_float_digits would need
 similar treatment as IntervalStyle and DateStyle (as it's seen in
 pg_dump), extra_float_digits is not an GUC_REPORT-ed GUC and is not
 able to be checked in PQparameterStatus, and furthermore, the float4
 and float8 parsers are not sensitive to the GUC anyway: both accept as
 much precision as is provided in an unambiguous way.

Agreed, we don't need to worry so much about that one; or at least,
if we do need to worry, it's on the other end of the connection from
what we're fixing here.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: postgres_fdw vs data formatting GUCs (was Re: [HACKERS] [v9.3] writable foreign tables)

2013-03-19 Thread Daniel Farina
On Tue, Mar 19, 2013 at 1:16 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Daniel Farina dan...@heroku.com writes:
 Similar in purpose to cc3f281ffb0a5d9b187e7a7b7de4a045809ff683, but
 taking into account that a dblink caller may choose to cause arbitrary
 changes to DateStyle and IntervalStyle.  To handle this, it is
 necessary to use PQparameterStatus before parsing any input, every
 time.  This is avoided in cases that do not include the two
 problematic types treated -- interval and timestamptz -- by scanning
 the TupleDesc's types first.

 Hm.  Is that really a win?  Examining the tupdesc wouldn't be free
 either, and what's more, I think you're making false assumptions about
 which types have behavior dependent on the GUCs.  You definitely missed
 out on date and plain timestamp, and what of domains over these types?

Yes, I also forgot about arrays, and nested composites in arrays or
nested composites.  As soon as recursive types are introduced the scan
is not sufficient for sure: it's necessary to copy every GUC unless
one wants to recurse through the catalogs which feels like a heavy
loss.

I presumed at the time that skimming over the tupdecs to compare a few
Oids would be significantly cheaper than the guts of
PQparameterStatus, which I believe to be a linked list traversal.  I
mostly wrote that optimization because it was easy coincidentally from
how I chose to structure the code rather than belief in its absolute
necessity.

And, as you said, I forgot a few types even for the simple case, which
is a bit of a relief because some of the machinery I wrote for the n =
3 case will come in useful for that.

 I'd be inclined to eat the cost of calling PQparameterStatus every time
 (which won't be that much) and instead try to optimize by avoiding the
 GUC-setting overhead if the current value matches the local setting.
 But even that might be premature optimization.  Did you do any
 performance testing to see if there was a problem worth avoiding?

Nope; should I invent a new way to do that, or would it be up to
commit standard based on inspection alone?  I'm okay either way, let
me know.

I'll take into account these problems and post a new version.

--
fdr


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: postgres_fdw vs data formatting GUCs (was Re: [HACKERS] [v9.3] writable foreign tables)

2013-03-19 Thread Tom Lane
Daniel Farina dan...@heroku.com writes:
 On Tue, Mar 19, 2013 at 1:16 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 I'd be inclined to eat the cost of calling PQparameterStatus every time
 (which won't be that much) and instead try to optimize by avoiding the
 GUC-setting overhead if the current value matches the local setting.
 But even that might be premature optimization.  Did you do any
 performance testing to see if there was a problem worth avoiding?

 Nope; should I invent a new way to do that, or would it be up to
 commit standard based on inspection alone?  I'm okay either way, let
 me know.

Doesn't seem that hard to test: run a dblink query that pulls back a
bunch of data under best-case conditions (ie, local not remote server),
and time it with and without the proposed fix.  If the difference
is marginal then it's not worth working hard to optimize.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: postgres_fdw vs data formatting GUCs (was Re: [HACKERS] [v9.3] writable foreign tables)

2013-03-19 Thread Daniel Farina
On Tue, Mar 19, 2013 at 2:41 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Daniel Farina dan...@heroku.com writes:
 On Tue, Mar 19, 2013 at 1:16 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 I'd be inclined to eat the cost of calling PQparameterStatus every time
 (which won't be that much) and instead try to optimize by avoiding the
 GUC-setting overhead if the current value matches the local setting.
 But even that might be premature optimization.  Did you do any
 performance testing to see if there was a problem worth avoiding?

 Nope; should I invent a new way to do that, or would it be up to
 commit standard based on inspection alone?  I'm okay either way, let
 me know.

 Doesn't seem that hard to test: run a dblink query that pulls back a
 bunch of data under best-case conditions (ie, local not remote server),
 and time it with and without the proposed fix.  If the difference
 is marginal then it's not worth working hard to optimize.

Okay, will do, and here's the shorter and less mechanically intensive
naive version that I think is the baseline: it doesn't try to optimize
out any GUC settings and sets up the GUCs before the two
materialization paths in dblink.

Something I forgot to ask about is about if an strangely-minded type
input function could whack around the GUC as records are being
remitted, which would necessitate per-tuple polling of
pqParameterStatus (e.g. in the inner loop of a materialization) .
That seemed pretty out there, but I'm broaching it for completeness.

I'll see how much of a penalty it is vs. not applying any patch at all next.

--
fdr


dblink-guc-sensitive-types-v2.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: postgres_fdw vs data formatting GUCs (was Re: [HACKERS] [v9.3] writable foreign tables)

2013-03-19 Thread Daniel Farina
On Tue, Mar 19, 2013 at 3:06 PM, Daniel Farina dan...@heroku.com wrote:
 On Tue, Mar 19, 2013 at 2:41 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Daniel Farina dan...@heroku.com writes:
 On Tue, Mar 19, 2013 at 1:16 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 I'd be inclined to eat the cost of calling PQparameterStatus every time
 (which won't be that much) and instead try to optimize by avoiding the
 GUC-setting overhead if the current value matches the local setting.
 But even that might be premature optimization.  Did you do any
 performance testing to see if there was a problem worth avoiding?

 Nope; should I invent a new way to do that, or would it be up to
 commit standard based on inspection alone?  I'm okay either way, let
 me know.

 Doesn't seem that hard to test: run a dblink query that pulls back a
 bunch of data under best-case conditions (ie, local not remote server),
 and time it with and without the proposed fix.  If the difference
 is marginal then it's not worth working hard to optimize.

 Okay, will do, and here's the shorter and less mechanically intensive
 naive version that I think is the baseline: it doesn't try to optimize
 out any GUC settings and sets up the GUCs before the two
 materialization paths in dblink.

The results.  Summary: seems like grabbing the GUC and strcmp-ing is
worthwhile, but the amount of ping-ponging between processes adds some
noise to the timing results: utilization is far short of 100% on
either processor involved.  Attached is a cumulative diff of the new
version, and also reproduced below are the changes to v2 that make up
v3.

## Benchmark

SELECT dblink_connect('benchconn','dbname=contrib_regression');

CREATE FUNCTION bench() RETURNS integer AS $$
DECLARE
iterations integer;
BEGIN
iterations := 0;

WHILE iterations  30 LOOP
  PERFORM * FROM dblink('benchconn', 'SELECT 1') AS t(a int);
  iterations := iterations + 1;
END LOOP;

RETURN iterations;
END;
$$ LANGUAGE plpgsql;

SELECT clock_timestamp() INTO TEMP beginning;
SELECT bench();
SELECT clock_timestamp() INTO TEMP ending;

SELECT 'dblink-benchmark-lines';
SELECT ending.clock_timestamp - beginning.clock_timestamp
FROM beginning, ending;

## Data Setup

CREATE TEMP TABLE bench_results (version text, span interval);

COPY bench_results FROM STDIN WITH CSV;
no-patch,@ 41.308122 secs
no-patch,@ 36.63597 secs
no-patch,@ 34.264119 secs
no-patch,@ 34.760179 secs
no-patch,@ 32.991257 secs
no-patch,@ 34.538258 secs
no-patch,@ 42.576354 secs
no-patch,@ 39.335557 secs
no-patch,@ 37.493206 secs
no-patch,@ 37.812205 secs
v2-applied,@ 36.550553 secs
v2-applied,@ 38.608723 secs
v2-applied,@ 39.415744 secs
v2-applied,@ 46.091052 secs
v2-applied,@ 45.425438 secs
v2-applied,@ 48.219506 secs
v2-applied,@ 43.514878 secs
v2-applied,@ 45.892302 secs
v2-applied,@ 48.479335 secs
v2-applied,@ 47.632041 secs
v3-strcmp,@ 32.524385 secs
v3-strcmp,@ 34.982098 secs
v3-strcmp,@ 34.487222 secs
v3-strcmp,@ 44.394681 secs
v3-strcmp,@ 44.638309 secs
v3-strcmp,@ 44.113088 secs
v3-strcmp,@ 45.497769 secs
v3-strcmp,@ 33.530176 secs
v3-strcmp,@ 32.9704 secs
v3-strcmp,@ 40.84764 secs
\.

= SELECT version, avg(extract(epoch from span)), stddev(extract(epoch
from span))
   FROM bench_results
   GROUP BY version;
  version   |avg |  stddev
++--
 no-patch   | 37.1715227 | 3.17076487912923
 v2-applied | 43.9829572 | 4.30572672565711
 v3-strcmp  | 38.7985768 | 5.54760393720725

## Changes to v2:

--- a/contrib/dblink/dblink.c
+++ b/contrib/dblink/dblink.c
@@ -2981,9 +2981,11 @@ initRemoteGucs(remoteGucs *rgs, PGconn *conn)
 static void
 applyRemoteGucs(remoteGucs *rgs)
 {
- int i;
  const int numGucs = sizeof parseAffectingGucs / sizeof *parseAffectingGucs;

+ int i;
+ int addedGucNesting = false;
+
  /*
  * Affected types require local GUC manipulations.  Create a new
  * GUC NestLevel to overlay the remote settings.
@@ -2992,14 +2994,27 @@ applyRemoteGucs(remoteGucs *rgs)
  * structure, so expect it to come with an invalid NestLevel.
  */
  Assert(rgs-localGUCNestLevel == -1);
- rgs-localGUCNestLevel = NewGUCNestLevel();

  for (i = 0; i  numGucs; i += 1)
  {
+ int gucApplyStatus;
+
  const char *gucName   = parseAffectingGucs[i];
  const char *remoteVal = PQparameterStatus(rgs-conn, gucName);
+ const char *localVal  = GetConfigOption(gucName, true, true);

- int gucApplyStatus;
+ /*
+ * Attempt to avoid GUC setting if the remote and local GUCs
+ * already have the same value.
+ */
+ if (strcmp(remoteVal, localVal) == 0)
+ continue;
+
+ if (!addedGucNesting)
+ {
+ rgs-localGUCNestLevel = NewGUCNestLevel();
+ addedGucNesting = true;
+ }

  gucApplyStatus = set_config_option(gucName, remoteVal,
PGC_USERSET, PGC_S_SESSION,

--
fdr


dblink-guc-sensitive-types-v3.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:

Re: postgres_fdw vs data formatting GUCs (was Re: [HACKERS] [v9.3] writable foreign tables)

2013-03-19 Thread Daniel Farina
On Tue, Mar 19, 2013 at 6:10 PM, Daniel Farina dan...@heroku.com wrote:
 On Tue, Mar 19, 2013 at 3:06 PM, Daniel Farina dan...@heroku.com wrote:
 On Tue, Mar 19, 2013 at 2:41 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Daniel Farina dan...@heroku.com writes:
 On Tue, Mar 19, 2013 at 1:16 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 I'd be inclined to eat the cost of calling PQparameterStatus every time
 (which won't be that much) and instead try to optimize by avoiding the
 GUC-setting overhead if the current value matches the local setting.
 But even that might be premature optimization.  Did you do any
 performance testing to see if there was a problem worth avoiding?

 Nope; should I invent a new way to do that, or would it be up to
 commit standard based on inspection alone?  I'm okay either way, let
 me know.

 Doesn't seem that hard to test: run a dblink query that pulls back a
 bunch of data under best-case conditions (ie, local not remote server),
 and time it with and without the proposed fix.  If the difference
 is marginal then it's not worth working hard to optimize.

 Okay, will do, and here's the shorter and less mechanically intensive
 naive version that I think is the baseline: it doesn't try to optimize
 out any GUC settings and sets up the GUCs before the two
 materialization paths in dblink.

 The results.  Summary: seems like grabbing the GUC and strcmp-ing is
 worthwhile, but the amount of ping-ponging between processes adds some
 noise to the timing results: utilization is far short of 100% on
 either processor involved.  Attached is a cumulative diff of the new
 version, and also reproduced below are the changes to v2 that make up
 v3.

I added programming around various NULL returns reading GUCs in this
revision, v4.

The non-cumulative changes:

--- a/contrib/dblink/dblink.c
+++ b/contrib/dblink/dblink.c
@@ -3005,8 +3005,22 @@ applyRemoteGucs(remoteGucs *rgs)
  /*
  * Attempt to avoid GUC setting if the remote and local GUCs
  * already have the same value.
+ *
+ * NB: Must error if the GUC is not found.
  */
- localVal = GetConfigOption(gucName, true, true);
+ localVal = GetConfigOption(gucName, false, true);
+
+ if (remoteVal == NULL)
+ ereport(ERROR,
+ (errmsg(could not load parameter status of %s,
+ gucName)));
+
+ /*
+ * An error must have been raised by now if GUC values could
+ * not be loaded for any reason.
+ */
+ Assert(localVal != NULL);
+ Assert(remoteVal != NULL);

  if (strcmp(remoteVal, localVal) == 0)
  continue;

--
fdr


dblink-guc-sensitive-types-v4.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.3] writable foreign tables

2013-03-15 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Mon, Mar 11, 2013 at 3:06 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Greg Stark st...@mit.edu writes:
 It feels a bit like unpredictable magic to have DEFAULT mean one
 thing and omitted columns mean something else.

 Agreed.  The current code behaves that way, but I think that's
 indisputably a bug not behavior we want to keep.

 I'm not entirely convinced that's a bug.  Both behaviors seem useful,
 and there has to be some way to specify each one.

I would love it if we had a way to provide remote-default
functionality.  But per SQL spec these should produce the same results:
INSERT INTO t(f1, f2) VALUES(1, DEFAULT);
INSERT INTO t(f1) VALUES(1);
If PG fails to work like that, it's not a feature, it's a bug.
Where the default is coming from is not a justification for failing
the POLA like that.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: postgres_fdw vs data formatting GUCs (was Re: [HACKERS] [v9.3] writable foreign tables)

2013-03-14 Thread Daniel Farina
On Tue, Mar 12, 2013 at 11:51 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Daniel Farina dan...@heroku.com writes:
 Okay, I see.  So inverting the thinking I wrote earlier: how about
 hearkening carefully to any ParameterStatus messages on the local side
 before entering the inner loop of dblink.c:materializeResult as to set
 the local GUC (and carefully dropping it back off after
 materializeResult) so that the the _in functions can evaluate the
 input in the same relevant GUC context as the remote side?

 Yeah, watching the remote side's datestyle and intervalstyle and
 matching them (for both input and output) would probably work.

Alright, so I've been whacking at this and there's one interesting
thing to ask about: saving and restoring the local GUCs.  There are a
bunch of things about GUCs besides their value that are maintained,
such as their 'source', so writing a little ad-hoc save/restore is not
going to do the right thing.  Luckily, something much closer to the
right thing is done for SET LOCAL with transactions and
subtransactions, to push and pop GUC contexts:

  guc.h:

  extern int NewGUCNestLevel(void);
  extern void AtEOXact_GUC(bool isCommit, int nestLevel);

By and large looking at the mechanics of these two functions, the
latter in particular has very little to do with the transaction
machinery in general.  It's already being called from a bunch of
places that don't, to me, seem to really indicate a intrinsic
connection to transactions, e.g. do_analyze_rel, ri_triggers, and
postgres_fdw.  I think in those cases the justification for settings
of 'isCommit' is informed by the mechanism more than the symbol name
or its comments.  I count about ten call sites that are like this.

So, I can add one more such use of AtEOXact_GUC for the dblink fix,
but would it also be appropriate to find some new names for the
concepts (instead of AtEOXact_GUC and isCommit) here to more
accurately express what's going on?

I'll perhaps do that after finishing up the dblink fix if I receive
some positive response on the matter.  However, if for some reason I
*shouldn't* use that machinery in dblink, I'd appreciate the guidance.

--
fdr


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: postgres_fdw vs data formatting GUCs (was Re: [HACKERS] [v9.3] writable foreign tables)

2013-03-14 Thread Tom Lane
Daniel Farina dan...@heroku.com writes:
 On Tue, Mar 12, 2013 at 11:51 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Yeah, watching the remote side's datestyle and intervalstyle and
 matching them (for both input and output) would probably work.

 Alright, so I've been whacking at this and there's one interesting
 thing to ask about: saving and restoring the local GUCs.  There are a
 bunch of things about GUCs besides their value that are maintained,
 such as their 'source', so writing a little ad-hoc save/restore is not
 going to do the right thing.

Right, you should use NewGUCNestLevel/AtEOXact_GUC.  Look at the fixes
I committed in postgres_fdw a day or two ago for an example.

 So, I can add one more such use of AtEOXact_GUC for the dblink fix,
 but would it also be appropriate to find some new names for the
 concepts (instead of AtEOXact_GUC and isCommit) here to more
 accurately express what's going on?

Meh.  I guess we could invent an EndGUCNestLevel that's a wrapper
around AtEOXact_GUC, but I'm not that excited about it ...

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.3] writable foreign tables

2013-03-14 Thread Robert Haas
On Mon, Mar 11, 2013 at 3:06 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Greg Stark st...@mit.edu writes:
 It feels a bit like unpredictable magic to have DEFAULT mean one
 thing and omitted columns mean something else.

 Agreed.  The current code behaves that way, but I think that's
 indisputably a bug not behavior we want to keep.

I'm not entirely convinced that's a bug.  Both behaviors seem useful,
and there has to be some way to specify each one.

But I just work here.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Column defaults for foreign tables (was Re: [HACKERS] [v9.3] writable foreign tables)

2013-03-13 Thread Albe Laurenz
Tom Lane wrote:
 Yeah, I'm drifting towards the position that we should just define the
 defaults as being whatever they are locally, rather than trying to be
 cute about supporting remotely-executed defaults.  It looks to me like
 if we try to do the latter, we're going to have pitfalls and weird
 corner cases that will never be quite transparent.  There's also the
 argument that this'd be a lot of work that benefits only some FDWs,
 since the whole concept of remote column defaults doesn't apply when
 the FDW's data source isn't a traditional RDBMS.

That was my first thought on the topic, to have a solution that
is simple (if not perfect).
Your argument that it would be unpleasant to lose the ability
to use sequence-generated remote default values made me reconsider.

But there is a workaround, namely to use a trigger before insert
to generate an automatic primary key (e.g. if the inserted value is
NULL).
Maybe it would be good to add a few hints at workarounds like that
to the documentation if it's going to be local defaults.

Yours,
Laurenz Albe


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Column defaults for foreign tables (was Re: [HACKERS] [v9.3] writable foreign tables)

2013-03-13 Thread Tom Lane
Albe Laurenz laurenz.a...@wien.gv.at writes:
 Yeah, I'm drifting towards the position that we should just define the
 defaults as being whatever they are locally, rather than trying to be
 cute about supporting remotely-executed defaults.

 That was my first thought on the topic, to have a solution that
 is simple (if not perfect).
 Your argument that it would be unpleasant to lose the ability
 to use sequence-generated remote default values made me reconsider.

 But there is a workaround, namely to use a trigger before insert
 to generate an automatic primary key (e.g. if the inserted value is
 NULL).

Another attack is to set up a different foreign table pointing to the
same remote table, but lacking the sequence column.  When you insert via
that table, you'll get the remote's default for the hidden column.
This doesn't require any weird triggers on the remote side, but it could
be hard to persuade existing apps to use the second foreign table.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Column defaults for foreign tables (was Re: [HACKERS] [v9.3] writable foreign tables)

2013-03-13 Thread Thom Brown
On 13 March 2013 19:04, Tom Lane t...@sss.pgh.pa.us wrote:
 Albe Laurenz laurenz.a...@wien.gv.at writes:
 Yeah, I'm drifting towards the position that we should just define the
 defaults as being whatever they are locally, rather than trying to be
 cute about supporting remotely-executed defaults.

 That was my first thought on the topic, to have a solution that
 is simple (if not perfect).
 Your argument that it would be unpleasant to lose the ability
 to use sequence-generated remote default values made me reconsider.

 But there is a workaround, namely to use a trigger before insert
 to generate an automatic primary key (e.g. if the inserted value is
 NULL).

 Another attack is to set up a different foreign table pointing to the
 same remote table, but lacking the sequence column.  When you insert via
 that table, you'll get the remote's default for the hidden column.
 This doesn't require any weird triggers on the remote side, but it could
 be hard to persuade existing apps to use the second foreign table.

How about:

CREATE FOREIGN TABLE tablename (id int DEFAULT PASSTHROUGH) SERVER pg_server;

That way it will pass DEFAULT through to the remote table as it's
defined on the table.  Users can then explicitly insert values, or
select the default, which will configured to ensure the default on the
remote server is used... although I suspect I'm overlooking something
here.

-- 
Thom


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Column defaults for foreign tables (was Re: [HACKERS] [v9.3] writable foreign tables)

2013-03-13 Thread Tom Lane
Thom Brown t...@linux.com writes:
 How about:

 CREATE FOREIGN TABLE tablename (id int DEFAULT PASSTHROUGH) SERVER pg_server;

 That way it will pass DEFAULT through to the remote table as it's
 defined on the table.  Users can then explicitly insert values, or
 select the default, which will configured to ensure the default on the
 remote server is used... although I suspect I'm overlooking something
 here.

Yeah ... how to implement it.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Column defaults for foreign tables (was Re: [HACKERS] [v9.3] writable foreign tables)

2013-03-12 Thread Albe Laurenz
Tom Lane wrote:
 Thom Brown t...@linux.com writes:
 Out of curiosity, is there any way to explicitly force a foreign
 DEFAULT with column-omission?

 I've concluded that the ideal behavior probably is that if you have
 declared a DEFAULT expression for a foreign table's column, then that's
 what the default is for the purpose of inserts or updates through the
 foreign table; but if you haven't, then (at least for postgres_fdw)
 the effective default is whatever the remote table has.

I agree.

 I thought at first that we could fix this, and the related case
 
   update foreigntable set somecolumn = default
 
 with some relatively localized hacking in the rewriter.  However, that
 idea fell down when I looked at multi-row inserts:
 
   insert into foreigntable
 values (x, y, z), (a, default, b), (c, d, default), ...
 
 The current implementation of this requires substituting the appropriate
 column default expressions into the VALUES lists at rewrite time.
 That's okay for a default expression that is known locally and should be
 evaluated locally; but I see absolutely no practical way to make it work
 if we'd like to have the defaults inserted remotely.  We'd need to have
 some out-of-band indication that a row being returned by the ValuesScan
 node had had default placeholders in particular columns --- and I just
 can't see us doing the amount of violence that would need to be done to
 the executor to make that happen.  Especially not in the 9.3 timeframe.
 
 So one possible answer is to adopt the ignore-remote-defaults semantics
 I suggested above, but I don't much like that from a usability standpoint.
 
 Another idea is to throw a not implemented error on the specific case
 of a multi-row VALUES with DEFAULT placeholders when the target is a
 foreign table.  That's pretty grotty, not least because it would have to
 reject the case for all foreign tables not just postgres_fdw ones.  But
 it would give us some wiggle room to implement more desirable semantics
 in the cases that we can handle reasonably.
 
 Thoughts?

Do you think that it is possible to insert remote defaults
by omitting columns like this:

INSERT INTO foreigntable (col1, col3) VALUES (a, c);

If that can be made to work, then my opinion is that throwing an
error on

INSERT INTO foreigntable (col1, col2, col3) VALUES (a, DEFAULT, c);

would be acceptable, because there is at least a workaround.

If the first variant also cannot be made to work with remote
defaults, then I'd say that the best is to use local
defaults throughout and accept the loss of usability.

Yours,
Laurenz Albe


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: postgres_fdw vs data formatting GUCs (was Re: [HACKERS] [v9.3] writable foreign tables)

2013-03-12 Thread Daniel Farina
On Mon, Mar 11, 2013 at 7:04 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Daniel Farina dan...@heroku.com writes:
 I will try to make time for this, although it seems like the general
 approach should match pgsql_fdw if possible.  Is the current thinking
 to forward the settings and then use the GUC hooks to track updates?

 That's not what I had in mind for postgres_fdw --- rather the idea is to
 avoid needing on-the-fly changes in remote-side settings, because those
 are so expensive to make.  However, postgres_fdw is fortunate in that
 the SQL it expects to execute on the remote side is very constrained.
 dblink might need a different solution that would leave room for
 user-driven changes of remote-side settings.

Okay, I see.  So inverting the thinking I wrote earlier: how about
hearkening carefully to any ParameterStatus messages on the local side
before entering the inner loop of dblink.c:materializeResult as to set
the local GUC (and carefully dropping it back off after
materializeResult) so that the the _in functions can evaluate the
input in the same relevant GUC context as the remote side?

That should handle SET actions executed remotely.

Otherwise it seems like a solution would have to be ambitious enough
to encompass reifying the GUCs from the afflicted parsers, which I
surmise is not something that we want to treat right now.

--
fdr


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: postgres_fdw vs data formatting GUCs (was Re: [HACKERS] [v9.3] writable foreign tables)

2013-03-12 Thread Tom Lane
Daniel Farina dan...@heroku.com writes:
 Okay, I see.  So inverting the thinking I wrote earlier: how about
 hearkening carefully to any ParameterStatus messages on the local side
 before entering the inner loop of dblink.c:materializeResult as to set
 the local GUC (and carefully dropping it back off after
 materializeResult) so that the the _in functions can evaluate the
 input in the same relevant GUC context as the remote side?

Yeah, watching the remote side's datestyle and intervalstyle and
matching them (for both input and output) would probably work.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Column defaults for foreign tables (was Re: [HACKERS] [v9.3] writable foreign tables)

2013-03-12 Thread Tom Lane
Albe Laurenz laurenz.a...@wien.gv.at writes:
 Do you think that it is possible to insert remote defaults
 by omitting columns like this:
 INSERT INTO foreigntable (col1, col3) VALUES (a, c);

Well, that's how it works right now, but it's not good that it's
inconsistent with the explicit-DEFAULT case.

 If that can be made to work, then my opinion is that throwing an
 error on
 INSERT INTO foreigntable (col1, col2, col3) VALUES (a, DEFAULT, c);
 would be acceptable, because there is at least a workaround.

Aside from the oddness of not supporting that when it's equivalent
to the other case, what about this:

UPDATE foreigntable SET col2 = DEFAULT WHERE ...

There is no simple workaround if we don't provide support for that.

 If the first variant also cannot be made to work with remote
 defaults, then I'd say that the best is to use local
 defaults throughout and accept the loss of usability.

Yeah, I'm drifting towards the position that we should just define the
defaults as being whatever they are locally, rather than trying to be
cute about supporting remotely-executed defaults.  It looks to me like
if we try to do the latter, we're going to have pitfalls and weird
corner cases that will never be quite transparent.  There's also the
argument that this'd be a lot of work that benefits only some FDWs,
since the whole concept of remote column defaults doesn't apply when
the FDW's data source isn't a traditional RDBMS.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: postgres_fdw vs data formatting GUCs (was Re: [HACKERS] [v9.3] writable foreign tables)

2013-03-11 Thread Tom Lane
Daniel Farina dan...@heroku.com writes:
 On Sun, Mar 10, 2013 at 12:15 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Hmm ... the buildfarm just rubbed my nose in a more immediate issue,
 which is that postgres_fdw is vulnerable to problems if the remote
 server is using different GUC settings than it is for things like
 timezone and datestyle.

 Forgive my naivety: why would a timestamptz value not be passed
 through the _in/_out function locally at least once (hence, respecting
 local GUCs) when using the FDW?  Is the problem overhead in not
 wanting to parse and re-format the value on the local server?

No, the problem is that ambiguous dates may be transferred incorrectly
to or from the remote server.  Once a timestamp value is inside our
server, we are responsible for getting it to the remote end accurately,
IMO.

Here's an example using the loopback server that's set up by the
postgres_fdw regression test:

contrib_regression=# show datestyle; -- this is the style the remote session 
will be using
 DateStyle 
---
 ISO, MDY
(1 row)

contrib_regression=# create table remote (f1 timestamptz);
CREATE TABLE
contrib_regression=# create foreign table ft (f1 timestamptz) server loopback 
options (table_name 'remote');
CREATE FOREIGN TABLE
contrib_regression=# set datestyle = german, dmy;
SET
contrib_regression=# select now();
  now   

 11.03.2013 09:40:17.401173 EDT
(1 row)

contrib_regression=# insert into ft values(now());
INSERT 0 1
contrib_regression=# select *, now() from ft;
   f1   |  now  
+---
 03.11.2013 08:40:58.682679 EST | 11.03.2013 09:41:30.96724 EDT
(1 row)

The remote end has entirely misinterpreted the day vs month fields.
Now, to hit this you need to be using a datestyle which will print
in an ambiguous format, so the ISO and Postgres styles are
not vulnerable; but German and SQL are.

 I suppose that means any non-immutable in/out function pair may have
 to be carefully considered, and the list is despairingly long...but
 finite:

A look at pg_dump says that it only worries about setting datestyle,
intervalstyle, and extra_float_digits.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: postgres_fdw vs data formatting GUCs (was Re: [HACKERS] [v9.3] writable foreign tables)

2013-03-11 Thread Josh Berkus

 The remote end has entirely misinterpreted the day vs month fields.
 Now, to hit this you need to be using a datestyle which will print
 in an ambiguous format, so the ISO and Postgres styles are
 not vulnerable; but German and SQL are.

Is the timezone GUC a problem, also, for this?  Seems like that could
be much worse ...

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: postgres_fdw vs data formatting GUCs (was Re: [HACKERS] [v9.3] writable foreign tables)

2013-03-11 Thread Tom Lane
Josh Berkus j...@agliodbs.com writes:
 The remote end has entirely misinterpreted the day vs month fields.
 Now, to hit this you need to be using a datestyle which will print
 in an ambiguous format, so the ISO and Postgres styles are
 not vulnerable; but German and SQL are.

 Is the timezone GUC a problem, also, for this?  Seems like that could
 be much worse ...

I'm not sure why you think being off by an hour is much worse than
being off by nine months ;-).  But anyway, timezone seems to be mostly a
cosmetic issue, since timestamptz values will always get printed with an
explicit zone value.  I think you could possibly burn yourself if a
foreign table were declared as being type timestamp when the underlying
column is really timestamptz ... but that kind of misconfiguration would
probably create a lot of misbehaviors above and beyond this one.

Having said that, I'd still be inclined to try to set the remote's
timezone GUC just so that error messages coming back from the remote
don't reflect a randomly different timezone, which was the basic issue
in the buildfarm failures we saw yesterday.  OTOH, there is no guarantee
at all that the remote has the same timezone database we do, so it may
not know the zone or may think it has different DST rules than we think;
so it's not clear how far we can get with that.  Maybe we should just
set the remote session's timezone to GMT always.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: postgres_fdw vs data formatting GUCs (was Re: [HACKERS] [v9.3] writable foreign tables)

2013-03-11 Thread Josh Berkus

 Having said that, I'd still be inclined to try to set the remote's
 timezone GUC just so that error messages coming back from the remote
 don't reflect a randomly different timezone, which was the basic issue
 in the buildfarm failures we saw yesterday.  OTOH, there is no guarantee
 at all that the remote has the same timezone database we do, so it may
 not know the zone or may think it has different DST rules than we think;
 so it's not clear how far we can get with that.  Maybe we should just
 set the remote session's timezone to GMT always.

Yeah, that seems the safest choice.  What are the potential drawbacks,
if any?

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Column defaults for foreign tables (was Re: [HACKERS] [v9.3] writable foreign tables)

2013-03-11 Thread Tom Lane
I wrote:
 Thom Brown t...@linux.com writes:
 Out of curiosity, is there any way to explicitly force a foreign
 DEFAULT with column-omission?

 That's one of the things that would have to be worked out before
 we could implement anything here.  The easy answer would be that DEFAULT
 specifies the local default, and only if you omit the column entirely
 from the local command (including not having a local default) does the
 remote default take effect.  But whether that would be convenient to
 use is hard to tell.

 Another thing that would be easy to implement is to say that the new row
 value is fully determined locally (including defaults if any) and remote
 defaults have nothing to do with it.  But I think that's almost
 certainly a usability fail --- imagine that the remote has a
 sequence-generated primary key, for instance.  I think it's probably
 necessary to permit remote insertion of defaults for that sort of table
 definition to work conveniently.

I looked into this a bit, and realize that the code-as-committed is
already not self-consistent, because these give different results:

insert into foreigntable default values;

insert into foreigntable values(default, default, ...);

The former case inserts whatever the remote-side default values are.
The latter case inserts NULLs, regardless of the remote defaults,
because that's what the query is expanded to by the rewriter.  So it
seems like this is something we must fix for 9.3, because otherwise
we're going to have backwards-compatibility issues if we try to change
the behavior later.

I've concluded that the ideal behavior probably is that if you have
declared a DEFAULT expression for a foreign table's column, then that's
what the default is for the purpose of inserts or updates through the
foreign table; but if you haven't, then (at least for postgres_fdw)
the effective default is whatever the remote table has.

I thought at first that we could fix this, and the related case

update foreigntable set somecolumn = default

with some relatively localized hacking in the rewriter.  However, that
idea fell down when I looked at multi-row inserts:

insert into foreigntable
  values (x, y, z), (a, default, b), (c, d, default), ...

The current implementation of this requires substituting the appropriate
column default expressions into the VALUES lists at rewrite time.
That's okay for a default expression that is known locally and should be
evaluated locally; but I see absolutely no practical way to make it work
if we'd like to have the defaults inserted remotely.  We'd need to have
some out-of-band indication that a row being returned by the ValuesScan
node had had default placeholders in particular columns --- and I just
can't see us doing the amount of violence that would need to be done to
the executor to make that happen.  Especially not in the 9.3 timeframe.

So one possible answer is to adopt the ignore-remote-defaults semantics
I suggested above, but I don't much like that from a usability standpoint.

Another idea is to throw a not implemented error on the specific case
of a multi-row VALUES with DEFAULT placeholders when the target is a
foreign table.  That's pretty grotty, not least because it would have to
reject the case for all foreign tables not just postgres_fdw ones.  But
it would give us some wiggle room to implement more desirable semantics
in the cases that we can handle reasonably.

Thoughts?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.3] writable foreign tables

2013-03-11 Thread Greg Stark
On Sun, Mar 10, 2013 at 10:01 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Another thing that would be easy to implement is to say that the new row
 value is fully determined locally (including defaults if any) and remote
 defaults have nothing to do with it.  But I think that's almost
 certainly a usability fail --- imagine that the remote has a
 sequence-generated primary key, for instance.  I think it's probably
 necessary to permit remote insertion of defaults for that sort of table
 definition to work conveniently.

It feels a bit like unpredictable magic to have DEFAULT mean one
thing and omitted columns mean something else. Perhaps we should have
an explicit LOCAL DEFAULT and REMOTE DEFAULT and then have DEFAULT and
omitted columns both mean the same thing.

This starts getting a bit weird if you start to ask what happens when
the remote table is itself an FDW though


-- 
greg


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.3] writable foreign tables

2013-03-11 Thread Tom Lane
Greg Stark st...@mit.edu writes:
 It feels a bit like unpredictable magic to have DEFAULT mean one
 thing and omitted columns mean something else.

Agreed.  The current code behaves that way, but I think that's
indisputably a bug not behavior we want to keep.

 Perhaps we should have
 an explicit LOCAL DEFAULT and REMOTE DEFAULT and then have DEFAULT and
 omitted columns both mean the same thing.

I don't think we really want to introduce new syntax for this, do you?
Especially not when many FDWs won't have a notion of a remote default
at all.

My thought was that the ideal behavior is that there's only one default
for a column, with any local definition of it taking precedence over any
remote definition.  But see later message about how that may be hard to
implement correctly.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.3] writable foreign tables

2013-03-11 Thread Thom Brown
On 11 March 2013 19:00, Greg Stark st...@mit.edu wrote:
 On Sun, Mar 10, 2013 at 10:01 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Another thing that would be easy to implement is to say that the new row
 value is fully determined locally (including defaults if any) and remote
 defaults have nothing to do with it.  But I think that's almost
 certainly a usability fail --- imagine that the remote has a
 sequence-generated primary key, for instance.  I think it's probably
 necessary to permit remote insertion of defaults for that sort of table
 definition to work conveniently.

 It feels a bit like unpredictable magic to have DEFAULT mean one
 thing and omitted columns mean something else. Perhaps we should have
 an explicit LOCAL DEFAULT and REMOTE DEFAULT and then have DEFAULT and
 omitted columns both mean the same thing.

 This starts getting a bit weird if you start to ask what happens when
 the remote table is itself an FDW though

We could have something like:

CREATE FOREIGN TABLE ...
... OPTION (default locality);

where locality is 'local' or 'remote'.  But then this means it still
can't be specified in individual queries, or have a different locality
between columns on the same foreign table.

-- 
Thom


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: postgres_fdw vs data formatting GUCs (was Re: [HACKERS] [v9.3] writable foreign tables)

2013-03-11 Thread Tom Lane
Josh Berkus j...@agliodbs.com writes:
 Having said that, I'd still be inclined to try to set the remote's
 timezone GUC just so that error messages coming back from the remote
 don't reflect a randomly different timezone, which was the basic issue
 in the buildfarm failures we saw yesterday.  OTOH, there is no guarantee
 at all that the remote has the same timezone database we do, so it may
 not know the zone or may think it has different DST rules than we think;
 so it's not clear how far we can get with that.  Maybe we should just
 set the remote session's timezone to GMT always.

 Yeah, that seems the safest choice.  What are the potential drawbacks,
 if any?

Hard to tell if there are any without testing it.

I remembered that there's a relatively inexpensive way to set GUC values
transiently within an operation, which is GUC_ACTION_SAVE; both
extension.c and ri_triggers.c are relying on that.  So here's my
proposal for a fix:

* To make the remote end transmit values unambiguously, send SET
commands for the GUCs listed below during remote session setup.
(postgres_fdw is already assuming that such SETs will persist for the
whole session.)

* To make our end transmit values unambiguously, use GUC_ACTION_SAVE to
transiently change the GUCs listed below whenever we are converting
values to text form to send to the remote end.  (This would include
deparsing of Const nodes as well as transmission of query parameters.)

* Judging from the precedent of pg_dump, these are the things we ought
to set this way:

DATESTYLE = ISO

INTERVALSTYLE = POSTGRES (skip on remote side, if version  8.4)

EXTRA_FLOAT_DIGITS = 3 (or 2 on remote side, if version  9.0)

* In addition I propose we set TIMEZONE = UTC on the remote side only.
This is, I believe, just a cosmetic hack so that timestamptz values
coming back in error messages will be printed consistently; it would let
us revert the kluge solution I put in place for this type of regression
failure:
http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=rover_fireflydt=2013-03-10%2018%3A30%3A00

BTW, it strikes me that dblink is probably subject to at least some of
these same failure modes.  I'm not personally volunteering to fix any
of this in dblink, but maybe someone ought to look into that.

Thoughts?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: postgres_fdw vs data formatting GUCs (was Re: [HACKERS] [v9.3] writable foreign tables)

2013-03-11 Thread Daniel Farina
On Mon, Mar 11, 2013 at 12:30 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 BTW, it strikes me that dblink is probably subject to at least some of
 these same failure modes.  I'm not personally volunteering to fix any
 of this in dblink, but maybe someone ought to look into that.

I will try to make time for this, although it seems like the general
approach should match pgsql_fdw if possible.  Is the current thinking
to forward the settings and then use the GUC hooks to track updates?

--
fdr


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: postgres_fdw vs data formatting GUCs (was Re: [HACKERS] [v9.3] writable foreign tables)

2013-03-11 Thread Tom Lane
Daniel Farina dan...@heroku.com writes:
 I will try to make time for this, although it seems like the general
 approach should match pgsql_fdw if possible.  Is the current thinking
 to forward the settings and then use the GUC hooks to track updates?

That's not what I had in mind for postgres_fdw --- rather the idea is to
avoid needing on-the-fly changes in remote-side settings, because those
are so expensive to make.  However, postgres_fdw is fortunate in that
the SQL it expects to execute on the remote side is very constrained.
dblink might need a different solution that would leave room for
user-driven changes of remote-side settings.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.3] writable foreign tables

2013-03-10 Thread Tom Lane
Kohei KaiGai kai...@kaigai.gr.jp writes:
 [ pgsql-v9.3-writable-fdw-poc.v12.part-1/2.patch ]

Applied after rather extensive editorialization.  DELETE RETURNING in
particular was a mess, and I also tried to make SELECT FOR UPDATE behave
in what seemed like a sane fashion.

There's a lot left to do here of course.  One thing I was wondering
about was why we don't allow DEFAULTs to be attached to foreign-table
columns.  There was no use in it before, but it seems sensible enough
now.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


postgres_fdw vs data formatting GUCs (was Re: [HACKERS] [v9.3] writable foreign tables)

2013-03-10 Thread Tom Lane
I wrote:
 There's a lot left to do here of course.  One thing I was wondering
 about was why we don't allow DEFAULTs to be attached to foreign-table
 columns.  There was no use in it before, but it seems sensible enough
 now.

Hmm ... the buildfarm just rubbed my nose in a more immediate issue,
which is that postgres_fdw is vulnerable to problems if the remote
server is using different GUC settings than it is for things like
timezone and datestyle.  The failure seen here:
http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=rover_fireflydt=2013-03-10%2018%3A30%3A00
is basically just cosmetic, but it's not hard to imagine non-cosmetic
problems coming up.  For instance, suppose our instance is running in
DMY datestyle and transmits an ambiguous date to a remote running in
MDY datestyle.

We could consider sending our settings to the remote at connection
establishment, but that doesn't seem terribly bulletproof --- what if
someone does a local SET later?  What seems safer is to set the remote
to ISO style always, but then we have to figure out how to get the local
timestamptz_out to emit that style without touching our local GUC.
Ugh.

(One more reason why GUCs that affect application-visible semantics are
dangerous.)

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.3] writable foreign tables

2013-03-10 Thread Andrew Dunstan


On 03/10/2013 02:32 PM, Tom Lane wrote:

Kohei KaiGai kai...@kaigai.gr.jp writes:

[ pgsql-v9.3-writable-fdw-poc.v12.part-1/2.patch ]

Applied after rather extensive editorialization.  DELETE RETURNING in
particular was a mess, and I also tried to make SELECT FOR UPDATE behave
in what seemed like a sane fashion.

There's a lot left to do here of course.  One thing I was wondering
about was why we don't allow DEFAULTs to be attached to foreign-table
columns.  There was no use in it before, but it seems sensible enough
now.


Excellent news. But I noticed as I went to update my non-writeable FDW 
that this has happened in the regression tests. Is this correct?


*** 
/home/pgl/npgl/fdw/file_text_array_fdw/expected/file_textarray_fdw.out 
2013-03-10 16:28:00.120340629 -0400
--- 
/home/pgl/npgl/fdw/file_text_array_fdw/results/file_textarray_fdw.out 
2013-03-10 16:28:00.595340910 -0400

***
*** 188,196 
  LINE 1: DELETE FROM agg_csv_array WHERE a = 100;
  ^
  SELECT * FROM agg_csv_array FOR UPDATE OF agg_csv_array;
! ERROR:  SELECT FOR UPDATE/SHARE cannot be used with foreign table 
agg_csv_array

! LINE 1: SELECT * FROM agg_csv_array FOR UPDATE OF agg_csv_array;
!   ^
  -- but this should be ignored
  SELECT * FROM agg_csv_array FOR UPDATE;
t
--- 188,200 
  LINE 1: DELETE FROM agg_csv_array WHERE a = 100;
  ^
  SELECT * FROM agg_csv_array FOR UPDATE OF agg_csv_array;
!   t
! --
!  {100,99.097}
!  {0,0.09561}
!  {42,324.78}
! (3 rows)
!
  -- but this should be ignored
  SELECT * FROM agg_csv_array FOR UPDATE;
t


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.3] writable foreign tables

2013-03-10 Thread Thom Brown
On 10 March 2013 18:32, Tom Lane t...@sss.pgh.pa.us wrote:
 Kohei KaiGai kai...@kaigai.gr.jp writes:
 [ pgsql-v9.3-writable-fdw-poc.v12.part-1/2.patch ]

 Applied after rather extensive editorialization.  DELETE RETURNING in
 particular was a mess, and I also tried to make SELECT FOR UPDATE behave
 in what seemed like a sane fashion.

 There's a lot left to do here of course.  One thing I was wondering
 about was why we don't allow DEFAULTs to be attached to foreign-table
 columns.  There was no use in it before, but it seems sensible enough
 now.

Yes...

postgres=# INSERT INTO animals (id, animal, age) VALUES (DEFAULT,
'okapi', NULL);
ERROR:  null value in column id violates not-null constraint
DETAIL:  Failing row contains (null, okapi, null).
CONTEXT:  Remote SQL command: INSERT INTO public.animals(id, animal,
age) VALUES ($1, $2, $3)

Out of curiosity, is there any way to explicitly force a foreign
DEFAULT with column-omission?

--
Thom


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.3] writable foreign tables

2013-03-10 Thread Thom Brown
On 10 March 2013 20:38, Thom Brown t...@linux.com wrote:
 On 10 March 2013 18:32, Tom Lane t...@sss.pgh.pa.us wrote:
 Kohei KaiGai kai...@kaigai.gr.jp writes:
 [ pgsql-v9.3-writable-fdw-poc.v12.part-1/2.patch ]

 Applied after rather extensive editorialization.  DELETE RETURNING in
 particular was a mess, and I also tried to make SELECT FOR UPDATE behave
 in what seemed like a sane fashion.

 There's a lot left to do here of course.  One thing I was wondering
 about was why we don't allow DEFAULTs to be attached to foreign-table
 columns.  There was no use in it before, but it seems sensible enough
 now.

 Yes...

 postgres=# INSERT INTO animals (id, animal, age) VALUES (DEFAULT,
 'okapi', NULL);
 ERROR:  null value in column id violates not-null constraint
 DETAIL:  Failing row contains (null, okapi, null).
 CONTEXT:  Remote SQL command: INSERT INTO public.animals(id, animal,
 age) VALUES ($1, $2, $3)

 Out of curiosity, is there any way to explicitly force a foreign
 DEFAULT with column-omission?

Looks like we'll also need tab-completion for UPDATE, INSERT and
DELETE statements on foreign tables.

-- 
Thom


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.3] writable foreign tables

2013-03-10 Thread Tom Lane
Andrew Dunstan and...@dunslane.net writes:
 Excellent news. But I noticed as I went to update my non-writeable FDW 
 that this has happened in the regression tests. Is this correct?

Yeah, see the adjustment I made in the file_fdw test (which that looks
to be borrowed from).

The new theory is that SELECT FOR UPDATE is allowed on foreign tables,
and if the FDW doesn't do anything to implement it, it's just a no-op.

I looked into having the core code throw an error, but it was a pain
in the rear and of dubious merit anyway (since you can't really tell
for sure from outside the FDW whether the FDW did anything or whether
there's even any need to do anything for the particular data source).
Besides, the old behavior was less than consistent, since it only
complained when the FOR UPDATE directly mentioned the foreign table,
though that's not what the semantics are supposed to be.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.3] writable foreign tables

2013-03-10 Thread Tom Lane
Thom Brown t...@linux.com writes:
 On 10 March 2013 18:32, Tom Lane t...@sss.pgh.pa.us wrote:
 There's a lot left to do here of course.  One thing I was wondering
 about was why we don't allow DEFAULTs to be attached to foreign-table
 columns.  There was no use in it before, but it seems sensible enough
 now.

 postgres=# INSERT INTO animals (id, animal, age) VALUES (DEFAULT,
 'okapi', NULL);
 ERROR:  null value in column id violates not-null constraint
 DETAIL:  Failing row contains (null, okapi, null).

 Out of curiosity, is there any way to explicitly force a foreign
 DEFAULT with column-omission?

That's one of the things that would have to be worked out before
we could implement anything here.  The easy answer would be that DEFAULT
specifies the local default, and only if you omit the column entirely
from the local command (including not having a local default) does the
remote default take effect.  But whether that would be convenient to
use is hard to tell.

Another thing that would be easy to implement is to say that the new row
value is fully determined locally (including defaults if any) and remote
defaults have nothing to do with it.  But I think that's almost
certainly a usability fail --- imagine that the remote has a
sequence-generated primary key, for instance.  I think it's probably
necessary to permit remote insertion of defaults for that sort of table
definition to work conveniently.

Not real sure what the ideal behavior would be or how hard it would be
to implement.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: postgres_fdw vs data formatting GUCs (was Re: [HACKERS] [v9.3] writable foreign tables)

2013-03-10 Thread Daniel Farina
On Sun, Mar 10, 2013 at 12:15 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 I wrote:
 There's a lot left to do here of course.  One thing I was wondering
 about was why we don't allow DEFAULTs to be attached to foreign-table
 columns.  There was no use in it before, but it seems sensible enough
 now.

 Hmm ... the buildfarm just rubbed my nose in a more immediate issue,
 which is that postgres_fdw is vulnerable to problems if the remote
 server is using different GUC settings than it is for things like
 timezone and datestyle.  The failure seen here:
 http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=rover_fireflydt=2013-03-10%2018%3A30%3A00
 is basically just cosmetic, but it's not hard to imagine non-cosmetic
 problems coming up.  For instance, suppose our instance is running in
 DMY datestyle and transmits an ambiguous date to a remote running in
 MDY datestyle.

 We could consider sending our settings to the remote at connection
 establishment, but that doesn't seem terribly bulletproof --- what if
 someone does a local SET later?  What seems safer is to set the remote
 to ISO style always, but then we have to figure out how to get the local
 timestamptz_out to emit that style without touching our local GUC.
 Ugh.

Forgive my naivety: why would a timestamptz value not be passed
through the _in/_out function locally at least once (hence, respecting
local GUCs) when using the FDW?  Is the problem overhead in not
wanting to parse and re-format the value on the local server?

Although it seems that if GUCs affected *parsing* then the problem
comes back again, since one may choose to canonicalize output from a
remote server (e.g. via setting ISO time in all cases) but should the
user have set up GUCs to interpret input in a particular way for a
data type that is not enough.

As-is this situation is already technically true for timestamptz in
the case of time stamps without any explicit time offset or time zone,
but since timestamptz_out doesn't ever render without the offset
(right?)  it's not a problem.  Close shave, though, and one that an
extension author could easily find themselves on the wrong side of.

I suppose that means any non-immutable in/out function pair may have
to be carefully considered, and the list is despairingly long...but
finite:

SELECT proname
FROM pg_proc WHERE EXISTS
  (SELECT 1
   FROM pg_type
   WHERE pg_proc.oid = pg_type.typinput OR
 pg_proc.oid = pg_type.typoutput OR
 pg_proc.oid = pg_type.typsend OR
 pg_proc.oid = pg_type.typreceive)
  AND provolatile != 'i';

 (One more reason why GUCs that affect application-visible semantics are
dangerous.)

:(

--
fdr


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.3] writable foreign tables

2013-03-04 Thread Kohei KaiGai
2013/3/3 Tom Lane t...@sss.pgh.pa.us:
 Craig Ringer cr...@2ndquadrant.com writes:
 On 02/08/2013 01:03 AM, Kohei KaiGai wrote:
 The attached patch adds Daniel's reworks on make_modifytable
 invocation, and add a short comment on add_base_rels_to_query(). Rest
 of portion has not been changed from the previous version.

 How's this looking for 9.3? On-list discussion seems to have been
 positive but inconclusive and time's running out. Do you think this can
 be turned into a production-worthy feature in the next week or two?

 I think it needs major changes.  The portion against
 contrib/postgres_fdw fails to apply at all, of course, but that's my
 fault for having hacked so much on postgres_fdw before committing it.
 More generally, I don't much like the approach to ctid-substitute
 columns --- I think hacking on the rel's tupledesc like that is
 guaranteed to break things all over the place.  The assorted ugly
 kluges that are already in the patch because of it are just scratching
 the surface, and there may well be consequences that are flat out
 unfixable.  Probably the resjunk-columns mechanism would offer a better
 solution.

Probably, the latest patch takes an approach that utilizes resjunk-columns
that moves remote row-identifier on scan stage to modify stage, but no
hacks on tupledesc.
The new GetForeignRelWidth API allows FDW drivers to append slots
to return a few pseudo-columns to upper level scan. It can contain
a remote row-identifier of the row to be modified.
Also, rewriteTargetListUD() injects a junk target-entry to reference this
pseudo-column on update or delete from foreign tables as regular
table is doing on ctid.

Regarding to the portion towards postgres_fdw, I'm under reworking
on the part-2 of this patch to apply cleanly.

Thanks,
-- 
KaiGai Kohei kai...@kaigai.gr.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.3] writable foreign tables

2013-03-03 Thread Craig Ringer
On 02/08/2013 01:03 AM, Kohei KaiGai wrote:
 The attached patch adds Daniel's reworks on make_modifytable
 invocation, and add a short comment on add_base_rels_to_query(). Rest
 of portion has not been changed from the previous version.
How's this looking for 9.3? On-list discussion seems to have been
positive but inconclusive and time's running out. Do you think this can
be turned into a production-worthy feature in the next week or two?

A quick look at the patch shows that it includes reasonable-looking
documentation changes. I didn't see any regression test suite changes,
though; either I missed them or that's something that probably needs
addressing.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.3] writable foreign tables

2013-03-03 Thread Tom Lane
Craig Ringer cr...@2ndquadrant.com writes:
 On 02/08/2013 01:03 AM, Kohei KaiGai wrote:
 The attached patch adds Daniel's reworks on make_modifytable
 invocation, and add a short comment on add_base_rels_to_query(). Rest
 of portion has not been changed from the previous version.

 How's this looking for 9.3? On-list discussion seems to have been
 positive but inconclusive and time's running out. Do you think this can
 be turned into a production-worthy feature in the next week or two?

I think it needs major changes.  The portion against
contrib/postgres_fdw fails to apply at all, of course, but that's my
fault for having hacked so much on postgres_fdw before committing it.
More generally, I don't much like the approach to ctid-substitute
columns --- I think hacking on the rel's tupledesc like that is
guaranteed to break things all over the place.  The assorted ugly
kluges that are already in the patch because of it are just scratching
the surface, and there may well be consequences that are flat out
unfixable.  Probably the resjunk-columns mechanism would offer a better
solution.

I had hoped to spend several days on this and perhaps get it into
committable shape, because I think this is a pretty significant feature
that will take FDWs over the line from curiosity to useful tool.
However, I've been hoping that for nigh two weeks now and not actually
had any cycles to spend on it ...

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.3] writable foreign tables

2013-03-03 Thread Craig Ringer
On 03/03/2013 11:17 PM, Tom Lane wrote:
 Craig Ringer cr...@2ndquadrant.com writes:
 On 02/08/2013 01:03 AM, Kohei KaiGai wrote:
 The attached patch adds Daniel's reworks on make_modifytable
 invocation, and add a short comment on add_base_rels_to_query(). Rest
 of portion has not been changed from the previous version.
 How's this looking for 9.3? On-list discussion seems to have been
 positive but inconclusive and time's running out. Do you think this can
 be turned into a production-worthy feature in the next week or two?
 I think it needs major changes.  The portion against
 contrib/postgres_fdw fails to apply at all, of course, but that's my
 fault for having hacked so much on postgres_fdw before committing it.
 More generally, I don't much like the approach to ctid-substitute
 columns --- I think hacking on the rel's tupledesc like that is
 guaranteed to break things all over the place.  The assorted ugly
 kluges that are already in the patch because of it are just scratching
 the surface, and there may well be consequences that are flat out
 unfixable.  Probably the resjunk-columns mechanism would offer a better
 solution.

 I had hoped to spend several days on this and perhaps get it into
 committable shape, because I think this is a pretty significant feature
 that will take FDWs over the line from curiosity to useful tool.
 However, I've been hoping that for nigh two weeks now and not actually
 had any cycles to spend on it ...
Do you have any further brief suggestions for things that KaiGai Kohei
or others could do to make your side of this process easier and reduce
the amount of your time it'll demand?

For now it seems this stays in hopefully-can-be-made-ready limbo. I'll
keep looking through the list.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.3] writable foreign tables

2013-02-05 Thread Daniel Farina
On Fri, Feb 1, 2013 at 2:22 AM, Daniel Farina dan...@heroku.com wrote:
 On Tue, Jan 29, 2013 at 1:19 AM, Kohei KaiGai kai...@kaigai.gr.jp wrote:
 I noticed the v10 patch cannot be applied on the latest master branch
 cleanly. The attached ones were rebased.

 Anyway, I'm looking at the first patch, which applies neatly. Thanks.

Hello,

My review is incomplete, but to the benefit of pipelining I wanted to
ask a couple of things and submit some changes for your consideration
while continuing to look at this.

So far, I've been trying to understand in very close detail how your
changes affect non-FDW related paths first, before delving into the
actual writable FDW functionality.  There's not that much in this
category, but there's one thing that gave me pause: the fact that the
target list (by convention, tlist) has to be pushed down from
planmain.c/query_planner all the way to a
fdwroutine-GetForeignRelWidth callsite in plancat.c/get_relation_info
(so even in the end, it is just pushed down -- never inspected or
modified).  In relative terms, it's a significant widening of
currently fairly short parameter lists in these procedures:

add_base_rels_to_query(PlannerInfo *root, List *tlist, Node *jtnode)
build_simple_rel(PlannerInfo *root, int relid, List *tlist, RelOptKind
reloptkind)
get_relation_info(PlannerInfo *root, Oid relationObjectId, bool
inhparent, List *tlist, RelOptInfo *rel)

In the case of all the other parameters except tlist, each is more
intimately related with the inner workings and mechanisms of the
procedure.  I wonder if there's a nice way that can train the reader's
eye that the tlist is simply pushed down rather than actively managed
at each level.  However, I can't immediately produce a way to both
achieve that that doesn't seem overwrought.  Perhaps a comment will
suffice.

Related to this, I found that make_modifytable grew a dependency on
PlannerInfo *root, and it seemed like a bunch of the arguments are
functionally related to that, so the attached patch that should be
able to be applied to yours tries to utilize that as often as
possible.  The weirdest thing in there is how make_modifytable has
been taught to call SS_assign_special_param, which has a side effect
on the PlannerInfo, but there exist exactly zero cases where one
*doesn't* want to do that prior to the (exactly two) call sites to
make_modifytable, so I pushed it in.  The second weirdest thing is
pushing in the handling of rowMarks (e.g. SELECT FOR UPDATE et al)

Let me know as to your thoughts, otherwise I'll keep moving on...

--
fdr


wfdw-rely-on-plannerinfo.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.3] writable foreign tables

2013-02-01 Thread Daniel Farina
On Tue, Jan 29, 2013 at 1:19 AM, Kohei KaiGai kai...@kaigai.gr.jp wrote:
 I noticed the v10 patch cannot be applied on the latest master branch
 cleanly. The attached ones were rebased.

Hello,

I'm just getting started looking at this, but notice that the second
patch relies on contrib/postgres_fdw to apply, but it's not clear to
me where to get the correct version of that.  One way to solve this
would be to tell me where to get a version of that, and another that
may work well would be to relay a Git repository with postgres_fdw and
the writable fdw changes accumulated.

I poked around on git.postgresql.org and github and wasn't able to
find a recent pushed copy of this.

Anyway, I'm looking at the first patch, which applies neatly. Thanks.

--
fdr


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.3] writable foreign tables

2013-02-01 Thread Kohei KaiGai
2013/2/1 Daniel Farina dan...@heroku.com:
 On Tue, Jan 29, 2013 at 1:19 AM, Kohei KaiGai kai...@kaigai.gr.jp wrote:
 I noticed the v10 patch cannot be applied on the latest master branch
 cleanly. The attached ones were rebased.

 Hello,

 I'm just getting started looking at this, but notice that the second
 patch relies on contrib/postgres_fdw to apply, but it's not clear to
 me where to get the correct version of that.  One way to solve this
 would be to tell me where to get a version of that, and another that
 may work well would be to relay a Git repository with postgres_fdw and
 the writable fdw changes accumulated.

 I poked around on git.postgresql.org and github and wasn't able to
 find a recent pushed copy of this.

 Anyway, I'm looking at the first patch, which applies neatly. Thanks.

Thanks for your reviewing.

The second part assumes the latest (v5) postgres_fdw patch being
applied. It has been in status of ready-for-committer since last CF,
and nothing was changed.
  https://commitfest.postgresql.org/action/patch_view?id=940

If I oversight nothing, it is the latest version I reviewed before.
Is there somebody who can volunteer to review his patch and commit it?

Hanada-san, if you have some updates, please share it with us.

Thanks,
-- 
KaiGai Kohei kai...@kaigai.gr.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.3] writable foreign tables

2012-12-18 Thread Ronan Dunklau
Hello.

I've tried to implement this API for our Multicorn FDW, and I have a few
questions about the API.

First, I don't understand what are the requirements on the rowid
pseudo-column values.

In particular, this sentence from a previous mail makes it ambiguous to me:

 At the beginning, I constructed the rowid pseudo-column using
 INTERNALOID, but it made troubles when fetched tuples are
 materialized prior to table modification.
 So, the rowid argument of them are re-defined as const char *.

Does that mean that one should only store a cstring in the rowid
pseudo-column ? Or can one store an arbitrary pointer ? Currently, I'm
building a text value using cstring_to_text_with_len. Could there be a
problem with that ?

Secondly, how does one use a returning clause ?
I've tried to look at the postgres_fdw code, but it does not seem to handle
that.

For what its worth, knowing that the postgres_fdw is still in WIP status,
here is a couple result of my experimentation with it:

- Insert into a foreign table mapped to a table with a before trigger,
using a returning clause, will return the values as they were before being
modified by the trigger.
- Same thing,  but if the trigger prevents insertion (returning NULL), the
would-have-been inserted row is still returned, although the insert
statement reports zero rows.
- Inserting into a table with default values does not work as intended,
since missing values are replaced by a null value in the remote statement.

What can be done to make the behaviour more consistent ?

I'm very excited about this feature, thank you for making this possible.

Regards,
--
Ronan Dunklau

2012/12/14 Albe Laurenz laurenz.a...@wien.gv.at

 Kohei KaiGai wrote:
  I came up with one more query that causes a problem:
 [...]
  This causes a deadlock, but one that is not detected;
  the query just keeps hanging.
 
  The UPDATE in the CTE has the rows locked, so the
  SELECT ... FOR UPDATE issued via the FDW connection will hang
  indefinitely.
 
  I wonder if that's just a pathological corner case that we shouldn't
  worry about.  Loopback connections for FDWs themselves might not
  be so rare, for example as a substitute for autonomous subtransactions.
 
  I guess it is not easily possible to detect such a situation or
  to do something reasonable about it.
 
  It is not avoidable problem due to the nature of distributed database
 system,
  not only loopback scenario.
 
  In my personal opinion, I'd like to wait for someone implements
 distributed
  lock/transaction manager on top of the background worker framework being
  recently committed, to intermediate lock request.
  However, it will take massive amount of efforts to existing
 lock/transaction
  management layer, not only enhancement of FDW APIs. It is obviously out
  of scope in this patch.
 
  So, I'd like to suggest authors of FDW that support writable features to
 put
  mention about possible deadlock scenario in their documentation.
  At least, above writable CTE example is a situation that two different
 sessions
  concurrently update the test relation, thus, one shall be blocked.

 Fair enough.

  I tried to overhaul the documentation, see the attached patch.
 
  There was one thing that I was not certain of:
  You say that for writable foreign tables, BeginForeignModify
  and EndForeignModify *must* be implemented.
  I thought that these were optional, and if you can do your work
  with just, say, ExecForeignDelete, you could do that.
 
  Yes, that's right. What I wrote was incorrect.
  If FDW driver does not require any state during modification of
  foreign tables, indeed, these are not mandatory handler.

 I have updated the documentation, that was all I changed in the
 attached patches.

  OK. I split the patch into two portion, part-1 is the APIs relevant
  patch, part-2 is relevant to postgres_fdw patch.

 Great.

 I'll mark the patch as ready for committer.

 Yours,
 Laurenz Albe


 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers




Re: [HACKERS] [v9.3] writable foreign tables

2012-12-18 Thread Kohei KaiGai
Hi,

2012/12/18 Ronan Dunklau rdunk...@gmail.com:
 Hello.

 I've tried to implement this API for our Multicorn FDW, and I have a few
 questions about the API.

 First, I don't understand what are the requirements on the rowid
 pseudo-column values.

 In particular, this sentence from a previous mail makes it ambiguous to me:


 At the beginning, I constructed the rowid pseudo-column using
 INTERNALOID, but it made troubles when fetched tuples are
 materialized prior to table modification.
 So, the rowid argument of them are re-defined as const char *.

 Does that mean that one should only store a cstring in the rowid
 pseudo-column ? Or can one store an arbitrary pointer ? Currently, I'm
 building a text value using cstring_to_text_with_len. Could there be a
 problem with that ?

All what we require on the rowid pseudo-column is it has capability to
identify a particular row on the remote side. In case of postgres_fdw,
ctid of the relevant table is sufficient for the purpose.
I don't recommend to set the rowid field an arbitrary pointer, because
scanned tuple may be materialized between scanning and modifying
foreign table, thus, the arbitrary pointer shall be dealt under the
assumption of cstring data type.

 Secondly, how does one use a returning clause ?
 I've tried to look at the postgres_fdw code, but it does not seem to handle
 that.

 For what its worth, knowing that the postgres_fdw is still in WIP status,
 here is a couple result of my experimentation with it:

 - Insert into a foreign table mapped to a table with a before trigger,
 using a returning clause, will return the values as they were before being
 modified by the trigger.
 - Same thing,  but if the trigger prevents insertion (returning NULL), the
 would-have-been inserted row is still returned, although the insert
 statement reports zero rows.

Hmm. Do you want to see the real final contents of the rows being inserted,
don't you. Yep, the proposed interface does not have capability to modify
the supplied tuple on ExecForeignInsert or other methods.

Probably, it needs to adjust interface to allow FDW drivers to return
a modified HeapTuple or TupleTableSlot for RETURNING calculation.
(As trigger doing, it can return the given one as-is if no change)

Let me investigate the code around this topic.

 - Inserting into a table with default values does not work as intended,
 since missing values are replaced by a null value in the remote statement.

It might be a bug of my proof-of-concept portion at postgres_fdw.
The prepared INSERT statement should list up columns being actually
used only, not all ones unconditionally.

Thanks,

 What can be done to make the behaviour more consistent ?

 I'm very excited about this feature, thank you for making this possible.

 Regards,
 --
 Ronan Dunklau

 2012/12/14 Albe Laurenz laurenz.a...@wien.gv.at

 Kohei KaiGai wrote:
  I came up with one more query that causes a problem:
 [...]
  This causes a deadlock, but one that is not detected;
  the query just keeps hanging.
 
  The UPDATE in the CTE has the rows locked, so the
  SELECT ... FOR UPDATE issued via the FDW connection will hang
  indefinitely.
 
  I wonder if that's just a pathological corner case that we shouldn't
  worry about.  Loopback connections for FDWs themselves might not
  be so rare, for example as a substitute for autonomous subtransactions.
 
  I guess it is not easily possible to detect such a situation or
  to do something reasonable about it.
 
  It is not avoidable problem due to the nature of distributed database
  system,
  not only loopback scenario.
 
  In my personal opinion, I'd like to wait for someone implements
  distributed
  lock/transaction manager on top of the background worker framework being
  recently committed, to intermediate lock request.
  However, it will take massive amount of efforts to existing
  lock/transaction
  management layer, not only enhancement of FDW APIs. It is obviously out
  of scope in this patch.
 
  So, I'd like to suggest authors of FDW that support writable features to
  put
  mention about possible deadlock scenario in their documentation.
  At least, above writable CTE example is a situation that two different
  sessions
  concurrently update the test relation, thus, one shall be blocked.

 Fair enough.

  I tried to overhaul the documentation, see the attached patch.
 
  There was one thing that I was not certain of:
  You say that for writable foreign tables, BeginForeignModify
  and EndForeignModify *must* be implemented.
  I thought that these were optional, and if you can do your work
  with just, say, ExecForeignDelete, you could do that.
 
  Yes, that's right. What I wrote was incorrect.
  If FDW driver does not require any state during modification of
  foreign tables, indeed, these are not mandatory handler.

 I have updated the documentation, that was all I changed in the
 attached patches.

  OK. I split the patch into two portion, part-1 is the APIs relevant
  patch, part-2 

Re: [HACKERS] [v9.3] writable foreign tables

2012-12-12 Thread Erik Rijkers
On Wed, December 12, 2012 14:45, Kohei KaiGai wrote:

 pgsql-v9.3-writable-fdw-poc.v8.part-2.patch  151 k
 pgsql-v9.3-writable-fdw-poc.v8.part-1.patch   70 k

I wanted to have a look at this, and tried to apply part 1, en then part 2 on 
top of that (that's
the idea, right?)

part 1 applies and then part 2 does not.



Erik Rijkers










-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.3] writable foreign tables

2012-12-12 Thread Albe Laurenz
Erik Rijkers wrote:
 pgsql-v9.3-writable-fdw-poc.v8.part-2.patch  151 k
 pgsql-v9.3-writable-fdw-poc.v8.part-1.patch   70 k
 
 I wanted to have a look at this, and tried to apply part 1, en then part 2 on 
 top of that (that's
 the idea, right?)
 
 part 1 applies and then part 2 does not.

Part 2 needs this prerequisite:
http://archives.postgresql.org/message-id/CAEZqfEcQtxn1JSjhC5usqhL4_n+Zck3mqo=rzedfpz+dawf...@mail.gmail.com

Yours,
Laurenz Albe

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.3] writable foreign tables

2012-12-07 Thread Albe Laurenz
Kohei KaiGai wrote:
 The attached patch is revised version.
 
 One most difference from the previous version is, it constructed
 PoC features on Hanada-san's latest postgres-fdw.v5.patch.
 Yesh, it looks to me working fine on RDBMS backend also.

Cool.

 Even though the filename of this patch contains poc phrase,
 I think it may be time to consider adoption of the core regarding
 to the interface portion.
 (Of course, postgres_fdw is still works in progress.)

Ok, I'll try to review it with regard to that.

 Here is a few operation examples.
[...]
 postgres=# INSERT INTO tbl VALUES (7,'ggg'),(8,'hhh');
 INSERT 0 2

Weird, that fails for me.

CREATE TABLE test(
   id integer PRIMARY KEY,
   val text NOT NULL
);

CREATE FOREIGN TABLE rtest(
   id integer not null,
   val text not null
) SERVER loopback OPTIONS (nspname 'laurenz', relname 'test');

test= INSERT INTO test(id, val) VALUES (1, 'one');
INSERT 0 1
test= INSERT INTO rtest(id, val) VALUES (2, 'two');
The connection to the server was lost. Attempting reset: Failed.
!

Here is the stack trace:
317 RangeTblEntry  *rte = root-simple_rte_array[rtindex];
#0  0x00231cb9 in deparseInsertSql (buf=0xbfffafb0, root=0x9be3614, rtindex=1) 
at deparse.c:317
#1  0x0023018c in postgresPlanForeignModify (root=0x9be3614, plan=0x9be3278, 
resultRelaion=1, subplan=0x9be3bec)
at postgres_fdw.c:1538
#2  0x082a3ac2 in make_modifytable (root=0x9be3614, operation=CMD_INSERT, 
canSetTag=1 '\001', resultRelations=0x9be3c7c,
subplans=0x9be3c4c, returningLists=0x0, rowMarks=0x0, epqParam=0) at 
createplan.c:4787
#3  0x082a7ada in subquery_planner (glob=0x9be357c, parse=0x9be3304, 
parent_root=0x0, hasRecursion=0 '\0', tuple_fraction=0,
subroot=0xbfffb0ec) at planner.c:574
#4  0x082a71dc in standard_planner (parse=0x9be3304, cursorOptions=0, 
boundParams=0x0) at planner.c:200
#5  0x082a707b in planner (parse=0x9be3304, cursorOptions=0, boundParams=0x0) 
at planner.c:129
#6  0x0832c14c in pg_plan_query (querytree=0x9be3304, cursorOptions=0, 
boundParams=0x0) at postgres.c:753
#7  0x0832c1ec in pg_plan_queries (querytrees=0x9be3a98, cursorOptions=0, 
boundParams=0x0) at postgres.c:812
#8  0x0832c46e in exec_simple_query (query_string=0x9be267c INSERT INTO 
rtest(id, val) VALUES (2, 'two');) at postgres.c:977

(gdb) print root-simple_rte_array
$1 = (RangeTblEntry **) 0x0

The bug I reported in my last review does not seem to be fixed, either.
The symptoms are different now (with the definitions from above):

test= UPDATE rtest SET val='new' FROM rtest t2 WHERE rtest.id=t2.id AND t2.val 
LIKE '%e';
TRAP: FailedAssertion(!(baserel-relid == var-varno), File: foreign.c, 
Line: 601)
The connection to the server was lost. Attempting reset: Failed.
!

The same happens for DELETE ... USING.

A different failure happens if I join with a local table:

test= UPDATE rtest SET val='new' FROM test t2 WHERE rtest.id=t2.id AND t2.val 
= 'nonexist';
TRAP: FailedAssertion(!(const Node*)(fscan))-type) == T_ForeignScan)), 
File: postgres_fdw.c, Line: 1526)
The connection to the server was lost. Attempting reset: Failed.
!

I gave up testing the functionality after that.

 So, let's back to the main topic of this patch.
 According to the upthread discussion, I renamed the interface to inform
 expected width of result set as follows:
 
 +typedef AttrNumber (*GetForeignRelWidth_function) (PlannerInfo *root,
 +  RelOptInfo *baserel,
 +  Relation foreignrel,
 +  bool inhparent,
 +  List *targetList);
 
 It informs the core how many slots for regular and pseudo columns shall
 be acquired. If it is identical with number of attributed in foreign table
 definition, it also means this scan does not use any pseudo columns.
 A typical use case of pseudo column is rowid to move an identifier of
 remote row from scan stage to modify stage. It is responsibility of FDW
 driver to ensure rowid has uniqueness on the remote side; my
 enhancement on postgres_fdw uses ctid.
 
 get_pseudo_rowid_column() is a utility function that picks up an attribute
 number of pseudo rowid column if query rewriter injected on previous
 stage. If FDW does not support any other pseudo column features, the
 value to be returned is just return-value of this function.

Thanks, I think this will make the FDW writer's work easier.

 Other relevant APIs are as follows:
 
 +typedef List *(*PlanForeignModify_function) (PlannerInfo *root,
 +ModifyTable *plan,
 +Index resultRelation,
 +Plan *subplan);
 +
 I newly added this handler on construction of ModifyTable structure.
 Because INSERT command does not have scan stage directly connected
 with table modification, FDW driver has no chance to construct 

Re: [HACKERS] [v9.3] writable foreign tables

2012-11-20 Thread Albe Laurenz
Kohei KaiGai wrote:
 This design tries to kill two-birds with one-stone.
 It enables to add multiple number of pseudo-columns,
 not only rowid, and makes possible to push-down
 complex calculation of target list into external computing
 resource.

 For example, when user gives the following query:

   SELECT ((c1 - c2) * (c2 - c3))^2 FROM ftable

 it contains a complex calculation in the target-list,
 thus, it also takes CPU cycles of local process.

 If we can replace the ((c1 - c2) * (c2 - c3))^2 by
 a reference to a pseudo-column that also references
 the calculation result on external node, it effectively
 off-load CPU cycles.

 In this case, all we need to do is (1) acquire a slot
 for pseudo-column at GetForeignRelInfo (2) replace
 TargetEntry::expr by Var node that reference this
 pseudo-column.

 It makes sense for performance optimization, so I don't
 want to restrict this handler for rowid only.

 I understand.

 But I think that you still can do that with the change that
 I suggest.  I suggest that GetForeignRelInfo (or whatever the
 name ends up being) gets the AttrNumber of the proposed rowid
 column in addition to the parameters you need for what
 you want to do.

 Then nothing would keep you from defining those
 pseudo-columns.  But all the setup necessary for the rowid
 column could be moved out of the FDW.  So for the 99% of all
 FDW which are only interested in the rowid, things would
 get much simpler and they don't all have to implement the
 same code.

 All we have to do at get_relation_info() to deal with pseudo-
 columns (including alternatives of complex calculation, not
 only rowid) is just expansion of rel-max_attr.
 So, if FDW is not interested in something except for rowid,
 it can just inform the caller Yeah, we need just one slot for
 a pseudo-column of rowid. Otherwise, it can return another
 value to acquire the slot for arbitrary pseudo-column.
 I don't think it is a problematic design.
 
 However, I'm skeptical 99% of FDWs don't support target-list
 push-down. At least, it was very desired feature when I had
 a talk at PGconf.EU last month. :-)

I agree that PostgreSQL should make this technique possible.

My idea should not make this any more difficult.

 So, if we rename it to GetForeignRelWidth, is it defined as
 follows?
 
 extern AttrNumber
 GetForeignRelWidth(PlannerInfo *root,
   RelOptInfo *baserel,
   Oid foreigntableid,
   bool inhparent,
   List *targetList);
 
 Right now, inhparent makes no sense because foreign table
 does not support table inheritance, but it seems to me we
 shall have this functionality near future.

I am thinking of this declaration:

extern bool
GetForeignRelWidth(PlannerInfo *root,
   RelOptInfo *baserel,
   Oid foreigntableid,
   bool inhparent,
   List *targetList,
   AttrNumber rowidAttr);

Let me illustrate my idea with some code.

Here's your fileGetForeignRelInfo:


static void
fileGetForeignRelInfo(PlannerInfo *root,
  RelOptInfo *baserel,
  Oid foreigntableid,
  bool inhparent, List *targetList)
{
FileFdwPlanState *fdw_private;
ListCell   *cell;

fdw_private = (FileFdwPlanState *)
palloc0(sizeof(FileFdwPlanState));

foreach(cell, targetList)
{
TargetEntry *tle = lfirst(cell);

if (tle-resjunk 
tle-resname  strcmp(tle-resname, rowid)==0)
{
Bitmapset  *temp = NULL;
AttrNumber  anum_rowid;
DefElem*defel;

pull_varattnos((Node *)tle, baserel-relid, temp);
anum_rowid = bms_singleton_member(temp)
+ FirstLowInvalidHeapAttributeNumber;
/* adjust attr_needed of baserel */
if (anum_rowid  baserel-max_attr)
baserel-max_attr = anum_rowid;
defel = makeDefElem(anum_rowid,
(Node *)makeInteger(anum_rowid));
fdw_private-anum_rowid = defel;
}
}
baserel-fdw_private = fdw_private;
}


I hope that this can be reduced to:


static bool
fileGetForeignRelRowid(PlannerInfo *root,
   RelOptInfo *baserel,
   Oid foreigntableid,
   bool inhparent,
   List *targetList,
   AttrNumber rowidAttr)
{
FileFdwPlanState *fdw_private;
fdw_private = (FileFdwPlanState *)
palloc0(sizeof(FileFdwPlanState));

defel = makeDefElem(anum_rowid,
(Node *)makeInteger(rowidAttr));
fdw_private-anum_rowid = defel;

baserel-fdw_private = fdw_private;

return true;  /* we'll use rowid, so please extend baserel-max_attr
*/
}


That wouldn't mean that the FDW cannot define any other
pseudo-columns in this function, just the case of rowid
would be 

Re: [HACKERS] [v9.3] writable foreign tables

2012-11-20 Thread Kohei KaiGai
2012/11/20 Albe Laurenz laurenz.a...@wien.gv.at:
 Kohei KaiGai wrote:
 This design tries to kill two-birds with one-stone.
 It enables to add multiple number of pseudo-columns,
 not only rowid, and makes possible to push-down
 complex calculation of target list into external computing
 resource.

 For example, when user gives the following query:

   SELECT ((c1 - c2) * (c2 - c3))^2 FROM ftable

 it contains a complex calculation in the target-list,
 thus, it also takes CPU cycles of local process.

 If we can replace the ((c1 - c2) * (c2 - c3))^2 by
 a reference to a pseudo-column that also references
 the calculation result on external node, it effectively
 off-load CPU cycles.

 In this case, all we need to do is (1) acquire a slot
 for pseudo-column at GetForeignRelInfo (2) replace
 TargetEntry::expr by Var node that reference this
 pseudo-column.

 It makes sense for performance optimization, so I don't
 want to restrict this handler for rowid only.

 I understand.

 But I think that you still can do that with the change that
 I suggest.  I suggest that GetForeignRelInfo (or whatever the
 name ends up being) gets the AttrNumber of the proposed rowid
 column in addition to the parameters you need for what
 you want to do.

 Then nothing would keep you from defining those
 pseudo-columns.  But all the setup necessary for the rowid
 column could be moved out of the FDW.  So for the 99% of all
 FDW which are only interested in the rowid, things would
 get much simpler and they don't all have to implement the
 same code.

 All we have to do at get_relation_info() to deal with pseudo-
 columns (including alternatives of complex calculation, not
 only rowid) is just expansion of rel-max_attr.
 So, if FDW is not interested in something except for rowid,
 it can just inform the caller Yeah, we need just one slot for
 a pseudo-column of rowid. Otherwise, it can return another
 value to acquire the slot for arbitrary pseudo-column.
 I don't think it is a problematic design.

 However, I'm skeptical 99% of FDWs don't support target-list
 push-down. At least, it was very desired feature when I had
 a talk at PGconf.EU last month. :-)

 I agree that PostgreSQL should make this technique possible.

 My idea should not make this any more difficult.

 So, if we rename it to GetForeignRelWidth, is it defined as
 follows?

 extern AttrNumber
 GetForeignRelWidth(PlannerInfo *root,
   RelOptInfo *baserel,
   Oid foreigntableid,
   bool inhparent,
   List *targetList);

 Right now, inhparent makes no sense because foreign table
 does not support table inheritance, but it seems to me we
 shall have this functionality near future.

 I am thinking of this declaration:

 extern bool
 GetForeignRelWidth(PlannerInfo *root,
RelOptInfo *baserel,
Oid foreigntableid,
bool inhparent,
List *targetList,
AttrNumber rowidAttr);

 Let me illustrate my idea with some code.

 Here's your fileGetForeignRelInfo:


 static void
 fileGetForeignRelInfo(PlannerInfo *root,
   RelOptInfo *baserel,
   Oid foreigntableid,
   bool inhparent, List *targetList)
 {
 FileFdwPlanState *fdw_private;
 ListCell   *cell;

 fdw_private = (FileFdwPlanState *)
 palloc0(sizeof(FileFdwPlanState));

 foreach(cell, targetList)
 {
 TargetEntry *tle = lfirst(cell);

 if (tle-resjunk 
 tle-resname  strcmp(tle-resname, rowid)==0)
 {
 Bitmapset  *temp = NULL;
 AttrNumber  anum_rowid;
 DefElem*defel;

 pull_varattnos((Node *)tle, baserel-relid, temp);
 anum_rowid = bms_singleton_member(temp)
 + FirstLowInvalidHeapAttributeNumber;
 /* adjust attr_needed of baserel */
 if (anum_rowid  baserel-max_attr)
 baserel-max_attr = anum_rowid;
 defel = makeDefElem(anum_rowid,
 (Node *)makeInteger(anum_rowid));
 fdw_private-anum_rowid = defel;
 }
 }
 baserel-fdw_private = fdw_private;
 }


 I hope that this can be reduced to:


 static bool
 fileGetForeignRelRowid(PlannerInfo *root,
RelOptInfo *baserel,
Oid foreigntableid,
bool inhparent,
List *targetList,
AttrNumber rowidAttr)
 {
 FileFdwPlanState *fdw_private;
 fdw_private = (FileFdwPlanState *)
 palloc0(sizeof(FileFdwPlanState));

 defel = makeDefElem(anum_rowid,
 (Node *)makeInteger(rowidAttr));
 fdw_private-anum_rowid = defel;

 baserel-fdw_private = fdw_private;

 return true;  /* we'll use rowid, so please extend baserel-max_attr
 */
 }


 That 

Re: [HACKERS] [v9.3] writable foreign tables

2012-11-20 Thread Albe Laurenz
Kohei KaiGai wrote:
 Probably, it is helpful to provide a helper function that fetches an
attribute-
 number of pseudo rowid column from the supplied targetlist.
 If we have GetPseudoRowidColumn() at foreign/foreign.c, the avove
 routine can be rewritten as:
 
 static AttrNumber
 fileGetForeignRelWidth(PlannerInfo *root,
   RelOptInfo *baserel,
   Relation foreignrel,
   bool inhparent, List
*targetList)
 {
 FileFdwPlanState *fdw_private;
 AttrNumbernattrs = RelationGetNumberOfAttributes(foreignrel);
 AttrNumberanum_rowid;
 
 fdw_private = palloc0(sizeof(FileFdwPlanState));
 anum_rowid = GetPseudoRowidColumn(..., targetList);
 if (anum_rowid  0)
 {
 Assert(anum_rowid  nattrs);
 fdw_private-anum_rowid
= makeDefElem(anum_rowid, (Node
*)makeInteger(anum_rowid));
 nattrs = anum_rowid;
 }
 baserel-fdw_private = fdw_private;
 
 return nattrs;
 }
 
 In case when FDW drive wants to push-down other target entry into
foreign-
 side, thus, it needs multiple pseudo-columns, it is decision of the
extension.
 In addition, it does not take API change in the future, if some more
additional
 pseudo-column is required by some other new features.
 
 How about your opinion?

I think that this is better than what I suggested.

Yours,
Laurenz Albe


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.3] writable foreign tables

2012-11-19 Thread Albe Laurenz
Kohei KaiGai wrote:
 I am not so happy with GetForeignRelInfo:
 - The name seems ill-chosen from the FDW API side.
   I guess that you chose the name because the function
   is called from get_relation_info, but I think the name
   should be more descriptive for the FDW API.
   Something like PlanForeignRelRowid.

 Indeed, GetForeignRelInfo might give misleading impression
 as if this routine collects widespread information about
 target relation. So, how about GetForeignRelWidth() instead?

That would be better for the function as it is now.

 - I guess that every FDW that only needs rowid will
   do exactly the same as your fileGetForeignRelInfo.
   Why can't that be done in core?
   The function could pass an AttrNumber for the rowid
   to the FDW, and will receive a boolean return code
   depending on whether the FDW plans to use rowid or not.
   That would be more convenient for FDW authors.

 This design tries to kill two-birds with one-stone.
 It enables to add multiple number of pseudo-columns,
 not only rowid, and makes possible to push-down
 complex calculation of target list into external computing
 resource.
 
 For example, when user gives the following query:
 
   SELECT ((c1 - c2) * (c2 - c3))^2 FROM ftable
 
 it contains a complex calculation in the target-list,
 thus, it also takes CPU cycles of local process.
 
 If we can replace the ((c1 - c2) * (c2 - c3))^2 by
 a reference to a pseudo-column that also references
 the calculation result on external node, it effectively
 off-load CPU cycles.
 
 In this case, all we need to do is (1) acquire a slot
 for pseudo-column at GetForeignRelInfo (2) replace
 TargetEntry::expr by Var node that reference this
 pseudo-column.
 
 It makes sense for performance optimization, so I don't
 want to restrict this handler for rowid only.

I understand.

But I think that you still can do that with the change that
I suggest.  I suggest that GetForeignRelInfo (or whatever the
name ends up being) gets the AttrNumber of the proposed rowid
column in addition to the parameters you need for what
you want to do.

Then nothing would keep you from defining those
pseudo-columns.  But all the setup necessary for the rowid
column could be moved out of the FDW.  So for the 99% of all
FDW which are only interested in the rowid, things would
get much simpler and they don't all have to implement the
same code.

Did I make clear what I mean?
Would that be difficult?

 - I guess the order is dictated by planner steps, but
   it would be nice to have if GetForeignRelInfo were
   not the first function to be called during planning.
   That would make it easier for a FDW to support both
   9.2 and 9.3 (fewer #ifdefs), because the FDW plan state
   will probably have to be created in the first API
   function.

 The baserel-fdw_private should be initialized to NULL,
 so it can perform as a mark whether private data is already
 constructed, or not.

Right, if that pointer is pre-initialized to NULL, that
should work.  Forget my quibble.

 In addition, I noticed my patch didn't update documentation stuff.
 I also add mention about new handlers.

I didn't get into documentation, comment and spelling issues since
the patch was still called POC, but yes, eventually that would
be necessary.

Yours,
Laurenz Albe


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.3] writable foreign tables

2012-11-19 Thread Kohei KaiGai
2012/11/19 Albe Laurenz laurenz.a...@wien.gv.at:
 Kohei KaiGai wrote:
 I am not so happy with GetForeignRelInfo:
 - The name seems ill-chosen from the FDW API side.
   I guess that you chose the name because the function
   is called from get_relation_info, but I think the name
   should be more descriptive for the FDW API.
   Something like PlanForeignRelRowid.

 Indeed, GetForeignRelInfo might give misleading impression
 as if this routine collects widespread information about
 target relation. So, how about GetForeignRelWidth() instead?

 That would be better for the function as it is now.

 - I guess that every FDW that only needs rowid will
   do exactly the same as your fileGetForeignRelInfo.
   Why can't that be done in core?
   The function could pass an AttrNumber for the rowid
   to the FDW, and will receive a boolean return code
   depending on whether the FDW plans to use rowid or not.
   That would be more convenient for FDW authors.

 This design tries to kill two-birds with one-stone.
 It enables to add multiple number of pseudo-columns,
 not only rowid, and makes possible to push-down
 complex calculation of target list into external computing
 resource.

 For example, when user gives the following query:

   SELECT ((c1 - c2) * (c2 - c3))^2 FROM ftable

 it contains a complex calculation in the target-list,
 thus, it also takes CPU cycles of local process.

 If we can replace the ((c1 - c2) * (c2 - c3))^2 by
 a reference to a pseudo-column that also references
 the calculation result on external node, it effectively
 off-load CPU cycles.

 In this case, all we need to do is (1) acquire a slot
 for pseudo-column at GetForeignRelInfo (2) replace
 TargetEntry::expr by Var node that reference this
 pseudo-column.

 It makes sense for performance optimization, so I don't
 want to restrict this handler for rowid only.

 I understand.

 But I think that you still can do that with the change that
 I suggest.  I suggest that GetForeignRelInfo (or whatever the
 name ends up being) gets the AttrNumber of the proposed rowid
 column in addition to the parameters you need for what
 you want to do.

 Then nothing would keep you from defining those
 pseudo-columns.  But all the setup necessary for the rowid
 column could be moved out of the FDW.  So for the 99% of all
 FDW which are only interested in the rowid, things would
 get much simpler and they don't all have to implement the
 same code.

 Did I make clear what I mean?
 Would that be difficult?

All we have to do at get_relation_info() to deal with pseudo-
columns (including alternatives of complex calculation, not
only rowid) is just expansion of rel-max_attr.
So, if FDW is not interested in something except for rowid,
it can just inform the caller Yeah, we need just one slot for
a pseudo-column of rowid. Otherwise, it can return another
value to acquire the slot for arbitrary pseudo-column.
I don't think it is a problematic design.

However, I'm skeptical 99% of FDWs don't support target-list
push-down. At least, it was very desired feature when I had
a talk at PGconf.EU last month. :-)

So, if we rename it to GetForeignRelWidth, is it defined as
follows?

extern AttrNumber
GetForeignRelWidth(PlannerInfo *root,
  RelOptInfo *baserel,
  Oid foreigntableid,
  bool inhparent,
  List *targetList);

Right now, inhparent makes no sense because foreign table
does not support table inheritance, but it seems to me we
shall have this functionality near future.

 - I guess the order is dictated by planner steps, but
   it would be nice to have if GetForeignRelInfo were
   not the first function to be called during planning.
   That would make it easier for a FDW to support both
   9.2 and 9.3 (fewer #ifdefs), because the FDW plan state
   will probably have to be created in the first API
   function.

 The baserel-fdw_private should be initialized to NULL,
 so it can perform as a mark whether private data is already
 constructed, or not.

 Right, if that pointer is pre-initialized to NULL, that
 should work.  Forget my quibble.

 In addition, I noticed my patch didn't update documentation stuff.
 I also add mention about new handlers.

 I didn't get into documentation, comment and spelling issues since
 the patch was still called POC, but yes, eventually that would
 be necessary.

 Yours,
 Laurenz Albe

Thanks,
-- 
KaiGai Kohei kai...@kaigai.gr.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.3] writable foreign tables

2012-11-16 Thread Albe Laurenz
Kohei KaiGai wrote:
 The attached patch is just a refreshed version for clean applying to
 the latest tree.
 
 As previous version doing, it makes pseudo enhancement on file_fdw
 to print something about the supplied tuple on INSERT, UPDATE and
 DELETE statement.

Basics:
---

The patch applies cleanly, compiles without warnings and passes
regression tests.

I think that the functionality is highly desirable; judging from
the number of talks at pgConf EU about SQL/MED this is a hot
topic, and further development is welcome.

Testing the functionality:
--

I ran a few queries with the file_fdw and found this:

$ cat flatfile
1,Laurenz,1968-10-20
2,Renée,1975-09-03
3,Caroline,2009-01-26
4,Ray,2011-03-09
5,Stephan,2011-03-09

CREATE SERVER file FOREIGN DATA WRAPPER file_fdw;

CREATE FOREIGN TABLE flat(
   id integer not null,
   name varchar(20) not null,
   birthday date not null
) SERVER file
  OPTIONS (filename 'flatfile', format 'csv');

UPDATE flat SET name='' FROM flat f WHERE f.id = flat.id and f.name like '%e';
ERROR:  bitmapset is empty

About the code:
---

I am not so happy with GetForeignRelInfo:
- The name seems ill-chosen from the FDW API side.
  I guess that you chose the name because the function
  is called from get_relation_info, but I think the name
  should be more descriptive for the FDW API.
  Something like PlanForeignRelRowid.

- I guess that every FDW that only needs rowid will
  do exactly the same as your fileGetForeignRelInfo.
  Why can't that be done in core?
  The function could pass an AttrNumber for the rowid
  to the FDW, and will receive a boolean return code
  depending on whether the FDW plans to use rowid or not.
  That would be more convenient for FDW authors.

- I guess the order is dictated by planner steps, but
  it would be nice to have if GetForeignRelInfo were
  not the first function to be called during planning.
  That would make it easier for a FDW to support both
  9.2 and 9.3 (fewer #ifdefs), because the FDW plan state
  will probably have to be created in the first API
  function.

I also wonder why you changed the signature of functions in
trigger.c.  It changes an exposed API unnecessarily, unless
you plan to have trigger support for foreign tables.
That is currently not supported, and I think that such
changes should be part of the present patch.

Other than that, I like the patch.
I cannot give an educated opinion if the changes to planner
and executor are well done or not.

Other comments:
---

I hope I find time to implement the API in oracle_fdw to
be able to test it more thoroughly.

There is an interdependence with the FDW for PostgreSQL patch
in the commitfest.  If both these patches get committed, the
FDW should be extended to support the new API.  If nothing else,
this would greatly improve the ability to test the new API
and find out if it is well designed.

 Here is one other idea. My GPU acceleration module (PG-Strom)
 implements column-oriented data store underlying foreign table.
 It might make sense to cut out this portion for proof-of-concept of
 writable foreign tables.
 
 Any ideas?

The best would be to have a patch on top of the PostgreSQL FDW
to be able to test.  It need not be perfect.

Yours,
Laurenz Albe


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.3] writable foreign tables

2012-11-16 Thread Atri Sharma
Awesome.

I would love to implement this API in JDBC_FDW.

Atri

Sent from my iPad

On 16-Nov-2012, at 20:20, Albe Laurenz laurenz.a...@wien.gv.at wrote:

 Kohei KaiGai wrote:
 The attached patch is just a refreshed version for clean applying to
 the latest tree.
 
 As previous version doing, it makes pseudo enhancement on file_fdw
 to print something about the supplied tuple on INSERT, UPDATE and
 DELETE statement.
 
 Basics:
 ---
 
 The patch applies cleanly, compiles without warnings and passes
 regression tests.
 
 I think that the functionality is highly desirable; judging from
 the number of talks at pgConf EU about SQL/MED this is a hot
 topic, and further development is welcome.
 
 Testing the functionality:
 --
 
 I ran a few queries with the file_fdw and found this:
 
 $ cat flatfile
 1,Laurenz,1968-10-20
 2,Renée,1975-09-03
 3,Caroline,2009-01-26
 4,Ray,2011-03-09
 5,Stephan,2011-03-09
 
 CREATE SERVER file FOREIGN DATA WRAPPER file_fdw;
 
 CREATE FOREIGN TABLE flat(
   id integer not null,
   name varchar(20) not null,
   birthday date not null
 ) SERVER file
  OPTIONS (filename 'flatfile', format 'csv');
 
 UPDATE flat SET name='' FROM flat f WHERE f.id = flat.id and f.name like '%e';
 ERROR:  bitmapset is empty
 
 About the code:
 ---
 
 I am not so happy with GetForeignRelInfo:
 - The name seems ill-chosen from the FDW API side.
  I guess that you chose the name because the function
  is called from get_relation_info, but I think the name
  should be more descriptive for the FDW API.
  Something like PlanForeignRelRowid.
 
 - I guess that every FDW that only needs rowid will
  do exactly the same as your fileGetForeignRelInfo.
  Why can't that be done in core?
  The function could pass an AttrNumber for the rowid
  to the FDW, and will receive a boolean return code
  depending on whether the FDW plans to use rowid or not.
  That would be more convenient for FDW authors.
 
 - I guess the order is dictated by planner steps, but
  it would be nice to have if GetForeignRelInfo were
  not the first function to be called during planning.
  That would make it easier for a FDW to support both
  9.2 and 9.3 (fewer #ifdefs), because the FDW plan state
  will probably have to be created in the first API
  function.
 
 I also wonder why you changed the signature of functions in
 trigger.c.  It changes an exposed API unnecessarily, unless
 you plan to have trigger support for foreign tables.
 That is currently not supported, and I think that such
 changes should be part of the present patch.
 
 Other than that, I like the patch.
 I cannot give an educated opinion if the changes to planner
 and executor are well done or not.
 
 Other comments:
 ---
 
 I hope I find time to implement the API in oracle_fdw to
 be able to test it more thoroughly.
 
 There is an interdependence with the FDW for PostgreSQL patch
 in the commitfest.  If both these patches get committed, the
 FDW should be extended to support the new API.  If nothing else,
 this would greatly improve the ability to test the new API
 and find out if it is well designed.
 
 Here is one other idea. My GPU acceleration module (PG-Strom)
 implements column-oriented data store underlying foreign table.
 It might make sense to cut out this portion for proof-of-concept of
 writable foreign tables.
 
 Any ideas?
 
 The best would be to have a patch on top of the PostgreSQL FDW
 to be able to test.  It need not be perfect.
 
 Yours,
 Laurenz Albe


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.3] writable foreign tables

2012-11-16 Thread Kohei KaiGai
2012/11/16 Albe Laurenz laurenz.a...@wien.gv.at:
 Kohei KaiGai wrote:
 The attached patch is just a refreshed version for clean applying to
 the latest tree.

 As previous version doing, it makes pseudo enhancement on file_fdw
 to print something about the supplied tuple on INSERT, UPDATE and
 DELETE statement.

 Basics:
 ---

 The patch applies cleanly, compiles without warnings and passes
 regression tests.

 I think that the functionality is highly desirable; judging from
 the number of talks at pgConf EU about SQL/MED this is a hot
 topic, and further development is welcome.

 Testing the functionality:
 --

 I ran a few queries with the file_fdw and found this:

 $ cat flatfile
 1,Laurenz,1968-10-20
 2,Renée,1975-09-03
 3,Caroline,2009-01-26
 4,Ray,2011-03-09
 5,Stephan,2011-03-09

 CREATE SERVER file FOREIGN DATA WRAPPER file_fdw;

 CREATE FOREIGN TABLE flat(
id integer not null,
name varchar(20) not null,
birthday date not null
 ) SERVER file
   OPTIONS (filename 'flatfile', format 'csv');

 UPDATE flat SET name='' FROM flat f WHERE f.id = flat.id and f.name like '%e';
 ERROR:  bitmapset is empty

Hmm... I'll try to investigate the behavior.

 About the code:
 ---

 I am not so happy with GetForeignRelInfo:
 - The name seems ill-chosen from the FDW API side.
   I guess that you chose the name because the function
   is called from get_relation_info, but I think the name
   should be more descriptive for the FDW API.
   Something like PlanForeignRelRowid.

Indeed, GetForeignRelInfo might give misleading impression
as if this routine collects widespread information about
target relation. So, how about GetForeignRelWidth() instead?

 - I guess that every FDW that only needs rowid will
   do exactly the same as your fileGetForeignRelInfo.
   Why can't that be done in core?
   The function could pass an AttrNumber for the rowid
   to the FDW, and will receive a boolean return code
   depending on whether the FDW plans to use rowid or not.
   That would be more convenient for FDW authors.

This design tries to kill two-birds with one-stone.
It enables to add multiple number of pseudo-columns,
not only rowid, and makes possible to push-down
complex calculation of target list into external computing
resource.

For example, when user gives the following query:

  SELECT ((c1 - c2) * (c2 - c3))^2 FROM ftable

it contains a complex calculation in the target-list,
thus, it also takes CPU cycles of local process.

If we can replace the ((c1 - c2) * (c2 - c3))^2 by
a reference to a pseudo-column that also references
the calculation result on external node, it effectively
off-load CPU cycles.

In this case, all we need to do is (1) acquire a slot
for pseudo-column at GetForeignRelInfo (2) replace
TargetEntry::expr by Var node that reference this
pseudo-column.

It makes sense for performance optimization, so I don't
want to restrict this handler for rowid only.

 - I guess the order is dictated by planner steps, but
   it would be nice to have if GetForeignRelInfo were
   not the first function to be called during planning.
   That would make it easier for a FDW to support both
   9.2 and 9.3 (fewer #ifdefs), because the FDW plan state
   will probably have to be created in the first API
   function.

The baserel-fdw_private should be initialized to NULL,
so it can perform as a mark whether private data is already
constructed, or not.

 I also wonder why you changed the signature of functions in
 trigger.c.  It changes an exposed API unnecessarily, unless
 you plan to have trigger support for foreign tables.
 That is currently not supported, and I think that such
 changes should be part of the present patch.

You are right. It might be unneeded change right now.
I'll revert it.

 Other than that, I like the patch.
 I cannot give an educated opinion if the changes to planner
 and executor are well done or not.

 Other comments:
 ---

 I hope I find time to implement the API in oracle_fdw to
 be able to test it more thoroughly.

 There is an interdependence with the FDW for PostgreSQL patch
 in the commitfest.  If both these patches get committed, the
 FDW should be extended to support the new API.  If nothing else,
 this would greatly improve the ability to test the new API
 and find out if it is well designed.

It is the reason why I'd like to volunteer to review Hanada-san's patch.
I'll spent my time for reviewing his patch on this weekend.

 Here is one other idea. My GPU acceleration module (PG-Strom)
 implements column-oriented data store underlying foreign table.
 It might make sense to cut out this portion for proof-of-concept of
 writable foreign tables.

 Any ideas?

 The best would be to have a patch on top of the PostgreSQL FDW
 to be able to test.  It need not be perfect.

OK,

In addition, I noticed my patch didn't update documentation stuff.
I also add mention about new handlers.

Thanks,
-- 
KaiGai Kohei kai...@kaigai.gr.jp


-- 
Sent 

Re: [HACKERS] [v9.3] writable foreign tables

2012-11-08 Thread Albe Laurenz
Alexander Korotkov wrote:
 2) You wrote that FDW can support or don't support write depending on
having corresponding functions.
 However it's likely some tables of same FDW could be writable while
another are not. I think we should
 have some mechanism for FDW telling whether particular table is
writable.

I think that this would best be handled by a table option,
if necessary.
That allows maximum flexibility for the design of the FDW.
In many cases it might be enough if the foreign data source
raises an error on a write request.

Yours,
Laurenz Albe


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.3] writable foreign tables

2012-11-08 Thread Atri Sharma


On 08-Nov-2012, at 13:35, Albe Laurenz laurenz.a...@wien.gv.at wrote:

 Alexander Korotkov wrote:
 2) You wrote that FDW can support or don't support write depending on
 having corresponding functions.
 However it's likely some tables of same FDW could be writable while
 another are not. I think we should
 have some mechanism for FDW telling whether particular table is
 writable.
 
 I think that this would best be handled by a table option,
 if necessary.
 That allows maximum flexibility for the design of the FDW.
 In many cases it might be enough if the foreign data source
 raises an error on a write request.
 
 Yours,
 Laurenz Albe
 
 
 -- 
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers

+1

I agree, we should have a system where if the foreign data source raises an 
error on write, FDW can raise corresponding error on PostgreSQL side.exposing 
this as a table option is IMHO a bit risky, and the user may not know whether 
the foreign data source will accept writes or not.

Atri

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.3] writable foreign tables

2012-11-04 Thread Kohei KaiGai
2012/11/2 Alexander Korotkov aekorot...@gmail.com:
 On Mon, Sep 24, 2012 at 12:49 PM, Kohei KaiGai kai...@kaigai.gr.jp wrote:

 2012/9/23 Kohei KaiGai kai...@kaigai.gr.jp:
  2012/8/29 Kohei KaiGai kai...@kaigai.gr.jp:
  2012/8/28 Kohei KaiGai kai...@kaigai.gr.jp:
  2012/8/28 Tom Lane t...@sss.pgh.pa.us:
  Kohei KaiGai kai...@kaigai.gr.jp writes:
  Would it be too invasive to introduce a new pointer in
  TupleTableSlot
  that is NULL for anything but virtual tuples from foreign tables?
 
  I'm not certain whether the duration of TupleTableSlot is enough to
  carry a private datum between scan and modify stage.
 
  It's not.
 
  Is it possible to utilize ctid field to move a private pointer?
 
  UPDATEs and DELETEs do not rely on the ctid field of tuples to carry
  the
  TID from scan to modify --- in fact, most of the time what the modify
  step is going to get is a virtual TupleTableSlot that hasn't even
  *got* a physical CTID field.
 
  Instead, the planner arranges for the TID to be carried up as an
  explicit resjunk column named ctid.  (Currently this is done in
  rewriteTargetListUD(), but see also preptlist.c which does some
  related
  things for SELECT FOR UPDATE.)
 
  I'm inclined to think that what we need here is for FDWs to be able
  to
  modify the details of that behavior, at least to the extent of being
  able to specify a different data type than TID for the row
  identification column.
 
  Hmm. It seems to me a straight-forward solution rather than ab-use
  of ctid system column. Probably, cstring data type is more suitable
  to carry a private datum between scan and modify stage.
 
  One problem I noticed is how FDW driver returns an extra field that
  is in neither system nor regular column.
  Number of columns and its data type are defined with TupleDesc of
  the target foreign-table, so we also need a feature to extend it on
  run-time. For example, FDW driver may have to be able to extend
  a virtual column with cstring data type, even though the target
  foreign table does not have such a column.
 
  I tried to investigate the related routines.
 
  TupleDesc of TupleTableSlot associated with ForeignScanState
  is initialized at ExecInitForeignScan as literal.
  ExecAssignScanType assigns TupleDesc of the target foreign-
  table on tts_tupleDescriptor, as-is.
  It is the reason why IterateForeignScan cannot return a private
  datum except for the columns being declared as regular ones.
 
  The attached patch improved its design according to the upthread
  discussion. It now got away from ab-use of ctid field, and adopts
  a concept of pseudo-column to hold row-id with opaque data type
  instead.
 
  Pseudo-column is Var reference towards attribute-number larger
  than number of attributes on the target relation; thus, it is not
  a substantial object. It is normally unavailable to reference such
  a larger attribute number because TupleDesc of each ScanState
  associated with a particular relation is initialized at ExecInitNode.
 
  The patched ExecInitForeignScan was extended to generate its
  own TupleDesc including pseudo-column definitions on the fly,
  instead of relation's one, when scan-plan of foreign-table requires
  to have pseudo-columns.
 
  Right now, the only possible pseudo-column is rowid being
  injected at rewriteTargetListUD(). It has no data format
  restriction like ctid because of VOID data type.
  FDW extension can set an appropriate value on the rowid
  field in addition to contents of regular columns at
  IterateForeignScan method, to track which remote row should
  be updated or deleted.
 
  Another possible usage of this pseudo-column is push-down
  of target-list including complex calculation. It may enable to
  move complex mathematical formula into remote devices
  (such as GPU device?) instead of just a reference of Var node.
 
  This patch adds a new interface: GetForeignRelInfo being invoked
  from get_relation_info() to adjust width of RelOptInfo-attr_needed
  according to the target-list which may contain rowid pseudo-column.
  Some FDW extension may use this interface to push-down a part of
  target list into remote side, even though I didn't implement this
  feature on file_fdw.
 
  RelOptInfo-max_attr is a good marker whether the plan shall have
  pseudo-column reference. Then, ExecInitForeignScan determines
  whether it should generate a TupleDesc, or not.
 
  The rowid is fetched using ExecGetJunkAttribute as we are currently
  doing on regular tables using ctid, then it shall be delivered to
  ExecUpdate or ExecDelete. We can never expect the fist argument of
  them now, so ItemPointer tupleid redefined to Datum rowid, and
  argument of BR-trigger routines redefined also.
 
  [kaigai@iwashi sepgsql]$ cat ~/testfile.csv
  10  aaa
  11  bbb
  12  ccc
  13  ddd
  14  eee
  15  fff
  [kaigai@iwashi sepgsql]$ psql postgres
  psql (9.3devel)
  Type help for help.
 
  postgres=# UPDATE ftbl SET b = md5(b) WHERE a  12 

Re: [HACKERS] [v9.3] writable foreign tables

2012-11-02 Thread Alexander Korotkov
On Mon, Sep 24, 2012 at 12:49 PM, Kohei KaiGai kai...@kaigai.gr.jp wrote:

 2012/9/23 Kohei KaiGai kai...@kaigai.gr.jp:
  2012/8/29 Kohei KaiGai kai...@kaigai.gr.jp:
  2012/8/28 Kohei KaiGai kai...@kaigai.gr.jp:
  2012/8/28 Tom Lane t...@sss.pgh.pa.us:
  Kohei KaiGai kai...@kaigai.gr.jp writes:
  Would it be too invasive to introduce a new pointer in
 TupleTableSlot
  that is NULL for anything but virtual tuples from foreign tables?
 
  I'm not certain whether the duration of TupleTableSlot is enough to
  carry a private datum between scan and modify stage.
 
  It's not.
 
  Is it possible to utilize ctid field to move a private pointer?
 
  UPDATEs and DELETEs do not rely on the ctid field of tuples to carry
 the
  TID from scan to modify --- in fact, most of the time what the modify
  step is going to get is a virtual TupleTableSlot that hasn't even
  *got* a physical CTID field.
 
  Instead, the planner arranges for the TID to be carried up as an
  explicit resjunk column named ctid.  (Currently this is done in
  rewriteTargetListUD(), but see also preptlist.c which does some
 related
  things for SELECT FOR UPDATE.)
 
  I'm inclined to think that what we need here is for FDWs to be able to
  modify the details of that behavior, at least to the extent of being
  able to specify a different data type than TID for the row
  identification column.
 
  Hmm. It seems to me a straight-forward solution rather than ab-use
  of ctid system column. Probably, cstring data type is more suitable
  to carry a private datum between scan and modify stage.
 
  One problem I noticed is how FDW driver returns an extra field that
  is in neither system nor regular column.
  Number of columns and its data type are defined with TupleDesc of
  the target foreign-table, so we also need a feature to extend it on
  run-time. For example, FDW driver may have to be able to extend
  a virtual column with cstring data type, even though the target
  foreign table does not have such a column.
 
  I tried to investigate the related routines.
 
  TupleDesc of TupleTableSlot associated with ForeignScanState
  is initialized at ExecInitForeignScan as literal.
  ExecAssignScanType assigns TupleDesc of the target foreign-
  table on tts_tupleDescriptor, as-is.
  It is the reason why IterateForeignScan cannot return a private
  datum except for the columns being declared as regular ones.
 
  The attached patch improved its design according to the upthread
  discussion. It now got away from ab-use of ctid field, and adopts
  a concept of pseudo-column to hold row-id with opaque data type
  instead.
 
  Pseudo-column is Var reference towards attribute-number larger
  than number of attributes on the target relation; thus, it is not
  a substantial object. It is normally unavailable to reference such
  a larger attribute number because TupleDesc of each ScanState
  associated with a particular relation is initialized at ExecInitNode.
 
  The patched ExecInitForeignScan was extended to generate its
  own TupleDesc including pseudo-column definitions on the fly,
  instead of relation's one, when scan-plan of foreign-table requires
  to have pseudo-columns.
 
  Right now, the only possible pseudo-column is rowid being
  injected at rewriteTargetListUD(). It has no data format
  restriction like ctid because of VOID data type.
  FDW extension can set an appropriate value on the rowid
  field in addition to contents of regular columns at
  IterateForeignScan method, to track which remote row should
  be updated or deleted.
 
  Another possible usage of this pseudo-column is push-down
  of target-list including complex calculation. It may enable to
  move complex mathematical formula into remote devices
  (such as GPU device?) instead of just a reference of Var node.
 
  This patch adds a new interface: GetForeignRelInfo being invoked
  from get_relation_info() to adjust width of RelOptInfo-attr_needed
  according to the target-list which may contain rowid pseudo-column.
  Some FDW extension may use this interface to push-down a part of
  target list into remote side, even though I didn't implement this
  feature on file_fdw.
 
  RelOptInfo-max_attr is a good marker whether the plan shall have
  pseudo-column reference. Then, ExecInitForeignScan determines
  whether it should generate a TupleDesc, or not.
 
  The rowid is fetched using ExecGetJunkAttribute as we are currently
  doing on regular tables using ctid, then it shall be delivered to
  ExecUpdate or ExecDelete. We can never expect the fist argument of
  them now, so ItemPointer tupleid redefined to Datum rowid, and
  argument of BR-trigger routines redefined also.
 
  [kaigai@iwashi sepgsql]$ cat ~/testfile.csv
  10  aaa
  11  bbb
  12  ccc
  13  ddd
  14  eee
  15  fff
  [kaigai@iwashi sepgsql]$ psql postgres
  psql (9.3devel)
  Type help for help.
 
  postgres=# UPDATE ftbl SET b = md5(b) WHERE a  12 RETURNING *;
  INFO:  ftbl is the target relation of UPDATE
  

Re: [HACKERS] [v9.3] writable foreign tables

2012-09-13 Thread Albe Laurenz
Tom Lane wrote:
 Kohei KaiGai kai...@kaigai.gr.jp writes:
 Laurenz Albe wrote:
 Would it be too invasive to introduce a new pointer in
TupleTableSlot
 that is NULL for anything but virtual tuples from foreign tables?

 I'm not certain whether the duration of TupleTableSlot is enough to
 carry a private datum between scan and modify stage.

 It's not.

 Is it possible to utilize ctid field to move a private pointer?

 UPDATEs and DELETEs do not rely on the ctid field of tuples to carry
the
 TID from scan to modify --- in fact, most of the time what the modify
 step is going to get is a virtual TupleTableSlot that hasn't even
 *got* a physical CTID field.
 
 Instead, the planner arranges for the TID to be carried up as an
 explicit resjunk column named ctid.  (Currently this is done in
 rewriteTargetListUD(), but see also preptlist.c which does some
related
 things for SELECT FOR UPDATE.)
 
 I'm inclined to think that what we need here is for FDWs to be able to
 modify the details of that behavior, at least to the extent of being
 able to specify a different data type than TID for the row
 identification column.

Would that imply inventing a new system attribute for
foreign tid?

Yours,
Laurenz Albe


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.3] writable foreign tables

2012-09-13 Thread Kohei KaiGai
2012/9/13 Albe Laurenz laurenz.a...@wien.gv.at:
 Tom Lane wrote:
 Kohei KaiGai kai...@kaigai.gr.jp writes:
 Laurenz Albe wrote:
 Would it be too invasive to introduce a new pointer in
 TupleTableSlot
 that is NULL for anything but virtual tuples from foreign tables?

 I'm not certain whether the duration of TupleTableSlot is enough to
 carry a private datum between scan and modify stage.

 It's not.

 Is it possible to utilize ctid field to move a private pointer?

 UPDATEs and DELETEs do not rely on the ctid field of tuples to carry
 the
 TID from scan to modify --- in fact, most of the time what the modify
 step is going to get is a virtual TupleTableSlot that hasn't even
 *got* a physical CTID field.

 Instead, the planner arranges for the TID to be carried up as an
 explicit resjunk column named ctid.  (Currently this is done in
 rewriteTargetListUD(), but see also preptlist.c which does some
 related
 things for SELECT FOR UPDATE.)

 I'm inclined to think that what we need here is for FDWs to be able to
 modify the details of that behavior, at least to the extent of being
 able to specify a different data type than TID for the row
 identification column.

 Would that imply inventing a new system attribute for
 foreign tid?

It is an idea to implement this feature with minimum code side.

However, my preference is to support pseudo-column approach
rather than system columns, because it also can be utilized for
another interesting feature that enables to push-down target entry
onto remote side.
So, I'd like to try to support a feature that allows foreign-table to
return pseudo-column in addition to its table definition to move
row-id of remote tuples, as primary purpose of this.

Thanks,
-- 
KaiGai Kohei kai...@kaigai.gr.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.3] writable foreign tables

2012-09-13 Thread Tom Lane
Albe Laurenz laurenz.a...@wien.gv.at writes:
 Tom Lane wrote:
 Instead, the planner arranges for the TID to be carried up as an
 explicit resjunk column named ctid.  (Currently this is done in
 rewriteTargetListUD(), but see also preptlist.c which does some
 related things for SELECT FOR UPDATE.)
 
 I'm inclined to think that what we need here is for FDWs to be able to
 modify the details of that behavior, at least to the extent of being
 able to specify a different data type than TID for the row
 identification column.

 Would that imply inventing a new system attribute for
 foreign tid?

No, I think you missed the point of what I wrote completely.  The target
row ID is not treated as a system attribute during UPDATE/DELETE.  It's
an ordinary data column that's silently added to what the user wrote.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.3] writable foreign tables

2012-08-29 Thread Kohei KaiGai
2012/8/28 Kohei KaiGai kai...@kaigai.gr.jp:
 2012/8/28 Tom Lane t...@sss.pgh.pa.us:
 Kohei KaiGai kai...@kaigai.gr.jp writes:
 Would it be too invasive to introduce a new pointer in TupleTableSlot
 that is NULL for anything but virtual tuples from foreign tables?

 I'm not certain whether the duration of TupleTableSlot is enough to
 carry a private datum between scan and modify stage.

 It's not.

 Is it possible to utilize ctid field to move a private pointer?

 UPDATEs and DELETEs do not rely on the ctid field of tuples to carry the
 TID from scan to modify --- in fact, most of the time what the modify
 step is going to get is a virtual TupleTableSlot that hasn't even
 *got* a physical CTID field.

 Instead, the planner arranges for the TID to be carried up as an
 explicit resjunk column named ctid.  (Currently this is done in
 rewriteTargetListUD(), but see also preptlist.c which does some related
 things for SELECT FOR UPDATE.)

 I'm inclined to think that what we need here is for FDWs to be able to
 modify the details of that behavior, at least to the extent of being
 able to specify a different data type than TID for the row
 identification column.

 Hmm. It seems to me a straight-forward solution rather than ab-use
 of ctid system column. Probably, cstring data type is more suitable
 to carry a private datum between scan and modify stage.

 One problem I noticed is how FDW driver returns an extra field that
 is in neither system nor regular column.
 Number of columns and its data type are defined with TupleDesc of
 the target foreign-table, so we also need a feature to extend it on
 run-time. For example, FDW driver may have to be able to extend
 a virtual column with cstring data type, even though the target
 foreign table does not have such a column.

I tried to investigate the related routines.

TupleDesc of TupleTableSlot associated with ForeignScanState
is initialized at ExecInitForeignScan as literal.
ExecAssignScanType assigns TupleDesc of the target foreign-
table on tts_tupleDescriptor, as-is.
It is the reason why IterateForeignScan cannot return a private
datum except for the columns being declared as regular ones.

Confrontation between ForeignScan and SubqueryScan tell us
the point to be extended. It assigns TupleDesc of the subplan
generated at run-time.
It seems to me ForeignScan will be able to adopt similar idea;
that allows to append pseudo-column onto TupleDesc to
carry identifier of remote-rows to be updated / deleted, if
FDW driver can control TupleDesc being set, instead of the
one come from relation's definition as-is.

Any comment please. Thanks,
-- 
KaiGai Kohei kai...@kaigai.gr.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.3] writable foreign tables

2012-08-29 Thread Kohei KaiGai
2012/8/28 David Fetter da...@fetter.org:
 On Tue, Aug 28, 2012 at 06:08:59PM +0200, Kohei KaiGai wrote:
 2012/8/28 David Fetter da...@fetter.org:
  On Tue, Aug 28, 2012 at 05:18:34PM +0200, Kohei KaiGai wrote:
  2012/8/28 David Fetter da...@fetter.org:
   On Tue, Aug 28, 2012 at 10:58:25AM -0400, Tom Lane wrote:
   Kohei KaiGai kai...@kaigai.gr.jp writes:
It seems to me TargetEntry of the parse tree can inform us
which column should be modified on UPDATE or INSERT. If it has
just a Var element that reference original table as-is, it
means here is no change.
  
   Only if you're not going to support BEFORE triggers modifying the
   row...
  
   +1 for supporting these.
 
  Generated identifiers and whole-row matching are two ways to approach
  this.  There are likely others, especially in cases where people have
  special knowledge of the remote source.
 
 One major problem is how to carry the generated identifiers on run-time,
 even though we have no slot except for system and regular columns
 defined in TupleDesc of the target foreign tables.
 It may need a feature to expand TupleDesc on demand.

 Could be.  You know a lot more about the implementation details than I do.

 Of course, I don't deny the benefit of trigger support on foreign-tables.
 Both writable-feature and trigger-support can be supported simultaneously.

 Do you see these as independent features, or is there some essential
 overlap?

If we stand on the viewpoint that foreign-tables should perform as if regular
tables, I don't think its writer feature should depend on trigger stuff.
They can work independently.

On the other hand, trigger feature gives users flexibility to control the data
to be written, as if regular tables. We shouldn't miss the point.
At least, I don't think we have some technical differences to support row-level
triggers on foreign tables.

Thanks,
-- 
KaiGai Kohei kai...@kaigai.gr.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.3] writable foreign tables

2012-08-28 Thread Kohei KaiGai
2012/8/27 Albe Laurenz laurenz.a...@wien.gv.at:
 Kohei KaiGai wrote:
 2012/8/25 Robert Haas robertmh...@gmail.com:
 On Thu, Aug 23, 2012 at 1:10 AM, Kohei KaiGai kai...@kaigai.gr.jp
 wrote:
 It is a responsibility of FDW extension (and DBA) to ensure each
 foreign-row has a unique identifier that has 48-bits width integer
 data type in maximum.

 It strikes me as incredibly short-sighted to decide that the row
 identifier has to have the same format as what our existing heap AM
 happens to have.  I think we need to allow the row identifier to be
 of
 any data type, and even compound.  For example, the foreign side
 might
 have no equivalent of CTID, and thus use primary key.  And the
 primary
 key might consist of an integer and a string, or some such.

 I assume it is a task of FDW extension to translate between the pseudo
 ctid and the primary key in remote side.

 For example, if primary key of the remote table is Text data type, an
 idea
 is to use a hash table to track the text-formed primary being
 associated
 with a particular 48-bits integer. The pseudo ctid shall be utilized
 to track
 the tuple to be modified on the scan-stage, then FDW can reference the
 hash table to pull-out the primary key to be provided on the prepared
 statement.

 And what if there is a hash collision?  Then you would not be able to
 determine which row is meant.

Even if we had a hash collision, each hash entry can have the original
key itself to be compared. But anyway, I love the idea to support
an opaque pointer to track particular remote-row rather.

 I agree with Robert that this should be flexible enough to cater for
 all kinds of row identifiers.  Oracle, for example, uses ten byte
 identifiers which would give me a headache with your suggested design.

 Do we have some other reasonable ideas?

 Would it be too invasive to introduce a new pointer in TupleTableSlot
 that is NULL for anything but virtual tuples from foreign tables?

I'm not certain whether the duration of TupleTableSlot is enough to
carry a private datum between scan and modify stage.
For example, the TupleTableSlot shall be cleared at ExecNestLoop
prior to the slot being delivered to ExecModifyTuple.

postgres=# EXPLAIN UPDATE t1 SET b = 'abcd' WHERE a IN (SELECT x FROM
t2 WHERE x % 2 = 0);
  QUERY PLAN
---
 Update on t1  (cost=0.00..54.13 rows=6 width=16)
   -  Nested Loop  (cost=0.00..54.13 rows=6 width=16)
 -  Seq Scan on t2  (cost=0.00..28.45 rows=6 width=10)
   Filter: ((x % 2) = 0)
 -  Index Scan using t1_pkey on t1  (cost=0.00..4.27 rows=1 width=10)
   Index Cond: (a = t2.x)
(6 rows)

Is it possible to utilize ctid field to move a private pointer?
TID data type is internally represented as a pointer to ItemPointerData,
so it has enough width to track an opaque formed remote-row identifier;
including string, int64 or others.

One disadvantage is ctid system column shows a nonsense value
when user explicitly references this system column. But it does not
seems to me a fundamental problem, because we didn't give any
special meaning on the ctid field of foreign table.

Thanks,
-- 
KaiGai Kohei kai...@kaigai.gr.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.3] writable foreign tables

2012-08-28 Thread Kohei KaiGai
2012/8/27 Shigeru HANADA shigeru.han...@gmail.com:
 Kaigai-san,

 On Thu, Aug 23, 2012 at 2:10 PM, Kohei KaiGai kai...@kaigai.gr.jp wrote:
 The patched portion at contrib/file_fdw.c does not make sense
 actually. It just prints messages for each invocation.
 It is just a proof-of-concept to show possibility of implementation
 based on real RDBMS.

 Attached is a tar ball of pgsql_fdw.  It's WIP and contains no
 document, but it would be enough for your PoC purpose.  Usage and
 features are same as the last version posted for 9.2 cycle.
 # I'll post finished patch in the CF-Sep.

Thanks, it is helpful to work on.

 Here are random comments for your PoC patch:

 + As Robert says, using CTID as virtual tuple identifier doesn't seem
 nice when considering various FDWs for NoSQL or RDBMS.  Having abstract
 layer between FDWs and tuple sounds better, but implementing it by each
 FDW seems useless effort.  Do yo have any idea of generic mechanism for
 tuple mapping?

As I wrote in the previous message, isn't it a reasonable idea to move
a private datum (instead of alternate key) on the ctid field which has
been internally represented as a pointer to indicate ItemPointerData?

 + Do you have any plan about deparsing local qualifiers into remote
 query to avoid repeated query submission?  This would improve
 performance of big UPDATE, but its use case might be limited to
 statements which consist of one foreign table.  For this case, we can
 consider pass-through mode as second way.

I think, FDW should run UPDATE or DELETE statement at the scan
stage on remote-side, then return a pseudo result to scanner, in case
of the statement is enough simple, like no qualifier, no returning, etc...
The callback on ExecUpdate/ExecDelete will perform just a stub; that
does not actually work except for increment of affected rows.

 + I have not read your patch closely yet, but I wonder how we can know
 which column is actually updated.  If we have only updated image of
 tuple, we have to update all remote columns by new values?

It seems to me TargetEntry of the parse tree can inform us which column
should be modified on UPDATE or INSERT. If it has just a Var element
that reference original table as-is, it means here is no change.

Thanks,
-- 
KaiGai Kohei kai...@kaigai.gr.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.3] writable foreign tables

2012-08-28 Thread Albe Laurenz
Kohei KaiGai wrote:
 It is a responsibility of FDW extension (and DBA) to ensure each
 foreign-row has a unique identifier that has 48-bits width integer
 data type in maximum.

 For example, if primary key of the remote table is Text data type,
 an idea is to use a hash table to track the text-formed primary
 being associated with a particular 48-bits integer.

 Even if we had a hash collision, each hash entry can have the original
 key itself to be compared. But anyway, I love the idea to support
 an opaque pointer to track particular remote-row rather.

Me too.

 Do we have some other reasonable ideas?

 I'm not certain whether the duration of TupleTableSlot is enough to
 carry a private datum between scan and modify stage.

 Is it possible to utilize ctid field to move a private pointer?
 TID data type is internally represented as a pointer to
ItemPointerData,
 so it has enough width to track an opaque formed remote-row
identifier;
 including string, int64 or others.
 
 One disadvantage is ctid system column shows a nonsense value
 when user explicitly references this system column. But it does not
 seems to me a fundamental problem, because we didn't give any
 special meaning on the ctid field of foreign table.

I can't say if (ab)using the field that way would cause other
problems, but I don't think that nonsense values are a problem.
The pointer would stay the same for the duration of the foreign
scan, which I think is as good a ctid for a foreign table as
anybody should reasonably ask.

BTW, I see the following comment in htup.h:

 * t_self and t_tableOid should be valid if the HeapTupleData points to
 * a disk buffer, or if it represents a copy of a tuple on disk.  They
 * should be explicitly set invalid in manufactured tuples.

I don't know if invalid means zero in that case.

Yours,
Laurenz Albe


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.3] writable foreign tables

2012-08-28 Thread Kohei KaiGai
2012/8/28 Albe Laurenz laurenz.a...@wien.gv.at:
 Kohei KaiGai wrote:
 It is a responsibility of FDW extension (and DBA) to ensure each
 foreign-row has a unique identifier that has 48-bits width integer
 data type in maximum.

 For example, if primary key of the remote table is Text data type,
 an idea is to use a hash table to track the text-formed primary
 being associated with a particular 48-bits integer.

 Even if we had a hash collision, each hash entry can have the original
 key itself to be compared. But anyway, I love the idea to support
 an opaque pointer to track particular remote-row rather.

 Me too.

 Do we have some other reasonable ideas?

 I'm not certain whether the duration of TupleTableSlot is enough to
 carry a private datum between scan and modify stage.

 Is it possible to utilize ctid field to move a private pointer?
 TID data type is internally represented as a pointer to
 ItemPointerData,
 so it has enough width to track an opaque formed remote-row
 identifier;
 including string, int64 or others.

 One disadvantage is ctid system column shows a nonsense value
 when user explicitly references this system column. But it does not
 seems to me a fundamental problem, because we didn't give any
 special meaning on the ctid field of foreign table.

 I can't say if (ab)using the field that way would cause other
 problems, but I don't think that nonsense values are a problem.
 The pointer would stay the same for the duration of the foreign
 scan, which I think is as good a ctid for a foreign table as
 anybody should reasonably ask.

 BTW, I see the following comment in htup.h:

  * t_self and t_tableOid should be valid if the HeapTupleData points to
  * a disk buffer, or if it represents a copy of a tuple on disk.  They
  * should be explicitly set invalid in manufactured tuples.

 I don't know if invalid means zero in that case.

ItemPointerSetInvalid is declared as follows:

/*
 * ItemPointerSetInvalid
 *  Sets a disk item pointer to be invalid.
 */
#define ItemPointerSetInvalid(pointer) \
( \
AssertMacro(PointerIsValid(pointer)), \
BlockIdSet(((pointer)-ip_blkid), InvalidBlockNumber), \
(pointer)-ip_posid = InvalidOffsetNumber \
)

Since ItemPointerGetBlockNumber() and ItemPointerGetOffsetNumber()
checks whether the given ItemPointer is valid, FDWs may have to put
a dummy ItemPointerData on head of their private datum to avoid
the first 6-bytes having zero.

For example, the following data structure is safe to carry an opaque
datum without false-positive of invalid ctid.

typedef struct {
ItemPointerData   dumm
char *pk_of_remote_table;
} my_pseudo_rowid;

Thanks,
-- 
KaiGai Kohei kai...@kaigai.gr.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.3] writable foreign tables

2012-08-28 Thread Tom Lane
Kohei KaiGai kai...@kaigai.gr.jp writes:
 Would it be too invasive to introduce a new pointer in TupleTableSlot
 that is NULL for anything but virtual tuples from foreign tables?

 I'm not certain whether the duration of TupleTableSlot is enough to
 carry a private datum between scan and modify stage.

It's not.

 Is it possible to utilize ctid field to move a private pointer?

UPDATEs and DELETEs do not rely on the ctid field of tuples to carry the
TID from scan to modify --- in fact, most of the time what the modify
step is going to get is a virtual TupleTableSlot that hasn't even
*got* a physical CTID field.

Instead, the planner arranges for the TID to be carried up as an
explicit resjunk column named ctid.  (Currently this is done in
rewriteTargetListUD(), but see also preptlist.c which does some related
things for SELECT FOR UPDATE.)

I'm inclined to think that what we need here is for FDWs to be able to
modify the details of that behavior, at least to the extent of being
able to specify a different data type than TID for the row
identification column.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.3] writable foreign tables

2012-08-28 Thread Tom Lane
Kohei KaiGai kai...@kaigai.gr.jp writes:
 It seems to me TargetEntry of the parse tree can inform us which column
 should be modified on UPDATE or INSERT. If it has just a Var element
 that reference original table as-is, it means here is no change.

Only if you're not going to support BEFORE triggers modifying the row...

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.3] writable foreign tables

2012-08-28 Thread David Fetter
On Tue, Aug 28, 2012 at 10:58:25AM -0400, Tom Lane wrote:
 Kohei KaiGai kai...@kaigai.gr.jp writes:
  It seems to me TargetEntry of the parse tree can inform us which column
  should be modified on UPDATE or INSERT. If it has just a Var element
  that reference original table as-is, it means here is no change.
 
 Only if you're not going to support BEFORE triggers modifying the row...

+1 for supporting these.

Speaking of triggers on foreign tables, what's needed to support them
independent of support at the FDW level for writing on foreign tables,
or does that even make sense?

Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.3] writable foreign tables

2012-08-28 Thread Kohei KaiGai
2012/8/28 David Fetter da...@fetter.org:
 On Tue, Aug 28, 2012 at 10:58:25AM -0400, Tom Lane wrote:
 Kohei KaiGai kai...@kaigai.gr.jp writes:
  It seems to me TargetEntry of the parse tree can inform us which column
  should be modified on UPDATE or INSERT. If it has just a Var element
  that reference original table as-is, it means here is no change.

 Only if you're not going to support BEFORE triggers modifying the row...

 +1 for supporting these.

 Speaking of triggers on foreign tables, what's needed to support them
 independent of support at the FDW level for writing on foreign tables,
 or does that even make sense?

I agree with trigger support on foreign tables is definitely useful feature,
even though it does not have capability to replace the writable foreign
table functionality.

In case when foreign-table definition does not contain a column mapped
with primary-key column in remote-side, the trigger function cannot
determine which row should be updated / deleted.
It is a situation that FDW driver should track a particular remote-row using
its identifier.

Thanks,
-- 
KaiGai Kohei kai...@kaigai.gr.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.3] writable foreign tables

2012-08-28 Thread David Fetter
On Tue, Aug 28, 2012 at 05:18:34PM +0200, Kohei KaiGai wrote:
 2012/8/28 David Fetter da...@fetter.org:
  On Tue, Aug 28, 2012 at 10:58:25AM -0400, Tom Lane wrote:
  Kohei KaiGai kai...@kaigai.gr.jp writes:
   It seems to me TargetEntry of the parse tree can inform us
   which column should be modified on UPDATE or INSERT. If it has
   just a Var element that reference original table as-is, it
   means here is no change.
 
  Only if you're not going to support BEFORE triggers modifying the
  row...
 
  +1 for supporting these.
 
  Speaking of triggers on foreign tables, what's needed to support
  them independent of support at the FDW level for writing on
  foreign tables, or does that even make sense?
 
 I agree with trigger support on foreign tables is definitely useful
 feature, even though it does not have capability to replace the
 writable foreign table functionality.

With utmost respect, trigger support does make it possible to write to
foreign tables using a whole-row comparison with the effect that all
whole-row matches would be affected.  This is how DBI-Link does it
currently.

 In case when foreign-table definition does not contain a column
 mapped with primary-key column in remote-side, the trigger function
 cannot determine which row should be updated / deleted.  It is a
 situation that FDW driver should track a particular remote-row using
 its identifier.

Generated identifiers and whole-row matching are two ways to approach
this.  There are likely others, especially in cases where people have
special knowledge of the remote source.

Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.3] writable foreign tables

2012-08-28 Thread Kohei KaiGai
2012/8/28 Tom Lane t...@sss.pgh.pa.us:
 Kohei KaiGai kai...@kaigai.gr.jp writes:
 Would it be too invasive to introduce a new pointer in TupleTableSlot
 that is NULL for anything but virtual tuples from foreign tables?

 I'm not certain whether the duration of TupleTableSlot is enough to
 carry a private datum between scan and modify stage.

 It's not.

 Is it possible to utilize ctid field to move a private pointer?

 UPDATEs and DELETEs do not rely on the ctid field of tuples to carry the
 TID from scan to modify --- in fact, most of the time what the modify
 step is going to get is a virtual TupleTableSlot that hasn't even
 *got* a physical CTID field.

 Instead, the planner arranges for the TID to be carried up as an
 explicit resjunk column named ctid.  (Currently this is done in
 rewriteTargetListUD(), but see also preptlist.c which does some related
 things for SELECT FOR UPDATE.)

 I'm inclined to think that what we need here is for FDWs to be able to
 modify the details of that behavior, at least to the extent of being
 able to specify a different data type than TID for the row
 identification column.

Hmm. It seems to me a straight-forward solution rather than ab-use
of ctid system column. Probably, cstring data type is more suitable
to carry a private datum between scan and modify stage.

One problem I noticed is how FDW driver returns an extra field that
is in neither system nor regular column.
Number of columns and its data type are defined with TupleDesc of
the target foreign-table, so we also need a feature to extend it on
run-time. For example, FDW driver may have to be able to extend
a virtual column with cstring data type, even though the target
foreign table does not have such a column.

Thanks,
-- 
KaiGai Kohei kai...@kaigai.gr.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.3] writable foreign tables

2012-08-28 Thread Kohei KaiGai
2012/8/28 David Fetter da...@fetter.org:
 On Tue, Aug 28, 2012 at 05:18:34PM +0200, Kohei KaiGai wrote:
 2012/8/28 David Fetter da...@fetter.org:
  On Tue, Aug 28, 2012 at 10:58:25AM -0400, Tom Lane wrote:
  Kohei KaiGai kai...@kaigai.gr.jp writes:
   It seems to me TargetEntry of the parse tree can inform us
   which column should be modified on UPDATE or INSERT. If it has
   just a Var element that reference original table as-is, it
   means here is no change.
 
  Only if you're not going to support BEFORE triggers modifying the
  row...
 
  +1 for supporting these.
 
  Speaking of triggers on foreign tables, what's needed to support
  them independent of support at the FDW level for writing on
  foreign tables, or does that even make sense?
 
 I agree with trigger support on foreign tables is definitely useful
 feature, even though it does not have capability to replace the
 writable foreign table functionality.

 With utmost respect, trigger support does make it possible to write to
 foreign tables using a whole-row comparison with the effect that all
 whole-row matches would be affected.  This is how DBI-Link does it
 currently.

 In case when foreign-table definition does not contain a column
 mapped with primary-key column in remote-side, the trigger function
 cannot determine which row should be updated / deleted.  It is a
 situation that FDW driver should track a particular remote-row using
 its identifier.

 Generated identifiers and whole-row matching are two ways to approach
 this.  There are likely others, especially in cases where people have
 special knowledge of the remote source.

One major problem is how to carry the generated identifiers on run-time,
even though we have no slot except for system and regular columns
defined in TupleDesc of the target foreign tables.
It may need a feature to expand TupleDesc on demand.

Of course, I don't deny the benefit of trigger support on foreign-tables.
Both writable-feature and trigger-support can be supported simultaneously.

Thanks,
-- 
KaiGai Kohei kai...@kaigai.gr.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.3] writable foreign tables

2012-08-28 Thread David Fetter
On Tue, Aug 28, 2012 at 06:08:59PM +0200, Kohei KaiGai wrote:
 2012/8/28 David Fetter da...@fetter.org:
  On Tue, Aug 28, 2012 at 05:18:34PM +0200, Kohei KaiGai wrote:
  2012/8/28 David Fetter da...@fetter.org:
   On Tue, Aug 28, 2012 at 10:58:25AM -0400, Tom Lane wrote:
   Kohei KaiGai kai...@kaigai.gr.jp writes:
It seems to me TargetEntry of the parse tree can inform us
which column should be modified on UPDATE or INSERT. If it has
just a Var element that reference original table as-is, it
means here is no change.
  
   Only if you're not going to support BEFORE triggers modifying the
   row...
  
   +1 for supporting these.
 
  Generated identifiers and whole-row matching are two ways to approach
  this.  There are likely others, especially in cases where people have
  special knowledge of the remote source.
 
 One major problem is how to carry the generated identifiers on run-time,
 even though we have no slot except for system and regular columns
 defined in TupleDesc of the target foreign tables.
 It may need a feature to expand TupleDesc on demand.

Could be.  You know a lot more about the implementation details than I do.

 Of course, I don't deny the benefit of trigger support on foreign-tables.
 Both writable-feature and trigger-support can be supported simultaneously.

Do you see these as independent features, or is there some essential
overlap?

Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.3] writable foreign tables

2012-08-27 Thread Albe Laurenz
Kohei KaiGai wrote:
 2012/8/25 Robert Haas robertmh...@gmail.com:
 On Thu, Aug 23, 2012 at 1:10 AM, Kohei KaiGai kai...@kaigai.gr.jp
wrote:
 It is a responsibility of FDW extension (and DBA) to ensure each
 foreign-row has a unique identifier that has 48-bits width integer
 data type in maximum.

 It strikes me as incredibly short-sighted to decide that the row
 identifier has to have the same format as what our existing heap AM
 happens to have.  I think we need to allow the row identifier to be
of
 any data type, and even compound.  For example, the foreign side
might
 have no equivalent of CTID, and thus use primary key.  And the
primary
 key might consist of an integer and a string, or some such.

 I assume it is a task of FDW extension to translate between the pseudo
 ctid and the primary key in remote side.
 
 For example, if primary key of the remote table is Text data type, an
idea
 is to use a hash table to track the text-formed primary being
associated
 with a particular 48-bits integer. The pseudo ctid shall be utilized
to track
 the tuple to be modified on the scan-stage, then FDW can reference the
 hash table to pull-out the primary key to be provided on the prepared
 statement.

And what if there is a hash collision?  Then you would not be able to
determine which row is meant.

I agree with Robert that this should be flexible enough to cater for
all kinds of row identifiers.  Oracle, for example, uses ten byte
identifiers which would give me a headache with your suggested design.

 Do we have some other reasonable ideas?

Would it be too invasive to introduce a new pointer in TupleTableSlot
that is NULL for anything but virtual tuples from foreign tables?

Yours,
Laurenz Albe



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.3] writable foreign tables

2012-08-27 Thread Shigeru HANADA
Kaigai-san,

On Thu, Aug 23, 2012 at 2:10 PM, Kohei KaiGai kai...@kaigai.gr.jp wrote:
 The patched portion at contrib/file_fdw.c does not make sense
 actually. It just prints messages for each invocation.
 It is just a proof-of-concept to show possibility of implementation
 based on real RDBMS.

Attached is a tar ball of pgsql_fdw.  It's WIP and contains no
document, but it would be enough for your PoC purpose.  Usage and
features are same as the last version posted for 9.2 cycle.
# I'll post finished patch in the CF-Sep.

Here are random comments for your PoC patch:

+ As Robert says, using CTID as virtual tuple identifier doesn't seem
nice when considering various FDWs for NoSQL or RDBMS.  Having abstract
layer between FDWs and tuple sounds better, but implementing it by each
FDW seems useless effort.  Do yo have any idea of generic mechanism for
tuple mapping?

+ Do you have any plan about deparsing local qualifiers into remote
query to avoid repeated query submission?  This would improve
performance of big UPDATE, but its use case might be limited to
statements which consist of one foreign table.  For this case, we can
consider pass-through mode as second way.

+ I have not read your patch closely yet, but I wonder how we can know
which column is actually updated.  If we have only updated image of
tuple, we have to update all remote columns by new values?

-- 
Shigeru Hanada


pgsql_fdw_93.tar.gz
Description: application/gzip

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.3] writable foreign tables

2012-08-25 Thread Robert Haas
On Thu, Aug 23, 2012 at 1:10 AM, Kohei KaiGai kai...@kaigai.gr.jp wrote:
 It is a responsibility of FDW extension (and DBA) to ensure each
 foreign-row has a unique identifier that has 48-bits width integer
 data type in maximum.

It strikes me as incredibly short-sighted to decide that the row
identifier has to have the same format as what our existing heap AM
happens to have.  I think we need to allow the row identifier to be of
any data type, and even compound.  For example, the foreign side might
have no equivalent of CTID, and thus use primary key.  And the primary
key might consist of an integer and a string, or some such.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.3] writable foreign tables

2012-08-25 Thread Kohei KaiGai
2012/8/25 Robert Haas robertmh...@gmail.com:
 On Thu, Aug 23, 2012 at 1:10 AM, Kohei KaiGai kai...@kaigai.gr.jp wrote:
 It is a responsibility of FDW extension (and DBA) to ensure each
 foreign-row has a unique identifier that has 48-bits width integer
 data type in maximum.

 It strikes me as incredibly short-sighted to decide that the row
 identifier has to have the same format as what our existing heap AM
 happens to have.  I think we need to allow the row identifier to be of
 any data type, and even compound.  For example, the foreign side might
 have no equivalent of CTID, and thus use primary key.  And the primary
 key might consist of an integer and a string, or some such.

I assume it is a task of FDW extension to translate between the pseudo
ctid and the primary key in remote side.

For example, if primary key of the remote table is Text data type, an idea
is to use a hash table to track the text-formed primary being associated
with a particular 48-bits integer. The pseudo ctid shall be utilized to track
the tuple to be modified on the scan-stage, then FDW can reference the
hash table to pull-out the primary key to be provided on the prepared
statement.

Do we have some other reasonable ideas?

Thanks,
-- 
KaiGai Kohei kai...@kaigai.gr.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [v9.3] writable foreign tables

2012-08-22 Thread Kohei KaiGai
Hello,

The attached patch is just a proof-of-concept of writable foreign table
feature; thus, I don't expect it getting merged at the upcoming commit
fest. The purpose of this patch is to find out the best way to support
write stuff in FDW.

Basic idea of this patch is to utilize ctid field to track an identifier of
a particular foreign-row; to be updated or deleted. It shall be moved
to the modify-stage from the scan-stage as regular table doing.
Then, newly added methods being invoked at ExecUpdate or
ExecDelete with the pseudo ctid, so FDW shall be able to modify
the target foreign-row.
It is a responsibility of FDW extension (and DBA) to ensure each
foreign-row has a unique identifier that has 48-bits width integer
data type in maximum. In case of pgsql_fdw, ctid of remote table
can perform as pseudo ctid, as is. For other RDBMS, DBA will
need to indicate which column should perform.
INSERT is simple enough; all we need to do it to carry every field
of new tuple into the remote side.

This patch adds five new methods of FdwRoutine structure.
The BeginForeignModify and EndForeignModify give a chance
to initialize / destruct a private state that can be allocated on
ResultRelInfo. As literal, ExecForeignInsert, ExecForeignDelete
and ExecForeignUpdate are invoked for each new tuple, instead
of heap_*() for regular tables. If NULL was set on them, it means
this FDW does not support these operations.

I intend FDW to set up a prepared statement that modifies
a particular remote-row being identified with pseudo-ctid,
at the BeginForeignModify(). Then, ExecForeign*() kicks
the prepared statement with given pseudo-ctid.

The patched portion at contrib/file_fdw.c does not make sense
actually. It just prints messages for each invocation.
It is just a proof-of-concept to show possibility of implementation
based on real RDBMS.

In case when file_fdw performs behalf on ftbl.

postgres=# SELECT ctid, * FROM ftbl;
  ctid  |  a  |  b
+-+-
 (0,1)  | 101 | aaa
 (0,2)  | 102 | bbb
 (0,3)  | 103 | ccc
 (0,4)  | 104 | ddd
 (0,5)  | 105 | eee
 (0,6)  | 106 | fff
 (0,7)  | 107 | ggg
 (0,8)  | 108 | hhh
 (0,9)  | 109 | iii
 (0,10) | 110 | jjj
(10 rows)
 == ctid is used to carray identifier of row; line number in this example.

postgres=# UPDATE ftbl SET a = a + 1 WHERE a  107;
INFO:  ftbl is the target relation of UPDATE
INFO:  fdw_file: BeginForeignModify method
INFO:  fdw_file: UPDATE (lineno = 8)
INFO:  fdw_file: UPDATE (lineno = 9)
INFO:  fdw_file: UPDATE (lineno = 10)
INFO:  fdw_file: EndForeignModify method
UPDATE 3
postgres=# DELETE FROM ftbl WHERE a BETWEEN 103 AND 106;
INFO:  ftbl is the target relation of DELETE
INFO:  fdw_file: BeginForeignModify method
INFO:  fdw_file: DELETE (lineno = 3)
INFO:  fdw_file: DELETE (lineno = 4)
INFO:  fdw_file: DELETE (lineno = 5)
INFO:  fdw_file: DELETE (lineno = 6)
INFO:  fdw_file: EndForeignModify method
DELETE 4


This patch does not care about transaction control anyway.
According to the discussion in developer meeting at Ottawa,
I didn't include such a complex stuff in the first step.
(Probably, we can implement using XactCallback...)

Thanks,
-- 
KaiGai Kohei kai...@kaigai.gr.jp


pgsql-v9.3-writable-fdw-poc.v1.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers