2012/8/27 Albe Laurenz <laurenz.a...@wien.gv.at>: > Kohei KaiGai wrote: >> 2012/8/25 Robert Haas <robertmh...@gmail.com>: >>> On Thu, Aug 23, 2012 at 1:10 AM, Kohei KaiGai <kai...@kaigai.gr.jp> > wrote: >>>> It is a responsibility of FDW extension (and DBA) to ensure each >>>> foreign-row has a unique identifier that has 48-bits width integer >>>> data type in maximum. > >>> It strikes me as incredibly short-sighted to decide that the row >>> identifier has to have the same format as what our existing heap AM >>> happens to have. I think we need to allow the row identifier to be > of >>> any data type, and even compound. For example, the foreign side > might >>> have no equivalent of CTID, and thus use primary key. And the > primary >>> key might consist of an integer and a string, or some such. > >> I assume it is a task of FDW extension to translate between the pseudo >> ctid and the primary key in remote side. >> >> For example, if primary key of the remote table is Text data type, an > idea >> is to use a hash table to track the text-formed primary being > associated >> with a particular 48-bits integer. The pseudo ctid shall be utilized > to track >> the tuple to be modified on the scan-stage, then FDW can reference the >> hash table to pull-out the primary key to be provided on the prepared >> statement. > > And what if there is a hash collision? Then you would not be able to > determine which row is meant. > Even if we had a hash collision, each hash entry can have the original key itself to be compared. But anyway, I love the idea to support an opaque pointer to track particular remote-row rather.
> I agree with Robert that this should be flexible enough to cater for > all kinds of row identifiers. Oracle, for example, uses ten byte > identifiers which would give me a headache with your suggested design. > >> Do we have some other reasonable ideas? > > Would it be too invasive to introduce a new pointer in TupleTableSlot > that is NULL for anything but virtual tuples from foreign tables? > I'm not certain whether the duration of TupleTableSlot is enough to carry a private datum between scan and modify stage. For example, the TupleTableSlot shall be cleared at ExecNestLoop prior to the slot being delivered to ExecModifyTuple. postgres=# EXPLAIN UPDATE t1 SET b = 'abcd' WHERE a IN (SELECT x FROM t2 WHERE x % 2 = 0); QUERY PLAN ------------------------------------------------------------------------------- Update on t1 (cost=0.00..54.13 rows=6 width=16) -> Nested Loop (cost=0.00..54.13 rows=6 width=16) -> Seq Scan on t2 (cost=0.00..28.45 rows=6 width=10) Filter: ((x % 2) = 0) -> Index Scan using t1_pkey on t1 (cost=0.00..4.27 rows=1 width=10) Index Cond: (a = t2.x) (6 rows) Is it possible to utilize ctid field to move a private pointer? TID data type is internally represented as a pointer to ItemPointerData, so it has enough width to track an opaque formed remote-row identifier; including string, int64 or others. One disadvantage is "ctid" system column shows a nonsense value when user explicitly references this system column. But it does not seems to me a fundamental problem, because we didn't give any special meaning on the "ctid" field of foreign table. Thanks, -- KaiGai Kohei <kai...@kaigai.gr.jp> -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers