Re: [HACKERS] EvalPlanQual behaves oddly for FDW queries involving system columns
Etsuro Fujita writes: > On 2015/05/13 6:22, Tom Lane wrote: >> Of course, if we don't do that then we still have your original gripe >> about ctid not being correct in EvalPlanQual results. I poked at this >> a bit and it seems like we could arrange to pass ctid through even in >> the ROW_MARK_COPY case, if we define the t_ctid field of a composite >> Datum as being the thing to use. > I did the same thing when creating the first version of the patch [1]. > In addition, I made another change to ForeignNext so that the FDWs to > get ctid coming out as the same value (0, 0) instead of (4294967295,0) > before and after an EvalPlanQual recheck. But IIRC, I think both of > them were rejected by you ... Ah, right. AFAIR, people objected to the other things you'd done in that patch, and I'd still say that the change in nodeForeignscan.c is unnecessary. But the idea of using t_ctid to solve the problem for the ROW_MARK_COPY code path seems reasonable enough. >> We could fix that by >> adding an ItemPointerSetInvalid(&(td->t_ctid)) call to heap_form_tuple, >> but I dunno if we want to add even a small number of cycles for this >> purpose to such a core function. > I thought so too when creating the first version. On the other hand, that would only be three additional instructions on typical machines, which is surely in the noise compared to the rest of heap_form_tuple, and you could argue that it's a bug fix: leaving the t_ctid field with an apparently valid value is not very appropriate. So after thinking about it I think we ought to do that. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] EvalPlanQual behaves oddly for FDW queries involving system columns
On 2015/05/13 6:22, Tom Lane wrote: I wrote: I did a very basic update of your postgres_fdw patch to test this with, and attach that so that you don't have to repeat the effort. I'm not sure whether we want to try to convert that into something committable. I'm afraid that the extra round trips involved in doing row locking this way will be so expensive that no one really wants it for postgres_fdw. It's more credible that FDWs operating against local storage would have use for it. Of course, if we don't do that then we still have your original gripe about ctid not being correct in EvalPlanQual results. I poked at this a bit and it seems like we could arrange to pass ctid through even in the ROW_MARK_COPY case, if we define the t_ctid field of a composite Datum as being the thing to use. The attached WIP, mostly-comment-free patch seems to fix the original test case. It would likely result in ctid coming out as (0,0) not (4294967295,0) for FDWs that don't take any special steps to return a valid ctid, as a result of the fact that heap_form_tuple just leaves that field zeroed. +1 I did the same thing when creating the first version of the patch [1]. In addition, I made another change to ForeignNext so that the FDWs to get ctid coming out as the same value (0, 0) instead of (4294967295,0) before and after an EvalPlanQual recheck. But IIRC, I think both of them were rejected by you ... We could fix that by adding an ItemPointerSetInvalid(&(td->t_ctid)) call to heap_form_tuple, but I dunno if we want to add even a small number of cycles for this purpose to such a core function. I thought so too when creating the first version. Best regards, Etsuro Fujita [1] http://www.postgresql.org/message-id/54b7691b.5080...@lab.ntt.co.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] EvalPlanQual behaves oddly for FDW queries involving system columns
On 2015/05/13 3:24, Tom Lane wrote: Etsuro Fujita writes: On 2015/05/12 7:42, Tom Lane wrote: I don't much like the division of labor between LockForeignRow and FetchForeignRow. In principle that would lead to not one but two extra remote accesses per locked row in SELECT FOR UPDATE, at least in the case that an EvalPlanQual recheck is required. (I see that in your prototype patch for postgres_fdw you attempt to avoid that by saving a retrieved tuple in LockForeignRow and then returning it in FetchForeignRow, but that seems extremely ugly and bug-prone, since there is nothing in the API spec insisting that those calls match up one-to-one.) The fact that locking and fetching a tuple are separate operations in the heapam API is a historical artifact that probably doesn't make sense to duplicate in the FDW API. I understand your concern about the postgres_fdw patch. However, I think we should divide this into the two routines as the core patch does, because I think that would allow an FDW author to implement these routines so as to improve the efficiency for scenarios that seldom need fetching, by not retrieving and saving locked tuples in LockForeignRow. I find it hard to envision a situation where an FDW could lock a row without being able to fetch its contents more or less for free. We have IIRC discussed changing the way that works even for heapam, since the current design requires multiple buffer lock/unlock cycles which aren't free either. In any case, I think that the temptation to do probably- buggy stuff like what you did in your prototype would be too strong for most people, and that we're much better off with a simpler one-step API. OK Another problem is that we fail to detect whether an EvalPlanQual recheck is required during a SELECT FOR UPDATE on a remote table, which we certainly ought to do if the objective is to make postgres_fdw semantics match the local ones. I think that is interesting in theory, but I'm not sure that is worth much in practice. Hm, well, AFAICS the entire point of this patch is to make it possible for FDWs to duplicate the semantics used for local tables, so I'm not sure why you'd want to ignore that aspect of it. Sorry, my explanation was not correct. For me, the objective was not necessarily to make it possible for FDWs to duplicate the semantics, but to make it possible for FDWs to improve the efficiency of SELECT FOR UPDATE on foreign tables (and UPDATE/DELETE involving foreign tables as source relations) by making it possible for FDWs to duplicate the semantics. And I didn't think that the idea of detecting such a thing would be in itself directly useful for the improved efficiency. Maybe I'm missing something, though. Anyway, I redid the patch along those lines, improved the documentation, and have committed it. Thanks for improving and committing the patch. I'm so happy with the committed version. I did a very basic update of your postgres_fdw patch to test this with, and attach that so that you don't have to repeat the effort. I'm not sure whether we want to try to convert that into something committable. I'm afraid that the extra round trips involved in doing row locking this way will be so expensive that no one really wants it for postgres_fdw. It's more credible that FDWs operating against local storage would have use for it. I think so too. And thanks for updating the patch. Best regards, Etsuro Fujita -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] EvalPlanQual behaves oddly for FDW queries involving system columns
I wrote: > I did a very basic update of your postgres_fdw patch to test this with, > and attach that so that you don't have to repeat the effort. I'm not sure > whether we want to try to convert that into something committable. I'm > afraid that the extra round trips involved in doing row locking this way > will be so expensive that no one really wants it for postgres_fdw. It's > more credible that FDWs operating against local storage would have use > for it. Of course, if we don't do that then we still have your original gripe about ctid not being correct in EvalPlanQual results. I poked at this a bit and it seems like we could arrange to pass ctid through even in the ROW_MARK_COPY case, if we define the t_ctid field of a composite Datum as being the thing to use. The attached WIP, mostly-comment-free patch seems to fix the original test case. It would likely result in ctid coming out as (0,0) not (4294967295,0) for FDWs that don't take any special steps to return a valid ctid, as a result of the fact that heap_form_tuple just leaves that field zeroed. We could fix that by adding an ItemPointerSetInvalid(&(td->t_ctid)) call to heap_form_tuple, but I dunno if we want to add even a small number of cycles for this purpose to such a core function. regards, tom lane diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c index 0c44260..2350de3 100644 *** a/contrib/postgres_fdw/postgres_fdw.c --- b/contrib/postgres_fdw/postgres_fdw.c *** make_tuple_from_result_row(PGresult *res *** 2965,2971 --- 2965,2974 tuple = heap_form_tuple(tupdesc, values, nulls); if (ctid) + { tuple->t_self = *ctid; + tuple->t_data->t_ctid = *ctid; + } /* Clean up */ MemoryContextReset(temp_context); diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c index 43d3c44..4c39e06 100644 *** a/src/backend/executor/execMain.c --- b/src/backend/executor/execMain.c *** EvalPlanQualFetchRowMarks(EPQState *epqs *** 2613,2621 /* build a temporary HeapTuple control structure */ tuple.t_len = HeapTupleHeaderGetDatumLength(td); - ItemPointerSetInvalid(&(tuple.t_self)); /* relation might be a foreign table, if so provide tableoid */ tuple.t_tableOid = erm->relid; tuple.t_data = td; /* copy and store tuple */ --- 2613,2622 /* build a temporary HeapTuple control structure */ tuple.t_len = HeapTupleHeaderGetDatumLength(td); /* relation might be a foreign table, if so provide tableoid */ tuple.t_tableOid = erm->relid; + /* also copy t_ctid in case somebody supplied it */ + tuple.t_self = td->t_ctid; tuple.t_data = td; /* copy and store tuple */ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] EvalPlanQual behaves oddly for FDW queries involving system columns
Etsuro Fujita writes: > On 2015/05/12 7:42, Tom Lane wrote: >> I don't much like the division of labor between LockForeignRow and >> FetchForeignRow. In principle that would lead to not one but two >> extra remote accesses per locked row in SELECT FOR UPDATE, at least >> in the case that an EvalPlanQual recheck is required. (I see that >> in your prototype patch for postgres_fdw you attempt to avoid that >> by saving a retrieved tuple in LockForeignRow and then returning it >> in FetchForeignRow, but that seems extremely ugly and bug-prone, >> since there is nothing in the API spec insisting that those calls match >> up one-to-one.) The fact that locking and fetching a tuple are separate >> operations in the heapam API is a historical artifact that probably >> doesn't make sense to duplicate in the FDW API. > I understand your concern about the postgres_fdw patch. However, I > think we should divide this into the two routines as the core patch > does, because I think that would allow an FDW author to implement these > routines so as to improve the efficiency for scenarios that seldom need > fetching, by not retrieving and saving locked tuples in LockForeignRow. I find it hard to envision a situation where an FDW could lock a row without being able to fetch its contents more or less for free. We have IIRC discussed changing the way that works even for heapam, since the current design requires multiple buffer lock/unlock cycles which aren't free either. In any case, I think that the temptation to do probably- buggy stuff like what you did in your prototype would be too strong for most people, and that we're much better off with a simpler one-step API. An additional advantage of the way I changed this is that it makes the logic in nodeLockRows simpler too; we no longer need the very messy hack added by commit 2db576ba8c449fca. >> Another problem is that we fail to detect whether an EvalPlanQual recheck >> is required during a SELECT FOR UPDATE on a remote table, which we >> certainly ought to do if the objective is to make postgres_fdw semantics >> match the local ones. > I think that is interesting in theory, but I'm not sure that is worth > much in practice. Hm, well, AFAICS the entire point of this patch is to make it possible for FDWs to duplicate the semantics used for local tables, so I'm not sure why you'd want to ignore that aspect of it. Anyway, I redid the patch along those lines, improved the documentation, and have committed it. I did a very basic update of your postgres_fdw patch to test this with, and attach that so that you don't have to repeat the effort. I'm not sure whether we want to try to convert that into something committable. I'm afraid that the extra round trips involved in doing row locking this way will be so expensive that no one really wants it for postgres_fdw. It's more credible that FDWs operating against local storage would have use for it. regards, tom lane diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c index 0c44260..a122c9e 100644 *** a/contrib/postgres_fdw/postgres_fdw.c --- b/contrib/postgres_fdw/postgres_fdw.c *** typedef struct PgFdwRelationInfo *** 88,93 --- 88,95 * * 1) SELECT statement text to be sent to the remote server * 2) Integer list of attribute numbers retrieved by the SELECT + * 3) SELECT statement text to be sent to the remote server + * 4) Integer list of attribute numbers retrieved by the SELECT * * These items are indexed with the enum FdwScanPrivateIndex, so an item * can be fetched with list_nth(). For example, to get the SELECT statement: *** enum FdwScanPrivateIndex *** 98,104 /* SQL statement to execute remotely (as a String node) */ FdwScanPrivateSelectSql, /* Integer list of attribute numbers retrieved by the SELECT */ ! FdwScanPrivateRetrievedAttrs }; /* --- 100,110 /* SQL statement to execute remotely (as a String node) */ FdwScanPrivateSelectSql, /* Integer list of attribute numbers retrieved by the SELECT */ ! FdwScanPrivateRetrievedAttrs, ! /* SQL statement to execute remotely (as a String node) */ ! FdwScanPrivateSelectSql2, ! /* Integer list of attribute numbers retrieved by SELECT */ ! FdwScanPrivateRetrievedAttrs2 }; /* *** typedef struct PgFdwModifyState *** 186,191 --- 192,221 } PgFdwModifyState; /* + * Execution state for fetching/locking foreign rows. + */ + typedef struct PgFdwFetchState + { + Relation rel; /* relcache entry for the foreign table */ + AttInMetadata *attinmeta; /* attribute datatype conversion metadata */ + + /* for remote query execution */ + PGconn *conn; /* connection for the fetch */ + char *p_name; /* name of prepared statement, if created */ + + /* extracted fdw_private data */ + char *query; /* text of SELECT command */ + List *retrieved_attrs; /* attr numbers retriev
Re: [HACKERS] EvalPlanQual behaves oddly for FDW queries involving system columns
On 2015/05/12 7:42, Tom Lane wrote: > On further consideration ... Thanks for the consideration! > I don't much like the division of labor between LockForeignRow and > FetchForeignRow. In principle that would lead to not one but two > extra remote accesses per locked row in SELECT FOR UPDATE, at least > in the case that an EvalPlanQual recheck is required. (I see that > in your prototype patch for postgres_fdw you attempt to avoid that > by saving a retrieved tuple in LockForeignRow and then returning it > in FetchForeignRow, but that seems extremely ugly and bug-prone, > since there is nothing in the API spec insisting that those calls match > up one-to-one.) The fact that locking and fetching a tuple are separate > operations in the heapam API is a historical artifact that probably > doesn't make sense to duplicate in the FDW API. I understand your concern about the postgres_fdw patch. However, I think we should divide this into the two routines as the core patch does, because I think that would allow an FDW author to implement these routines so as to improve the efficiency for scenarios that seldom need fetching, by not retrieving and saving locked tuples in LockForeignRow. > Another problem is that we fail to detect whether an EvalPlanQual recheck > is required during a SELECT FOR UPDATE on a remote table, which we > certainly ought to do if the objective is to make postgres_fdw semantics > match the local ones. I think that is interesting in theory, but I'm not sure that is worth much in practice. Best regards, Etsuro Fujita -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] EvalPlanQual behaves oddly for FDW queries involving system columns
On further consideration ... I don't much like the division of labor between LockForeignRow and FetchForeignRow. In principle that would lead to not one but two extra remote accesses per locked row in SELECT FOR UPDATE, at least in the case that an EvalPlanQual recheck is required. (I see that in your prototype patch for postgres_fdw you attempt to avoid that by saving a retrieved tuple in LockForeignRow and then returning it in FetchForeignRow, but that seems extremely ugly and bug-prone, since there is nothing in the API spec insisting that those calls match up one-to-one.) The fact that locking and fetching a tuple are separate operations in the heapam API is a historical artifact that probably doesn't make sense to duplicate in the FDW API. Another problem is that we fail to detect whether an EvalPlanQual recheck is required during a SELECT FOR UPDATE on a remote table, which we certainly ought to do if the objective is to make postgres_fdw semantics match the local ones. What I'm inclined to do is merge LockForeignRow and FetchForeignRow into one operation, which would have the semantics of returning a palloc'd tuple, or NULL on a SKIP LOCKED failure, and it would be expected to acquire a lock according to erm->markType (in particular, no lock just a re-fetch for ROW_MARK_REFERENCE). In addition it needs some way of reporting that the returned row is an updated version rather than the original. Probably just an extra output boolean would do for that. This'd require some refactoring in nodeLockRows, because we'd want to be able to hang onto the returned tuple without necessarily provoking an EvalPlanQual cycle, but it doesn't look too bad. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] EvalPlanQual behaves oddly for FDW queries involving system columns
I wrote: > Etsuro Fujita writes: >> On 2015/05/11 8:50, Tom Lane wrote: >>> I wonder if we should instead add a "ScanState*" field and expect the >>> core code to set that up (ExecOpenScanRelation could do it with minor >>> API changes...). >> Sorry, I don't understand clearly what you mean, but that (the idea of >> expecting the core to set it up) sounds inconsistent with your comment >> on the earlier version of the API "BeginForeignFetch" [1]. > Well, the other way that it could work would be for the FDW's > BeginForeignScan routine to search for a relevant ExecRowMark entry > (which, per that previous discussion, it'd likely need to do anyway) and > then plug a back-link to the ForeignScanState into the ExecRowMark. > But I don't see any very good reason to make every FDW that's concerned > with this do that, rather than doing it once in the core code. I'm also > thinking that having a link to the source scan node there might be useful > someday for regular tables as well as FDWs. And on the third hand ... in view of the custom/foreign join pushdown stuff that just went in, it would be a mistake to assume that there necessarily is a distinct source scan node associated with each ExecRowMark. The FDW can presumably find all the ExecRowMarks associated with the rels that a join ForeignScan is scanning, but we can't assume that ExecOpenScanRelation will be able to set up those links, and the FDW might not want a simple link to the join scan node anyway. So it's probably best to leave it as an unspecified void* instead of trying to nail down the meaning more precisely. I still don't much like calling it "fdw_state" though, because I don't think it should be documented as only for the use of FDWs. (Custom scans might have a use for a pointer field here too, for example.) Maybe just call it "void *extra" and document it as being for the use of whatever plan node is sourcing the relation's tuples. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] EvalPlanQual behaves oddly for FDW queries involving system columns
Etsuro Fujita writes: > On 2015/05/11 8:50, Tom Lane wrote: >> In particular, I find the addition of "void *fdw_state" to ExecRowMark >> to be pretty questionable. That does not seem like a good place to keep >> random state. (I realize that WHERE CURRENT OF keeps some state in >> ExecRowMark, but that's a crock not something to emulate.) ISTM that in >> most scenarios, the state that LockForeignRow/FetchForeignRow would like >> to get at is probably the FDW state associated with the ForeignScanState >> that the tuple came from. Which this API doesn't help with particularly. >> I wonder if we should instead add a "ScanState*" field and expect the >> core code to set that up (ExecOpenScanRelation could do it with minor >> API changes...). > Sorry, I don't understand clearly what you mean, but that (the idea of > expecting the core to set it up) sounds inconsistent with your comment > on the earlier version of the API "BeginForeignFetch" [1]. Well, the other way that it could work would be for the FDW's BeginForeignScan routine to search for a relevant ExecRowMark entry (which, per that previous discussion, it'd likely need to do anyway) and then plug a back-link to the ForeignScanState into the ExecRowMark. But I don't see any very good reason to make every FDW that's concerned with this do that, rather than doing it once in the core code. I'm also thinking that having a link to the source scan node there might be useful someday for regular tables as well as FDWs. >> Also, as I mentioned, I'd be a whole lot happier if we had a way to test >> this... > Attached is a postgres_fdw patch that I used for the testing. If you > try it, edit postgresGetForeignRowMarkType as necessary. I have to > confess that I did the testing only in the normal conditions by the patch. Thanks, this is helpful. However, it pretty much proves my point about wanting the back-link --- it seems like init_foreign_fetch_state, for example, is uselessly repeating a lot of the setup already done for the scan node itself. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] EvalPlanQual behaves oddly for FDW queries involving system columns
On 2015/05/11 8:50, Tom Lane wrote: > Etsuro Fujita writes: >> [ EvalPlanQual-v6.patch ] > > I've started to study this in a little more detail, and I'm not terribly > happy with some of the API decisions in it. Thanks for taking the time to review the patch! > In particular, I find the addition of "void *fdw_state" to ExecRowMark > to be pretty questionable. That does not seem like a good place to keep > random state. (I realize that WHERE CURRENT OF keeps some state in > ExecRowMark, but that's a crock not something to emulate.) ISTM that in > most scenarios, the state that LockForeignRow/FetchForeignRow would like > to get at is probably the FDW state associated with the ForeignScanState > that the tuple came from. Which this API doesn't help with particularly. > I wonder if we should instead add a "ScanState*" field and expect the > core code to set that up (ExecOpenScanRelation could do it with minor > API changes...). Sorry, I don't understand clearly what you mean, but that (the idea of expecting the core to set it up) sounds inconsistent with your comment on the earlier version of the API "BeginForeignFetch" [1]. > I'm also a bit tempted to pass the TIDs to LockForeignRow and > FetchForeignRow as Datums not ItemPointers. We have the Datum format > available already at the call sites, so this is free as far as the core > code is concerned, and would only cost another line or so for the FDWs. > This is by no means sufficient to allow FDWs to use some other type than > "tid" for row identifiers; but it would be a down payment on that problem, > and at least would avoid nailing the rowids-are-tids assumption into yet > another global API. That is a good idea. > Also, as I mentioned, I'd be a whole lot happier if we had a way to test > this... Attached is a postgres_fdw patch that I used for the testing. If you try it, edit postgresGetForeignRowMarkType as necessary. I have to confess that I did the testing only in the normal conditions by the patch. Sorry for the delay. I took a vacation until yesterday. Best regards, Etsuro Fujita [1] http://www.postgresql.org/message-id/14504.1428446...@sss.pgh.pa.us *** a/contrib/postgres_fdw/postgres_fdw.c --- b/contrib/postgres_fdw/postgres_fdw.c *** *** 88,93 typedef struct PgFdwRelationInfo --- 88,95 * * 1) SELECT statement text to be sent to the remote server * 2) Integer list of attribute numbers retrieved by the SELECT + * 3) SELECT statement text to be sent to the remote server + * 4) Integer list of attribute numbers retrieved by the SELECT * * These items are indexed with the enum FdwScanPrivateIndex, so an item * can be fetched with list_nth(). For example, to get the SELECT statement: *** *** 98,104 enum FdwScanPrivateIndex /* SQL statement to execute remotely (as a String node) */ FdwScanPrivateSelectSql, /* Integer list of attribute numbers retrieved by the SELECT */ ! FdwScanPrivateRetrievedAttrs }; /* --- 100,110 /* SQL statement to execute remotely (as a String node) */ FdwScanPrivateSelectSql, /* Integer list of attribute numbers retrieved by the SELECT */ ! FdwScanPrivateRetrievedAttrs, ! /* SQL statement to execute remotely (as a String node) */ ! FdwScanPrivateSelectSql2, ! /* Integer list of attribute numbers retrieved by SELECT */ ! FdwScanPrivateRetrievedAttrs2 }; /* *** *** 186,191 typedef struct PgFdwModifyState --- 192,223 } PgFdwModifyState; /* + * Execution state for fetching/locking foreign rows. + */ + typedef struct PgFdwFetchState + { + Relation rel; /* relcache entry for the foreign table */ + AttInMetadata *attinmeta; /* attribute datatype conversion metadata */ + + /* for remote query execution */ + PGconn *conn; /* connection for the fetch */ + char *p_name; /* name of prepared statement, if created */ + + /* extracted fdw_private data */ + char *query; /* text of SELECT command */ + List *retrieved_attrs; /* attr numbers retrieved by SELECT */ + + /* info about parameters for prepared statement */ + int p_nums; /* number of parameters to transmit */ + FmgrInfo *p_flinfo; /* output conversion functions for them */ + + HeapTuple locked_tuple; + + /* working memory context */ + MemoryContext temp_cxt; /* context for per-tuple temporary data */ + } PgFdwFetchState; + + /* * Workspace for analyzing a foreign table. */ typedef struct PgFdwAnalyzeState *** *** 276,281 static TupleTableSlot *postgresExecForeignDelete(EState *estate, --- 308,320 static void postgresEndForeignModify(EState *estate, ResultRelInfo *resultRelInfo); static int postgresIsForeignRelUpdatable(Relation rel); + static RowMarkType postgresGetForeignRowMarkType(LockClauseStrength strength); + static bool postgresLockForeignRow(EState *estate, + ExecRowMark *erm, + ItemPointer tupleid); + static HeapTuple
Re: [HACKERS] EvalPlanQual behaves oddly for FDW queries involving system columns
Etsuro Fujita writes: > [ EvalPlanQual-v6.patch ] I've started to study this in a little more detail, and I'm not terribly happy with some of the API decisions in it. In particular, I find the addition of "void *fdw_state" to ExecRowMark to be pretty questionable. That does not seem like a good place to keep random state. (I realize that WHERE CURRENT OF keeps some state in ExecRowMark, but that's a crock not something to emulate.) ISTM that in most scenarios, the state that LockForeignRow/FetchForeignRow would like to get at is probably the FDW state associated with the ForeignScanState that the tuple came from. Which this API doesn't help with particularly. I wonder if we should instead add a "ScanState*" field and expect the core code to set that up (ExecOpenScanRelation could do it with minor API changes...). I'm also a bit tempted to pass the TIDs to LockForeignRow and FetchForeignRow as Datums not ItemPointers. We have the Datum format available already at the call sites, so this is free as far as the core code is concerned, and would only cost another line or so for the FDWs. This is by no means sufficient to allow FDWs to use some other type than "tid" for row identifiers; but it would be a down payment on that problem, and at least would avoid nailing the rowids-are-tids assumption into yet another global API. Thoughts? Also, as I mentioned, I'd be a whole lot happier if we had a way to test this... regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] EvalPlanQual behaves oddly for FDW queries involving system columns
Etsuro Fujita writes: > BTW, I revised docs a bit. Attached is an updated version of the patch. I started to look at this and realized that it only touches the core code and not postgres_fdw, which seems odd --- doesn't that mean the new feature can't be tested? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] EvalPlanQual behaves oddly for FDW queries involving system columns
Robert Haas writes: > Tom, you're listed as the committer for this in the CF app. Are you > still planning to take care of this? > It seems that time is beginning to run short. Yeah, I will address this (and start looking at GROUPING SETS) next week. I'm out of town right now. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] EvalPlanQual behaves oddly for FDW queries involving system columns
On Thu, Apr 16, 2015 at 2:55 AM, Etsuro Fujita wrote: > Ah, you are right. FOR NO KEY UPDATE and FOR KEY SHARE would be useful in > the Postgres FDW if we assume the user performs those properly based on > information about keys for a remote table. > > Sorry, my explanation was not correct, but I want to make it clear that the > proposed patch also allows the FDW to change the behavior of FOR NO KEY > UPDATE and/or FOR KEY SHARE row locking so as to match the local semantics > exactly. > > BTW, I revised docs a bit. Attached is an updated version of the patch. Tom, you're listed as the committer for this in the CF app. Are you still planning to take care of this? It seems that time is beginning to run short. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] EvalPlanQual behaves oddly for FDW queries involving system columns
On 2015/04/15 2:27, Jim Nasby wrote: On 4/14/15 1:05 AM, Kyotaro HORIGUCHI wrote: As an example, the following operations cause an "unexpected" result. Those results are indeed surprising, but since we allow it in a direct connection I don't see why we wouldn't allow it in the Postgres FDW... As for the FDW not knowing about keys, why would it need to? If you try to do something illegal it's the remote side that should throw the error, not the FDW. Of course, if you try to do a locking operation on an FDW that doesn't support it, the FDW should throw an error... but that's different. Ah, you are right. FOR NO KEY UPDATE and FOR KEY SHARE would be useful in the Postgres FDW if we assume the user performs those properly based on information about keys for a remote table. Sorry, my explanation was not correct, but I want to make it clear that the proposed patch also allows the FDW to change the behavior of FOR NO KEY UPDATE and/or FOR KEY SHARE row locking so as to match the local semantics exactly. BTW, I revised docs a bit. Attached is an updated version of the patch. Best regards, Etsuro Fujita *** a/doc/src/sgml/fdwhandler.sgml --- b/doc/src/sgml/fdwhandler.sgml *** *** 211,216 BeginForeignScan (ForeignScanState *node, --- 211,232 + If the FDW changes handling SELECT FOR UPDATE/SHARE row + locking, this routine should perform any initialization needed prior to + the actual row locking if the foreign table is referenced in a + SELECT FOR UPDATE/SHARE. To decide whether the foreign + table is referenced in the command or not, it's recommended to use + ExecFindRowMark with the missing_ok argument set to true, + which returns an ExecRowMark struct if the foreign table + is referenced in the command and otherwise returns a NULL. Information + about the row locking is accessible through the + ExecRowMark struct. + (The fdw_state field of ExecRowMark is + available for the FDW to store any private state it needs for the row + locking.) + + + Note that when (eflags & EXEC_FLAG_EXPLAIN_ONLY) is true, this function should not perform any externally-visible actions; it should only do the minimum required to make the node state valid *** *** 598,603 IsForeignRelUpdatable (Relation rel); --- 614,701 + + FDW Routines For SELECT FOR UPDATE/SHARE row locking + + + + If an FDW supports SELECT FOR UPDATE/SHARE row locking, + it should provide the following callback functions: + + + + + RowMarkType + GetForeignRowMarkType (LockClauseStrength strength); + + + Report which row-marking option to support for a lock strength + associated with a SELECT FOR UPDATE/SHARE request. + This is called at the beginning of query planning. + strength is a member of the LockClauseStrength + enum type. + The return value should be a member of the RowMarkType + enum type. See + src/include/nodes/lockoptions.h and + src/include/nodes/plannodes.h for information about + these enum types. + + + + If the GetForeignRowMarkType pointer is set to + NULL, the default option is selected for any lock strength, + and both LockForeignRow and FetchForeignRow + described below will not be called at query execution time. + + + + + bool + LockForeignRow (EState *estate, + ExecRowMark *erm, + ItemPointer tupleid); + + + Lock one tuple in the foreign table. + estate is global execution state for the query. + erm is the ExecRowMark struct describing + the target foreign table. + tupleid identifies the tuple to be locked. + This function should return true, if the FDW lock the tuple + successfully. Otherwise, return false. + + + + If the LockForeignRow pointer is set to + NULL, attempts to lock rows will fail + with an error message. + + + + + HeapTuple + FetchForeignRow (EState *estate, + ExecRowMark *erm, + ItemPointer tupleid); + + + Fetch one tuple from the foreign table. + estate is global execution state for the query. + erm is the ExecRowMark struct describing + the target foreign table. + tupleid identifies the tuple to be fetched. + This function should return the fetched tuple, if the FDW fetch the + tuple successfully. Otherwise, return NULL. + + + + If the FetchForeignRow pointer is set to + NULL, attempts to lock rows will fail + with an error message. + + + + FDW Routines for EXPLAIN *** *** 1011,1017 GetForeignServerByName(const char *name, bool missing_ok); join conditions. However, matching the local semantics exactly would require an additional remote access f
Re: [HACKERS] EvalPlanQual behaves oddly for FDW queries involving system columns
On 4/14/15 1:05 AM, Kyotaro HORIGUCHI wrote: As an example, the following operations cause an "unexpected" result. Ex. 1 Session A=# create table t (a int primary key, b int); Session A=# insert into t (select a, 1 from generate_series(0, 99) a); Session A=# begin; Session A=# select * from t where a = 1 for no key update; Session B=# begin; Session B=# select * from t where a = 1 for key share; Session B=# update t set a = -a where a = 1; Ex. 2 Session A=# create table t (a int, b int); -- a is the key in mind. Session A=# insert into t (select a, 1 from generate_series(0, 99) a); Session A=# begin; Session A=# select * from t where a = 1 for no key update; Session B=# begin; Session B=# select * from t where a = 1 for key share; Session A=# update t set a = -a where a = 1; UPDATE 1 Session A=# commit; Session B=# select * from t where a = 1; (0 rows) -- Woops. Those results are indeed surprising, but since we allow it in a direct connection I don't see why we wouldn't allow it in the Postgres FDW... As for the FDW not knowing about keys, why would it need to? If you try to do something illegal it's the remote side that should throw the error, not the FDW. Of course, if you try to do a locking operation on an FDW that doesn't support it, the FDW should throw an error... but that's different. -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] EvalPlanQual behaves oddly for FDW queries involving system columns
Hello, At Tue, 14 Apr 2015 12:10:35 +0900, Etsuro Fujita wrote in <552c852b.2050...@lab.ntt.co.jp> > On 2015/04/13 23:25, Jim Nasby wrote: > > On 4/13/15 4:58 AM, Etsuro Fujita wrote: > >> On 2015/04/10 21:40, Etsuro Fujita wrote: > >>> On 2015/04/09 12:07, Etsuro Fujita wrote: > I'll update the patch, which will contain only an infrastructure for > this in the PG core, and will not contain any postgres_fdw change. > >>> > >>> I updated the patch based on your comments. Updated patch attached. In > >>> the patch the following FDW APIs have been proposed: > >>> > >>> + RowMarkType > >>> + GetForeignRowMarkType (LockClauseStrength strength); > >>> > >>> + bool > >>> + LockForeignRow (EState *estate, > >>> + ExecRowMark *erm, > >>> + ItemPointer tupleid); > >>> > >>> + HeapTuple > >>> + FetchForeignRow (EState *estate, > >>> + ExecRowMark *erm, > >>> + ItemPointer tupleid); > >>> > >>> I think that these APIs allow the FDW that has TIDs to use the rowmark > >>> options such as ROW_MARK_REFERENCE, ROW_MARK_SHARE and > >>> ROW_MARK_EXCLUSIVE for its foreign tables so as to match the local > >>> semantics exactly, for example. > >>> > >>> As you mentioned, it would be better to add helper functions to see > >>> whether the foreign table is referenced by any ExecRowMarks. ISTM that > >>> an easy way to do that is to modify ExecFindRowMark() so that it allows > >>> for the missing case. I didn't contain such functions in the patch, > >>> though. > >> > >> I added that function and modified docs a bit. Please find attached an > >> updated version of the patch. > > > > Why aren't we allowing SELECT FOR KEY SHARE? > > I omitted that mode (and the FOR NO KEY UPDATE mode) for simplicity, but > both modes have been allowed. However, I'm not sure if those modes are > useful because we don't have information about keys for a remote table. If I understand this correctly, the two lock modes are no relation with key column definitions, and are provided as a weaker lock in exchange for some risks. Like advisory locks. we can FOR_NO_KEY_UPDATE in the trunsactions that evetually update "key columns" but also should accept the unexpected result. In other words, the "key" in the context of "FOR NO KEY UPDATE" and "FOR KEY SHARE" is only in the programmers' minds. Apart from feasibility, I don't see no resason to omit them if this is correct. As an example, the following operations cause an "unexpected" result. Ex. 1 Session A=# create table t (a int primary key, b int); Session A=# insert into t (select a, 1 from generate_series(0, 99) a); Session A=# begin; Session A=# select * from t where a = 1 for no key update; Session B=# begin; Session B=# select * from t where a = 1 for key share; Session B=# update t set a = -a where a = 1; Ex. 2 Session A=# create table t (a int, b int); -- a is the key in mind. Session A=# insert into t (select a, 1 from generate_series(0, 99) a); Session A=# begin; Session A=# select * from t where a = 1 for no key update; Session B=# begin; Session B=# select * from t where a = 1 for key share; Session A=# update t set a = -a where a = 1; UPDATE 1 Session A=# commit; Session B=# select * from t where a = 1; (0 rows) -- Woops. regards, -- Kyotaro Horiguchi NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] EvalPlanQual behaves oddly for FDW queries involving system columns
On 2015/04/13 23:25, Jim Nasby wrote: > On 4/13/15 4:58 AM, Etsuro Fujita wrote: >> On 2015/04/10 21:40, Etsuro Fujita wrote: >>> On 2015/04/09 12:07, Etsuro Fujita wrote: I'll update the patch, which will contain only an infrastructure for this in the PG core, and will not contain any postgres_fdw change. >>> >>> I updated the patch based on your comments. Updated patch attached. In >>> the patch the following FDW APIs have been proposed: >>> >>> + RowMarkType >>> + GetForeignRowMarkType (LockClauseStrength strength); >>> >>> + bool >>> + LockForeignRow (EState *estate, >>> + ExecRowMark *erm, >>> + ItemPointer tupleid); >>> >>> + HeapTuple >>> + FetchForeignRow (EState *estate, >>> + ExecRowMark *erm, >>> + ItemPointer tupleid); >>> >>> I think that these APIs allow the FDW that has TIDs to use the rowmark >>> options such as ROW_MARK_REFERENCE, ROW_MARK_SHARE and >>> ROW_MARK_EXCLUSIVE for its foreign tables so as to match the local >>> semantics exactly, for example. >>> >>> As you mentioned, it would be better to add helper functions to see >>> whether the foreign table is referenced by any ExecRowMarks. ISTM that >>> an easy way to do that is to modify ExecFindRowMark() so that it allows >>> for the missing case. I didn't contain such functions in the patch, though. >> >> I added that function and modified docs a bit. Please find attached an >> updated version of the patch. > > Why aren't we allowing SELECT FOR KEY SHARE? I omitted that mode (and the FOR NO KEY UPDATE mode) for simplicity, but both modes have been allowed. However, I'm not sure if those modes are useful because we don't have information about keys for a remote table. Best regards, Etsuro Fujita -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] EvalPlanQual behaves oddly for FDW queries involving system columns
On 4/13/15 4:58 AM, Etsuro Fujita wrote: > On 2015/04/10 21:40, Etsuro Fujita wrote: >> On 2015/04/09 12:07, Etsuro Fujita wrote: >>> I'll update the patch, which will contain only an infrastructure for >>> this in the PG core, and will not contain any postgres_fdw change. >> >> I updated the patch based on your comments. Updated patch attached. In >> the patch the following FDW APIs have been proposed: >> >> + RowMarkType >> + GetForeignRowMarkType (LockClauseStrength strength); >> >> + bool >> + LockForeignRow (EState *estate, >> + ExecRowMark *erm, >> + ItemPointer tupleid); >> >> + HeapTuple >> + FetchForeignRow (EState *estate, >> + ExecRowMark *erm, >> + ItemPointer tupleid); >> >> I think that these APIs allow the FDW that has TIDs to use the rowmark >> options such as ROW_MARK_REFERENCE, ROW_MARK_SHARE and >> ROW_MARK_EXCLUSIVE for its foreign tables so as to match the local >> semantics exactly, for example. >> >> As you mentioned, it would be better to add helper functions to see >> whether the foreign table is referenced by any ExecRowMarks. ISTM that >> an easy way to do that is to modify ExecFindRowMark() so that it allows >> for the missing case. I didn't contain such functions in the patch, though. > > I added that function and modified docs a bit. Please find attached an > updated version of the patch. Why aren't we allowing SELECT FOR KEY SHARE? -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] EvalPlanQual behaves oddly for FDW queries involving system columns
On 2015/04/10 21:40, Etsuro Fujita wrote: > On 2015/04/09 12:07, Etsuro Fujita wrote: >> I'll update the patch, which will contain only an infrastructure for >> this in the PG core, and will not contain any postgres_fdw change. > > I updated the patch based on your comments. Updated patch attached. In > the patch the following FDW APIs have been proposed: > > + RowMarkType > + GetForeignRowMarkType (LockClauseStrength strength); > > + bool > + LockForeignRow (EState *estate, > + ExecRowMark *erm, > + ItemPointer tupleid); > > + HeapTuple > + FetchForeignRow (EState *estate, > + ExecRowMark *erm, > + ItemPointer tupleid); > > I think that these APIs allow the FDW that has TIDs to use the rowmark > options such as ROW_MARK_REFERENCE, ROW_MARK_SHARE and > ROW_MARK_EXCLUSIVE for its foreign tables so as to match the local > semantics exactly, for example. > > As you mentioned, it would be better to add helper functions to see > whether the foreign table is referenced by any ExecRowMarks. ISTM that > an easy way to do that is to modify ExecFindRowMark() so that it allows > for the missing case. I didn't contain such functions in the patch, though. I added that function and modified docs a bit. Please find attached an updated version of the patch. Best regards, Etsuro Fujita *** a/doc/src/sgml/fdwhandler.sgml --- b/doc/src/sgml/fdwhandler.sgml *** *** 211,216 BeginForeignScan (ForeignScanState *node, --- 211,230 + If the FDW supports SELECT FOR UPDATE/SHARE row locking, + this routine should perform any initialization needed prior to the + actual row locking if the foreign table is referenced in a + SELECT FOR UPDATE/SHARE. To decide whether the foreign + table is referenced in a SELECT FOR UPDATE/SHARE or not, + it's recommended to use ExecFindRowMark with the missing_ok + argument allowing a NULL return if the structure is not. Information + about the row locking is accessible through its return value + (ExecRowMark struct). (The fdw_state + field of ExecRowMark is available for the FDW to store + any private state it needs for the operation.) + + + Note that when (eflags & EXEC_FLAG_EXPLAIN_ONLY) is true, this function should not perform any externally-visible actions; it should only do the minimum required to make the node state valid *** *** 598,603 IsForeignRelUpdatable (Relation rel); --- 612,699 + + FDW Routines For SELECT FOR UPDATE/SHARE row locking + + + + If an FDW supports SELECT FOR UPDATE/SHARE row locking, + it should provide the following callback functions: + + + + + RowMarkType + GetForeignRowMarkType (LockClauseStrength strength); + + + Report which row-marking option to support for a lock strength + associated with a SELECT FOR UPDATE/SHARE request. + This is called at the beginning of query planning. + strength is a member of the LockClauseStrength + enum type. + The return value should be a member of the RowMarkType + enum type. See + src/include/nodes/lockoptions.h and + src/include/nodes/plannodes.h for information about + these enum types. + + + + If the GetForeignRowMarkType pointer is set to + NULL, the default option is selected for any lock strength, + and both LockForeignRow and FetchForeignRow + described below will not be called at query execution time. + + + + + bool + LockForeignRow (EState *estate, + ExecRowMark *erm, + ItemPointer tupleid); + + + Lock one tuple in the foreign table. + estate is global execution state for the query. + erm is the ExecRowMark struct describing + the target foreign table. + tupleid identifies the tuple to be locked. + This function should return true, if the FDW lock the tuple + successfully. Otherwise, return false. + + + + If the LockForeignRow pointer is set to + NULL, attempts to lock rows will fail + with an error message. + + + + + HeapTuple + FetchForeignRow (EState *estate, + ExecRowMark *erm, + ItemPointer tupleid); + + + Fetch one tuple from the foreign table. + estate is global execution state for the query. + erm is the ExecRowMark struct describing + the target foreign table. + tupleid identifies the tuple to be fetched. + This function should return the fetched tuple, if the FDW fetch the + tuple successfully. Otherwise, return NULL. + + + + If the FetchForeignRow pointer is set to + NULL, attempts to lock rows will fail + with an error message. + + + + FDW Routines for EXPLAIN *** *** 1011,1017 GetForeignServerByNa
Re: [HACKERS] EvalPlanQual behaves oddly for FDW queries involving system columns
On 2015/04/08 7:44, Tom Lane wrote: > Etsuro Fujita writes: >> To support ROW_MARK_REFERENCE on (postgres_fdw) foreign tables, I'd like >> to propose the following FDW APIs: > >> RowMarkType >> GetForeignRowMarkType(Oid relid, >> LockClauseStrength strength); > >> Decide which rowmark type to use for a foreign table (that has strength >> = LCS_NONE), ie, ROW_MARK_REFERENCE or ROW_MARK_COPY. (For now, the >> second argument takes LCS_NONE only, but is intended to be used for the >> possible extension to the other cases.) This is called during >> select_rowmark_type() in the planner. > > Why would you restrict that to LCS_NONE? Seems like the point is to give > the FDW control of the rowmark behavior, not only partial control. The reason is because I think it's comparatively more promissing to customize the ROW_MARK type for LCS_NONE than other cases since in many workloads no re-fetch is needed, and because I think other cases would need more discussions. So, as a first step, I restricted that to LCS_NONE. But I've got the point, and agree that we should give the full control to the FDW. > (For the same reason I disagree with the error check in the patch that > restricts which ROW_MARK options this function can return. If the FDW has > TIDs, seems like it could reasonably use whatever options tables can use.) Will fix. >> void >> BeginForeignFetch(EState *estate, >> ExecRowMark *erm, >> List *fdw_private, >> int eflags); > >> Begin a remote fetch. This is called during InitPlan() in the executor. > > The begin/end functions seem like useless extra mechanism. Why wouldn't > the FDW just handle this during its regular scan setup? It could look to > see whether the foreign table is referenced by any ExecRowMarks (possibly > there's room to add some convenience functions to help with that). What's > more, that design would make it simpler if the basic row fetch needs to be > modified. The reason is just because it's easy to understand the structure at least to me since the begin/exec/end are all done in a higher level of the executor. What's more, the begin/end can be done once per foreign scan node for the multi-table update case. But I feel inclined to agree with you on this point also. >> And I'd also like to propose to add a table/server option, >> row_mark_reference, to postgres_fdw. When a user sets the option to >> true for eg a foreign table, ROW_MARK_REFERENCE will be used for the >> table, not ROW_MARK_COPY. > > Why would we leave that in the hands of the user? Hardly anybody is going > to know what to do with the option, or even that they should do something > with it. It's basically only of value for debugging AFAICS, Agreed. (When begining to update postgres_fdw docs, I also started to think so.) I'll update the patch, which will contain only an infrastructure for this in the PG core, and will not contain any postgres_fdw change. Thank you for taking the time to review the patch! Best regards, Etsuro Fujita -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] EvalPlanQual behaves oddly for FDW queries involving system columns
Etsuro Fujita writes: > To support ROW_MARK_REFERENCE on (postgres_fdw) foreign tables, I'd like > to propose the following FDW APIs: > RowMarkType > GetForeignRowMarkType(Oid relid, >LockClauseStrength strength); > Decide which rowmark type to use for a foreign table (that has strength > = LCS_NONE), ie, ROW_MARK_REFERENCE or ROW_MARK_COPY. (For now, the > second argument takes LCS_NONE only, but is intended to be used for the > possible extension to the other cases.) This is called during > select_rowmark_type() in the planner. Why would you restrict that to LCS_NONE? Seems like the point is to give the FDW control of the rowmark behavior, not only partial control. (For the same reason I disagree with the error check in the patch that restricts which ROW_MARK options this function can return. If the FDW has TIDs, seems like it could reasonably use whatever options tables can use.) > void > BeginForeignFetch(EState *estate, >ExecRowMark *erm, >List *fdw_private, >int eflags); > Begin a remote fetch. This is called during InitPlan() in the executor. The begin/end functions seem like useless extra mechanism. Why wouldn't the FDW just handle this during its regular scan setup? It could look to see whether the foreign table is referenced by any ExecRowMarks (possibly there's room to add some convenience functions to help with that). What's more, that design would make it simpler if the basic row fetch needs to be modified. > And I'd also like to propose to add a table/server option, > row_mark_reference, to postgres_fdw. When a user sets the option to > true for eg a foreign table, ROW_MARK_REFERENCE will be used for the > table, not ROW_MARK_COPY. Why would we leave that in the hands of the user? Hardly anybody is going to know what to do with the option, or even that they should do something with it. It's basically only of value for debugging AFAICS, but if we expose an option we're going to be stuck with supporting it forever. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] EvalPlanQual behaves oddly for FDW queries involving system columns
On 2015/03/13 0:50, Tom Lane wrote: I think the real fix as far as postgres_fdw is concerned is in fact to let it adopt a different ROW_MARK strategy, since it has meaningful ctid values. However, that is not a one-size-fits-all answer. The tableoid problem can be fixed much less invasively as per the attached patch. I think that we should continue to assume that ctid is not meaningful (and hence should read as (4294967295,0)) in FDWs that use ROW_MARK_COPY, and press forward on fixing the locking issues for postgres_fdw by letting it use ROW_MARK_REFERENCE or something close to that. That would also cause ctid to read properly for rows from postgres_fdw. To support ROW_MARK_REFERENCE on (postgres_fdw) foreign tables, I'd like to propose the following FDW APIs: RowMarkType GetForeignRowMarkType(Oid relid, LockClauseStrength strength); Decide which rowmark type to use for a foreign table (that has strength = LCS_NONE), ie, ROW_MARK_REFERENCE or ROW_MARK_COPY. (For now, the second argument takes LCS_NONE only, but is intended to be used for the possible extension to the other cases.) This is called during select_rowmark_type() in the planner. void BeginForeignFetch(EState *estate, ExecRowMark *erm, List *fdw_private, int eflags); Begin a remote fetch. This is called during InitPlan() in the executor. HeapTuple ExecForeignFetch(EState *estate, ExecRowMark *erm, ItemPointer tupleid); Re-fetch the specified tuple. This is called during EvalPlanQualFetchRowMarks() in the executor. void EndForeignFetch(EState *estate, ExecRowMark *erm); End a remote fetch. This is called during ExecEndPlan() in the executor. And I'd also like to propose to add a table/server option, row_mark_reference, to postgres_fdw. When a user sets the option to true for eg a foreign table, ROW_MARK_REFERENCE will be used for the table, not ROW_MARK_COPY. Attached is a WIP patch, which contains no docs/regression tests. It'd be appreciated if anyone could send back any comments earlier. Best regards, Etsuro Fujita *** a/contrib/postgres_fdw/option.c --- b/contrib/postgres_fdw/option.c *** *** 105,111 postgres_fdw_validator(PG_FUNCTION_ARGS) * Validate option value, when we can do so without any context. */ if (strcmp(def->defname, "use_remote_estimate") == 0 || ! strcmp(def->defname, "updatable") == 0) { /* these accept only boolean values */ (void) defGetBoolean(def); --- 105,112 * Validate option value, when we can do so without any context. */ if (strcmp(def->defname, "use_remote_estimate") == 0 || ! strcmp(def->defname, "updatable") == 0 || ! strcmp(def->defname, "row_mark_reference") == 0) { /* these accept only boolean values */ (void) defGetBoolean(def); *** *** 153,158 InitPgFdwOptions(void) --- 154,162 /* updatable is available on both server and table */ {"updatable", ForeignServerRelationId, false}, {"updatable", ForeignTableRelationId, false}, + /* row_mark_reference is available on both server and table */ + {"row_mark_reference", ForeignServerRelationId, false}, + {"row_mark_reference", ForeignTableRelationId, false}, {NULL, InvalidOid, false} }; *** a/contrib/postgres_fdw/postgres_fdw.c --- b/contrib/postgres_fdw/postgres_fdw.c *** *** 124,129 enum FdwModifyPrivateIndex --- 124,144 }; /* + * Similarly, this enum describes what's kept in the fdw_private list for + * a PlanRowMark node referencing a postgres_fdw foreign table. We store: + * + * 1) SELECT statement text to be sent to the remote server + * 2) Integer list of attribute numbers retrieved by SELECT + */ + enum FdwFetchPrivateIndex + { + /* SQL statement to execute remotely (as a String node) */ + FdwFetchPrivateSelectSql, + /* Integer list of attribute numbers retrieved by SELECT */ + FdwFetchPrivateRetrievedAttrs + }; + + /* * Execution state of a foreign scan using postgres_fdw. */ typedef struct PgFdwScanState *** *** 186,191 typedef struct PgFdwModifyState --- 201,230 } PgFdwModifyState; /* + * Execution state of a foreign fetch operation. + */ + typedef struct PgFdwFetchState + { + Relation rel; /* relcache entry for the foreign table */ + AttInMetadata *attinmeta; /* attribute datatype conversion metadata */ + + /* for remote query execution */ + PGconn *conn; /* connection for the fetch */ + char *p_name; /* name of prepared statement, if created */ + + /* extracted fdw_private data */ + char *query; /* text of SELECT command */ + List *retrieved_attrs; /* attr numbers retrieved by SELECT */ + + /* info about parameters for prepared statement */ + int p_nums; /* number of parameters to transmit */ + FmgrInfo *p_flinfo; /* output conversion funct
Re: [HACKERS] EvalPlanQual behaves oddly for FDW queries involving system columns
On 2015/03/25 4:56, Tom Lane wrote: > Etsuro Fujita writes: >> Let me explain further. Here is the comment in ExecOpenScanRelation: > >>* Determine the lock type we need. First, scan to see if target >> relation >>* is a result relation. If not, check if it's a FOR UPDATE/FOR SHARE >>* relation. In either of those cases, we got the lock already. > >> I think this is not true for foreign tables selected FOR UPDATE/SHARE, >> which have markType = ROW_MARK_COPY, because such foreign tables don't >> get opened/locked by InitPlan. Then such foreign tables don't get >> locked by neither of InitPlan nor ExecOpenScanRelation. I think this is >> a bug. > > You are right. I think it may not matter in practice, but if the executor > is taking its own locks here then it should not overlook ROW_MARK_COPY > cases. > >> To fix it, I think we should open/lock such foreign tables at >> InitPlan as the original patch does. > > I still don't like that; InitPlan is not doing something that would > require physical table access. The right thing is to fix > ExecOpenScanRelation's idea of whether InitPlan took a lock or not, > which I've now done. OK, thanks. Best regards, Etsuro Fujita -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] EvalPlanQual behaves oddly for FDW queries involving system columns
Etsuro Fujita writes: > Let me explain further. Here is the comment in ExecOpenScanRelation: > * Determine the lock type we need. First, scan to see if target > relation > * is a result relation. If not, check if it's a FOR UPDATE/FOR SHARE > * relation. In either of those cases, we got the lock already. > I think this is not true for foreign tables selected FOR UPDATE/SHARE, > which have markType = ROW_MARK_COPY, because such foreign tables don't > get opened/locked by InitPlan. Then such foreign tables don't get > locked by neither of InitPlan nor ExecOpenScanRelation. I think this is > a bug. You are right. I think it may not matter in practice, but if the executor is taking its own locks here then it should not overlook ROW_MARK_COPY cases. > To fix it, I think we should open/lock such foreign tables at > InitPlan as the original patch does. I still don't like that; InitPlan is not doing something that would require physical table access. The right thing is to fix ExecOpenScanRelation's idea of whether InitPlan took a lock or not, which I've now done. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] EvalPlanQual behaves oddly for FDW queries involving system columns
On 2015/03/13 11:46, Etsuro Fujita wrote: BTW, what do you think about opening/locking foreign tables selected for update at InitPlan, which the original patch does? As I mentioned in [1], ISTM that ExecOpenScanRelation called from ExecInitForeignScan is assuming that. [1] http://www.postgresql.org/message-id/54bcbbf8.3020...@lab.ntt.co.jp Let me explain further. Here is the comment in ExecOpenScanRelation: * Determine the lock type we need. First, scan to see if target relation * is a result relation. If not, check if it's a FOR UPDATE/FOR SHARE * relation. In either of those cases, we got the lock already. I think this is not true for foreign tables selected FOR UPDATE/SHARE, which have markType = ROW_MARK_COPY, because such foreign tables don't get opened/locked by InitPlan. Then such foreign tables don't get locked by neither of InitPlan nor ExecOpenScanRelation. I think this is a bug. To fix it, I think we should open/lock such foreign tables at InitPlan as the original patch does. Best regards, Etsuro Fujita -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] EvalPlanQual behaves oddly for FDW queries involving system columns
On 2015/03/13 0:50, Tom Lane wrote: The tableoid problem can be fixed much less invasively as per the attached patch. I think that we should continue to assume that ctid is not meaningful (and hence should read as (4294967295,0)) in FDWs that use ROW_MARK_COPY, and press forward on fixing the locking issues for postgres_fdw by letting it use ROW_MARK_REFERENCE or something close to that. That would also cause ctid to read properly for rows from postgres_fdw. OK, thanks! BTW, what do you think about opening/locking foreign tables selected for update at InitPlan, which the original patch does? As I mentioned in [1], ISTM that ExecOpenScanRelation called from ExecInitForeignScan is assuming that. Best regards, Etsuro Fujita [1] http://www.postgresql.org/message-id/54bcbbf8.3020...@lab.ntt.co.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] EvalPlanQual behaves oddly for FDW queries involving system columns
Etsuro Fujita writes: > On 2015/03/12 13:35, Tom Lane wrote: >> I don't like the execMain.c changes much at all. They look somewhat >> like they're intended to allow foreign tables to adopt a different >> locking strategy, > I didn't intend such a thing. My intention is, foreign tables have > markType = ROW_MARK_COPY as ever, but I might not have correctly > understood what you pointed out. Could you elaborate on that? I think the real fix as far as postgres_fdw is concerned is in fact to let it adopt a different ROW_MARK strategy, since it has meaningful ctid values. However, that is not a one-size-fits-all answer. The fundamental problem I've got with this patch is that it's trying to impose some one-size-fits-all assumptions on all FDWs about whether ctids are meaningful; which is wrong, not to mention not backwards compatible. >> I don't see the need for the change in nodeForeignscan.c. If the FDW has >> returned a physical tuple, it can fix that for itself, while if it has >> returned a virtual tuple, the ctid will be garbage in any case. > If you leave nodeForeignscan.c unchanged, file_fdw would cause the > problem that I pointed out upthread. Here is an example. But that's self-inflicted damage, because in fact ctid correctly shows as invalid in this case in HEAD, without this patch. The tableoid problem can be fixed much less invasively as per the attached patch. I think that we should continue to assume that ctid is not meaningful (and hence should read as (4294967295,0)) in FDWs that use ROW_MARK_COPY, and press forward on fixing the locking issues for postgres_fdw by letting it use ROW_MARK_REFERENCE or something close to that. That would also cause ctid to read properly for rows from postgres_fdw. regards, tom lane diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c index 33b172b..5a1c3b3 100644 *** a/src/backend/executor/execMain.c --- b/src/backend/executor/execMain.c *** EvalPlanQualFetchRowMarks(EPQState *epqs *** 2438,2444 /* build a temporary HeapTuple control structure */ tuple.t_len = HeapTupleHeaderGetDatumLength(td); ItemPointerSetInvalid(&(tuple.t_self)); ! tuple.t_tableOid = InvalidOid; tuple.t_data = td; /* copy and store tuple */ --- 2438,2445 /* build a temporary HeapTuple control structure */ tuple.t_len = HeapTupleHeaderGetDatumLength(td); ItemPointerSetInvalid(&(tuple.t_self)); ! tuple.t_tableOid = getrelid(erm->rti, ! epqstate->estate->es_range_table); tuple.t_data = td; /* copy and store tuple */ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] EvalPlanQual behaves oddly for FDW queries involving system columns
On 2015/03/12 13:35, Tom Lane wrote: > I don't like the execMain.c changes much at all. They look somewhat > like they're intended to allow foreign tables to adopt a different > locking strategy, I didn't intend such a thing. My intention is, foreign tables have markType = ROW_MARK_COPY as ever, but I might not have correctly understood what you pointed out. Could you elaborate on that? > The changes are not all consistent > either, eg this: > > ! if (erm->relation && > ! erm->relation->rd_rel->relkind != RELKIND_FOREIGN_TABLE) > > is not necessary if this Assert is accurate: > > ! if (erm->relation) > ! { > ! Assert(erm->relation->rd_rel->relkind == > RELKIND_FOREIGN_TABLE); I modified InitPlan so that relations get opened for foreign tables as well as local tables. So, I think the change would be necessary. > I don't see the need for the change in nodeForeignscan.c. If the FDW has > returned a physical tuple, it can fix that for itself, while if it has > returned a virtual tuple, the ctid will be garbage in any case. If you leave nodeForeignscan.c unchanged, file_fdw would cause the problem that I pointed out upthread. Here is an example. [Create a test environment] postgres=# create foreign table foo (a int) server file_server options (filename '/home/pgsql/foo.csv', format 'csv'); CREATE FOREIGN TABLE postgres=# select tableoid, ctid, * from foo; tableoid | ctid | a --++--- 16459 | (4294967295,0) | 1 (1 row) postgres=# create table tab (a int, b int); CREATE TABLE postgres=# insert into tab values (1, 1); INSERT 0 1 [Run concurrent transactions] In terminal1: postgres=# begin; BEGIN postgres=# update tab set b = b * 2; UPDATE 1 In terminal2: postgres=# select foo.tableoid, foo.ctid, foo.* from foo, tab where foo.a = tab.a for update; In terminal1: postgres=# commit; COMMIT In terminal2:(When committing the UPDATE query in terminal1, psql in terminal2 will display the result for the SELECT-FOR-UPDATE query as shown below.) tableoid | ctid | a --+---+--- 16459 | (0,0) | 1 (1 row) Note the value of the ctid has changed! Rather than changing nodeForeignscan.c, it might be better to change heap_form_tuple to set the t_ctid of a formed tuple to be invalid. Thanks for the review! Best regards, Etsuro Fujita -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] EvalPlanQual behaves oddly for FDW queries involving system columns
Ashutosh Bapat writes: > I will leave this issue for the committer to judge. Changed the status to > "ready for committer". I don't like the execMain.c changes much at all. They look somewhat like they're intended to allow foreign tables to adopt a different locking strategy, but if so they belong in a different patch that actually implements such a thing. The changes are not all consistent either, eg this: ! if (erm->relation && ! erm->relation->rd_rel->relkind != RELKIND_FOREIGN_TABLE) is not necessary if this Assert is accurate: ! if (erm->relation) ! { ! Assert(erm->relation->rd_rel->relkind == RELKIND_FOREIGN_TABLE); I don't see the need for the change in nodeForeignscan.c. If the FDW has returned a physical tuple, it can fix that for itself, while if it has returned a virtual tuple, the ctid will be garbage in any case. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] EvalPlanQual behaves oddly for FDW queries involving system columns
On Wed, Mar 11, 2015 at 5:10 PM, Etsuro Fujita wrote: > On 2015/03/11 17:37, Ashutosh Bapat wrote: > >> Now I can reproduce the problem. >> >> Sanity >> >> Patch compiles cleanly and make check passes. The tests in file_fdw and >> postgres_fdw contrib modules pass. >> >> The patch works as expected in the test case reported. >> > > Thanks for the testing! > > I have only one doubt. >> In EvalPlanQualFetchRowMarks(). tuple->t_self is assigned from >> td->t_ctid. CTID or even t_self may be valid for foreign tables based on >> postgres_fdw but may not be valid for other FDWs. So, this assignment >> might put some garbage in t_self, rather we should set it to invalid as >> done prior to the patch. I might have missed some previous thread, we >> decided to go this route, so ignore the comment, in that case. >> > > Good point. As the following code and comment I added to ForeignNext, I > think that FDW authors should initialize the tup->t_data->t_ctid of each > scan tuple with its own TID. If the authors do that, the t_self is > guaranteed to be assigned the right TID from the whole-row Var (ie, > td->t_ctid) in EvalPlanQualFetchRowMarks. > > /* We assume that t_ctid is initialized with its own TID */ > tup->t_self = tup->t_data->t_ctid; > > IMHO, I'm not sure it's worth complicating the code as you mentioned. (I > don't know whether there are any discussions about this before.) > > Note that file_fdw needs no treatment. In that case, in ForeignNext, the > tup->t_data->t_ctid of each scan tuple is initialized with (0,0) (if > necessary), and then the t_self will be correctly assigned (0,0) throguh > the whole-row Var in EvalPlanQualFetchRowMarks. So, no inconsistency! > > I will leave this issue for the committer to judge. Changed the status to "ready for committer". > Apart from this, I do not have any comments here. >> > > Thanks again. > > Best regards, > Etsuro Fujita > -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company
Re: [HACKERS] EvalPlanQual behaves oddly for FDW queries involving system columns
On 2015/03/11 17:37, Ashutosh Bapat wrote: Now I can reproduce the problem. Sanity Patch compiles cleanly and make check passes. The tests in file_fdw and postgres_fdw contrib modules pass. The patch works as expected in the test case reported. Thanks for the testing! I have only one doubt. In EvalPlanQualFetchRowMarks(). tuple->t_self is assigned from td->t_ctid. CTID or even t_self may be valid for foreign tables based on postgres_fdw but may not be valid for other FDWs. So, this assignment might put some garbage in t_self, rather we should set it to invalid as done prior to the patch. I might have missed some previous thread, we decided to go this route, so ignore the comment, in that case. Good point. As the following code and comment I added to ForeignNext, I think that FDW authors should initialize the tup->t_data->t_ctid of each scan tuple with its own TID. If the authors do that, the t_self is guaranteed to be assigned the right TID from the whole-row Var (ie, td->t_ctid) in EvalPlanQualFetchRowMarks. /* We assume that t_ctid is initialized with its own TID */ tup->t_self = tup->t_data->t_ctid; IMHO, I'm not sure it's worth complicating the code as you mentioned. (I don't know whether there are any discussions about this before.) Note that file_fdw needs no treatment. In that case, in ForeignNext, the tup->t_data->t_ctid of each scan tuple is initialized with (0,0) (if necessary), and then the t_self will be correctly assigned (0,0) throguh the whole-row Var in EvalPlanQualFetchRowMarks. So, no inconsistency! Apart from this, I do not have any comments here. Thanks again. Best regards, Etsuro Fujita -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] EvalPlanQual behaves oddly for FDW queries involving system columns
Now I can reproduce the problem. Sanity Patch compiles cleanly and make check passes. The tests in file_fdw and postgres_fdw contrib modules pass. The patch works as expected in the test case reported. I have only one doubt. In EvalPlanQualFetchRowMarks(). tuple->t_self is assigned from td->t_ctid. CTID or even t_self may be valid for foreign tables based on postgres_fdw but may not be valid for other FDWs. So, this assignment might put some garbage in t_self, rather we should set it to invalid as done prior to the patch. I might have missed some previous thread, we decided to go this route, so ignore the comment, in that case. Apart from this, I do not have any comments here. On Mon, Mar 9, 2015 at 4:05 PM, Etsuro Fujita wrote: > On 2015/03/09 16:02, Ashutosh Bapat wrote: > >> I tried reproducing the issue with the steps summarised. >> > > Thanks for the review! > > Here's my setup >> > > Sorry, my explanation was not enough, but such was not my intention. I'll > re-summarize the steps below: > > [Create a test environment] > > $ createdb mydatabase > $ psql mydatabase > mydatabase=# create table mytable (a int); > > $ psql postgres > postgres=# create extension postgres_fdw; > postgres=# create server loopback foreign data wrapper postgres_fdw > options (dbname 'mydatabase'); > postgres=# create user mapping for current_user server loopback; > postgres=# create foreign table ftable (a int) server loopback options > (table_name 'mytable'); > postgres=# insert into ftable values (1); > postgres=# create table ltable (a int, b int); > postgres=# insert into ltable values (1, 1); > > [Run concurrent transactions] > > In terminal1: > $ psql postgres > postgres=# begin; > BEGIN > postgres=# update ltable set b = b * 2; > UPDATE 1 > > In terminal2: > $ psql postgres > postgres=# select tableoid, ctid, * from ftable; > tableoid | ctid | a > --+---+--- > 16394 | (0,1) | 1 > (1 row) > > postgres=# select f.tableoid, f.ctid, f.* from ftable f, ltable l where > f.a = l.a for update; > > In terminal1: > postgres=# commit; > COMMIT > > In terminal2:(When committing the UPDATE query in terminal1, psql in > terminal2 will display the result for the SELECT-FOR-UPDATE query as shown > below.) > tableoid | ctid | a > --++--- > 0 | (4294967295,0) | 1 > (1 row) > > Best regards, > Etsuro Fujita > -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company
Re: [HACKERS] EvalPlanQual behaves oddly for FDW queries involving system columns
On 2015/03/09 16:02, Ashutosh Bapat wrote: I tried reproducing the issue with the steps summarised. Thanks for the review! Here's my setup Sorry, my explanation was not enough, but such was not my intention. I'll re-summarize the steps below: [Create a test environment] $ createdb mydatabase $ psql mydatabase mydatabase=# create table mytable (a int); $ psql postgres postgres=# create extension postgres_fdw; postgres=# create server loopback foreign data wrapper postgres_fdw options (dbname 'mydatabase'); postgres=# create user mapping for current_user server loopback; postgres=# create foreign table ftable (a int) server loopback options (table_name 'mytable'); postgres=# insert into ftable values (1); postgres=# create table ltable (a int, b int); postgres=# insert into ltable values (1, 1); [Run concurrent transactions] In terminal1: $ psql postgres postgres=# begin; BEGIN postgres=# update ltable set b = b * 2; UPDATE 1 In terminal2: $ psql postgres postgres=# select tableoid, ctid, * from ftable; tableoid | ctid | a --+---+--- 16394 | (0,1) | 1 (1 row) postgres=# select f.tableoid, f.ctid, f.* from ftable f, ltable l where f.a = l.a for update; In terminal1: postgres=# commit; COMMIT In terminal2:(When committing the UPDATE query in terminal1, psql in terminal2 will display the result for the SELECT-FOR-UPDATE query as shown below.) tableoid | ctid | a --++--- 0 | (4294967295,0) | 1 (1 row) Best regards, Etsuro Fujita -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] EvalPlanQual behaves oddly for FDW queries involving system columns
Hi Fujita-san, I tried reproducing the issue with the steps summarised. Here's my setup postgres=# \d ft Foreign table "public.ft" Column | Type | Modifiers | FDW Options +-+---+- a | integer | | Server: loopback FDW Options: (table_name 'lbt') postgres=# \d lbt Table "public.lbt" Column | Type | Modifiers +-+--- a | integer | The select (without for update) returns me two rows (one inserted in lbt and one in ft), whereas in your description there is only one row. For me, I am getting following error postgres=# select ft.tableoid, ft.ctid, ft.* from lbt, ft where lbt.a = ft.a for update; ERROR: could not serialize access due to concurrent update CONTEXT: Remote SQL command: SELECT a, ctid FROM public.lbt FOR UPDATE postgres=# after commit on terminal 1. Am I missing something? -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company
Re: [HACKERS] EvalPlanQual behaves oddly for FDW queries involving system columns
Hi Ashutosh, On 2015/02/03 16:44, Ashutosh Bapat wrote: I am having some minor problems running this repro [Terminal 2] postgres=# create foreign table ft (a int) server loopback options (table_name 'lbt'); There isn't any table "lbt" mentioned here. Do you mean "t" here? Sorry, my explanation was not enough. "lbt" means a foreign table named "lbt" defined on a foreign server named "loopback". It'd be defined eg, in the following manner: $ createdb efujita efujita=# create table lbt (a int); CREATE TABLE postgres=# create server loopback foreign data wrapper postgres_fdw options (dbname 'efujita'); CREATE SERVER postgres=# create user mapping for current_user server loopback; CREATE USER MAPPING postgres=# create foreign table ft (a int) server loopback options (table_name 'lbt'); CREATE FOREIGN TABLE Thanks for the review! Best regards, Etsuro Fujita -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] EvalPlanQual behaves oddly for FDW queries involving system columns
Hi Fujita-san, I am having some minor problems running this repro On Thu, Jan 15, 2015 at 12:45 PM, Etsuro Fujita wrote: > Here is an example using postgres_fdw. > > [Terminal 1] > postgres=# create table t (a int, b int); > CREATE TABLE > postgres=# insert into t values (1, 1); > INSERT 0 1 > postgres=# begin; > BEGIN > postgres=# update t set b = b * 2; > UPDATE 1 > > [Terminal 2] > postgres=# create foreign table ft (a int) server loopback options > (table_name 'lbt'); > There isn't any table "lbt" mentioned here. Do you mean "t" here? > CREATE FOREIGN TABLE > postgres=# insert into ft values (1); > INSERT 0 1 > postgres=# select tableoid, ctid, * from ft; > tableoid | ctid | a > --+---+--- > 25092 | (0,1) | 1 > (1 row) > Shouldn't we see two values here one inserted in 't' and one in "ft" > > postgres=# select ft.tableoid, ft.ctid, ft.* from t, ft where t.a = ft.a > for update; > > [Terminal 1] > postgres=# commit; > COMMIT > > [Terminal 2] > postgres=# select ft.tableoid, ft.ctid, ft.* from t, ft where t.a = ft.a > for update; > tableoid | ctid | a > --++--- > 0 | (4294967295,0) | 1 > (1 row) > > Instead of this result, I got following error ERROR: could not serialize access due to concurrent update CONTEXT: Remote SQL command: SELECT a, ctid FROM public.t FOR UPDATE Am I missing something while reproducing the problem? > Note that tableoid and ctid have been changed! > > I think the reason for that is because EvalPlanQualFetchRowMarks doesn't > properly set tableoid and ctid for foreign tables, IIUC. I think this > should be fixed. Please find attached a patch. The patch slightly > relates to [1], so if it is reasonable, I'll update [1] on top of this. > > [1] https://commitfest.postgresql.org/action/patch_view?id=1386 > > Best regards, > Etsuro Fujita > > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers > > -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company
Re: [HACKERS] EvalPlanQual behaves oddly for FDW queries involving system columns
On 2015/01/19 17:10, Etsuro Fujita wrote: Attached is an updated version of the patch. I'll add this to the next CF. Best regards, Etsuro Fujita -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] EvalPlanQual behaves oddly for FDW queries involving system columns
On 2015/01/16 1:24, Alvaro Herrera wrote: Etsuro Fujita wrote: *** 817,826 InitPlan(QueryDesc *queryDesc, int eflags) --- 818,833 break; case ROW_MARK_COPY: /* there's no real table here ... */ + relkind = rt_fetch(rc->rti, rangeTable)->relkind; + if (relkind == RELKIND_FOREIGN_TABLE) + relid = getrelid(rc->rti, rangeTable); + else + relid = InvalidOid; relation = NULL; break; --- 2326,2342 /* build a temporary HeapTuple control structure */ tuple.t_len = HeapTupleHeaderGetDatumLength(td); ! /* if relid is valid, rel is a foreign table; set system columns */ ! if (OidIsValid(erm->relid)) ! { ! tuple.t_self = td->t_ctid; ! tuple.t_tableOid = erm->relid; ! } ! else ! { ! ItemPointerSetInvalid(&(tuple.t_self)); ! tuple.t_tableOid = InvalidOid; ! } tuple.t_data = td; I find this arrangement confusing and unnecessary -- surely if you have access to the ExecRowMark here, you could just obtain the relid with RelationGetRelid instead of saving the OID beforehand? I don't think so because we don't have the Relation (ie, erm->relation = NULL) here since InitPlan don't open/lock relations having markType = ROW_MARK_COPY as shown above, which include foreign tables selected for update/share. But I noticed we should open/lock such foreign tables as well in InitPlan because I think ExecOpenScanRelation is assuming that, IIUC. > And if you have the Relation, you could just consult the relkind at that point instead of relying on the relid being set or not as a flag to indicate whether the rel is foreign. Followed your revision. Attached is an updated version of the patch. Thanks for the comment! Best regards, Etsuro Fujita *** a/contrib/postgres_fdw/postgres_fdw.c --- b/contrib/postgres_fdw/postgres_fdw.c *** *** 2947,2953 make_tuple_from_result_row(PGresult *res, tuple = heap_form_tuple(tupdesc, values, nulls); if (ctid) ! tuple->t_self = *ctid; /* Clean up */ MemoryContextReset(temp_context); --- 2947,2953 tuple = heap_form_tuple(tupdesc, values, nulls); if (ctid) ! tuple->t_self = tuple->t_data->t_ctid = *ctid; /* Clean up */ MemoryContextReset(temp_context); *** a/src/backend/executor/execMain.c --- b/src/backend/executor/execMain.c *** *** 795,800 InitPlan(QueryDesc *queryDesc, int eflags) --- 795,801 { PlanRowMark *rc = (PlanRowMark *) lfirst(l); Oid relid; + char relkind; Relation relation; ExecRowMark *erm; *** *** 817,823 InitPlan(QueryDesc *queryDesc, int eflags) break; case ROW_MARK_COPY: /* there's no real table here ... */ ! relation = NULL; break; default: elog(ERROR, "unrecognized markType: %d", rc->markType); --- 818,831 break; case ROW_MARK_COPY: /* there's no real table here ... */ ! relkind = rt_fetch(rc->rti, rangeTable)->relkind; ! if (relkind == RELKIND_FOREIGN_TABLE) ! { ! relid = getrelid(rc->rti, rangeTable); ! relation = heap_open(relid, AccessShareLock); ! } ! else ! relation = NULL; break; default: elog(ERROR, "unrecognized markType: %d", rc->markType); *** *** 1115,1125 CheckValidRowMarkRel(Relation rel, RowMarkType markType) RelationGetRelationName(rel; break; case RELKIND_FOREIGN_TABLE: ! /* Should not get here; planner should have used ROW_MARK_COPY */ ! ereport(ERROR, ! (errcode(ERRCODE_WRONG_OBJECT_TYPE), ! errmsg("cannot lock rows in foreign table \"%s\"", ! RelationGetRelationName(rel; break; default: ereport(ERROR, --- 1123,1134 RelationGetRelationName(rel; break; case RELKIND_FOREIGN_TABLE: ! /* Allow opening/locking a foreign table only if ROW_MARK_COPY */ ! if (markType != ROW_MARK_COPY) ! ereport(ERROR, ! (errcode(ERRCODE_WRONG_OBJECT_TYPE), ! errmsg("cannot lock rows in foreign table \"%s\"", ! RelationGetRelationName(rel; break; default: ereport(ERROR, *** *** 1792,1798 ExecBuildAuxRowMark(ExecRowMark *erm, List *targetlist) aerm->rowmark = erm; /* Look up the resjunk columns associated with this rowmark */ ! if (erm->relation) { Assert(erm->markType != ROW_MA
Re: [HACKERS] EvalPlanQual behaves oddly for FDW queries involving system columns
Etsuro Fujita wrote: > *** > *** 817,826 InitPlan(QueryDesc *queryDesc, int eflags) > --- 818,833 > break; > case ROW_MARK_COPY: > /* there's no real table here ... */ > + relkind = rt_fetch(rc->rti, > rangeTable)->relkind; > + if (relkind == RELKIND_FOREIGN_TABLE) > + relid = getrelid(rc->rti, rangeTable); > + else > + relid = InvalidOid; > relation = NULL; > break; > default: > elog(ERROR, "unrecognized markType: %d", > rc->markType); > + relid = InvalidOid; > relation = NULL;/* keep compiler quiet > */ > break; > } [ ... ] > --- 2326,2342 > > /* build a temporary HeapTuple control structure */ > tuple.t_len = HeapTupleHeaderGetDatumLength(td); > ! /* if relid is valid, rel is a foreign table; set > system columns */ > ! if (OidIsValid(erm->relid)) > ! { > ! tuple.t_self = td->t_ctid; > ! tuple.t_tableOid = erm->relid; > ! } > ! else > ! { > ! ItemPointerSetInvalid(&(tuple.t_self)); > ! tuple.t_tableOid = InvalidOid; > ! } > tuple.t_data = td; > > /* copy and store tuple */ I find this arrangement confusing and unnecessary -- surely if you have access to the ExecRowMark here, you could just obtain the relid with RelationGetRelid instead of saving the OID beforehand? And if you have the Relation, you could just consult the relkind at that point instead of relying on the relid being set or not as a flag to indicate whether the rel is foreign. I didn't look at anything else in the patch so I can't comment more on it, but the change to ExecRowMark caught my attention. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] EvalPlanQual behaves oddly for FDW queries involving system columns
Here is an example using postgres_fdw. [Terminal 1] postgres=# create table t (a int, b int); CREATE TABLE postgres=# insert into t values (1, 1); INSERT 0 1 postgres=# begin; BEGIN postgres=# update t set b = b * 2; UPDATE 1 [Terminal 2] postgres=# create foreign table ft (a int) server loopback options (table_name 'lbt'); CREATE FOREIGN TABLE postgres=# insert into ft values (1); INSERT 0 1 postgres=# select tableoid, ctid, * from ft; tableoid | ctid | a --+---+--- 25092 | (0,1) | 1 (1 row) postgres=# select ft.tableoid, ft.ctid, ft.* from t, ft where t.a = ft.a for update; [Terminal 1] postgres=# commit; COMMIT [Terminal 2] postgres=# select ft.tableoid, ft.ctid, ft.* from t, ft where t.a = ft.a for update; tableoid | ctid | a --++--- 0 | (4294967295,0) | 1 (1 row) Note that tableoid and ctid have been changed! I think the reason for that is because EvalPlanQualFetchRowMarks doesn't properly set tableoid and ctid for foreign tables, IIUC. I think this should be fixed. Please find attached a patch. The patch slightly relates to [1], so if it is reasonable, I'll update [1] on top of this. [1] https://commitfest.postgresql.org/action/patch_view?id=1386 Best regards, Etsuro Fujita *** a/contrib/postgres_fdw/postgres_fdw.c --- b/contrib/postgres_fdw/postgres_fdw.c *** *** 2947,2953 make_tuple_from_result_row(PGresult *res, tuple = heap_form_tuple(tupdesc, values, nulls); if (ctid) ! tuple->t_self = *ctid; /* Clean up */ MemoryContextReset(temp_context); --- 2947,2953 tuple = heap_form_tuple(tupdesc, values, nulls); if (ctid) ! tuple->t_self = tuple->t_data->t_ctid = *ctid; /* Clean up */ MemoryContextReset(temp_context); *** a/src/backend/executor/execMain.c --- b/src/backend/executor/execMain.c *** *** 795,800 InitPlan(QueryDesc *queryDesc, int eflags) --- 795,801 { PlanRowMark *rc = (PlanRowMark *) lfirst(l); Oid relid; + char relkind; Relation relation; ExecRowMark *erm; *** *** 817,826 InitPlan(QueryDesc *queryDesc, int eflags) --- 818,833 break; case ROW_MARK_COPY: /* there's no real table here ... */ + relkind = rt_fetch(rc->rti, rangeTable)->relkind; + if (relkind == RELKIND_FOREIGN_TABLE) + relid = getrelid(rc->rti, rangeTable); + else + relid = InvalidOid; relation = NULL; break; default: elog(ERROR, "unrecognized markType: %d", rc->markType); + relid = InvalidOid; relation = NULL; /* keep compiler quiet */ break; } *** *** 831,836 InitPlan(QueryDesc *queryDesc, int eflags) --- 838,844 erm = (ExecRowMark *) palloc(sizeof(ExecRowMark)); erm->relation = relation; + erm->relid = relid; erm->rti = rc->rti; erm->prti = rc->prti; erm->rowmarkId = rc->rowmarkId; *** *** 2318,2325 EvalPlanQualFetchRowMarks(EPQState *epqstate) /* build a temporary HeapTuple control structure */ tuple.t_len = HeapTupleHeaderGetDatumLength(td); ! ItemPointerSetInvalid(&(tuple.t_self)); ! tuple.t_tableOid = InvalidOid; tuple.t_data = td; /* copy and store tuple */ --- 2326,2342 /* build a temporary HeapTuple control structure */ tuple.t_len = HeapTupleHeaderGetDatumLength(td); ! /* if relid is valid, rel is a foreign table; set system columns */ ! if (OidIsValid(erm->relid)) ! { ! tuple.t_self = td->t_ctid; ! tuple.t_tableOid = erm->relid; ! } ! else ! { ! ItemPointerSetInvalid(&(tuple.t_self)); ! tuple.t_tableOid = InvalidOid; ! } tuple.t_data = td; /* copy and store tuple */ *** a/src/backend/executor/nodeForeignscan.c --- b/src/backend/executor/nodeForeignscan.c *** *** 22,27 --- 22,28 */ #include "postgres.h" + #include "access/htup_details.h" #include "executor/executor.h" #include "executor/nodeForeignscan.h" #include "foreign/fdwapi.h" *** *** 53,65 ForeignNext(ForeignScanState *node) /* * If any system columns are requested, we have to force the tuple into * physical-tuple form to avoid "cannot extract system attribute from ! * virtual tuple" errors later. We also insert a valid value for ! * tableoid, which is the only actually-useful system column. */ if (plan->fsSystemCol && !TupIsNull(slot)) { HeapTuple tup = ExecMaterializeSlot(slot); tup->t_tableOid = RelationGetRelid(node->ss.ss_currentRelation); } --- 54,69 /* * If any system columns are requested, we have to force the tuple into * physical-tuple form to avoid "cannot extract system attribute from ! * virtual tuple" errors later. We also insert a valid value for TID ! * and tableoid, which are the only actually-useful system columns. */ if (plan->fsSystemCol && !TupIsNull(slot)) { HeapTuple