Re: [HACKERS] Triggers on foreign tables
Le dimanche 23 mars 2014 02:44:26 Noah Misch a écrit : On Tue, Mar 18, 2014 at 09:31:06AM +0100, Ronan Dunklau wrote: Le mardi 18 mars 2014 03:54:19 Kouhei Kaigai a écrit : (1) To acquire the old tuple for UPDATE/DELETE operations, the patch closely parallels our handling for INSTEAD OF triggers on views. It adds a wholerow resjunk attribute, from which it constructs a HeapTuple before calling a trigger function. This loses the system columns, an irrelevant consideration for views but meaningful for foreign tables. postgres_fdw maintains the ctid system column (t_self), but triggers will always see an invalid t_self for the old tuple. One way to fix this is to pass around the old tuple data as ROW(ctid, oid, xmin, cmin, xmax, cmax, tableoid, wholerow). That's fairly close to sufficient, but it doesn't cover t_ctid. Frankly, I would like to find a principled excuse to not worry about making foreign table system columns accessible from triggers. Supporting system columns dramatically affects the mechanism, and what trigger is likely to care? Unfortunately, that argument seems too weak. Does anyone have a cleaner idea for keeping track of the system column values or a stronger argument for not bothering? Regarding to the first suggestion, I think, it is better not to care about system columns on foreign tables, because it fully depends on driver's implementation whether FDW fetches ctid from its data source, or not. Usually, system columns of foreign table, except for tableoid, are nonsense. Because of implementation reason, postgres_fdw fetches ctid of remote tables on UPDATE / DELETE, it is not a common nature for all FDW drivers. For example, we can assume an implementation that uses primary key of remote table to identify the record to be updated or deleted. In this case, local ctid does not have meaningful value. So, fundamentally, we cannot guarantee FDW driver returns meaningful ctid or other system columns. I agree, I think it is somewhat clunky to have postgres_fdw use a feature that is basically meaningless for other FDWs. Looking at some threads in this list, it confused many people. My own reasoning for accepting omission of system columns is more along the lines of Robert's argument. Regardless, three folks voting to do so and none against suffices for me. I documented the system columns limitation, made the returningList change I mentioned, and committed the patch. Thank you, I'm glad the patch found its way to the repository ! This is off-topic, but maybe we could devise an API allowing for local system attributes on foreign table. This would allow FDWs to carry attributes that weren't declared as part of the table definition. This could then be used for postgres_fdw ctid, as well as others foreign data wrappers equivalent of an implicit tuple id. We could, but I discourage it. System columns are a legacy feature; I doubt we'd choose that design afresh today. On the off-chance that you need the value of a remote system column, you can already declare an ordinary foreign table column for it. I raised the issue because it's inconsistent for RETURNING to convey system columns while tg_trigtuple/tg_newtuple do not, not because acquiring system columns from foreign tables is notably useful. The idea here was to allow an FDW author to add columns which are not part of the table definition, for example colums which are required to identify the tuple remotely. Without system columns, a postgres_fdw user would have to declare the ctid column on every table for a tuble to be identifiable. The proposal would allow postgres_fdw to automatically inject an hidden (system ?) column on every table for this ctid. -- Ronan Dunklau http://dalibo.com - http://dalibo.org signature.asc Description: This is a digitally signed message part.
Re: [HACKERS] Triggers on foreign tables
On Tue, Mar 18, 2014 at 09:31:06AM +0100, Ronan Dunklau wrote: Le mardi 18 mars 2014 03:54:19 Kouhei Kaigai a écrit : (1) To acquire the old tuple for UPDATE/DELETE operations, the patch closely parallels our handling for INSTEAD OF triggers on views. It adds a wholerow resjunk attribute, from which it constructs a HeapTuple before calling a trigger function. This loses the system columns, an irrelevant consideration for views but meaningful for foreign tables. postgres_fdw maintains the ctid system column (t_self), but triggers will always see an invalid t_self for the old tuple. One way to fix this is to pass around the old tuple data as ROW(ctid, oid, xmin, cmin, xmax, cmax, tableoid, wholerow). That's fairly close to sufficient, but it doesn't cover t_ctid. Frankly, I would like to find a principled excuse to not worry about making foreign table system columns accessible from triggers. Supporting system columns dramatically affects the mechanism, and what trigger is likely to care? Unfortunately, that argument seems too weak. Does anyone have a cleaner idea for keeping track of the system column values or a stronger argument for not bothering? Regarding to the first suggestion, I think, it is better not to care about system columns on foreign tables, because it fully depends on driver's implementation whether FDW fetches ctid from its data source, or not. Usually, system columns of foreign table, except for tableoid, are nonsense. Because of implementation reason, postgres_fdw fetches ctid of remote tables on UPDATE / DELETE, it is not a common nature for all FDW drivers. For example, we can assume an implementation that uses primary key of remote table to identify the record to be updated or deleted. In this case, local ctid does not have meaningful value. So, fundamentally, we cannot guarantee FDW driver returns meaningful ctid or other system columns. I agree, I think it is somewhat clunky to have postgres_fdw use a feature that is basically meaningless for other FDWs. Looking at some threads in this list, it confused many people. My own reasoning for accepting omission of system columns is more along the lines of Robert's argument. Regardless, three folks voting to do so and none against suffices for me. I documented the system columns limitation, made the returningList change I mentioned, and committed the patch. This is off-topic, but maybe we could devise an API allowing for local system attributes on foreign table. This would allow FDWs to carry attributes that weren't declared as part of the table definition. This could then be used for postgres_fdw ctid, as well as others foreign data wrappers equivalent of an implicit tuple id. We could, but I discourage it. System columns are a legacy feature; I doubt we'd choose that design afresh today. On the off-chance that you need the value of a remote system column, you can already declare an ordinary foreign table column for it. I raised the issue because it's inconsistent for RETURNING to convey system columns while tg_trigtuple/tg_newtuple do not, not because acquiring system columns from foreign tables is notably useful. Thanks, nm -- Noah Misch EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Triggers on foreign tables
Le mardi 18 mars 2014 03:54:19 Kouhei Kaigai a écrit : I hacked on this for awhile, but there remain two matters on which I'm uncertain about the right way forward. (1) To acquire the old tuple for UPDATE/DELETE operations, the patch closely parallels our handling for INSTEAD OF triggers on views. It adds a wholerow resjunk attribute, from which it constructs a HeapTuple before calling a trigger function. This loses the system columns, an irrelevant consideration for views but meaningful for foreign tables. postgres_fdw maintains the ctid system column (t_self), but triggers will always see an invalid t_self for the old tuple. One way to fix this is to pass around the old tuple data as ROW(ctid, oid, xmin, cmin, xmax, cmax, tableoid, wholerow). That's fairly close to sufficient, but it doesn't cover t_ctid. Frankly, I would like to find a principled excuse to not worry about making foreign table system columns accessible from triggers. Supporting system columns dramatically affects the mechanism, and what trigger is likely to care? Unfortunately, that argument seems too weak. Does anyone have a cleaner idea for keeping track of the system column values or a stronger argument for not bothering? Regarding to the first suggestion, I think, it is better not to care about system columns on foreign tables, because it fully depends on driver's implementation whether FDW fetches ctid from its data source, or not. Usually, system columns of foreign table, except for tableoid, are nonsense. Because of implementation reason, postgres_fdw fetches ctid of remote tables on UPDATE / DELETE, it is not a common nature for all FDW drivers. For example, we can assume an implementation that uses primary key of remote table to identify the record to be updated or deleted. In this case, local ctid does not have meaningful value. So, fundamentally, we cannot guarantee FDW driver returns meaningful ctid or other system columns. I agree, I think it is somewhat clunky to have postgres_fdw use a feature that is basically meaningless for other FDWs. Looking at some threads in this list, it confused many people. This is off-topic, but maybe we could devise an API allowing for local system attributes on foreign table. This would allow FDWs to carry attributes that weren't declared as part of the table definition. This could then be used for postgres_fdw ctid, as well as others foreign data wrappers equivalent of an implicit tuple id. (2) When a foreign table has an AFTER ROW trigger, the FDW's ExecForeign{Insert,Update,Delete} callbacks must return a slot covering all columns. Current FDW API documentation says things like this: The data in the returned slot is used only if the INSERT query has a RETURNING clause. Hence, the FDW could choose to optimize away returning some or all columns depending on the contents of the RETURNING clause. Consistent with that, postgres_fdw inspects the returningLists of the ModifyTable node to decide which columns are necessary. This patch has rewriteTargetListIU() add a resjunk wholerow var to the returningList of any query that will fire an AFTER ROW trigger on a foreign table. That avoids the need to change the FDW API contract or any postgres_fdw code. I was pleased about that for a time, but on further review, I'm doubting that the benefit for writable FDWs justifies muddying the definition of returningList; until now, returningList has been free from resjunk TLEs. For example, callers of FetchStatementTargetList() may be unprepared to see an all-resjunk list, instead of NIL, for a data modification statement that returns nothing. If we do keep the patch's approach, I'm inclined to rename returningList. However, I more lean toward adding a separate flag to indicate the need to return a complete tuple regardless of the RETURNING list. The benefits of overloading returningList are all short-term benefits. We know that the FDW API is still converging, so changing it seems eventually-preferable to, and safer than, changing the name or meaning of returningList. Thoughts? On Thu, Mar 06, 2014 at 09:11:19AM +0100, Ronan Dunklau wrote: Le mercredi 5 mars 2014 22:36:44 Noah Misch a écrit : Agreed. More specifically, I see only two scenarios for retrieving tuples from the tuplestore. Scenario one is that we need the next tuple (or pair of tuples, depending on the TriggerEvent). Scenario two is that we need the tuple(s) most recently retrieved. If that's correct, I'm inclined to rearrange afterTriggerInvokeEvents() and AfterTriggerExecute() to remember the tuple or pair of tuples most recently retrieved. They'll then never call tuplestore_advance() just to reposition. Do you see a problem with that? I don't see any problem with that. I don't know how this would be implemented, but
Re: [HACKERS] Triggers on foreign tables
On Mon, Mar 17, 2014 at 11:54 PM, Kouhei Kaigai kai...@ak.jp.nec.com wrote: I hacked on this for awhile, but there remain two matters on which I'm uncertain about the right way forward. (1) To acquire the old tuple for UPDATE/DELETE operations, the patch closely parallels our handling for INSTEAD OF triggers on views. It adds a wholerow resjunk attribute, from which it constructs a HeapTuple before calling a trigger function. This loses the system columns, an irrelevant consideration for views but meaningful for foreign tables. postgres_fdw maintains the ctid system column (t_self), but triggers will always see an invalid t_self for the old tuple. One way to fix this is to pass around the old tuple data as ROW(ctid, oid, xmin, cmin, xmax, cmax, tableoid, wholerow). That's fairly close to sufficient, but it doesn't cover t_ctid. Frankly, I would like to find a principled excuse to not worry about making foreign table system columns accessible from triggers. Supporting system columns dramatically affects the mechanism, and what trigger is likely to care? Unfortunately, that argument seems too weak. Does anyone have a cleaner idea for keeping track of the system column values or a stronger argument for not bothering? Regarding to the first suggestion, I think, it is better not to care about system columns on foreign tables, because it fully depends on driver's implementation whether FDW fetches ctid from its data source, or not. Usually, system columns of foreign table, except for tableoid, are nonsense. Because of implementation reason, postgres_fdw fetches ctid of remote tables on UPDATE / DELETE, it is not a common nature for all FDW drivers. For example, we can assume an implementation that uses primary key of remote table to identify the record to be updated or deleted. In this case, local ctid does not have meaningful value. So, fundamentally, we cannot guarantee FDW driver returns meaningful ctid or other system columns. I'm not sure I particularly agree with this reasoning - after all, just because some people might not find a feature useful isn't a reason not to have it. On the other hand, I don't think it's a very useful feature, and I don't feel like we have to have it. Most system columns can't be updated or indexed, and none of them can be dropped or renamed, so it's not like they aren't second-class citizens to some degree already. By way of comparison, the first version of the index-only scan patch gave up when it saw an expression index, instead of making an effort to figure out whether a matching expression was present in the query. Somebody could have looked at that patch and said, oh, well, that's an ugly and unacceptable limitation, and we ought to reject the patch until it's fixed. Well, instead, Tom committed the patch, and we still have that limitation today, and it's still something we really ought to fix some day, but in the meantime we have the feature. Obviously, the line between your patch is only part of a feature, please finish it and try again and your patch implements a nice self-contained feature and there are some more things that we could build on top of it later is to some extent a matter of judgement; but for what it's worth, I can't get too excited about this particular limitation of this particular patch. I just don't think that very many people are going to miss the functionality in question. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Triggers on foreign tables
I hacked on this for awhile, but there remain two matters on which I'm uncertain about the right way forward. (1) To acquire the old tuple for UPDATE/DELETE operations, the patch closely parallels our handling for INSTEAD OF triggers on views. It adds a wholerow resjunk attribute, from which it constructs a HeapTuple before calling a trigger function. This loses the system columns, an irrelevant consideration for views but meaningful for foreign tables. postgres_fdw maintains the ctid system column (t_self), but triggers will always see an invalid t_self for the old tuple. One way to fix this is to pass around the old tuple data as ROW(ctid, oid, xmin, cmin, xmax, cmax, tableoid, wholerow). That's fairly close to sufficient, but it doesn't cover t_ctid. Frankly, I would like to find a principled excuse to not worry about making foreign table system columns accessible from triggers. Supporting system columns dramatically affects the mechanism, and what trigger is likely to care? Unfortunately, that argument seems too weak. Does anyone have a cleaner idea for keeping track of the system column values or a stronger argument for not bothering? Regarding to the first suggestion, I think, it is better not to care about system columns on foreign tables, because it fully depends on driver's implementation whether FDW fetches ctid from its data source, or not. Usually, system columns of foreign table, except for tableoid, are nonsense. Because of implementation reason, postgres_fdw fetches ctid of remote tables on UPDATE / DELETE, it is not a common nature for all FDW drivers. For example, we can assume an implementation that uses primary key of remote table to identify the record to be updated or deleted. In this case, local ctid does not have meaningful value. So, fundamentally, we cannot guarantee FDW driver returns meaningful ctid or other system columns. (2) When a foreign table has an AFTER ROW trigger, the FDW's ExecForeign{Insert,Update,Delete} callbacks must return a slot covering all columns. Current FDW API documentation says things like this: The data in the returned slot is used only if the INSERT query has a RETURNING clause. Hence, the FDW could choose to optimize away returning some or all columns depending on the contents of the RETURNING clause. Consistent with that, postgres_fdw inspects the returningLists of the ModifyTable node to decide which columns are necessary. This patch has rewriteTargetListIU() add a resjunk wholerow var to the returningList of any query that will fire an AFTER ROW trigger on a foreign table. That avoids the need to change the FDW API contract or any postgres_fdw code. I was pleased about that for a time, but on further review, I'm doubting that the benefit for writable FDWs justifies muddying the definition of returningList; until now, returningList has been free from resjunk TLEs. For example, callers of FetchStatementTargetList() may be unprepared to see an all-resjunk list, instead of NIL, for a data modification statement that returns nothing. If we do keep the patch's approach, I'm inclined to rename returningList. However, I more lean toward adding a separate flag to indicate the need to return a complete tuple regardless of the RETURNING list. The benefits of overloading returningList are all short-term benefits. We know that the FDW API is still converging, so changing it seems eventually-preferable to, and safer than, changing the name or meaning of returningList. Thoughts? On Thu, Mar 06, 2014 at 09:11:19AM +0100, Ronan Dunklau wrote: Le mercredi 5 mars 2014 22:36:44 Noah Misch a écrit : Agreed. More specifically, I see only two scenarios for retrieving tuples from the tuplestore. Scenario one is that we need the next tuple (or pair of tuples, depending on the TriggerEvent). Scenario two is that we need the tuple(s) most recently retrieved. If that's correct, I'm inclined to rearrange afterTriggerInvokeEvents() and AfterTriggerExecute() to remember the tuple or pair of tuples most recently retrieved. They'll then never call tuplestore_advance() just to reposition. Do you see a problem with that? I don't see any problem with that. I don't know how this would be implemented, but it would make sense to avoid those scans, as long as a fresh copy is passed to the trigger: modifications to a tuple performed in an after trigger should not be visible to the next one. Trigger functions are not permitted to modify tg_trigtuple or tg_newtuple; notice that, for non-foreign triggers, we pass shared_buffers-backed tuples in those fields. Therefore, no copy is necessary. I was again somewhat tempted to remove ate_tupleindex, perhaps by defining the four flag bits this way: #define AFTER_TRIGGER_DONE0x1000 #define AFTER_TRIGGER_IN_PROGRESS 0x2000 /* two bits describing the size of and
Re: [HACKERS] Triggers on foreign tables
Le mercredi 5 mars 2014 22:36:44 Noah Misch a écrit : Agreed. More specifically, I see only two scenarios for retrieving tuples from the tuplestore. Scenario one is that we need the next tuple (or pair of tuples, depending on the TriggerEvent). Scenario two is that we need the tuple(s) most recently retrieved. If that's correct, I'm inclined to rearrange afterTriggerInvokeEvents() and AfterTriggerExecute() to remember the tuple or pair of tuples most recently retrieved. They'll then never call tuplestore_advance() just to reposition. Do you see a problem with that? I don't see any problem with that. I don't know how this would be implemented, but it would make sense to avoid those scans, as long as a fresh copy is passed to the trigger: modifications to a tuple performed in an after trigger should not be visible to the next one. I was again somewhat tempted to remove ate_tupleindex, perhaps by defining the four flag bits this way: #define AFTER_TRIGGER_DONE0x1000 #define AFTER_TRIGGER_IN_PROGRESS 0x2000 /* two bits describing the size of and tuple sources for this event */ #define AFTER_TRIGGER_TUP_BITS0xC000 #define AFTER_TRIGGER_FDW_REUSE 0x #define AFTER_TRIGGER_FDW_FETCH 0x4000 #define AFTER_TRIGGER_1CTID 0x8000 #define AFTER_TRIGGER_2CTID 0xC000 AFTER_TRIGGER_FDW_FETCH and AFTER_TRIGGER_FDW_REUSE correspond to the aforementioned scenarios one and two, respectively. I think, though, I'll rate this as needless micro-optimization and not bother; opinions welcome. (The savings is four bytes per foreign table trigger event.) I was already happy with having a lower footprint for foreign table trigger events than for regular trigger events, but if we remove the need for seeking in the tuplestore entirely, it would make sense to get rid of the index. Thanks, nm Thanks to you. -- Ronan Dunklau http://dalibo.com - http://dalibo.org signature.asc Description: This is a digitally signed message part.
Re: [HACKERS] Triggers on foreign tables
This version looks basically good. I have some cosmetic things to sweep up before commit. One point is a bit more substantial: On Tue, Feb 04, 2014 at 01:16:22PM +0100, Ronan Dunklau wrote: Le lundi 3 février 2014 23:28:45 Noah Misch a écrit : On Sun, Feb 02, 2014 at 11:53:51AM +0100, Ronan Dunklau wrote: We can remove ate_ptr2 and rely on the AFTER_TRIGGER_2CTIDS flag, but the rescanning and ate_ptr1 (renamed ate_tupleindex in the attached patch) can't go away. Consider for example the case of a foreign table with more than one AFTER UPDATE triggers. Unless we store the tuples once for each trigger, we will have to rescan the tuplestore. Will we? Within a given query level, when do (non-deferred) triggers execute in an order other than the enqueue order? Let me explain what I had in mind. Looking at the code in AfterTriggerSaveEvent: - we build a template AfterTriggerEvent, and store the tuple(s) - for each suitable after trigger that matches the trigger type, as well as the WHEN condition if any, a copy of the previously built AfterTriggerEvent is queued Later, those events are fired in order. This means that more than one event can be fired for one tuple. Take this example: [snip] Thanks; that illuminated the facts I was missing. On a side note, this made me realize that it is better to avoid storing a tuple entirely if there is no enabled trigger (the f1 = 4 case above). The attached patch does that, so the previous sequence becomes: 0-0-2-2-4-6-6 It also prevents from initalizing a tuplestore at all if its not needed. That's a sensible improvement. To mitigate the effects of this behaviour, I added the option to perform a reverse_seek when the looked-up tuple is nearer from the current index than from the start. If there's still a need to seek within the tuplestore, that should get rid of the O(n^2) effect. I'm hoping that per-query-level tuplestores will eliminate the need to seek entirely. I think the only case when seeking is still needed is when there are more than one after trigger that need to be fired, since the abovementioned change prevents from seeking to skip tuples. Agreed. More specifically, I see only two scenarios for retrieving tuples from the tuplestore. Scenario one is that we need the next tuple (or pair of tuples, depending on the TriggerEvent). Scenario two is that we need the tuple(s) most recently retrieved. If that's correct, I'm inclined to rearrange afterTriggerInvokeEvents() and AfterTriggerExecute() to remember the tuple or pair of tuples most recently retrieved. They'll then never call tuplestore_advance() just to reposition. Do you see a problem with that? I was again somewhat tempted to remove ate_tupleindex, perhaps by defining the four flag bits this way: #define AFTER_TRIGGER_DONE 0x1000 #define AFTER_TRIGGER_IN_PROGRESS 0x2000 /* two bits describing the size of and tuple sources for this event */ #define AFTER_TRIGGER_TUP_BITS 0xC000 #define AFTER_TRIGGER_FDW_REUSE 0x #define AFTER_TRIGGER_FDW_FETCH 0x4000 #define AFTER_TRIGGER_1CTID 0x8000 #define AFTER_TRIGGER_2CTID 0xC000 AFTER_TRIGGER_FDW_FETCH and AFTER_TRIGGER_FDW_REUSE correspond to the aforementioned scenarios one and two, respectively. I think, though, I'll rate this as needless micro-optimization and not bother; opinions welcome. (The savings is four bytes per foreign table trigger event.) Thanks, nm -- Noah Misch EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Triggers on foreign tables
I tried to check the latest (v8) patch again, then could not find problem in your design change from the v7. As Noah pointed out, it uses per query-depth tuplestore being released on AfterTriggerEndSubXact. So, may I mark it as ready for committer? 2014-03-03 15:48 GMT+09:00 Ronan Dunklau ronan.dunk...@dalibo.com: Hello. Did you have time to review the latest version of this patch ? Is there anything I can do to get this ready for commiter ? Thank you for all the work performed so far. Le mardi 4 février 2014 13:16:22 Ronan Dunklau a écrit : Le lundi 3 février 2014 23:28:45 Noah Misch a écrit : On Sun, Feb 02, 2014 at 11:53:51AM +0100, Ronan Dunklau wrote: Le jeudi 30 janvier 2014 14:05:08 Noah Misch a écrit : On Thu, Jan 23, 2014 at 03:17:35PM +0100, Ronan Dunklau wrote: What do you think about this approach ? Is there something I missed which would make it not sustainable ? Seems basically reasonable. I foresee multiple advantages from having one tuplestore per query level as opposed to one for the entire transaction. You would remove the performance trap of backing up the tuplestore by rescanning. It permits reclaiming memory and disk space in AfterTriggerEndQuery() rather than at end of transaction. You could remove ate_ptr1 and ate_ptr2 from AfterTriggerEventDataFDW and just store the flags word: depending on AFTER_TRIGGER_2CTIDS, grab either the next one or the next two tuples from the tuplestore. Using work_mem per AfterTriggerBeginQuery() instead of per transaction is no problem. What do you think of that design change? I agree that this design is better, but I have some objections. We can remove ate_ptr2 and rely on the AFTER_TRIGGER_2CTIDS flag, but the rescanning and ate_ptr1 (renamed ate_tupleindex in the attached patch) can't go away. Consider for example the case of a foreign table with more than one AFTER UPDATE triggers. Unless we store the tuples once for each trigger, we will have to rescan the tuplestore. Will we? Within a given query level, when do (non-deferred) triggers execute in an order other than the enqueue order? Let me explain what I had in mind. Looking at the code in AfterTriggerSaveEvent: - we build a template AfterTriggerEvent, and store the tuple(s) - for each suitable after trigger that matches the trigger type, as well as the WHEN condition if any, a copy of the previously built AfterTriggerEvent is queued Later, those events are fired in order. This means that more than one event can be fired for one tuple. Take this example: CREATE TRIGGER trig_row_after1 AFTER UPDATE ON rem2 FOR EACH ROW WHEN (NEW.f1 % 5 3) EXECUTE PROCEDURE trigger_func('TRIG1'); CREATE TRIGGER trig_row_after2 AFTER UPDATE ON rem2 FOR EACH ROW WHEN (NEW.f1 % 5 4) EXECUTE PROCEDURE trigger_func('TRIG2'); UPDATE rem2 set f2 = 'something'; Assuming 5 rows with f1 as a serial, the fired AfterTriggerEvent's ate_tupleindex will be, in that order. Ass 0-0-2-2-4-8-8 So, at least a backward seek is required for trig_row_after2 to be able to retrieve a tuple that was already consumed when firing trig_row_after1. On a side note, this made me realize that it is better to avoid storing a tuple entirely if there is no enabled trigger (the f1 = 4 case above). The attached patch does that, so the previous sequence becomes: 0-0-2-2-4-6-6 It also prevents from initalizing a tuplestore at all if its not needed. To mitigate the effects of this behaviour, I added the option to perform a reverse_seek when the looked-up tuple is nearer from the current index than from the start. If there's still a need to seek within the tuplestore, that should get rid of the O(n^2) effect. I'm hoping that per-query-level tuplestores will eliminate the need to seek entirely. I think the only case when seeking is still needed is when there are more than one after trigger that need to be fired, since the abovementioned change prevents from seeking to skip tuples. If you do pursue that change, make sure the code still does the right thing when it drops queued entries during subxact abort. I don't really understand what should be done at that stage. Since triggers on foreign tables are not allowed to be deferred, everything should be cleaned up at the end of each query, right ? So, there shouldn't be any queued entries. I suspect that's right. If you haven't looked over AfterTriggerEndSubXact(), please do so and ensure all its actions still make sense in the context of this new kind of trigger storage. You're right, I missed something here. When aborting a subxact, the tuplestores for queries below the subxact query depth should be cleaned, if any, because AfterTriggerEndQuery has not been called for the failing query. The attached patch fixes that. The attached
Re: [HACKERS] Triggers on foreign tables
On Mon, Mar 03, 2014 at 11:10:30PM +0900, Kohei KaiGai wrote: I tried to check the latest (v8) patch again, then could not find problem in your design change from the v7. As Noah pointed out, it uses per query-depth tuplestore being released on AfterTriggerEndSubXact. So, may I mark it as ready for committer? Yes. Re-reviewing this is next on my list. -- Noah Misch EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Triggers on foreign tables
Hello. Did you have time to review the latest version of this patch ? Is there anything I can do to get this ready for commiter ? Thank you for all the work performed so far. Le mardi 4 février 2014 13:16:22 Ronan Dunklau a écrit : Le lundi 3 février 2014 23:28:45 Noah Misch a écrit : On Sun, Feb 02, 2014 at 11:53:51AM +0100, Ronan Dunklau wrote: Le jeudi 30 janvier 2014 14:05:08 Noah Misch a écrit : On Thu, Jan 23, 2014 at 03:17:35PM +0100, Ronan Dunklau wrote: What do you think about this approach ? Is there something I missed which would make it not sustainable ? Seems basically reasonable. I foresee multiple advantages from having one tuplestore per query level as opposed to one for the entire transaction. You would remove the performance trap of backing up the tuplestore by rescanning. It permits reclaiming memory and disk space in AfterTriggerEndQuery() rather than at end of transaction. You could remove ate_ptr1 and ate_ptr2 from AfterTriggerEventDataFDW and just store the flags word: depending on AFTER_TRIGGER_2CTIDS, grab either the next one or the next two tuples from the tuplestore. Using work_mem per AfterTriggerBeginQuery() instead of per transaction is no problem. What do you think of that design change? I agree that this design is better, but I have some objections. We can remove ate_ptr2 and rely on the AFTER_TRIGGER_2CTIDS flag, but the rescanning and ate_ptr1 (renamed ate_tupleindex in the attached patch) can't go away. Consider for example the case of a foreign table with more than one AFTER UPDATE triggers. Unless we store the tuples once for each trigger, we will have to rescan the tuplestore. Will we? Within a given query level, when do (non-deferred) triggers execute in an order other than the enqueue order? Let me explain what I had in mind. Looking at the code in AfterTriggerSaveEvent: - we build a template AfterTriggerEvent, and store the tuple(s) - for each suitable after trigger that matches the trigger type, as well as the WHEN condition if any, a copy of the previously built AfterTriggerEvent is queued Later, those events are fired in order. This means that more than one event can be fired for one tuple. Take this example: CREATE TRIGGER trig_row_after1 AFTER UPDATE ON rem2 FOR EACH ROW WHEN (NEW.f1 % 5 3) EXECUTE PROCEDURE trigger_func('TRIG1'); CREATE TRIGGER trig_row_after2 AFTER UPDATE ON rem2 FOR EACH ROW WHEN (NEW.f1 % 5 4) EXECUTE PROCEDURE trigger_func('TRIG2'); UPDATE rem2 set f2 = 'something'; Assuming 5 rows with f1 as a serial, the fired AfterTriggerEvent's ate_tupleindex will be, in that order. Ass 0-0-2-2-4-8-8 So, at least a backward seek is required for trig_row_after2 to be able to retrieve a tuple that was already consumed when firing trig_row_after1. On a side note, this made me realize that it is better to avoid storing a tuple entirely if there is no enabled trigger (the f1 = 4 case above). The attached patch does that, so the previous sequence becomes: 0-0-2-2-4-6-6 It also prevents from initalizing a tuplestore at all if its not needed. To mitigate the effects of this behaviour, I added the option to perform a reverse_seek when the looked-up tuple is nearer from the current index than from the start. If there's still a need to seek within the tuplestore, that should get rid of the O(n^2) effect. I'm hoping that per-query-level tuplestores will eliminate the need to seek entirely. I think the only case when seeking is still needed is when there are more than one after trigger that need to be fired, since the abovementioned change prevents from seeking to skip tuples. If you do pursue that change, make sure the code still does the right thing when it drops queued entries during subxact abort. I don't really understand what should be done at that stage. Since triggers on foreign tables are not allowed to be deferred, everything should be cleaned up at the end of each query, right ? So, there shouldn't be any queued entries. I suspect that's right. If you haven't looked over AfterTriggerEndSubXact(), please do so and ensure all its actions still make sense in the context of this new kind of trigger storage. You're right, I missed something here. When aborting a subxact, the tuplestores for queries below the subxact query depth should be cleaned, if any, because AfterTriggerEndQuery has not been called for the failing query. The attached patch fixes that. The attached patch checks this, and add documentation for this limitation. I'm not really sure about how to phrase that correctly in the error message and the documentation. One can store at most INT_MAX foreign tuples, which means that at most INT_MAX insert or delete or
Re: [HACKERS] Triggers on foreign tables
On Sun, Feb 02, 2014 at 11:53:51AM +0100, Ronan Dunklau wrote: Le jeudi 30 janvier 2014 14:05:08 Noah Misch a écrit : On Thu, Jan 23, 2014 at 03:17:35PM +0100, Ronan Dunklau wrote: What do you think about this approach ? Is there something I missed which would make it not sustainable ? Seems basically reasonable. I foresee multiple advantages from having one tuplestore per query level as opposed to one for the entire transaction. You would remove the performance trap of backing up the tuplestore by rescanning. It permits reclaiming memory and disk space in AfterTriggerEndQuery() rather than at end of transaction. You could remove ate_ptr1 and ate_ptr2 from AfterTriggerEventDataFDW and just store the flags word: depending on AFTER_TRIGGER_2CTIDS, grab either the next one or the next two tuples from the tuplestore. Using work_mem per AfterTriggerBeginQuery() instead of per transaction is no problem. What do you think of that design change? I agree that this design is better, but I have some objections. We can remove ate_ptr2 and rely on the AFTER_TRIGGER_2CTIDS flag, but the rescanning and ate_ptr1 (renamed ate_tupleindex in the attached patch) can't go away. Consider for example the case of a foreign table with more than one AFTER UPDATE triggers. Unless we store the tuples once for each trigger, we will have to rescan the tuplestore. Will we? Within a given query level, when do (non-deferred) triggers execute in an order other than the enqueue order? To mitigate the effects of this behaviour, I added the option to perform a reverse_seek when the looked-up tuple is nearer from the current index than from the start. If there's still a need to seek within the tuplestore, that should get rid of the O(n^2) effect. I'm hoping that per-query-level tuplestores will eliminate the need to seek entirely. If you do pursue that change, make sure the code still does the right thing when it drops queued entries during subxact abort. I don't really understand what should be done at that stage. Since triggers on foreign tables are not allowed to be deferred, everything should be cleaned up at the end of each query, right ? So, there shouldn't be any queued entries. I suspect that's right. If you haven't looked over AfterTriggerEndSubXact(), please do so and ensure all its actions still make sense in the context of this new kind of trigger storage. The attached patch checks this, and add documentation for this limitation. I'm not really sure about how to phrase that correctly in the error message and the documentation. One can store at most INT_MAX foreign tuples, which means that at most INT_MAX insert or delete or half-updates can occur. By half-updates, I mean that for update two tuples are stored whereas in contrast to only one for insert and delete trigger. Besides, I don't know where this disclaimer should be in the documentation. Any advice here ? I wouldn't mention that limitation. Maybe it shouldn't be so prominent, but I still think a note somewhere couldn't hurt. Perhaps. There's not much documentation of such implementation upper limits, and there's no usage of INT_MAX outside of parts that discuss writing C code. I'm not much of a visionary when it comes to the documentation; I try to document new things with an amount of detail similar to old features. Should the use of work_mem be documented somewhere, too ? I wouldn't. Most uses of work_mem are undocumented, even relatively major ones like count(DISTINCT ...) and CTEs. So, while I'd generally favor a patch documenting all/most of the things that use work_mem, it would be odd to document one new consumer apart from the others. This is the performance trap I mentioned above; it brings potential O(n^2) complexity to certain AFTER trigger execution scenarios. What scenarios do you have in mind ? I fixed the problem when there are multiple triggers on a foreign table, do you have any other one ? I'm not aware of such a performance trap in your latest design. For what it's worth, I don't think multiple triggers were ever a problem. In the previous design, a problem arose if you had a scenario like this: Query level 1: queue one million events ... Repeat this section many times: Query level 2: queue one event Query level 3: queue one event Query level 3: execute events Query level 2: execute events -- had to advance through the 1M stored events ... Query level 1: execute events Thanks, nm -- Noah Misch EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Triggers on foreign tables
On Thu, Jan 23, 2014 at 03:17:35PM +0100, Ronan Dunklau wrote: - for after triggers, the whole queuing mechanism is bypassed for foreign tables. This is IMO acceptable, since foreign tables cannot have constraints or constraints triggers, and thus have not need for deferrable execution. This design avoids the need for storing and retrieving/identifying remote tuples until the query or transaction end. Whether an AFTER ROW trigger is deferred determines whether it runs at the end of the firing query or at the end of the firing query's transaction. In all cases, every BEFORE ROW trigger of a given query fires before any AFTER ROW trigger of the same query. SQL requires that. This proposal would give foreign table AFTER ROW triggers a novel firing time; let's not do that. I think the options going forward are either (a) design a way to queue foreign table AFTER ROW triggers such that we can get the old and/or new rows at the end of the query or (b) not support AFTER ROW triggers on foreign tables for the time being. I did not know this was mandated by the standard. The attached patch tries to solve this problem by allocating a tuplestore in the global afterTriggers structure. This tuplestore is used for the whole transaction, and uses only work_mem per transaction. Both old and new tuples are stored in this tuplestore. Some additional bookkeeping is done on the afterTriggers global structure, to keep track of the number of inserted tuples, and the current read pointer position. The tuples are identified by their order of insertion during the transaction. I think this could benefit from some support in the tuplestore API, by allowing arbitrary seek without the need to store more ReadPointers. I initially tried to keep track of them by allocating read pointers on the tuple store, but it turned out to be so expensive that I had to find another way (24bytes per stored tuple, which are not reclaimable until the end of the transaction). What do you think about this approach ? Is there something I missed which would make it not sustainable ? Seems basically reasonable. I foresee multiple advantages from having one tuplestore per query level as opposed to one for the entire transaction. You would remove the performance trap of backing up the tuplestore by rescanning. It permits reclaiming memory and disk space in AfterTriggerEndQuery() rather than at end of transaction. You could remove ate_ptr1 and ate_ptr2 from AfterTriggerEventDataFDW and just store the flags word: depending on AFTER_TRIGGER_2CTIDS, grab either the next one or the next two tuples from the tuplestore. Using work_mem per AfterTriggerBeginQuery() instead of per transaction is no problem. What do you think of that design change? If you do pursue that change, make sure the code still does the right thing when it drops queued entries during subxact abort. I do not have access to the standard specification, any advice regarding specs compliance would be welcomed. http://wiki.postgresql.org/wiki/Developer_FAQ#Where_can_I_get_a_copy_of_the_SQL_standards.3F There is still one small issue with the attached patch: modifications to the tuple performed by the foreign data wrapper (via the returned TupleTableSlot in ExecForeignUpdate and ExecForeignInsert hooks) are not visible to the AFTER trigger. This could be fixed by merging the planslot containing the resjunk columns with the returned slot before calling the trigger, but I'm not really sure how to safely perform that. Any advice ? Currently, FDWs are permitted to skip returning columns not actually referenced by any RETURNING clause. I would change that part of the API contract to require returning all columns when an AFTER ROW trigger is involved. You can't get around doing that by merging old column values, because, among other reasons, an INSERT does not have those values at all. I'm not sure this should be part of the API contract: it would make implementing a FDW more complicated than it is now. The attached patch hooks on rewriteTargetListIU to add the missing targets to the returning clause, when needed. You're effectively faking the presence of a RETURNING list so today's conforming FDWs will already do the right thing? Clever. Please don't change unrelated whitespace. Please use pgindent to fix the formatting of your new code. It's fine to introduce occasional whitespace errors, but they're unusually-plentiful here. I think its done now. One problem I have with running pgindent is that I accidentally add chunks that were modified only by pgindent. Yep; that is a pain. Note that pgindent can't fix many unrelated whitespace changes. For example, if you add or remove a blank line, pgindent won't interfere. (We would not want it to interfere, because the use of blank lines is up to the code author.) You will still need to read through your
Re: [HACKERS] Triggers on foreign tables
Hello, I initially tried to keep track of them by allocating read pointers on the tuple store, but it turned out to be so expensive that I had to find another way (24bytes per stored tuple, which are not reclaimable until the end of the transaction). What do you think about this approach ? Is there something I missed which would make it not sustainable ? It seems to me reasonable approach to track them. Just a corner case, it may cause an unexpected problem if someone tried to update a foreign table with 2^31 of tuples because of int index. It may make sense to put a check fdw_nextwrite is less than INT_MAX. :-) I have a few other minor comments: +static HeapTuple +ExtractOldTuple(TupleTableSlot *planSlot, + ResultRelInfo *relinfo) +{ + boolisNull; + JunkFilter *junkfilter = relinfo-ri_junkFilter; + HeapTuple oldtuple = palloc0(sizeof(HeapTupleData)); + HeapTupleHeader td; + Datum datum = ExecGetJunkAttribute(planSlot, +junkfilter-jf_junkAttNo, +isNull); + + /* shouldn't ever get a null result... */ + if (isNull) + elog(ERROR, wholerow is NULL); + td = DatumGetHeapTupleHeader(datum); + (*oldtuple).t_len = HeapTupleHeaderGetDatumLength(td); + (*oldtuple).t_data = td; + return oldtuple; +} + Why not usual coding manner as: oldtuple-t_len = HeapTupleHeaderGetDatumLength(td); oldtuple-t_data = td; Also, it don't put tableOid on the tuple. oldtuple-t_tableOid = RelationGetRelid(relinfo-ri_RelationDesc); @@ -730,6 +738,45 @@ rewriteTargetListIU(Query *parsetree, Relation target_relation, + /* +* For foreign tables, build a similar array for returning tlist. +*/ + if (need_full_returning_tlist) + { + returning_tles = (TargetEntry **) palloc0(numattrs * sizeof(TargetEntry *)); + foreach(temp, parsetree-returningList) + { + TargetEntry *old_rtle = (TargetEntry *) lfirst(temp); + + if (IsA(old_rtle-expr, Var)) + { + Var*var = (Var *) old_rtle-expr; + + if (var-varno == parsetree-resultRelation) + { + attrno = var-varattno; + if (attrno 1 || attrno numattrs) + elog(ERROR, bogus resno %d in targetlist, attrno); This checks caused an error when returning list contains a reference to system column; that has negative attribute number. Probably, it should be continue;, instead of elog(). BTW, isn't it sufficient to inhibit optimization by putting whole-row-reference here, rather than whole-row-reference. Postgres_fdw extracts whole-row-reference into individual columns reference. + att_tup = target_relation-rd_att-attrs[attrno - 1]; + /* We can (and must) ignore deleted attributes */ + if (att_tup-attisdropped) + continue; + returning_tles[attrno - 1] = old_rtle; + } + } + } + } + Thanks, -- NEC OSS Promotion Center / PG-Strom Project KaiGai Kohei kai...@ak.jp.nec.com -Original Message- From: pgsql-hackers-ow...@postgresql.org [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Ronan Dunklau Sent: Thursday, January 23, 2014 11:18 PM To: Noah Misch Cc: pgsql-hackers@postgresql.org Subject: Re: [HACKERS] Triggers on foreign tables Thank you very much for this review. The need here is awfully similar to a need of INSTEAD OF triggers on views. For them, we add a single wholerow resjunk TLE. Is there a good reason to do it differently for triggers on foreign tables? I wasn't aware of that, it makes for some much cleaner code IMO. Thanks. - for after triggers, the whole queuing mechanism is bypassed for foreign tables. This is IMO acceptable, since foreign tables cannot have constraints or constraints triggers, and thus have not need for deferrable execution. This design avoids the need for storing and retrieving/identifying remote tuples until the query or transaction end. Whether an AFTER ROW trigger is deferred determines whether it runs at the end of the firing query or at the end of the firing query's transaction. In all cases, every BEFORE ROW trigger of a given query fires before any AFTER ROW trigger of the same query. SQL requires that. This proposal would give foreign table AFTER ROW triggers a novel firing time; let's not do that. I think the options going forward are either (a) design a way to queue foreign table AFTER ROW triggers such that we can get the old and/or new rows at the end of the query or (b) not support AFTER ROW triggers on foreign tables for the time being. I did not know this was mandated by the standard. The attached patch tries to solve this problem by allocating a tuplestore in the global afterTriggers structure. This tuplestore is used
Re: [HACKERS] Triggers on foreign tables
On Tue, Jan 07, 2014 at 12:11:28PM +0100, Ronan Dunklau wrote: Since the last discussion about it (http://www.postgresql.org/message-id/cadyhksugp6ojb1pybtimop3s5fg_yokguto-7rbcltnvaj5...@mail.gmail.com), I finally managed to implement row triggers for foreign tables. For statement-level triggers, nothing has changed since the last patch. Statement-level triggers are just allowed in the command checking. For row-level triggers, it works as follows: - during rewrite phase, every attribute of the foreign table is duplicated as a resjunk entry if a trigger is defined on the relation. These entries are then used to store the old values for a tuple. There is room for improvement here: we could check if any trigger is in fact a row-level trigger The need here is awfully similar to a need of INSTEAD OF triggers on views. For them, we add a single wholerow resjunk TLE. Is there a good reason to do it differently for triggers on foreign tables? - for after triggers, the whole queuing mechanism is bypassed for foreign tables. This is IMO acceptable, since foreign tables cannot have constraints or constraints triggers, and thus have not need for deferrable execution. This design avoids the need for storing and retrieving/identifying remote tuples until the query or transaction end. Whether an AFTER ROW trigger is deferred determines whether it runs at the end of the firing query or at the end of the firing query's transaction. In all cases, every BEFORE ROW trigger of a given query fires before any AFTER ROW trigger of the same query. SQL requires that. This proposal would give foreign table AFTER ROW triggers a novel firing time; let's not do that. I think the options going forward are either (a) design a way to queue foreign table AFTER ROW triggers such that we can get the old and/or new rows at the end of the query or (b) not support AFTER ROW triggers on foreign tables for the time being. It's not clear to me whether SQL/MED contemplates triggers on foreign tables. Its drop basic column definition General Rules do mention the possibility of a reference from a trigger column list. On the other hand, I see nothing overriding the fact that CREATE TRIGGER only targets base tables. Is this clearer to anyone else? (This is a minor point, mainly bearing on the Compatibility section of the CREATE TRIGGER documentation.) - the duplicated resjunk attributes are identified by being: - marked as resjunk in the targetlist - not being system or whole-row attributes (varno 0) There is still one small issue with the attached patch: modifications to the tuple performed by the foreign data wrapper (via the returned TupleTableSlot in ExecForeignUpdate and ExecForeignInsert hooks) are not visible to the AFTER trigger. This could be fixed by merging the planslot containing the resjunk columns with the returned slot before calling the trigger, but I'm not really sure how to safely perform that. Any advice ? Currently, FDWs are permitted to skip returning columns not actually referenced by any RETURNING clause. I would change that part of the API contract to require returning all columns when an AFTER ROW trigger is involved. You can't get around doing that by merging old column values, because, among other reasons, an INSERT does not have those values at all. On Tue, Jan 07, 2014 at 05:31:10PM +0100, Ronan Dunklau wrote: --- a/contrib/postgres_fdw/expected/postgres_fdw.out +++ b/contrib/postgres_fdw/expected/postgres_fdw.out +NOTICE: TG_NAME: trig_row_after +NOTICE: TG_WHEN: AFTER +NOTICE: TG_LEVEL: ROW +NOTICE: TG_OP: DELETE +NOTICE: TG_RELID::regclass: rem1 +NOTICE: TG_RELNAME: rem1 +NOTICE: TG_TABLE_NAME: rem1 +NOTICE: TG_TABLE_SCHEMA: public +NOTICE: TG_NARGS: 2 +NOTICE: TG_ARGV: [23, skidoo] +NOTICE: OLD: (2,bye) +NOTICE: TG_NAME: trig_row_before +NOTICE: TG_WHEN: BEFORE +NOTICE: TG_LEVEL: ROW +NOTICE: TG_OP: DELETE +NOTICE: TG_RELID::regclass: rem1 +NOTICE: TG_RELNAME: rem1 +NOTICE: TG_TABLE_NAME: rem1 +NOTICE: TG_TABLE_SCHEMA: public +NOTICE: TG_NARGS: 2 +NOTICE: TG_ARGV: [23, skidoo] +NOTICE: OLD: (11,bye remote) +NOTICE: TG_NAME: trig_row_after +NOTICE: TG_WHEN: AFTER +NOTICE: TG_LEVEL: ROW +NOTICE: TG_OP: DELETE +NOTICE: TG_RELID::regclass: rem1 +NOTICE: TG_RELNAME: rem1 +NOTICE: TG_TABLE_NAME: rem1 +NOTICE: TG_TABLE_SCHEMA: public +NOTICE: TG_NARGS: 2 +NOTICE: TG_ARGV: [23, skidoo] +NOTICE: OLD: (11,bye remote) +insert into rem1 values(1,'insert'); Would you trim the verbosity a bit? Maybe merge several of the TG_ fields onto one line, and remove the low-importance ones. Perhaps issue one line like this in place of all the current TG_ lines: NOTICE: trig_row_after(23, skidoo) AFTER ROW DELETE ON rem1 --- a/contrib/postgres_fdw/sql/postgres_fdw.sql +++ b/contrib/postgres_fdw/sql/postgres_fdw.sql @@ -384,3 +384,188 @@ insert into loc1(f2) values('bye'); insert into rem1(f2) values('bye
[HACKERS] Triggers on foreign tables
Hello. Since the last discussion about it (http://www.postgresql.org/message-id/cadyhksugp6ojb1pybtimop3s5fg_yokguto-7rbcltnvaj5...@mail.gmail.com), I finally managed to implement row triggers for foreign tables. The attached patch does not contain regression tests or documentation yet, a revised patch will include those sometime during the week. I'll add it to the commitfest then. A simple test scenario using postgres_fdw is however included as an attachment. The attached patch implements triggers for foreign tables according to the design discussed in the link above. For statement-level triggers, nothing has changed since the last patch. Statement-level triggers are just allowed in the command checking. For row-level triggers, it works as follows: - during rewrite phase, every attribute of the foreign table is duplicated as a resjunk entry if a trigger is defined on the relation. These entries are then used to store the old values for a tuple. There is room for improvement here: we could check if any trigger is in fact a row-level trigger - during execution phase, the before triggers are fired exactly like triggers on regular tables, except that old tuples are not fetched using their ctid, but rebuilt using the previously-stored resjunk attributes. - for after triggers, the whole queuing mechanism is bypassed for foreign tables. This is IMO acceptable, since foreign tables cannot have constraints or constraints triggers, and thus have not need for deferrable execution. This design avoids the need for storing and retrieving/identifying remote tuples until the query or transaction end. - the duplicated resjunk attributes are identified by being: - marked as resjunk in the targetlist - not being system or whole-row attributes (varno 0) There is still one small issue with the attached patch: modifications to the tuple performed by the foreign data wrapper (via the returned TupleTableSlot in ExecForeignUpdate and ExecForeignInsert hooks) are not visible to the AFTER trigger. This could be fixed by merging the planslot containing the resjunk columns with the returned slot before calling the trigger, but I'm not really sure how to safely perform that. Any advice ? Many thanks to Kohei Kaigai for taking the time to help with the design. -- Ronan Dunklau http://dalibo.com - http://dalibo.org test_with_postgres_fdw.sql Description: application/sql diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c index b9cd88d..a509595 100644 --- a/src/backend/commands/tablecmds.c +++ b/src/backend/commands/tablecmds.c @@ -3150,13 +3150,16 @@ ATPrepCmd(List **wqueue, Relation rel, AlterTableCmd *cmd, /* No command-specific prep needed */ break; case AT_EnableTrig: /* ENABLE TRIGGER variants */ - case AT_EnableAlwaysTrig: - case AT_EnableReplicaTrig: case AT_EnableTrigAll: case AT_EnableTrigUser: case AT_DisableTrig: /* DISABLE TRIGGER variants */ case AT_DisableTrigAll: case AT_DisableTrigUser: + ATSimplePermissions(rel, ATT_TABLE | ATT_FOREIGN_TABLE); + pass = AT_PASS_MISC; + break; + case AT_EnableReplicaTrig: + case AT_EnableAlwaysTrig: case AT_EnableRule: /* ENABLE/DISABLE RULE variants */ case AT_EnableAlwaysRule: case AT_EnableReplicaRule: diff --git a/src/backend/commands/trigger.c b/src/backend/commands/trigger.c index 4eff184..12870eb 100644 --- a/src/backend/commands/trigger.c +++ b/src/backend/commands/trigger.c @@ -75,6 +75,14 @@ static HeapTuple GetTupleForTrigger(EState *estate, ItemPointer tid, LockTupleMode lockmode, TupleTableSlot **newSlot); + +static HeapTuple ExtractOldTuple(TupleTableSlot * mixedtupleslot, + ResultRelInfo * relinfo); +static void ExecARForeignTrigger(EState * estate, +TriggerData LocTriggerData, +ResultRelInfo *relinfo, +int triggerEvent, int triggerType); + static bool TriggerEnabled(EState *estate, ResultRelInfo *relinfo, Trigger *trigger, TriggerEvent event, Bitmapset *modifiedCols, @@ -184,12 +192,22 @@ CreateTrigger(CreateTrigStmt *stmt, const char *queryString, RelationGetRelationName(rel)), errdetail(Views cannot have TRUNCATE triggers.))); } + else if (rel-rd_rel-relkind == RELKIND_FOREIGN_TABLE) + { + if(stmt-isconstraint) + ereport(ERROR, + (errcode(ERRCODE_WRONG_OBJECT_TYPE), + errmsg(\%s\ is a foreign table, + RelationGetRelationName(rel)), + errdetail(Foreign Tables cannot have constraint triggers.))); + } else + { ereport(ERROR, (errcode(ERRCODE_WRONG_OBJECT_TYPE), errmsg(\%s\ is not a table or view, RelationGetRelationName(rel; - + } if (!allowSystemTableMods IsSystemRelation(rel)) ereport(ERROR, (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE), @@ -1062,10 +1080,11 @@ RemoveTriggerById(Oid trigOid) rel = heap_open(relid, AccessExclusiveLock); if (rel-rd_rel-relkind != RELKIND_RELATION - rel-rd_rel-relkind != RELKIND_VIEW) + rel-rd_rel-relkind !=
Re: [HACKERS] Triggers on foreign tables
Le mardi 15 octobre 2013 09:47:31 Robert Haas a écrit : On Mon, Oct 14, 2013 at 5:24 PM, Kohei KaiGai kai...@kaigai.gr.jp wrote: And, I also want some comments from committers, not only from mine. +1 +1 /me pokes head up. I know I'm going to annoy people with this comment, but I feel like it's going to have to be made at some point by somebody, so here goes: I don't see the point of this feature. If you want a trigger on a table, why not set it on the remote side? A trigger on the foreign table won't be enforced consistently; it'll only work when the update is routed through the foreign table, not when people access the underlying table on the remote side through any other mechanism. The number of useful things you can do this way seems fairly small. Perhaps you could use a row-level trigger for RLS, to allow only certain rows on the foreign side to be updated, but even that seems like a slightly strange design: generally it'd be better to enforce the security as close to the target object as possible. There's another issue that concerns me here also: performance. IIUC, an update of N tuples on the remote side currently requires N+1 server round-trips. That is unspeakably awful, and we really oughta be looking for ways to make that number go down, by pushing the whole update to the remote side. But obviously that won't be possible if there's a per-row trigger that has to be evaluated on the local side. Now, assuming somebody comes up with code that implements that optimization, we can just disable it when there are local-side triggers. But, then you're back to having terrible performance. So even if the use case for this seemed really broad, I tend to think performance concerns would sink most of the possible real-world uses. I could, of course, be all wet Even if the row-level trigger functionality is controversial, in terms of provided features VS performance, the original patch I submitted enables statement-level triggers. Although my goal was to implement row-level triggers and I hit a wall pretty quickly, would it make sense to have statement-level triggers without their row counterparts ? I also think that pushing the whole update to the remote side will not be possible in all cases, and like Kohei wrote, it may be an acceptable loss to not be able to push it when a trigger is present. If we want to push the whole update, we will have to cope with local joins and function evaluations in all cases. IMO, triggers are just a special case of these limiting factors. signature.asc Description: This is a digitally signed message part.
Re: [HACKERS] Triggers on foreign tables
Sorry, I might call it something like primary key, instead of 'tupleid'. Apart from whether we can uniquely identify a particular remote record with an attribute, what FDW needs here is something to identify a remote record. So, we were talking about same concept with different names. Ah, that makes sense: I was understanding tupleid as a synonym for ctid. Does the incomplete tuple mean a tuple image but some of columns are replaced with NULL due to optimization, as postgres_fdw is doing, doesn't it? If so, a solution is to enforce FDW driver to fetch all the columns if its managing foreign table has row level triggers for update / delete. What I meant is that a DELETE statement, for example, does not build a scan- stage reltargetlist consisting of the whole set of attributes: the only attributes that are fetched are the ones built by addForeignUpdateTargets, and whatever additional attributes are involved in the DELETE statement (in the WHERE clause, for example). DELETE statement indeed does not (need to) construct a complete tuple image on scan stage, however, it does not prohibit FDW driver to keep the latest record being fetched from remote server. FDW driver easily knows whether relation has row-level trigger preliminary, so here is no matter to keep a complete image internally if needed. Or, it is perhaps available to add additional target-entries that is needed to construct a complete tuple using AddForeignUpdateTargets() callback. I think that the additional target-entries should be added in core, because that would require additional work from FDW drivers, and the code would be duplicated amongst all FDW drivers that wish to support triggers. I like this idea, but I don't really see all the implications of such a design. Where would this tuple be stored ? From what I understand, after-triggers are queued for later execution, using the old/new tupleid for later retrieval (in the AfterTriggerEventData struct). This would need a major change to work with foreign tables. Would that involve a special path for executing those triggers ASAP ? Ops, I oversight here. AfterTriggerSaveEvent() indeed saves only ctid of tuples in regular tables, because it can re-construct a complete tuple image from a particular ctid any time. On the other hand, it is not available on foreign table due to its nature. All we can do seems to me, probably, to save copy of before/after tuple image on local memory, even though it consumes much memory than regular tables. Or, as suggested above, add a special code path for foreign tables which would execute the trigger as soon as possible instead of queuing it. The feature we need here might become a bit clear little by little. A complete tuple image shall be fetched on the scan stage, if foreign table has row-level trigger. The planSlot may make sense if it contains (junk) attributes that is sufficient to re-construct an old tuple image. Target-list of DELETE statement contains a junk ctid attribute. Here is no reason why we cannot add junk attributes that reference each regular attribute, isn't it? Also, target-list of UPDATE statement contains new tuple image, however, it is available to add junk attributes that just reference old value, as a carrier of old values from scan stage to modify stage. I want core PostgreSQL to support a part of these jobs. For example, it may be able to add junk attribute to re-construct old tuple image on rewriteTargetListUD(), if target relation has row-level triggers. Also, it may be able to extract these old values from planSlot to construct an old tuple image on GetTupleForTrigger(), if relation is foreign table. So, if I understand you, the target list would be built as follow: - core builds the target list, including every attribute - this target list is updated by the FDW to add any junk attributes it wishes to use - in the case of an UPDATE, core duplicates the whole list of attributes (including the added junk attributes), as another set of junk attributes. Maybe we could duplicate only the updated attributes. Unfortunately, I have no special idea to handle old/new remote tuple on AfterTriggerSaveEvent(), except for keeping it as copy, but it is straightforward. Maybe avoid it altogether in the case of a trigger on a remote foreign table ? And, I also want some comments from committers, not only from mine. +1 +1 Thanks, signature.asc Description: This is a digitally signed message part.
Re: [HACKERS] Triggers on foreign tables
On Tue, Oct 15, 2013 at 4:17 PM, Kohei KaiGai kai...@kaigai.gr.jp wrote: One reason we should support local triggers is that not all the data source of FDW support remote trigger. It required FDW drivers to have RDBMS as its backend, but no realistic assumption. For example, file_fdw is unavailable to implement remote triggers. True, but gosh, updates via file_fdw are gonna be so slow I can't believe anybody'd use it for something real One thing I'd like to know is, where is the goal of FDW feature. It seems to me, FDW goes into a feature to manage external data set as if regular tables. If it is right understanding, things we try to support on foreign table is things we're supporting on regular tables, such as triggers. I generally agree with that. We often have some case that we cannot apply fully optimized path because of some reasons, like view has security-barrier, qualifier contained volatile functions, and so on... Trigger may be a factor to prevent fully optimized path, however, it depends on the situation which one shall be prioritized; performance or functionality. Sure. I mean, I guess if there are enough people that want this, I suppose I ought not stand in the way. It just seems like a lot of work for a feature of very marginal utility. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Triggers on foreign tables
2013/10/16 Robert Haas robertmh...@gmail.com: On Tue, Oct 15, 2013 at 4:17 PM, Kohei KaiGai kai...@kaigai.gr.jp wrote: One reason we should support local triggers is that not all the data source of FDW support remote trigger. It required FDW drivers to have RDBMS as its backend, but no realistic assumption. For example, file_fdw is unavailable to implement remote triggers. True, but gosh, updates via file_fdw are gonna be so slow I can't believe anybody'd use it for something real How about another example? I have implemented a column-oriented data store with read/write capability, using FDW APIs. The role of FDW driver here is to translate its data format between column-oriented and row-oriented, but no trigger capability itself. One thing I'd like to know is, where is the goal of FDW feature. It seems to me, FDW goes into a feature to manage external data set as if regular tables. If it is right understanding, things we try to support on foreign table is things we're supporting on regular tables, such as triggers. I generally agree with that. We often have some case that we cannot apply fully optimized path because of some reasons, like view has security-barrier, qualifier contained volatile functions, and so on... Trigger may be a factor to prevent fully optimized path, however, it depends on the situation which one shall be prioritized; performance or functionality. Sure. I mean, I guess if there are enough people that want this, I suppose I ought not stand in the way. It just seems like a lot of work for a feature of very marginal utility. The reason why I'm interested in this feature is, row-level triggers on foreign tables will become a piece to implement table partitioning that contains multiple foreign tables. Probably, it also connects to the effort of custom-plan node (in the future) that enables to replace Append node by custom logic, to kick multiple concurrent remote queries on behalf of partitioned foreign table. Thanks, -- KaiGai Kohei kai...@kaigai.gr.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Triggers on foreign tables
On Wed, Oct 16, 2013 at 2:20 PM, Kohei KaiGai kai...@kaigai.gr.jp wrote: True, but gosh, updates via file_fdw are gonna be so slow I can't believe anybody'd use it for something real How about another example? I have implemented a column-oriented data store with read/write capability, using FDW APIs. The role of FDW driver here is to translate its data format between column-oriented and row-oriented, but no trigger capability itself. OK, that's a believable example. Call me convinced. :-) -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Triggers on foreign tables
On Mon, Oct 14, 2013 at 5:24 PM, Kohei KaiGai kai...@kaigai.gr.jp wrote: And, I also want some comments from committers, not only from mine. +1 +1 /me pokes head up. I know I'm going to annoy people with this comment, but I feel like it's going to have to be made at some point by somebody, so here goes: I don't see the point of this feature. If you want a trigger on a table, why not set it on the remote side? A trigger on the foreign table won't be enforced consistently; it'll only work when the update is routed through the foreign table, not when people access the underlying table on the remote side through any other mechanism. The number of useful things you can do this way seems fairly small. Perhaps you could use a row-level trigger for RLS, to allow only certain rows on the foreign side to be updated, but even that seems like a slightly strange design: generally it'd be better to enforce the security as close to the target object as possible. There's another issue that concerns me here also: performance. IIUC, an update of N tuples on the remote side currently requires N+1 server round-trips. That is unspeakably awful, and we really oughta be looking for ways to make that number go down, by pushing the whole update to the remote side. But obviously that won't be possible if there's a per-row trigger that has to be evaluated on the local side. Now, assuming somebody comes up with code that implements that optimization, we can just disable it when there are local-side triggers. But, then you're back to having terrible performance. So even if the use case for this seemed really broad, I tend to think performance concerns would sink most of the possible real-world uses. I could, of course, be all wet -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Triggers on foreign tables
Robert, * Robert Haas (robertmh...@gmail.com) wrote: /me pokes head up. I know I'm going to annoy people with this comment, but I feel like it's going to have to be made at some point Perhaps some folks will be annoyed- I'm not annoyed, but I don't really agree. :) by somebody, so here goes: I don't see the point of this feature. If you want a trigger on a table, why not set it on the remote side? A trigger on the foreign table won't be enforced consistently; it'll only work when the update is routed through the foreign table, not when people access the underlying table on the remote side through any other mechanism. The number of useful things you can do this way seems fairly small. Perhaps you could use a row-level trigger for RLS, to allow only certain rows on the foreign side to be updated, but even that seems like a slightly strange design: generally it'd be better to enforce the security as close to the target object as possible. I can certainly see use-cases for this, a very simple one being a way to keep track of what's been updated/inserted/whatever through this particular foreign table (essentially, an auditing table). The *remote* side might not be ideal for tracking that information and you might want the info locally and remotely anyway. There's another issue that concerns me here also: performance. IIUC, an update of N tuples on the remote side currently requires N+1 server round-trips. That is unspeakably awful, and we really oughta be looking for ways to make that number go down, by pushing the whole update to the remote side. But obviously that won't be possible if there's a per-row trigger that has to be evaluated on the local side. Now, assuming somebody comes up with code that implements that optimization, we can just disable it when there are local-side triggers. But, then you're back to having terrible performance. So even if the use case for this seemed really broad, I tend to think performance concerns would sink most of the possible real-world uses. Performance, while a concern, should probably be secondary when there are valid use-cases for this where the performance wouldn't be a problem for users. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] Triggers on foreign tables
2013/10/15 Robert Haas robertmh...@gmail.com: On Mon, Oct 14, 2013 at 5:24 PM, Kohei KaiGai kai...@kaigai.gr.jp wrote: And, I also want some comments from committers, not only from mine. +1 +1 /me pokes head up. I know I'm going to annoy people with this comment, but I feel like it's going to have to be made at some point by somebody, so here goes: I don't see the point of this feature. If you want a trigger on a table, why not set it on the remote side? A trigger on the foreign table won't be enforced consistently; it'll only work when the update is routed through the foreign table, not when people access the underlying table on the remote side through any other mechanism. The number of useful things you can do this way seems fairly small. Perhaps you could use a row-level trigger for RLS, to allow only certain rows on the foreign side to be updated, but even that seems like a slightly strange design: generally it'd be better to enforce the security as close to the target object as possible. One reason we should support local triggers is that not all the data source of FDW support remote trigger. It required FDW drivers to have RDBMS as its backend, but no realistic assumption. For example, file_fdw is unavailable to implement remote triggers. One thing I'd like to know is, where is the goal of FDW feature. It seems to me, FDW goes into a feature to manage external data set as if regular tables. If it is right understanding, things we try to support on foreign table is things we're supporting on regular tables, such as triggers. There's another issue that concerns me here also: performance. IIUC, an update of N tuples on the remote side currently requires N+1 server round-trips. That is unspeakably awful, and we really oughta be looking for ways to make that number go down, by pushing the whole update to the remote side. But obviously that won't be possible if there's a per-row trigger that has to be evaluated on the local side. Now, assuming somebody comes up with code that implements that optimization, we can just disable it when there are local-side triggers. But, then you're back to having terrible performance. So even if the use case for this seemed really broad, I tend to think performance concerns would sink most of the possible real-world uses. We often have some case that we cannot apply fully optimized path because of some reasons, like view has security-barrier, qualifier contained volatile functions, and so on... Trigger may be a factor to prevent fully optimized path, however, it depends on the situation which one shall be prioritized; performance or functionality. Thanks, -- KaiGai Kohei kai...@kaigai.gr.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Triggers on foreign tables
2013/10/13 Ronan Dunklau rdunk...@gmail.com: Le samedi 12 octobre 2013 07:30:35 Kohei KaiGai a écrit : 2013/10/10 Ronan Dunklau rdunk...@gmail.com: Sorry, I'm uncertain the point above. Are you saying FDW driver may be able to handle well a case when a remote tuple to be updated is different from a remote tuple being fetched on the first stage? Or, am I misunderstanding? I was not clear in the point above: what I meant was that, from my understanding, the FDW driver has no obligation to use the system-attribute tupleid to identify the remote tuple. IIRC, one of the early API proposal involved had the tupleid passed as an argument to the ExecForeign* hooks. Now, these hooks receive the whole planslot instead. This slot can store additional resjunks attributes (thanks to the AddForeignUpdateTargets) which shouldn't be altered during the execution and are carried on until modify stage. So, these additional attributes can be used as identifiers instead of the tupleid. In fact, this is the approach I implemented for the multicorn fdw, and I believe it to be used in other FDWs as well (HadoopFDW does that, if I understand it correctly). So, it doesn't make sense to me to assume that the tupleid will be filled as valid remote identifier in the context of triggers. Sorry, I might call it something like primary key, instead of 'tupleid'. Apart from whether we can uniquely identify a particular remote record with an attribute, what FDW needs here is something to identify a remote record. So, we were talking about same concept with different names. Does the incomplete tuple mean a tuple image but some of columns are replaced with NULL due to optimization, as postgres_fdw is doing, doesn't it? If so, a solution is to enforce FDW driver to fetch all the columns if its managing foreign table has row level triggers for update / delete. What I meant is that a DELETE statement, for example, does not build a scan- stage reltargetlist consisting of the whole set of attributes: the only attributes that are fetched are the ones built by addForeignUpdateTargets, and whatever additional attributes are involved in the DELETE statement (in the WHERE clause, for example). DELETE statement indeed does not (need to) construct a complete tuple image on scan stage, however, it does not prohibit FDW driver to keep the latest record being fetched from remote server. FDW driver easily knows whether relation has row-level trigger preliminary, so here is no matter to keep a complete image internally if needed. Or, it is perhaps available to add additional target-entries that is needed to construct a complete tuple using AddForeignUpdateTargets() callback. As I noted above, 2nd round trip is not a suitable design to support after-row trigger. Likely, GetTupleForTrigger() hook shall perform to return a cached older image being fetched on the first round of remote scan. So, I guess every FDW driver will have similar implementation to handle these the situation, thus it makes sense that PostgreSQL core has a feature to support this handling; that keeps a tuple being fetched last and returns older image for row-level triggers. How about your opinion? I like this idea, but I don't really see all the implications of such a design. Where would this tuple be stored ? From what I understand, after-triggers are queued for later execution, using the old/new tupleid for later retrieval (in the AfterTriggerEventData struct). This would need a major change to work with foreign tables. Would that involve a special path for executing those triggers ASAP ? Ops, I oversight here. AfterTriggerSaveEvent() indeed saves only ctid of tuples in regular tables, because it can re-construct a complete tuple image from a particular ctid any time. On the other hand, it is not available on foreign table due to its nature. All we can do seems to me, probably, to save copy of before/after tuple image on local memory, even though it consumes much memory than regular tables. The feature we need here might become a bit clear little by little. A complete tuple image shall be fetched on the scan stage, if foreign table has row-level trigger. The planSlot may make sense if it contains (junk) attributes that is sufficient to re-construct an old tuple image. Target-list of DELETE statement contains a junk ctid attribute. Here is no reason why we cannot add junk attributes that reference each regular attribute, isn't it? Also, target-list of UPDATE statement contains new tuple image, however, it is available to add junk attributes that just reference old value, as a carrier of old values from scan stage to modify stage. I want core PostgreSQL to support a part of these jobs. For example, it may be able to add junk attribute to re-construct old tuple image on rewriteTargetListUD(), if target relation has row-level triggers. Also, it may be able to extract these old values from planSlot to construct an old tuple image on
Re: [HACKERS] Triggers on foreign tables
Le samedi 12 octobre 2013 07:30:35 Kohei KaiGai a écrit : 2013/10/10 Ronan Dunklau rdunk...@gmail.com: Sorry, I'm uncertain the point above. Are you saying FDW driver may be able to handle well a case when a remote tuple to be updated is different from a remote tuple being fetched on the first stage? Or, am I misunderstanding? I was not clear in the point above: what I meant was that, from my understanding, the FDW driver has no obligation to use the system-attribute tupleid to identify the remote tuple. IIRC, one of the early API proposal involved had the tupleid passed as an argument to the ExecForeign* hooks. Now, these hooks receive the whole planslot instead. This slot can store additional resjunks attributes (thanks to the AddForeignUpdateTargets) which shouldn't be altered during the execution and are carried on until modify stage. So, these additional attributes can be used as identifiers instead of the tupleid. In fact, this is the approach I implemented for the multicorn fdw, and I believe it to be used in other FDWs as well (HadoopFDW does that, if I understand it correctly). So, it doesn't make sense to me to assume that the tupleid will be filled as valid remote identifier in the context of triggers. From another standpoint, I'd like to cancel the above my suggestion, because after-row update or delete trigger tries to fetch an older image of tuple next to the actual update / delete jobs. Even if FDW is responsible to identify a remote tuple using tupleid, the identified tuple we can fetch is the newer image because FDW driver already issues a remote query to update or delete the target record. So, soon or later, FDW shall be responsible to keep a complete tuple image when foreign table has row-level triggers, until writer operation is finished. +1 Does the incomplete tuple mean a tuple image but some of columns are replaced with NULL due to optimization, as postgres_fdw is doing, doesn't it? If so, a solution is to enforce FDW driver to fetch all the columns if its managing foreign table has row level triggers for update / delete. What I meant is that a DELETE statement, for example, does not build a scan- stage reltargetlist consisting of the whole set of attributes: the only attributes that are fetched are the ones built by addForeignUpdateTargets, and whatever additional attributes are involved in the DELETE statement (in the WHERE clause, for example). I apologize if this is not clear, since both my technical and english vocabulary are maybe not precise enough to get my point accross. Or, perform a second round-trip to the foreign data store to fetch the whole tuple. It might be a solution for before-row trigger, but not for after-row trigger, unfortunately. +1 As I noted above, 2nd round trip is not a suitable design to support after-row trigger. Likely, GetTupleForTrigger() hook shall perform to return a cached older image being fetched on the first round of remote scan. So, I guess every FDW driver will have similar implementation to handle these the situation, thus it makes sense that PostgreSQL core has a feature to support this handling; that keeps a tuple being fetched last and returns older image for row-level triggers. How about your opinion? I like this idea, but I don't really see all the implications of such a design. Where would this tuple be stored ? From what I understand, after-triggers are queued for later execution, using the old/new tupleid for later retrieval (in the AfterTriggerEventData struct). This would need a major change to work with foreign tables. Would that involve a special path for executing those triggers ASAP ? And, I also want some comments from committers, not only from mine. +1 Thank you for taking the time to think this through. -- Ronan Dunklau signature.asc Description: This is a digitally signed message part.
Re: [HACKERS] Triggers on foreign tables
2013/10/10 Ronan Dunklau rdunk...@gmail.com: Le dimanche 6 octobre 2013 22:33:23 Kohei KaiGai a écrit : 2013/9/10 Ronan Dunklau rdunk...@gmail.com: For row-level triggers, it seems more complicated. From what I understand, OLD/NEW tuples are fetched from the heap using their ctid (except for BEFORE INSERT triggers). How could this be adapted for foreign tables ? It seems to me the origin of difficulty to support row-level trigger is that foreign table does not have a back-door to see the older tuple to be updated, unlike regular tables. The core-PostgreSQL knows on-disk format to store tuples within regular tables, thus, GetTupleForTrigger() can fetch a tuple being identified by tupleid. On the other hand, writable foreign table is also designed to identify a remote tuple with tupleid, as long as FDW module is responsible. It is my understanding that the tupleid is one way for a FDW to identify a tuple, but the AddForeignUpdateTargets hook allows for arbitrary resjunks columns to be added. It is these columns that are then used to identify the tuple to update. I don't think the tupleid itself can be used to identify a tuple for the trigger. Sorry, I'm uncertain the point above. Are you saying FDW driver may be able to handle well a case when a remote tuple to be updated is different from a remote tuple being fetched on the first stage? Or, am I misunderstanding? From another standpoint, I'd like to cancel the above my suggestion, because after-row update or delete trigger tries to fetch an older image of tuple next to the actual update / delete jobs. Even if FDW is responsible to identify a remote tuple using tupleid, the identified tuple we can fetch is the newer image because FDW driver already issues a remote query to update or delete the target record. So, soon or later, FDW shall be responsible to keep a complete tuple image when foreign table has row-level triggers, until writer operation is finished. So, a straightforward idea is adding a new FDW interface to get an older image of the tuple to be updated. It makes possible to implement something similar to GetTupleForTrigger() on foreign tables, even though it takes a secondary query to fetch a record from the remote server. Probably, it is an headache for us One thing we pay attention is, the tuple to be updated is already fetched from the remote server and the tuple being fetched latest is (always?) the target of update or delete. It is not a heavy job for ForeignScanState node to hold a copy of the latest tuple. Isn't it an available way to reference relevant ForeignScanState to get the older image? It is however possible to have only an incomplete tuple. The FDW may have only fetched the tupleid-equivalent, and we would have to ensure that the reltargetlist contains every attribute that the trigger could need. Does the incomplete tuple mean a tuple image but some of columns are replaced with NULL due to optimization, as postgres_fdw is doing, doesn't it? If so, a solution is to enforce FDW driver to fetch all the columns if its managing foreign table has row level triggers for update / delete. Or, perform a second round-trip to the foreign data store to fetch the whole tuple. It might be a solution for before-row trigger, but not for after-row trigger, unfortunately. If my assumption is right, all we need to enhance are (1) add a store on ForeignScanState to hold the last tuple on the scan stage, (2) add GetForeignTupleForTrigger() to reference the store above on the relevant ForeignScanState. I would add a 3) have a GetTupleForTrigger() hook that would get called with the stored tuple. I personnally don't like this approach: there is already (n+1) round trips to the foreign store (one during the scan phase, and one per tuple during the update/delete phase), we don't need another one per tuple for the triggers. As I noted above, 2nd round trip is not a suitable design to support after-row trigger. Likely, GetTupleForTrigger() hook shall perform to return a cached older image being fetched on the first round of remote scan. So, I guess every FDW driver will have similar implementation to handle these the situation, thus it makes sense that PostgreSQL core has a feature to support this handling; that keeps a tuple being fetched last and returns older image for row-level triggers. How about your opinion? And, I also want some comments from committers, not only from mine. -- KaiGai Kohei kai...@kaigai.gr.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Triggers on foreign tables
Le dimanche 6 octobre 2013 22:33:23 Kohei KaiGai a écrit : 2013/9/10 Ronan Dunklau rdunk...@gmail.com: For row-level triggers, it seems more complicated. From what I understand, OLD/NEW tuples are fetched from the heap using their ctid (except for BEFORE INSERT triggers). How could this be adapted for foreign tables ? It seems to me the origin of difficulty to support row-level trigger is that foreign table does not have a back-door to see the older tuple to be updated, unlike regular tables. The core-PostgreSQL knows on-disk format to store tuples within regular tables, thus, GetTupleForTrigger() can fetch a tuple being identified by tupleid. On the other hand, writable foreign table is also designed to identify a remote tuple with tupleid, as long as FDW module is responsible. It is my understanding that the tupleid is one way for a FDW to identify a tuple, but the AddForeignUpdateTargets hook allows for arbitrary resjunks columns to be added. It is these columns that are then used to identify the tuple to update. I don't think the tupleid itself can be used to identify a tuple for the trigger. So, a straightforward idea is adding a new FDW interface to get an older image of the tuple to be updated. It makes possible to implement something similar to GetTupleForTrigger() on foreign tables, even though it takes a secondary query to fetch a record from the remote server. Probably, it is an headache for us One thing we pay attention is, the tuple to be updated is already fetched from the remote server and the tuple being fetched latest is (always?) the target of update or delete. It is not a heavy job for ForeignScanState node to hold a copy of the latest tuple. Isn't it an available way to reference relevant ForeignScanState to get the older image? It is however possible to have only an incomplete tuple. The FDW may have only fetched the tupleid-equivalent, and we would have to ensure that the reltargetlist contains every attribute that the trigger could need. Or, perform a second round-trip to the foreign data store to fetch the whole tuple. If my assumption is right, all we need to enhance are (1) add a store on ForeignScanState to hold the last tuple on the scan stage, (2) add GetForeignTupleForTrigger() to reference the store above on the relevant ForeignScanState. I would add a 3) have a GetTupleForTrigger() hook that would get called with the stored tuple. I personnally don't like this approach: there is already (n+1) round trips to the foreign store (one during the scan phase, and one per tuple during the update/delete phase), we don't need another one per tuple for the triggers. signature.asc Description: This is a digitally signed message part.
Re: [HACKERS] Triggers on foreign tables
On 10/6/13 3:33 PM, Kohei KaiGai wrote: 2013/9/10 Ronan Dunklau rdunk...@gmail.com: For row-level triggers, it seems more complicated. From what I understand, OLD/NEW tuples are fetched from the heap using their ctid (except for BEFORE INSERT triggers). How could this be adapted for foreign tables ? It seems to me the origin of difficulty to support row-level trigger is that foreign table does not have a back-door to see the older tuple to be updated, unlike regular tables. The core-PostgreSQL knows on-disk format to store tuples within regular tables, thus, GetTupleForTrigger() can fetch a tuple being identified by tupleid. On the other hand, writable foreign table is also designed to identify a remote tuple with tupleid, as long as FDW module is responsible. So, a straightforward idea is adding a new FDW interface to get an older image of the tuple to be updated. It makes possible to implement something similar to GetTupleForTrigger() on foreign tables, even though it takes a secondary query to fetch a record from the remote server. Probably, it is an headache for us. One thing we pay attention is, the tuple to be updated is already fetched from the remote server and the tuple being fetched latest is (always?) the target of update or delete. It is not a heavy job for ForeignScanState node to hold a copy of the latest tuple. Isn't it an available way to reference relevant ForeignScanState to get the older image? If my assumption is right, all we need to enhance are (1) add a store on ForeignScanState to hold the last tuple on the scan stage, (2) add GetForeignTupleForTrigger() to reference the store above on the relevant ForeignScanState. Any comment please. What happens if someone changes the record on the foreign side between when we've read it and we do the UPDATE? -- Jim C. Nasby, Data Architect j...@nasby.net 512.569.9461 (cell) http://jim.nasby.net -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Triggers on foreign tables
What happens if someone changes the record on the foreign side between when we've read it and we do the UPDATE? Concurrency control is job of FDW driver. It has to coordinate access to the records to be fetched for update / delete. In fact, postgres_fdw add FOR UPDATE to avoid concurrent update when it issues 1st-stage query to remote server. Thanks, -- KaiGai Kohei kai...@kaigai.gr.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Triggers on foreign tables
2013/9/10 Ronan Dunklau rdunk...@gmail.com: For row-level triggers, it seems more complicated. From what I understand, OLD/NEW tuples are fetched from the heap using their ctid (except for BEFORE INSERT triggers). How could this be adapted for foreign tables ? It seems to me the origin of difficulty to support row-level trigger is that foreign table does not have a back-door to see the older tuple to be updated, unlike regular tables. The core-PostgreSQL knows on-disk format to store tuples within regular tables, thus, GetTupleForTrigger() can fetch a tuple being identified by tupleid. On the other hand, writable foreign table is also designed to identify a remote tuple with tupleid, as long as FDW module is responsible. So, a straightforward idea is adding a new FDW interface to get an older image of the tuple to be updated. It makes possible to implement something similar to GetTupleForTrigger() on foreign tables, even though it takes a secondary query to fetch a record from the remote server. Probably, it is an headache for us. One thing we pay attention is, the tuple to be updated is already fetched from the remote server and the tuple being fetched latest is (always?) the target of update or delete. It is not a heavy job for ForeignScanState node to hold a copy of the latest tuple. Isn't it an available way to reference relevant ForeignScanState to get the older image? If my assumption is right, all we need to enhance are (1) add a store on ForeignScanState to hold the last tuple on the scan stage, (2) add GetForeignTupleForTrigger() to reference the store above on the relevant ForeignScanState. Any comment please. -- KaiGai Kohei kai...@kaigai.gr.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Triggers on foreign tables
On Thursday 12 September 2013 12:10:01 Peter Eisentraut wrote: The documentation build fails: openjade:trigger.sgml:72:9:E: end tag for ACRONYM omitted, but OMITTAG NO was specified openjade:trigger.sgml:70:56: start tag was here Thank you, I took the time to install a working doc-building environment. Please find attached a patch that corrects this.diff --git a/doc/src/sgml/ref/create_trigger.sgml b/doc/src/sgml/ref/create_trigger.sgml index e5ec738..08d133c 100644 --- a/doc/src/sgml/ref/create_trigger.sgml +++ b/doc/src/sgml/ref/create_trigger.sgml @@ -93,7 +93,7 @@ CREATE [ CONSTRAINT ] TRIGGER replaceable class=PARAMETERname/replaceable para The following table summarizes which types of triggers may be used on - tables and views: + tables, views and foreign tables: /para informaltable id=supported-trigger-types @@ -111,7 +111,7 @@ CREATE [ CONSTRAINT ] TRIGGER replaceable class=PARAMETERname/replaceable entry align=center morerows=1literalBEFORE//entry entry align=centercommandINSERT//commandUPDATE//commandDELETE//entry entry align=centerTables/entry - entry align=centerTables and views/entry + entry align=centerTables, views and foreign tables/entry /row row entry align=centercommandTRUNCATE//entry @@ -122,7 +122,7 @@ CREATE [ CONSTRAINT ] TRIGGER replaceable class=PARAMETERname/replaceable entry align=center morerows=1literalAFTER//entry entry align=centercommandINSERT//commandUPDATE//commandDELETE//entry entry align=centerTables/entry - entry align=centerTables and views/entry + entry align=centerTables, views and foreign tables/entry /row row entry align=centercommandTRUNCATE//entry @@ -244,7 +244,7 @@ UPDATE OF replaceablecolumn_name1/replaceable [, replaceablecolumn_name2/ termreplaceable class=parametertable_name/replaceable/term listitem para - The name (optionally schema-qualified) of the table or view the trigger + The name (optionally schema-qualified) of the table, view or foreign table the trigger is for. /para /listitem diff --git a/doc/src/sgml/trigger.sgml b/doc/src/sgml/trigger.sgml index f579340..37c0a35 100644 --- a/doc/src/sgml/trigger.sgml +++ b/doc/src/sgml/trigger.sgml @@ -33,7 +33,7 @@ para A trigger is a specification that the database should automatically execute a particular function whenever a certain type of operation is -performed. Triggers can be attached to both tables and views. +performed. Triggers can be attached to tables, views and foreign tables. /para para @@ -64,6 +64,15 @@ /para para +On foreign tables, trigger can be defined to execute either before or after any +commandINSERT/command, +commandUPDATE/command or +commandDELETE/command operation, only once per acronymSQL/acronym statement. +This means that row-level triggers are not supported for foreign tables. + /para + + + para The trigger function must be defined before the trigger itself can be created. The trigger function must be declared as a function taking no arguments and returning type literaltrigger/. @@ -96,6 +105,7 @@ after may only be defined at statement level, while triggers that fire instead of an commandINSERT/command, commandUPDATE/command, or commandDELETE/command may only be defined at row level. +On foreign tables, triggers can not be defined at row level. /para para diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c index adc74dd..68bfa9c 100644 --- a/src/backend/commands/tablecmds.c +++ b/src/backend/commands/tablecmds.c @@ -3141,13 +3141,16 @@ ATPrepCmd(List **wqueue, Relation rel, AlterTableCmd *cmd, pass = AT_PASS_MISC; break; case AT_EnableTrig: /* ENABLE TRIGGER variants */ - case AT_EnableAlwaysTrig: - case AT_EnableReplicaTrig: case AT_EnableTrigAll: case AT_EnableTrigUser: case AT_DisableTrig: /* DISABLE TRIGGER variants */ case AT_DisableTrigAll: case AT_DisableTrigUser: +ATSimplePermissions(rel, ATT_TABLE | ATT_FOREIGN_TABLE); +pass = AT_PASS_MISC; +break; + case AT_EnableReplicaTrig: + case AT_EnableAlwaysTrig: case AT_EnableRule: /* ENABLE/DISABLE RULE variants */ case AT_EnableAlwaysRule: case AT_EnableReplicaRule: diff --git a/src/backend/commands/trigger.c b/src/backend/commands/trigger.c index d86e9ad..d84b61b 100644 --- a/src/backend/commands/trigger.c +++ b/src/backend/commands/trigger.c @@ -184,12 +184,23 @@ CreateTrigger(CreateTrigStmt *stmt, const char *queryString, RelationGetRelationName(rel)), errdetail(Views cannot have TRUNCATE triggers.))); } - else + else if (rel-rd_rel-relkind == RELKIND_FOREIGN_TABLE) + { + if(stmt-row){ + ereport(ERROR, + (errcode(ERRCODE_WRONG_OBJECT_TYPE), + errmsg(\%s\ is a foreign table, +
Re: [HACKERS] Triggers on foreign tables
On 9/11/13 10:14 AM, Ronan Dunklau wrote: On Wednesday 11 September 2013 06:27:24 Michael Paquier wrote: As your patch is targeting the implementation of a new feature, please consider adding this patch to the next commit fest that is going to begin in a couple of days: https://commitfest.postgresql.org/action/commitfest_view?id=19 I intended to do that in the next couple of days if I don't get any feedback. Thank you for the reminder, its done. The documentation build fails: openjade:trigger.sgml:72:9:E: end tag for ACRONYM omitted, but OMITTAG NO was specified openjade:trigger.sgml:70:56: start tag was here -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Triggers on foreign tables
On Tue, Sep 10, 2013 at 5:08 AM, Ronan Dunklau rdunk...@gmail.com wrote: Hello. I wanted to know what it would take to implement triggers on foreign tables. It seems that statement-level triggers can work provided they are allowed in the code. Please find attached a simple POC patch that implement just that. For row-level triggers, it seems more complicated. From what I understand, OLD/NEW tuples are fetched from the heap using their ctid (except for BEFORE INSERT triggers). How could this be adapted for foreign tables ? As your patch is targeting the implementation of a new feature, please consider adding this patch to the next commit fest that is going to begin in a couple of days: https://commitfest.postgresql.org/action/commitfest_view?id=19 More general information on how patches are processed is findable here: http://wiki.postgresql.org/wiki/Submitting_a_Patch Regards, -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Triggers on foreign tables
On Wednesday 11 September 2013 06:27:24 Michael Paquier wrote: As your patch is targeting the implementation of a new feature, please consider adding this patch to the next commit fest that is going to begin in a couple of days: https://commitfest.postgresql.org/action/commitfest_view?id=19 I intended to do that in the next couple of days if I don't get any feedback. Thank you for the reminder, its done. signature.asc Description: This is a digitally signed message part.
[HACKERS] Triggers on foreign tables
Hello. I wanted to know what it would take to implement triggers on foreign tables. It seems that statement-level triggers can work provided they are allowed in the code. Please find attached a simple POC patch that implement just that. For row-level triggers, it seems more complicated. From what I understand, OLD/NEW tuples are fetched from the heap using their ctid (except for BEFORE INSERT triggers). How could this be adapted for foreign tables ? -- Ronan Dunklaudiff --git a/doc/src/sgml/ref/create_trigger.sgml b/doc/src/sgml/ref/create_trigger.sgml index e5ec738..08d133c 100644 --- a/doc/src/sgml/ref/create_trigger.sgml +++ b/doc/src/sgml/ref/create_trigger.sgml @@ -93,7 +93,7 @@ CREATE [ CONSTRAINT ] TRIGGER replaceable class=PARAMETERname/replaceable para The following table summarizes which types of triggers may be used on - tables and views: + tables, views and foreign tables: /para informaltable id=supported-trigger-types @@ -111,7 +111,7 @@ CREATE [ CONSTRAINT ] TRIGGER replaceable class=PARAMETERname/replaceable entry align=center morerows=1literalBEFORE//entry entry align=centercommandINSERT//commandUPDATE//commandDELETE//entry entry align=centerTables/entry - entry align=centerTables and views/entry + entry align=centerTables, views and foreign tables/entry /row row entry align=centercommandTRUNCATE//entry @@ -122,7 +122,7 @@ CREATE [ CONSTRAINT ] TRIGGER replaceable class=PARAMETERname/replaceable entry align=center morerows=1literalAFTER//entry entry align=centercommandINSERT//commandUPDATE//commandDELETE//entry entry align=centerTables/entry - entry align=centerTables and views/entry + entry align=centerTables, views and foreign tables/entry /row row entry align=centercommandTRUNCATE//entry @@ -244,7 +244,7 @@ UPDATE OF replaceablecolumn_name1/replaceable [, replaceablecolumn_name2/ termreplaceable class=parametertable_name/replaceable/term listitem para - The name (optionally schema-qualified) of the table or view the trigger + The name (optionally schema-qualified) of the table, view or foreign table the trigger is for. /para /listitem diff --git a/doc/src/sgml/trigger.sgml b/doc/src/sgml/trigger.sgml index f579340..37c0a35 100644 --- a/doc/src/sgml/trigger.sgml +++ b/doc/src/sgml/trigger.sgml @@ -33,7 +33,7 @@ para A trigger is a specification that the database should automatically execute a particular function whenever a certain type of operation is -performed. Triggers can be attached to both tables and views. +performed. Triggers can be attached to tables, views and foreign tables. /para para @@ -64,6 +64,15 @@ /para para +On foreign tables, trigger can be defined to execute either before or after any +commandINSERT/command, +commandUPDATE/command or +commandDELETE/command operation, only onece per acronymSQL statement. +This means that row-level triggers are not supported for foreign tables. + /para + + + para The trigger function must be defined before the trigger itself can be created. The trigger function must be declared as a function taking no arguments and returning type literaltrigger/. @@ -96,6 +105,7 @@ after may only be defined at statement level, while triggers that fire instead of an commandINSERT/command, commandUPDATE/command, or commandDELETE/command may only be defined at row level. +On foreign tables, triggers can not be defined at row level. /para para diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c index adc74dd..68bfa9c 100644 --- a/src/backend/commands/tablecmds.c +++ b/src/backend/commands/tablecmds.c @@ -3141,13 +3141,16 @@ ATPrepCmd(List **wqueue, Relation rel, AlterTableCmd *cmd, pass = AT_PASS_MISC; break; case AT_EnableTrig: /* ENABLE TRIGGER variants */ - case AT_EnableAlwaysTrig: - case AT_EnableReplicaTrig: case AT_EnableTrigAll: case AT_EnableTrigUser: case AT_DisableTrig: /* DISABLE TRIGGER variants */ case AT_DisableTrigAll: case AT_DisableTrigUser: +ATSimplePermissions(rel, ATT_TABLE | ATT_FOREIGN_TABLE); +pass = AT_PASS_MISC; +break; + case AT_EnableReplicaTrig: + case AT_EnableAlwaysTrig: case AT_EnableRule: /* ENABLE/DISABLE RULE variants */ case AT_EnableAlwaysRule: case AT_EnableReplicaRule: diff --git a/src/backend/commands/trigger.c b/src/backend/commands/trigger.c index d86e9ad..d84b61b 100644 --- a/src/backend/commands/trigger.c +++ b/src/backend/commands/trigger.c @@ -184,12 +184,23 @@ CreateTrigger(CreateTrigStmt *stmt, const char *queryString, RelationGetRelationName(rel)), errdetail(Views cannot have TRUNCATE triggers.))); } - else + else if (rel-rd_rel-relkind == RELKIND_FOREIGN_TABLE) + { + if(stmt-row){ + ereport(ERROR, +