On Wed, Jun 7, 2017 at 9:42 AM, Robert Haas <robertmh...@gmail.com> wrote: > I think I'd like to walk back my earlier statements about reverting > this patch just a little bit. Although putting the tuplestore at the > wrong level does seem like a fairly significant design mistake, Thomas > more or less convinced me yesterday on a Skype call that relocating it > to the ModifyTable node might not be that much work. If it's a > 150-line patch, it'd probably be less disruptive to install a fix than > to revert the whole thing (and maybe put it back in again, in some > future release).
I spent a couple of hours drafting a proof-of-concept to see if my hunch was right. It seems to work correctly so far and isn't huge (but certainly needs more testing and work): 6 files changed, 156 insertions(+), 109 deletions(-) It applies on top of the other patch. It extends TransitionCaptureState to hold the the new and old tuplestores for each ModifyTable node, instead of using global variables. The result is that tuplestores don't get mixed up, and there aren't any weird assumptions about the order of execution as discussed earlier. Example: with wcte as (insert into table1 values (42)) insert into table2 values ('hello world'); NOTICE: trigger = table2_trig, old table = <NULL>, new table = ("hello world") NOTICE: trigger = table1_trig, old table = <NULL>, new table = (42) Summary of how these patches relate: 1. In the inheritance patch, TransitionCaptureState is introduced. It holds flags that control whether we capture tuples. There is one of these per ModifyTable node. In master we use the flags in TriggerDesc to control transition tuple capture directly, but we needed a way for ModifyTable's result rel's TriggerDesc to affect all child tables that are touched. My proposal is to do that by inventing this new object to activate transition tuple capture while modifying child tables too. It is passed into the ExecAR*Trigger() functions of all relations touched by the ModifyTable node. 2. In the attached patch, that struct is extended to hold the actual tuplestores. They are used for two purposes: ExecAR*Trigger() captures tuples into them (instead of using global variables to find the tuplestores to capture tuples into), and ExecA[RS]*Trigger() keeps hold of the TransitionCaptureState in the after trigger queue so that when the queued event is eventually executed AfterTriggerExecute() can expose the correct tuplestores to triggers. There are a couple of things that definitely need work and I'd welcome any comments: 1. I added a pointer to TransitionCaptureState to AfterTriggerShared, in order to record which tuplestores a queued after trigger event should see. I suspected that enqueuing pointers like that wouldn't be popular, and when I ran the idea past Andres on IRC he just said "yuck" :-) Perhaps there needs to be a way to convert this into an index into some array in EState, ... or something else. The basic requirement is that the AfterTriggerExecute() needs to know *which* tuplestores should be visible to the trigger when it runs. I believe the object lifetime is sound (the TransitionCaptureState lasts until ExecutorEnd(), and triggers are fired before that during ExecutorFinish()). 2. I didn't think about what execReplication.c needs. Although that code apparently doesn't know how to fire AS triggers, it does know how to fire AR triggers (so that RI works?), and in theory those might have transition tables, so I guess that needs to use MakeTransitionCaptureState() -- but it seems to lack a place to keep that around, and I ran out of time thinking about that today. Thoughts?  https://www.postgresql.org/message-id/CAEepm%3D1dGNzh98Gt21fn_Ed6k20sVB-NuAARE1EF693itK6%3DLg%40mail.gmail.com -- Thomas Munro http://www.enterprisedb.com
Description: Binary data
-- Sent via pgsql-hackers mailing list (email@example.com) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers