Re: [HACKERS] Triggers on foreign tables

Ronan Dunklau Sun, 02 Mar 2014 22:50:33 -0800

Hello.

Did you have time to review the latest version of this patch ? Is there 
anything I can do to get this "ready for commiter" ?


Thank you for all the work performed so far.




Le mardi 4 février 2014 13:16:22 Ronan Dunklau a écrit :
> Le lundi 3 février 2014 23:28:45 Noah Misch a écrit :
> > On Sun, Feb 02, 2014 at 11:53:51AM +0100, Ronan Dunklau wrote:
> > > Le jeudi 30 janvier 2014 14:05:08 Noah Misch a écrit :
> > > > On Thu, Jan 23, 2014 at 03:17:35PM +0100, Ronan Dunklau wrote:
> > > > > What do you think about this approach ? Is there something I missed
> > > > > which
> > > > > would make it not sustainable ?
> > > > 
> > > > Seems basically reasonable.  I foresee multiple advantages from having
> > > > one
> > > > tuplestore per query level as opposed to one for the entire
> > > > transaction.
> > > > You would remove the performance trap of backing up the tuplestore by
> > > > rescanning. It permits reclaiming memory and disk space in
> > > > AfterTriggerEndQuery() rather than at end of transaction.  You could
> > > > remove
> > > > ate_ptr1 and ate_ptr2 from AfterTriggerEventDataFDW and just store the
> > > > flags word: depending on AFTER_TRIGGER_2CTIDS, grab either the next
> > > > one
> > > > or
> > > > the next two tuples from the tuplestore.  Using work_mem per
> > > > AfterTriggerBeginQuery() instead of per transaction is no problem. 
> > > > What
> > > > do
> > > > you think of that design change?
> > > 
> > > I agree that this design is better, but I have some objections.
> > > 
> > > We can remove ate_ptr2 and rely on the AFTER_TRIGGER_2CTIDS flag, but
> > > the
> > > rescanning and ate_ptr1 (renamed ate_tupleindex in the attached patch)
> > > can't go away.
> > > 
> > > Consider for example the case of a foreign table with more than one
> > > AFTER
> > > UPDATE triggers. Unless we store the tuples once for each trigger, we
> > > will
> > > have to rescan the tuplestore.
> > 
> > Will we?  Within a given query level, when do (non-deferred) triggers
> > execute in an order other than the enqueue order?
> 
> Let me explain what I had in mind.
> 
> Looking at the code in AfterTriggerSaveEvent:
> 
> - we build a "template" AfterTriggerEvent, and store the tuple(s)
> - for each suitable after trigger that matches the trigger type, as well as
> the WHEN condition if any, a copy of the previously built AfterTriggerEvent
> is queued
> 
> Later, those events are fired in order.
> 
> This means that more than one event can be fired for one tuple.
> 
> Take this example:
> 
> CREATE TRIGGER trig_row_after1
> AFTER UPDATE ON rem2
> FOR EACH ROW
> WHEN (NEW.f1 % 5 < 3)
> EXECUTE PROCEDURE trigger_func('TRIG1');
> 
> CREATE TRIGGER trig_row_after2
> AFTER UPDATE ON rem2
> FOR EACH ROW
> WHEN (NEW.f1 % 5 < 4)
> EXECUTE PROCEDURE trigger_func('TRIG2');
> 
> UPDATE rem2 set f2 = 'something';
> 
> Assuming 5 rows with f1 as a serial, the fired AfterTriggerEvent's
> ate_tupleindex will be, in that order. Ass
> 
> 0-0-2-2-4-8-8
> 
> So, at least a backward seek is required for trig_row_after2 to be able to
> retrieve a tuple that was already consumed when firing trig_row_after1.
> 
> On a side note, this made me realize that it is better to avoid storing a
> tuple entirely if there is no enabled trigger (the f1 = 4 case above). The
> attached patch does that, so the previous sequence becomes:
> 
> 0-0-2-2-4-6-6
> 
> It also prevents from initalizing a tuplestore at all if its not needed.
> 
> > > To mitigate the effects of this behaviour, I added the option to perform
> > > a
> > > reverse_seek when the looked-up tuple is nearer from the current index
> > > than
> > > from the start.
> > 
> > If there's still a need to seek within the tuplestore, that should get rid
> > of the O(n^2) effect.  I'm hoping that per-query-level tuplestores will
> > eliminate the need to seek entirely.
> 
> I think the only case when seeking is still needed is when there are more
> than one after trigger that need to be fired, since the abovementioned
> change prevents from seeking to skip tuples.
> 
> > > > If you do pursue that change, make sure the code still does the right
> > > > thing
> > > > when it drops queued entries during subxact abort.
> > > 
> > > I don't really understand what should be done at that stage. Since
> > > triggers on foreign tables are not allowed to be deferred, everything
> > > should be cleaned up at the end of each query, right ? So, there
> > > shouldn't be any queued entries.
> > 
> > I suspect that's right.  If you haven't looked over
> > AfterTriggerEndSubXact(), please do so and ensure all its actions still
> > make sense in the context of this new kind of trigger storage.
> 
> You're right, I missed something here. When aborting a subxact, the
> tuplestores for queries below the subxact query depth should be cleaned, if
> any, because AfterTriggerEndQuery has not been called for the failing query.
> 
> The attached patch fixes that.
> 
> > > > > The attached patch checks this, and add documentation for this
> > > > > limitation.
> > > > > I'm not really sure about how to phrase that correctly in the error
> > > > > message
> > > > > and the documentation. One can store at most INT_MAX foreign tuples,
> > > > > which
> > > > > means that at most INT_MAX insert or delete or "half-updates" can
> > > > > occur.
> > > > > By
> > > > > half-updates, I mean that for update two tuples are stored whereas
> > > > > in
> > > > > contrast to only one for insert and delete trigger.
> > > > > 
> > > > > Besides, I don't know where this disclaimer should be in the
> > > > > documentation.
> > > > > Any advice here ?
> > > > 
> > > > I wouldn't mention that limitation.
> > > 
> > > Maybe it shouldn't be so prominent, but I still think a note somewhere
> > > couldn't hurt.
> > 
> > Perhaps.  There's not much documentation of such implementation upper
> > limits, and there's no usage of "INT_MAX" outside of parts that discuss
> > writing C code.  I'm not much of a visionary when it comes to the
> > documentation; I try to document new things with an amount of detail
> > similar to old features.
> 
> Ok, I removed the paragraph documenting the limitation.
> 
> > > Should the use of work_mem be documented somewhere, too ?
> > 
> > I wouldn't.  Most uses of work_mem are undocumented, even relatively major
> > ones like count(DISTINCT ...) and CTEs.  So, while I'd generally favor a
> > patch documenting all/most of the things that use work_mem, it would be
> > odd
> > to document one new consumer apart from the others.
> 
> Ok.
> 
> > > > This is the performance trap I mentioned above; it brings potential
> > > > O(n^2)
> > > > complexity to certain AFTER trigger execution scenarios.
> > > 
> > > What scenarios do you have in mind ? I "fixed" the problem when there
> > > are
> > > multiple triggers on a foreign table, do you have any other one ?
> > 
> > I'm not aware of such a performance trap in your latest design.
> 
> Good !
> 
> > Thanks,
> > nm

-- 
Ronan Dunklau
http://dalibo.com - http://dalibo.org

signature.asc
Description: This is a digitally signed message part.

Re: [HACKERS] Triggers on foreign tables

Reply via email to