Re: [HACKERS] Trigger concurrent execution

2014-05-16 Thread Blagoj Petrushev
Hi,

Thanks all for being patient, apparently I didn't quite understand the
norms of trigger execution.

On 16 May 2014 07:55, Craig Ringer cr...@2ndquadrant.com wrote:
 On 05/16/2014 08:06 AM, Blagoj Petrushev wrote:
 Hi,

 I'm thinking of an extension to trigger functionality like this:

 CREATE TRIGGER trigger_name
 AFTER event
 ON table
 CONCURRENTLY EXECUTE PROCEDURE trigger_fc

 This would call the trigger after the end of the transaction.

 If after the end of the transaction is what you mean by
 concurrently, then that's the wrong word to choose.

 AFTER COMMIT ?

You're right, 'concurrently' is the wrong word.


 The concept of running a trigger concurrently just doesn't make sense
 in PostgreSQL, because the backend is single threaded. You wouldn't be
 able to run any SQL commands until the trigger finished.

 It isn't possible to do anything useful without a transaction, so
 PostgreSQL would need to start a transaction for the trigger and commit
 the transaction at the end, as if you'd run SELECT my_procedure();.
 Because it's outside the scope of the transaction it probably wouldn't
 be possible to do FOR EACH ROW with a NEW and OLD var,

Right. Didn't think of this.

 unless you
 stashed them as materialized rows in the queue of pending AFTER COMMIT
 triggers.

 Finally, because it's after transaction commit, you couldn't easily
 guarantee that the trigger would really run. If the backend crashed /
 the server was shut down  / etc after the commit but before your trigger
 finished, you'd have a committed transaction but the trigger would not
 run. To fix that you'd need to somehow make the trigger queue WAL-logged
 and run it during replay, which from my rather limited understanding of
 this area would be ... interesting to do. It'd also mean the trigger
 couldn't have any session context.

 This isn't easy, if it's practical at all.

 I have a big table with big text column article and a nullable
 tsvector column fts_article. On each insert or update that changes the
 article, I trigger-issue 'NOTIFY article_changed row_id', then, with a
 daemon listener, I catch the notification and update fts_article
 accordingly with my_fts_fc(article). The reason I don't do this
 directly in my trigger is because my_fts_fc is slow for big articles,
 fts_article has a gin index, and also, on heavy load, my listener can
 do these updates concurrently. Now, with a concurrent execution of
 triggers, I can just call my_fts_fc inside the trigger instead of the
 notify roundtrip.

 I don't think that really fits.

 It seems like you want to run the trigger procedure in the background on
 another back-end. That'd be quite cool, but also not trivial to do,
 especially if you wanted to guarantee that it happened reliably and in a
 crash-safe manner.



 --
  Craig Ringer   http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training  Services

I'll also try to reply to David G Johnston answer here, since I didn't
actually get the email.

Conceptually, trigger actions run in-transaction and can cause it to
ROLLBACK; so how would after the end of the transaction work?  Since the
easy way is to have COMMIT; block until all the AFTER event concurrent
triggers fire I presume you would want something more like a task queue for
background workers where, at commit, the function call is in place in a FIFO
queue and the calling session is allowed to move onto other activity.

As I see, for my problem, it would be great if there's a way to put a
function call in a queue in a background worker. I don't know how to
do this, however. But, the crash handling and NEW/OLD vars passing
would remain a problem nonetheless.

It is not clear what you mean by my listener can do these updates
concurrently? Concurrently with each other or concurrently with other DML
action on table?I assume you have multiple listeners since the potential
rate of insert of the documents is likely much greater than the rate of
update/indexing.

I meant concurrently with additional inserts to the table, as well as
the fact that my listener is able to receive new notifications while
updating some record.

Also, it would seem you'd typically want the GIN index to be updated once
the corresponding transaction committed and makes the rest of the data
available.  Or does your use case allow for some delay between the article
being in the database physically and it being available in the index?

The latter, namely, the search feature can 'lag' a few seconds after
the content update.

Thanks,
Blagoj Petrushev


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Trigger concurrent execution

2014-05-15 Thread Blagoj Petrushev
Hi,

I'm thinking of an extension to trigger functionality like this:

CREATE TRIGGER trigger_name
AFTER event
ON table
CONCURRENTLY EXECUTE PROCEDURE trigger_fc

This would call the trigger after the end of the transaction.

The following is a use-case, please tell me if I'm doing it wrong.

I have a big table with big text column article and a nullable
tsvector column fts_article. On each insert or update that changes the
article, I trigger-issue 'NOTIFY article_changed row_id', then, with a
daemon listener, I catch the notification and update fts_article
accordingly with my_fts_fc(article). The reason I don't do this
directly in my trigger is because my_fts_fc is slow for big articles,
fts_article has a gin index, and also, on heavy load, my listener can
do these updates concurrently. Now, with a concurrent execution of
triggers, I can just call my_fts_fc inside the trigger instead of the
notify roundtrip.

Kind regards,
Blagoj Petrushev


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Trigger concurrent execution

2014-05-15 Thread David G Johnston
Blagoj Petrushev wrote
 Hi,
 
 I'm thinking of an extension to trigger functionality like this:
 
 CREATE TRIGGER trigger_name
 AFTER event
 ON table
 CONCURRENTLY EXECUTE PROCEDURE trigger_fc
 
 This would call the trigger after the end of the transaction.
 
 The following is a use-case, please tell me if I'm doing it wrong.
 
 I have a big table with big text column article and a nullable
 tsvector column fts_article. On each insert or update that changes the
 article, I trigger-issue 'NOTIFY article_changed row_id', then, with a
 daemon listener, I catch the notification and update fts_article
 accordingly with my_fts_fc(article). The reason I don't do this
 directly in my trigger is because my_fts_fc is slow for big articles,
 fts_article has a gin index, and also, on heavy load, my listener can
 do these updates concurrently. Now, with a concurrent execution of
 triggers, I can just call my_fts_fc inside the trigger instead of the
 notify roundtrip.

Conceptually, trigger actions run in-transaction and can cause it to
ROLLBACK; so how would after the end of the transaction work?  Since the
easy way is to have COMMIT; block until all the AFTER event concurrent
triggers fire I presume you would want something more like a task queue for
background workers where, at commit, the function call is in place in a FIFO
queue and the calling session is allowed to move onto other activity.

It is not clear what you mean by my listener can do these updates
concurrently? Concurrently with each other or concurrently with other DML
action on table?I assume you have multiple listeners since the potential
rate of insert of the documents is likely much greater than the rate of
update/indexing.

Also, it would seem you'd typically want the GIN index to be updated once
the corresponding transaction committed and makes the rest of the data
available.  Or does your use case allow for some delay between the article
being in the database physically and it being available in the index?

David J.




--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Trigger-concurrent-execution-tp5804158p5804164.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Trigger concurrent execution

2014-05-15 Thread Craig Ringer
On 05/16/2014 08:06 AM, Blagoj Petrushev wrote:
 Hi,
 
 I'm thinking of an extension to trigger functionality like this:
 
 CREATE TRIGGER trigger_name
 AFTER event
 ON table
 CONCURRENTLY EXECUTE PROCEDURE trigger_fc
 
 This would call the trigger after the end of the transaction.

If after the end of the transaction is what you mean by
concurrently, then that's the wrong word to choose.

AFTER COMMIT ?

The concept of running a trigger concurrently just doesn't make sense
in PostgreSQL, because the backend is single threaded. You wouldn't be
able to run any SQL commands until the trigger finished.

It isn't possible to do anything useful without a transaction, so
PostgreSQL would need to start a transaction for the trigger and commit
the transaction at the end, as if you'd run SELECT my_procedure();.
Because it's outside the scope of the transaction it probably wouldn't
be possible to do FOR EACH ROW with a NEW and OLD var, unless you
stashed them as materialized rows in the queue of pending AFTER COMMIT
triggers.

Finally, because it's after transaction commit, you couldn't easily
guarantee that the trigger would really run. If the backend crashed /
the server was shut down  / etc after the commit but before your trigger
finished, you'd have a committed transaction but the trigger would not
run. To fix that you'd need to somehow make the trigger queue WAL-logged
and run it during replay, which from my rather limited understanding of
this area would be ... interesting to do. It'd also mean the trigger
couldn't have any session context.

This isn't easy, if it's practical at all.

 I have a big table with big text column article and a nullable
 tsvector column fts_article. On each insert or update that changes the
 article, I trigger-issue 'NOTIFY article_changed row_id', then, with a
 daemon listener, I catch the notification and update fts_article
 accordingly with my_fts_fc(article). The reason I don't do this
 directly in my trigger is because my_fts_fc is slow for big articles,
 fts_article has a gin index, and also, on heavy load, my listener can
 do these updates concurrently. Now, with a concurrent execution of
 triggers, I can just call my_fts_fc inside the trigger instead of the
 notify roundtrip.

I don't think that really fits.

It seems like you want to run the trigger procedure in the background on
another back-end. That'd be quite cool, but also not trivial to do,
especially if you wanted to guarantee that it happened reliably and in a
crash-safe manner.



-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers