Re: [HACKERS] Proposal - asynchronous functions
On 04/26/2011 11:17 PM, Robert Haas wrote: IIRC, we kind of got stuck on the prerequisite wamalloc patch, and that sunk the whole thing. :-( Right, that prerequisite was the largest stumbling block. As I certainly mentioned back then, it should be possible to get rid of the imessages dependency (and thus wamalloc). So whoever really wants to implement asynchronous functions (or autonomous transactions) is more than welcome to try that. Please keep in mind that you'd need an alternative communication path. Not only for the bgworker infrastructure itself, but for communication between the requesting backend and the bgworker (except for fire-and-forget jobs like autovacuum, of course. OTOH even those could benefit from communicating back their state to the coordinator.. eh.. autovacuum launcher). Regards Markus -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal - asynchronous functions
It sounds like there is interest in this feature, can it get added to the TODO list? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal - asynchronous functions
On Tue, Apr 26, 2011 at 3:28 AM, Sim Zacks s...@compulab.co.il wrote: Asynchronous functions *Problem* Postgresql does not have support for asynchronous function calls. Well, there is asynchronous support from the client of course. Thus you can set up a asynchronous call back to the database with dblink. There is some discussion about formalizing this feature -- you might want to read up on autonomous transactions and how they might be used to do what you are proposing. merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal - asynchronous functions
On Tue, Apr 26, 2011 at 3:28 AM, Sim Zacks s...@compulab.co.il wrote: Add an Async command for functions ( ASYNC my_func(var1,var2) ) and add an async optional keyword in trigger statements ( CREATE TRIGGER ... EXECUTE ASYNC trig_func() ). This should cause an internal session to be started that the function or trigger function will run in, disconnected from the session it started in. We've talked about a number of features that could benefit from some kind of worker process facility (e.g. logical replication, parallel query). So far no one has stepped forward to build such a facility, and I think without that this can't even get off the ground. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal - asynchronous functions
* Robert Haas (robertmh...@gmail.com) wrote: We've talked about a number of features that could benefit from some kind of worker process facility (e.g. logical replication, parallel query). So far no one has stepped forward to build such a facility, and I think without that this can't even get off the ground. Well, this specific thing could be done by just having PG close the client connection, not care that it's gone, and have an implied 'commit;' at the end. I'm not saying that I like this approach, but I don't think it'd be hard to implement. What I don't think we saw was any information about how, exactly, the OP was planning to implement this in the backend. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] Proposal - asynchronous functions
On Tue, Apr 26, 2011 at 8:32 AM, Stephen Frost sfr...@snowman.net wrote: * Robert Haas (robertmh...@gmail.com) wrote: We've talked about a number of features that could benefit from some kind of worker process facility (e.g. logical replication, parallel query). So far no one has stepped forward to build such a facility, and I think without that this can't even get off the ground. Well, this specific thing could be done by just having PG close the client connection, not care that it's gone, and have an implied 'commit;' at the end. I'm not saying that I like this approach, but I don't think it'd be hard to implement. Maybe, but that introduces a lot of complications with regards to things like authentication. We probably want some API for a backend to say - hey, please spawn a session with the same user ID and database association as me, and also provide some mechanism for data transfer between the two processes. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal - asynchronous functions
On 04/26/2011 03:15 PM, Merlin Moncure wrote: On Tue, Apr 26, 2011 at 3:28 AM, Sim Zackss...@compulab.co.il wrote: Asynchronous functions *Problem* Postgresql does not have support for asynchronous function calls. Well, there is asynchronous support from the client of course. Thus you can set up a asynchronous call back to the database with dblink. There is some discussion about formalizing this feature -- you might want to read up on autonomous transactions and how they might be used to do what you are proposing. merlin I am looking for specifically server support and not client support. Part of the proposal is that if the client goes away, it will still continue to finish. Sim -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal - asynchronous functions
* Robert Haas (robertmh...@gmail.com) wrote: On Tue, Apr 26, 2011 at 8:32 AM, Stephen Frost sfr...@snowman.net wrote: Well, this specific thing could be done by just having PG close the client connection, not care that it's gone, and have an implied 'commit;' at the end. I'm not saying that I like this approach, but I don't think it'd be hard to implement. Maybe, but that introduces a lot of complications with regards to things like authentication. We probably want some API for a backend to say - hey, please spawn a session with the same user ID and database association as me, and also provide some mechanism for data transfer between the two processes. The impression I got from the OP is that this function call could be the last (and possibly only) thing done with this connection. I wasn't suggesting that we spawn a new backend to run it (that introduces all kinds of complexities). The approach I was suggesting was to just have the backend close its client connection and then process the function and then 'commit;' and exit. Might be interesting as a way to prefix anything, ala: LAST delete from big_table; poof, client is disconnected, backend keeps running, etc. I don't know if that would really be useful to very many people or that it's something we'd really want to do but it's an interesting idea to be able to 'background' a process. I'm certainly all for the bigger projects of having a cron-like capability and/or being able to spawn off multiple backgrounded queries from a single connection. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] Proposal - asynchronous functions
On 04/26/2011 03:32 PM, Stephen Frost wrote: What I don't think we saw was any information about how, exactly, the OP was planning to implement this in the backend. Thanks, Stephen I'm at stage 1 of this proposal, meaning I know exactly what I want. I am checking with the hackers list to see if this is a desirable feature before going to a postgres developer to talk about actually building the feature. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal - asynchronous functions
On 04/26/2011 04:22 PM, Stephen Frost wrote: * Robert Haas (robertmh...@gmail.com) wrote: On Tue, Apr 26, 2011 at 8:32 AM, Stephen Frostsfr...@snowman.net wrote: Well, this specific thing could be done by just having PG close the client connection, not care that it's gone, and have an implied 'commit;' at the end. I'm not saying that I like this approach, but I don't think it'd be hard to implement. Maybe, but that introduces a lot of complications with regards to things like authentication. We probably want some API for a backend to say - hey, please spawn a session with the same user ID and database association as me, and also provide some mechanism for data transfer between the two processes. The impression I got from the OP is that this function call could be the last (and possibly only) thing done with this connection. I wasn't suggesting that we spawn a new backend to run it (that introduces all kinds of complexities). The approach I was suggesting was to just have the backend close its client connection and then process the function and then 'commit;' and exit. My thought was that it actually would require its own process. One use case is a function might be called from within another function, but it does not want to wait for a return. Then the original function would finish processing and return. The second function would be run with the security of the user who called the function, but would be managed as a separate connection without a client (or as a client on the server to be more precise) Sim -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal - asynchronous functions
On Tue, Apr 26, 2011 at 04:17:48PM +0300, Sim Zacks wrote: On 04/26/2011 03:15 PM, Merlin Moncure wrote: On Tue, Apr 26, 2011 at 3:28 AM, Sim Zackss...@compulab.co.il wrote: Asynchronous functions *Problem* Postgresql does not have support for asynchronous function calls. Well, there is asynchronous support from the client of course. Thus you can set up a asynchronous call back to the database with dblink. There is some discussion about formalizing this feature -- you might want to read up on autonomous transactions and how they might be used to do what you are proposing. merlin I am looking for specifically server support and not client support. Part of the proposal is that if the client goes away, it will still continue to finish. This is exactly autonomous transactions. Please read this thread to see how. http://archives.postgresql.org/pgsql-hackers/2008-01/msg00893.php Cheers, David. -- David Fetter da...@fetter.org http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fet...@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal - asynchronous functions
On Tue, Apr 26, 2011 at 10:02 AM, David Fetter da...@fetter.org wrote: On Tue, Apr 26, 2011 at 04:17:48PM +0300, Sim Zacks wrote: On 04/26/2011 03:15 PM, Merlin Moncure wrote: On Tue, Apr 26, 2011 at 3:28 AM, Sim Zackss...@compulab.co.il wrote: Asynchronous functions *Problem* Postgresql does not have support for asynchronous function calls. Well, there is asynchronous support from the client of course. Thus you can set up a asynchronous call back to the database with dblink. There is some discussion about formalizing this feature -- you might want to read up on autonomous transactions and how they might be used to do what you are proposing. merlin I am looking for specifically server support and not client support. Part of the proposal is that if the client goes away, it will still continue to finish. This is exactly autonomous transactions. Please read this thread to see how. http://archives.postgresql.org/pgsql-hackers/2008-01/msg00893.php It's not the same thing at all. An autonomous function is (or appears to be) two simultaneous toplevel transactions within the same backend. This is a request for an *asynchronous* function, which would run concurrently with foreground processing. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal - asynchronous functions
On Tue, Apr 26, 2011 at 9:24 AM, Robert Haas robertmh...@gmail.com wrote: On Tue, Apr 26, 2011 at 10:02 AM, David Fetter da...@fetter.org wrote: On Tue, Apr 26, 2011 at 04:17:48PM +0300, Sim Zacks wrote: On 04/26/2011 03:15 PM, Merlin Moncure wrote: On Tue, Apr 26, 2011 at 3:28 AM, Sim Zackss...@compulab.co.il wrote: Asynchronous functions *Problem* Postgresql does not have support for asynchronous function calls. Well, there is asynchronous support from the client of course. Thus you can set up a asynchronous call back to the database with dblink. There is some discussion about formalizing this feature -- you might want to read up on autonomous transactions and how they might be used to do what you are proposing. merlin I am looking for specifically server support and not client support. Part of the proposal is that if the client goes away, it will still continue to finish. This is exactly autonomous transactions. Please read this thread to see how. http://archives.postgresql.org/pgsql-hackers/2008-01/msg00893.php It's not the same thing at all. An autonomous function is (or appears to be) two simultaneous toplevel transactions within the same backend. This is a request for an *asynchronous* function, which would run concurrently with foreground processing. It's not exactly the same, but in the greater spirit of things I think David is correct. If you make async dblink call, you get parallel processing from a single function entry point. Autonomous transaction implementations I've heard are basically taking this approach and de-kludging it, and give you a lot of the same stuff, like being able to do work in parallel. I'm curious if the feature meets the OP's requirements. merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal - asynchronous functions
On 04/26/2011 06:32 PM, Merlin Moncure wrote: On Tue, Apr 26, 2011 at 9:24 AM, Robert Haasrobertmh...@gmail.com wrote: On Tue, Apr 26, 2011 at 10:02 AM, David Fetterda...@fetter.org wrote: On Tue, Apr 26, 2011 at 04:17:48PM +0300, Sim Zacks wrote: On 04/26/2011 03:15 PM, Merlin Moncure wrote: On Tue, Apr 26, 2011 at 3:28 AM, Sim Zackss...@compulab.co.ilwrote: Asynchronous functions *Problem* Postgresql does not have support for asynchronous function calls. Well, there is asynchronous support from the client of course. Thus you can set up a asynchronous call back to the database with dblink. There is some discussion about formalizing this feature -- you might want to read up on autonomous transactions and how they might be used to do what you are proposing. merlin I am looking for specifically server support and not client support. Part of the proposal is that if the client goes away, it will still continue to finish. This is exactly autonomous transactions. Please read this thread to see how. http://archives.postgresql.org/pgsql-hackers/2008-01/msg00893.php It's not the same thing at all. An autonomous function is (or appears to be) two simultaneous toplevel transactions within the same backend. This is a request for an *asynchronous* function, which would run concurrently with foreground processing. It's not exactly the same, but in the greater spirit of things I think David is correct. If you make async dblink call, you get parallel processing from a single function entry point. Autonomous transaction implementations I've heard are basically taking this approach and de-kludging it, and give you a lot of the same stuff, like being able to do work in parallel. I'm curious if the feature meets the OP's requirements. We have tried a similar approach, using plpythonu, by calling import pg and then creating a new connection to the database. This does give you an autonomous transaction, but not an asynchronous function. My use cases are mostly where the function takes longer then the user wants to wait and the result is not as important to the user as it is to the system. One example is building a summary table (materialized view if you will). Lets say building the table takes 10 seconds and is run on a trigger for every update to a specific table. When the user updates the table he doesn't want to wait 10 seconds before the control returns. Another example, is a plpythonu function that FTPs a file. The file can take X amount of time to send and the user just needs to know that it has been sent. If there is a problem the user will not be informed about it directly. There are ways of having the function tell the system (either email or error table or marking a bool flag, etc) and by using this type of function the user declares that he understands that something might go wrong and he won't get a message about it. The user may also turn off his computer before the file is finished sending. Sim -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal - asynchronous functions
On Tue, Apr 26, 2011 at 1:15 PM, Sim Zacks s...@compulab.co.il wrote: We have tried a similar approach, using plpythonu, by calling import pg and then creating a new connection to the database. This does give you an autonomous transaction, but not an asynchronous function. My use cases are mostly where the function takes longer then the user wants to wait and the result is not as important to the user as it is to the system. One example is building a summary table (materialized view if you will). Lets say building the table takes 10 seconds and is run on a trigger for every update to a specific table. When the user updates the table he doesn't want to wait 10 seconds before the control returns. Another example, is a plpythonu function that FTPs a file. The file can take X amount of time to send and the user just needs to know that it has been sent. If there is a problem the user will not be informed about it directly. There are ways of having the function tell the system (either email or error table or marking a bool flag, etc) and by using this type of function the user declares that he understands that something might go wrong and he won't get a message about it. The user may also turn off his computer before the file is finished sending. There's a pretty big foot gun there in that there's the potential for each connection coming in from a client to spawn a further connection that *doesn't* go away when the client does. There's a not-inconsiderable risk of having a ballooning set of post-processing connections lurking around. That doesn't have to be problematic, within the context of a reasonable design. For such cases, the thing that lurks afterwards shouldn't the process that does the postprocessing for MY connection, but rather a singleton process (e.g. - it does something to ensure that There Can Only Be One) that does postprocessing of that kind of activity. The asynchronous bit would consist of something like: - queueing up My Connection's Object IDs for processing - trying to start the singleton asynchronous process, failing, gracefully (e.g. - without terminating any of the client's work) if that fails. An extra use case for this leaps out at me immediately. It would be a plenty fine idea for a NOTIFY request to cause asynchronous invocation of a specified stored procedure. That would definitely spiff up the usefulness of NOTIFY/LISTEN, by adding a way of having a listener process already available on the server. -- When confronted by a difficult problem, solve it by reducing it to the question, How would the Lone Ranger handle this? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal - asynchronous functions
Robert, On 04/26/2011 02:25 PM, Robert Haas wrote: We've talked about a number of features that could benefit from some kind of worker process facility (e.g. logical replication, parallel query). So far no one has stepped forward to build such a facility, and I think without that this can't even get off the ground. Remember the bgworker patches extracted from Postgres-R? [ Interestingly enough, one of the complaints I heard back then (not necessarily from you) was that there's no user for bgworkers, yet. Smells a lot like a chicken and egg problem to me. ] Regards Markus -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal - asynchronous functions
Markus Wanner mar...@bluegap.ch wrote: On 04/26/2011 02:25 PM, Robert Haas wrote: We've talked about a number of features that could benefit from some kind of worker process facility (e.g. logical replication, parallel query). So far no one has stepped forward to build such a facility, and I think without that this can't even get off the ground. Remember the bgworker patches extracted from Postgres-R? Yeah, that crossed my mind. [ Interestingly enough, one of the complaints I heard back then (not necessarily from you) was that there's no user for bgworkers, yet. Smells a lot like a chicken and egg problem to me. ] My recollection is that people wanted two or three solid use cases so that what was implemented could be shown to be generalized. Perhaps this brings us to critical mass to re-introduce the idea. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal - asynchronous functions
On Apr 26, 2011, at 3:32 PM, Markus Wanner mar...@bluegap.ch wrote: Remember the bgworker patches extracted from Postgres-R? Oh, right. I should have remembered that. [ Interestingly enough, one of the complaints I heard back then (not necessarily from you) was that there's no user for bgworkers, yet. Smells a lot like a chicken and egg problem to me. ] IIRC, we kind of got stuck on the prerequisite wamalloc patch, and that sunk the whole thing. :-( ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers