Re: [HACKERS] Proposal - asynchronous functions

2011-04-27 Thread Markus Wanner
On 04/26/2011 11:17 PM, Robert Haas wrote:
 IIRC, we kind of got stuck on the prerequisite wamalloc patch, and that sunk 
 the whole thing.  :-(

Right, that prerequisite was the largest stumbling block.  As I
certainly mentioned back then, it should be possible to get rid of the
imessages dependency (and thus wamalloc).  So whoever really wants to
implement asynchronous functions (or autonomous transactions) is more
than welcome to try that.

Please keep in mind that you'd need an alternative communication path.
Not only for the bgworker infrastructure itself, but for communication
between the requesting backend and the bgworker (except for
fire-and-forget jobs like autovacuum, of course.  OTOH even those could
benefit from communicating back their state to the coordinator.. eh..
autovacuum launcher).

Regards

Markus

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal - asynchronous functions

2011-04-27 Thread Sim Zacks
It sounds like there is interest in this feature, can it get added to 
the TODO list?


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal - asynchronous functions

2011-04-26 Thread Merlin Moncure
On Tue, Apr 26, 2011 at 3:28 AM, Sim Zacks s...@compulab.co.il wrote:
 Asynchronous functions

 *Problem*
 Postgresql does not have support for asynchronous function calls.

Well, there is asynchronous support from the client of course.  Thus
you can set up a asynchronous call back to the database with dblink.
There is some discussion about formalizing this feature -- you might
want to read up on autonomous transactions and how they might be used
to do what you are proposing.

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal - asynchronous functions

2011-04-26 Thread Robert Haas
On Tue, Apr 26, 2011 at 3:28 AM, Sim Zacks s...@compulab.co.il wrote:
 Add an Async command for functions ( ASYNC my_func(var1,var2) ) and add an
 async optional keyword in trigger statements ( CREATE TRIGGER ... EXECUTE
 ASYNC trig_func() ). This should cause an internal session to be started
 that the function or trigger function will run in, disconnected from the
 session it started in.

We've talked about a number of features that could benefit from some
kind of worker process facility (e.g. logical replication, parallel
query).  So far no one has stepped forward to build such a facility,
and I think without that this can't even get off the ground.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal - asynchronous functions

2011-04-26 Thread Stephen Frost
* Robert Haas (robertmh...@gmail.com) wrote:
 We've talked about a number of features that could benefit from some
 kind of worker process facility (e.g. logical replication, parallel
 query).  So far no one has stepped forward to build such a facility,
 and I think without that this can't even get off the ground.

Well, this specific thing could be done by just having PG close the
client connection, not care that it's gone, and have an implied 
'commit;' at the end.  I'm not saying that I like this approach, but I
don't think it'd be hard to implement.

What I don't think we saw was any information about how, exactly, the OP
was planning to implement this in the backend.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Proposal - asynchronous functions

2011-04-26 Thread Robert Haas
On Tue, Apr 26, 2011 at 8:32 AM, Stephen Frost sfr...@snowman.net wrote:
 * Robert Haas (robertmh...@gmail.com) wrote:
 We've talked about a number of features that could benefit from some
 kind of worker process facility (e.g. logical replication, parallel
 query).  So far no one has stepped forward to build such a facility,
 and I think without that this can't even get off the ground.

 Well, this specific thing could be done by just having PG close the
 client connection, not care that it's gone, and have an implied
 'commit;' at the end.  I'm not saying that I like this approach, but I
 don't think it'd be hard to implement.

Maybe, but that introduces a lot of complications with regards to
things like authentication.  We probably want some API for a backend
to say - hey, please spawn a session with the same user ID and
database association as me, and also provide some mechanism for data
transfer between the two processes.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal - asynchronous functions

2011-04-26 Thread Sim Zacks

On 04/26/2011 03:15 PM, Merlin Moncure wrote:


On Tue, Apr 26, 2011 at 3:28 AM, Sim Zackss...@compulab.co.il  wrote:

Asynchronous functions

*Problem*
Postgresql does not have support for asynchronous function calls.

Well, there is asynchronous support from the client of course.  Thus
you can set up a asynchronous call back to the database with dblink.
There is some discussion about formalizing this feature -- you might
want to read up on autonomous transactions and how they might be used
to do what you are proposing.

merlin
I am looking for specifically server support and not client support. 
Part of the proposal is that if the client goes away, it will still 
continue to finish.


Sim

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal - asynchronous functions

2011-04-26 Thread Stephen Frost
* Robert Haas (robertmh...@gmail.com) wrote:
 On Tue, Apr 26, 2011 at 8:32 AM, Stephen Frost sfr...@snowman.net wrote:
  Well, this specific thing could be done by just having PG close the
  client connection, not care that it's gone, and have an implied
  'commit;' at the end.  I'm not saying that I like this approach, but I
  don't think it'd be hard to implement.
 
 Maybe, but that introduces a lot of complications with regards to
 things like authentication.  We probably want some API for a backend
 to say - hey, please spawn a session with the same user ID and
 database association as me, and also provide some mechanism for data
 transfer between the two processes.

The impression I got from the OP is that this function call could be the
last (and possibly only) thing done with this connection.  I wasn't
suggesting that we spawn a new backend to run it (that introduces all
kinds of complexities).  The approach I was suggesting was to just have
the backend close its client connection and then process the function
and then 'commit;' and exit.

Might be interesting as a way to prefix anything, ala:

LAST delete from big_table;

poof, client is disconnected, backend keeps running, etc.

I don't know if that would really be useful to very many people or that
it's something we'd really want to do but it's an interesting idea to be
able to 'background' a process.

I'm certainly all for the bigger projects of having a cron-like
capability and/or being able to spawn off multiple backgrounded queries
from a single connection.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Proposal - asynchronous functions

2011-04-26 Thread Sim Zacks

On 04/26/2011 03:32 PM, Stephen Frost wrote:


What I don't think we saw was any information about how, exactly, the OP
was planning to implement this in the backend.

Thanks,

Stephen
I'm at stage 1 of this proposal, meaning I know exactly what I want. I 
am checking with the hackers list to see if this is a desirable feature 
before going to a postgres developer to talk about actually building the 
feature.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal - asynchronous functions

2011-04-26 Thread Sim Zacks

On 04/26/2011 04:22 PM, Stephen Frost wrote:


* Robert Haas (robertmh...@gmail.com) wrote:

On Tue, Apr 26, 2011 at 8:32 AM, Stephen Frostsfr...@snowman.net  wrote:

Well, this specific thing could be done by just having PG close the
client connection, not care that it's gone, and have an implied
'commit;' at the end.  I'm not saying that I like this approach, but I
don't think it'd be hard to implement.

Maybe, but that introduces a lot of complications with regards to
things like authentication.  We probably want some API for a backend
to say - hey, please spawn a session with the same user ID and
database association as me, and also provide some mechanism for data
transfer between the two processes.

The impression I got from the OP is that this function call could be the
last (and possibly only) thing done with this connection.  I wasn't
suggesting that we spawn a new backend to run it (that introduces all
kinds of complexities).  The approach I was suggesting was to just have
the backend close its client connection and then process the function
and then 'commit;' and exit.

My thought was that it actually would require its own process. One use 
case is a function might be called from within another function, but it 
does not want to wait for a return. Then the original function would 
finish processing and return. The second function would be run with the 
security of the user who called the function, but would be managed as 
a separate connection without a client (or as a client on the server to 
be more precise)


Sim

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal - asynchronous functions

2011-04-26 Thread David Fetter
On Tue, Apr 26, 2011 at 04:17:48PM +0300, Sim Zacks wrote:
 On 04/26/2011 03:15 PM, Merlin Moncure wrote:
 
 On Tue, Apr 26, 2011 at 3:28 AM, Sim Zackss...@compulab.co.il  wrote:
 Asynchronous functions
 
 *Problem*
 Postgresql does not have support for asynchronous function calls.
 Well, there is asynchronous support from the client of course.  Thus
 you can set up a asynchronous call back to the database with dblink.
 There is some discussion about formalizing this feature -- you might
 want to read up on autonomous transactions and how they might be used
 to do what you are proposing.
 
 merlin
 I am looking for specifically server support and not client support.
 Part of the proposal is that if the client goes away, it will still
 continue to finish.

This is exactly autonomous transactions.  Please read this thread to
see how.

http://archives.postgresql.org/pgsql-hackers/2008-01/msg00893.php

Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal - asynchronous functions

2011-04-26 Thread Robert Haas
On Tue, Apr 26, 2011 at 10:02 AM, David Fetter da...@fetter.org wrote:
 On Tue, Apr 26, 2011 at 04:17:48PM +0300, Sim Zacks wrote:
 On 04/26/2011 03:15 PM, Merlin Moncure wrote:

 On Tue, Apr 26, 2011 at 3:28 AM, Sim Zackss...@compulab.co.il  wrote:
 Asynchronous functions
 
 *Problem*
 Postgresql does not have support for asynchronous function calls.
 Well, there is asynchronous support from the client of course.  Thus
 you can set up a asynchronous call back to the database with dblink.
 There is some discussion about formalizing this feature -- you might
 want to read up on autonomous transactions and how they might be used
 to do what you are proposing.
 
 merlin
 I am looking for specifically server support and not client support.
 Part of the proposal is that if the client goes away, it will still
 continue to finish.

 This is exactly autonomous transactions.  Please read this thread to
 see how.

 http://archives.postgresql.org/pgsql-hackers/2008-01/msg00893.php

It's not the same thing at all.  An autonomous function is (or appears
to be) two simultaneous toplevel transactions within the same backend.
 This is a request for an *asynchronous* function, which would run
concurrently with foreground processing.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal - asynchronous functions

2011-04-26 Thread Merlin Moncure
On Tue, Apr 26, 2011 at 9:24 AM, Robert Haas robertmh...@gmail.com wrote:
 On Tue, Apr 26, 2011 at 10:02 AM, David Fetter da...@fetter.org wrote:
 On Tue, Apr 26, 2011 at 04:17:48PM +0300, Sim Zacks wrote:
 On 04/26/2011 03:15 PM, Merlin Moncure wrote:

 On Tue, Apr 26, 2011 at 3:28 AM, Sim Zackss...@compulab.co.il  wrote:
 Asynchronous functions
 
 *Problem*
 Postgresql does not have support for asynchronous function calls.
 Well, there is asynchronous support from the client of course.  Thus
 you can set up a asynchronous call back to the database with dblink.
 There is some discussion about formalizing this feature -- you might
 want to read up on autonomous transactions and how they might be used
 to do what you are proposing.
 
 merlin
 I am looking for specifically server support and not client support.
 Part of the proposal is that if the client goes away, it will still
 continue to finish.

 This is exactly autonomous transactions.  Please read this thread to
 see how.

 http://archives.postgresql.org/pgsql-hackers/2008-01/msg00893.php

 It's not the same thing at all.  An autonomous function is (or appears
 to be) two simultaneous toplevel transactions within the same backend.
  This is a request for an *asynchronous* function, which would run
 concurrently with foreground processing.

It's not exactly the same, but in the greater spirit of things I think
David is correct.  If you make async dblink call, you get parallel
processing from a single function entry point.   Autonomous
transaction implementations I've heard are basically taking this
approach and de-kludging it, and give you a lot of the same stuff,
like being able to do work in parallel.  I'm curious if the feature
meets the OP's requirements.

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal - asynchronous functions

2011-04-26 Thread Sim Zacks

On 04/26/2011 06:32 PM, Merlin Moncure wrote:


On Tue, Apr 26, 2011 at 9:24 AM, Robert Haasrobertmh...@gmail.com  wrote:

On Tue, Apr 26, 2011 at 10:02 AM, David Fetterda...@fetter.org  wrote:

On Tue, Apr 26, 2011 at 04:17:48PM +0300, Sim Zacks wrote:

On 04/26/2011 03:15 PM, Merlin Moncure wrote:


On Tue, Apr 26, 2011 at 3:28 AM, Sim Zackss...@compulab.co.ilwrote:

Asynchronous functions

*Problem*
Postgresql does not have support for asynchronous function calls.

Well, there is asynchronous support from the client of course.  Thus
you can set up a asynchronous call back to the database with dblink.
There is some discussion about formalizing this feature -- you might
want to read up on autonomous transactions and how they might be used
to do what you are proposing.

merlin

I am looking for specifically server support and not client support.
Part of the proposal is that if the client goes away, it will still
continue to finish.

This is exactly autonomous transactions.  Please read this thread to
see how.

http://archives.postgresql.org/pgsql-hackers/2008-01/msg00893.php

It's not the same thing at all.  An autonomous function is (or appears
to be) two simultaneous toplevel transactions within the same backend.
  This is a request for an *asynchronous* function, which would run
concurrently with foreground processing.

It's not exactly the same, but in the greater spirit of things I think
David is correct.  If you make async dblink call, you get parallel
processing from a single function entry point.   Autonomous
transaction implementations I've heard are basically taking this
approach and de-kludging it, and give you a lot of the same stuff,
like being able to do work in parallel.  I'm curious if the feature
meets the OP's requirements.
We have tried a similar approach, using plpythonu, by calling import pg 
and then creating a new connection to the database. This does give you 
an autonomous transaction, but not an asynchronous function.
My use cases are mostly where the function takes longer then the user 
wants to wait and the result is not as important to the user as it is to 
the system.
One example is building a summary table (materialized view if you will). 
Lets say building the table takes 10 seconds and is run on a trigger for 
every update to a specific table. When the user updates the table he 
doesn't want to wait 10 seconds before the control returns.
Another example, is a plpythonu function that FTPs a file. The file can 
take X amount of time to send and the user just needs to know that it 
has been sent. If there is a problem the user will not be informed about 
it directly. There are ways of having the function tell the system 
(either email or error table or marking a bool flag, etc) and by using 
this type of function the user declares that he understands that 
something might go wrong and he won't get a message about it. The user 
may also turn off his computer before the file is finished sending.


Sim

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal - asynchronous functions

2011-04-26 Thread Christopher Browne
On Tue, Apr 26, 2011 at 1:15 PM, Sim Zacks s...@compulab.co.il wrote:
 We have tried a similar approach, using plpythonu, by calling import pg and
 then creating a new connection to the database. This does give you an
 autonomous transaction, but not an asynchronous function.
 My use cases are mostly where the function takes longer then the user wants
 to wait and the result is not as important to the user as it is to the
 system.
 One example is building a summary table (materialized view if you will).
 Lets say building the table takes 10 seconds and is run on a trigger for
 every update to a specific table. When the user updates the table he doesn't
 want to wait 10 seconds before the control returns.
 Another example, is a plpythonu function that FTPs a file. The file can take
 X amount of time to send and the user just needs to know that it has been
 sent. If there is a problem the user will not be informed about it directly.
 There are ways of having the function tell the system (either email or error
 table or marking a bool flag, etc) and by using this type of function the
 user declares that he understands that something might go wrong and he won't
 get a message about it. The user may also turn off his computer before the
 file is finished sending.

There's a pretty big foot gun there in that there's the potential
for each connection coming in from a client to spawn a further
connection that *doesn't* go away when the client does.  There's a
not-inconsiderable risk of having a ballooning set of
post-processing connections lurking around.

That doesn't have to be problematic, within the context of a
reasonable design.  For such cases, the thing that lurks afterwards
shouldn't the process that does the postprocessing for MY
connection, but rather a singleton process (e.g. - it does something
to ensure that There Can Only Be One) that does postprocessing of that
kind of activity.

The asynchronous bit would consist of something like:
- queueing up My Connection's Object IDs for processing
- trying to start the singleton asynchronous process, failing,
gracefully (e.g. - without terminating any of the client's work) if
that fails.

An extra use case for this leaps out at me immediately.

It would be a plenty fine idea for a NOTIFY request to cause
asynchronous invocation of a specified stored procedure.  That would
definitely spiff up the usefulness of NOTIFY/LISTEN, by adding a way
of having a listener process already available on the server.
-- 
When confronted by a difficult problem, solve it by reducing it to the
question, How would the Lone Ranger handle this?

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal - asynchronous functions

2011-04-26 Thread Markus Wanner
Robert,

On 04/26/2011 02:25 PM, Robert Haas wrote:
 We've talked about a number of features that could benefit from some
 kind of worker process facility (e.g. logical replication, parallel
 query).  So far no one has stepped forward to build such a facility,
 and I think without that this can't even get off the ground.

Remember the bgworker patches extracted from Postgres-R?

[ Interestingly enough, one of the complaints I heard back then (not
necessarily from you) was that there's no user for bgworkers, yet.
Smells a lot like a chicken and egg problem to me. ]

Regards

Markus

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal - asynchronous functions

2011-04-26 Thread Kevin Grittner
Markus Wanner mar...@bluegap.ch wrote:
 On 04/26/2011 02:25 PM, Robert Haas wrote:
 We've talked about a number of features that could benefit from
 some kind of worker process facility (e.g. logical replication,
 parallel query).  So far no one has stepped forward to build such
 a facility, and I think without that this can't even get off the
 ground.
 
 Remember the bgworker patches extracted from Postgres-R?
 
Yeah, that crossed my mind.
 
 [ Interestingly enough, one of the complaints I heard back then
 (not necessarily from you) was that there's no user for bgworkers,
 yet. Smells a lot like a chicken and egg problem to me. ]
 
My recollection is that people wanted two or three solid use cases
so that what was implemented could be shown to be generalized. 
Perhaps this brings us to critical mass to re-introduce the idea.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal - asynchronous functions

2011-04-26 Thread Robert Haas
On Apr 26, 2011, at 3:32 PM, Markus Wanner mar...@bluegap.ch wrote:
 Remember the bgworker patches extracted from Postgres-R?

Oh, right.  I should have remembered that.

 [ Interestingly enough, one of the complaints I heard back then (not
 necessarily from you) was that there's no user for bgworkers, yet.
 Smells a lot like a chicken and egg problem to me. ]

IIRC, we kind of got stuck on the prerequisite wamalloc patch, and that sunk 
the whole thing.  :-(

...Robert
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers