I've implemented asynchronous operations. I'm not ready to submit a
patch yet because I've been working against 4.1.1 rather than the head
revision and python dumps core after finishing the unit test, so there's
a bad pointer or something somewhere that I need to find.
In the mean time, I thought I'd share the syntax I'm proposing.
There are three aspects to this: asynchronous connection, asynchronous
queries, and non-blocking copy and large object operations.
In libpq, asynchronous connections work by calling PQconnectStart(),
then calling PQconnectPoll() repeatedly until it returns either success
or failure. You can call select() on the connection socket to find out
whether it's safe to call PQconnectPoll().
Asynchronous queries work by calling PQsendQuery() then retrieving
results using PQgetResult(), until it returns NULL.
One thing about both these cases is the programmer really needs to call
the follow-up function until all the results are retrieved, or the db
connection won't work.
For copy and large objects, there are PQsetnonblocking() and
PQisnonblocking() calls, which makes the existing calls return an error
if there's nothing to do.
My proposed changes are:
- add a new argument "nowait" to pgconnect(). If it's True, the
function calls PQconnectStart() and the the programmer has to call
cnx.connectpoll() to complete the connection. The advantage of this is
that you can make multiple connections which complete in parallel or you
can somehow multiplex the database connection with other socket
operations
- add a new method connectpoll() to the connection object and some
associated manifest constants from the libpq header files
- add a new function sendquery() which behaves like query but starts
the transaction in the background. This always either raises an
exception or returns a query object. Otherwise, it behaves much like
query except:
-- it returns sooner
-- listfields() doesn't work until after one of the getresult()
functions has been called
-- getresult() et al only return data once
-- You must call one of the getresult functions, and call it until it
returns None. In practice, I think it will always return all the rows in
the first call and None in the second call but I'm not sure I would
trust that
-- if you have a query call with multiple statements, the results of
all the statements will be returned
-- I believe you can call select() on the connection socket to
determine whether it's safe to call getresult()
- change query.getresult(), dictresult(), and namedresult() to work
with asynchronous queries. These will handle different output formats,
e.g., for DML statements, so dictresult() might return '1' after an
update statement which updated one row
- add methods setnonblocking() and isnonblocking() to the connection
object. I haven't played with these at all so I'm not sure how well they
work
One thing chose not to do was to implement PQsetSingleRowMode(), which
causes PQgetResult() to return 1 row at a time.
This is probably useful for people writing totally event-driven
applications, but for me, the use case is running some related queries
in parallel:
db1 = pg.DB()
db2 = pg.DB()
db1.begin()
db2.begin()
db1.query("set transaction read only")
db2.query("set transaction read only, isolation level repeatable read;"
"set transaction snapshot '%s'" % (
db1.query("select pg_export_snapshot()").getresult()[0][0],))
q1= db1.sendquery("select expensive,stuff from a_big_table"
" where the_conditions_are_not_adequate")
q2= db2.sendquery("select expensive,stuff from another_big_table"
" where the_conditions_are_also_not_adequate")
qr1 = []
qr2 = []
qr = db1.getresult()
while qr is not None:
qr1.extend(qr)
qr = q1.getresult()
qr = db2.getresult()
while qr is not None:
qr2.extend(qr)
qr = q2.getresult()
db1.commit()
db2.commit()
That has the same effect as qr1 = db.query(query1).getresult(); qr2 =
db.query(query2).getresult(), except the time to run is the time of the
longest query plus the time to pull the results of the two queries
across the wire.
As I say, I have this implemented modulo what's probably a memory
management bug, but I'm interested in comments on the interface and also
suggestions on how this might be applied to dbapi.
--
Patrick TJ McPhee <[email protected]>
_______________________________________________
PyGreSQL mailing list
[email protected]
https://mail.vex.net/mailman/listinfo.cgi/pygresql