Re: [HACKERS] [patch] libpq one-row-at-a-time API
Hey, this thread was pointed out to me just a few days ago, but I'll start by saying that I think this thread is on exactly the right track. I don't like the callback API, and think that PQsetSingleRowMode should be offered in place of it. But I do have one On Sat, Jun 16, 2012 at 10:22 AM, Marko Kreen mark...@gmail.com wrote: The function can be called only after PQsend* and before any rows have arrived. This guarantees there will be no surprises to PQexec* users who expect full resultset at once. Ok, I'm guessing you mean that before you call PQgetResult or PQgetRowData, or maybe before you call PQgetResult or PQgetRowData and it returns a result or partial result.Because it would be a race condition if you meant exactly what you said. (Though I don't understand how this could possibly be implemented without some source of concurrency, which libpq doesn't do.) Maybe this is a little overly pendantic, but I do want to confirm the intention here. One other possibility, Tom Lane fretted ever so slightly about the use of malloc/free per row... what about instead of PQsetSingleRowMode, you have PQsetChunkedRowMode that takes a chunkSize parameter. A chunkSize = 0 would be equivalent to what we have today, a chunkSize of 1 means you get what you have from PQsetSingleRowMode, and larger chunkSizes would wait until n rows have been received before returning them all in a single result. I don't know that this suggestion is all that important, but it seems like an obvious generalization that might possibly be useful. Best, Leon
Re: [HACKERS] [patch] libpq one-row-at-a-time API
On Mon, Jul 30, 2012 at 9:59 PM, Jan Wieck janwi...@yahoo.com wrote: On 7/30/2012 8:11 PM, Leon Smith wrote: One other possibility, Tom Lane fretted ever so slightly about the use of malloc/free per row... what about instead of PQsetSingleRowMode, you have PQsetChunkedRowMode that takes a chunkSize parameter. A chunkSize = 0 would be equivalent to what we have today, a chunkSize of 1 means you get what you have from PQsetSingleRowMode, and larger chunkSizes would wait until n rows have been received before returning them all in a single result. I don't know that this suggestion is all that important, but it seems like an obvious generalization that might possibly be useful. It is questionable if that actually adds any useful functionality. This is true, I'm not sure my suggestion is necessarily useful. I'm just throwing it out there. Any collecting of multiple rows will only run the risk to stall receiving the following rows while processing this batch. Processing each row as soon as it is available will ensure making most use network buffers. This is not necessarily true, on multiple levels. I mean, some of the programs I write are highly concurrent, and this form of batching would have almost no risk of stalling the network buffer.And the possible use case would be when you are dealing with very small rows, when there would typically be several rows inside a single network packet or network buffer. Collecting multiple rows, like in the FETCH command for cursors does, makes sense when each batch introduces a network round trip, like for the FETCH command. But does it make any sense for a true streaming mode, like what is discussed here? Maybe?I mean, I anticipate that there are (probably) still use cases for FETCH, even when the row-at-a-time interface is a viable option and the transport between postgres and the client has reasonable flow-control. Leon
[HACKERS] Transactions over pathological TCP connections
Out of (mostly idle) curiousity, when exactly does a transaction commit, especially with respect to a TCP connection that a pathological demon will cut off at the worst possible moment? The thing is, I'm using PostgreSQL as a queue, using asynchronous notifications and following the advice of Marko Tiikkaja in this post: http://johtopg.blogspot.com/2010/12/queues-in-sql.html I'm using a stored procedure to reduce the round trips between the database and client, and then running it in a bare transaction, that is, as SELECT dequeue_element(); with an implicit BEGIN/COMMIT to mark a row in the queue as taken and return it. My question is, would it be theoretically possible for an element of a queue to become marked but not delivered, or delivered and not marked, if the TCP connection between the backend and client was interrupted at the worst possible moment? Will the backend wait for the delivery of the row be acknowledged before the transaction is committed? Or should the truly paranoid use an explicit transaction block and not consider the row taken until confirmation that the transaction has committed has been received? Best, Leon
Re: [HACKERS] Transactions over pathological TCP connections
On Tue, Jun 19, 2012 at 11:59 AM, Robert Haas robertmh...@gmail.com wrote: On Tue, Jun 19, 2012 at 1:56 AM, Tom Lane t...@sss.pgh.pa.us wrote: The transaction would be committed before a command success report is delivered to the client, so I don't think delivered-and-not-marked is possible. ...unless you have configured synchronous_commit=off, or fsync=off. Or unless your disk melts into a heap of slag and you have to restore from backup. You can protect against that last case using synchronous replication. But hard disk failure isn't in the failure model I was concerned about. =)To be perfectly honest, I'm not too concerned with either hard drive failure or network failure, as we are deploying on Raid 1+0 database server talking to the client over a redundant LAN, and using asynchronous (Slony) replication to an identical database server just in case. No single point of failure is a key philosophy of this app from top to bottom. Like I said, this is mostly idle curiosity. But I'm also accustomed to trying to get work done on shockingly unreliable internet connections. As a result, network failure is something I think about quite a lot when writing networked applications. So this is not entirely idle curiosity either. And thinking about this a bit more, it's clear that the database has to commit before the result is sent, on the off chance that the transaction fails and needs to be retried. And that an explicit transaction block isn't really a solution either, because a BEGIN; SELECT dequeue_row() would get the row to the client without marking it as taken, but the pathological TCP disconnect could then attack the following COMMIT;, leaving the client to think that the row has not been actually taken when it in fact has. It's not clear to me that this is even a solvable problem without modifying the schema to include both a taken and a finished processing state, and then letting elements be re-delievered after a period of time. But this would then allow a pathological demon with the power to cause TCP connects have a single element delivered and processed multiple times. In any case, thanks for the responses... Best, Leon
Re: [HACKERS] Transactions over pathological TCP connections
I just realized this is essentially an instance of the Two General's Problem; which is something I feel should have been more obvious to me. On Tue, Jun 19, 2012 at 5:50 PM, Leon Smith leon.p.sm...@gmail.com wrote: On Tue, Jun 19, 2012 at 11:59 AM, Robert Haas robertmh...@gmail.comwrote: On Tue, Jun 19, 2012 at 1:56 AM, Tom Lane t...@sss.pgh.pa.us wrote: The transaction would be committed before a command success report is delivered to the client, so I don't think delivered-and-not-marked is possible. ...unless you have configured synchronous_commit=off, or fsync=off. Or unless your disk melts into a heap of slag and you have to restore from backup. You can protect against that last case using synchronous replication. But hard disk failure isn't in the failure model I was concerned about. =)To be perfectly honest, I'm not too concerned with either hard drive failure or network failure, as we are deploying on Raid 1+0 database server talking to the client over a redundant LAN, and using asynchronous (Slony) replication to an identical database server just in case. No single point of failure is a key philosophy of this app from top to bottom. Like I said, this is mostly idle curiosity. But I'm also accustomed to trying to get work done on shockingly unreliable internet connections. As a result, network failure is something I think about quite a lot when writing networked applications. So this is not entirely idle curiosity either. And thinking about this a bit more, it's clear that the database has to commit before the result is sent, on the off chance that the transaction fails and needs to be retried. And that an explicit transaction block isn't really a solution either, because a BEGIN; SELECT dequeue_row() would get the row to the client without marking it as taken, but the pathological TCP disconnect could then attack the following COMMIT;, leaving the client to think that the row has not been actually taken when it in fact has. It's not clear to me that this is even a solvable problem without modifying the schema to include both a taken and a finished processing state, and then letting elements be re-delievered after a period of time. But this would then allow a pathological demon with the power to cause TCP connects have a single element delivered and processed multiple times. In any case, thanks for the responses... Best, Leon
[HACKERS] Exporting closePGconn from libpq
A minor issue has come up in creating low-level bindings to libpq for safe garbage-collected languages, namely that PQfinish is the only (AFAICT) way to close a connection but also de-allocates the memory used to represent the database connection.It would be preferable to call PQfinish to free the memory in a finalizer, but appilcations need a way to disconnect from the database at a predictable and deterministic point in time, whereas leaving a bit of memory around until the GC finally gets to it is relatively harmless.The low-level binding has a couple of options: 1. Ignore the issue and allow for the possibility of a segfault if the library is used incorrectly, which is not a good situation for safe languages. 2. Create a wrapper that tracks whether or not PQfinish has ever been called, so that attempts to use a connection afterwards can be turned into native exceptions/other forms of error signaling. This kind of solution can introduce their own minor issues. 3. Hack libpq to export closePGconn so that libpq can safely signal the low-level bindings of the error when a connection is used after it is disconnected, reserving PQfinish to run in a GC-triggered finalizer. While this is a technically preferable solution, without getting the change into upstream sources it is also a deployment nightmare. Is there any particular reason why closePGconn should not be exported from libpq? Best, Leon -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Exporting closePGconn from libpq
On Sat, May 14, 2011 at 11:37 AM, Tom Lane t...@sss.pgh.pa.us wrote: Yes: it'd introduce a new externally-visible state that libpq now has to worry about supporting in all its operations, ie connection closed but not gone. This state is guaranteed to be poorly tested and hence buggy. If you connect to a database over an unreliable network, you can lose the connection without warning at any time. Thus libpq must already support a connection 'closed' but not gone state, and I'm fine with making the explicitly disconnected state indistinguishable from the connection lost state. I think you need a wrapper object. Given the context you're describing, I'd be willing to lay a side bet that you'll end up needing a wrapper anyway, even if it seems like you could avoid it right now. Language embeddings of libpq tend to accrete features... The intention of the low-level bindings I'm working on is to keep features to an absolute minimum; to bind calls to C in a 1-1 fashion and to handle memory management and error signaling associated with foreign calls. Of course such a library is not intended to be particularly attractive for application development, but rather as a library that can be wrapped up into a higher-level database access library that's free to accrete features. :) Best, Leon -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Simplified VALUES parameters
Hi, I'm the maintainer and a primary author of a postgresql client library for Haskell, called postgresql-simple, and I recently investigated improving support for VALUES expressions in this library. As a result, I'd like to suggest two changes to postgresql: 1. Allow type specifications inside AS clauses, for example (VALUES (1,'hello'),(2,'world')) AS update(x int, y text) 2. Have an explicit syntax for representing VALUES expressions which contain no rows, such as VALUES (). (although the precise syntax isn't important to me.) My claim is that these changes would make it simpler for client libraries to properly support parameterized VALUES expressions. If you care, I've included a postscript including a brief background, and a link to my analysis and motivations. Best, Leon P.S. https://github.com/lpsmith/postgresql-simple/issues/61 Not entirely unlike many other client libraries, such as psycopg2, postgresql generates queries by expanding values of particular Haskell types into fragments of SQL syntax. So for example, you can currently write: executeMany conn [sql| UPDATE tbl SET tbl.y = upd.y FROM (VALUES (?,?)) AS upd(x,y) WHERE tbl.x = upd.x |] [(1,hello),(2,world)] Which will issue the query: UPDATE tbl SET tbl.y = upd.y FROM (VALUES (1,'hello'),(2,'world')) AS upd(x,y) WHERE tbl.x = upd.x The issue however is that postgresql-simple cannot currently parameterize more complex queries that have multiple VALUES expressions, or a VALUES expression alongside other parameters, as might occur with a Writable CTE or complex query. Also, when presented with a empty list of arguments, executeMany does not issue a query at all and simply returns 0, which is (usually?) the right thing to do given it's intended use cases, but is not the right thing to do in more general settings. So, what I'd like to do is to be able to write something like: execute conn [sql| UPDATE tbl SET tbl.y = upd.y FROM ? AS upd(x,y) WHERE tbl.x = upd.x AND tbl.z = ? |] ( Values [(1,hello),(2,world)], False ) and issue a similar query. However, the problems with this approach is specifying the postgresql types and handling the zero-row case properly.
Re: [HACKERS] Simplified VALUES parameters
On Wed, Feb 26, 2014 at 1:54 PM, Josh Berkus j...@agliodbs.com wrote: And thank you for writing that driver! You are welcome! I have no opinion about your request for VALUES() stuff, though. It looks fairly complex as far as grammar and libpq is concerned. Actually, my suggestions wouldn't necessarily impact libpq at all. For better and worse, postgresql-simple does not currently support protocol-level parameters at all. While it's clear to me that I do eventually need to work on supporting protocol-level parameters and support for the binary formats, it's also become clear to me since I first wrote it that protocol-level parameters are not a total replacement either, and that postgresql-simple will still need to support direct parameter expansion in some cases. (e.g. for values parameters, for identifier parameters (which aren't yet supported due to the need to drop support for libpq 8.4), etc.) Best, Leon
Re: [HACKERS] Adding unsigned 256 bit integers
pgmp is also worth mentioning here, and it's likely to be more efficient than the numeric type or something you hack up yourself: http://pgmp.projects.pgfoundry.org/ Best, Leon On Thu, Apr 10, 2014 at 10:11 AM, k...@rice.edu k...@rice.edu wrote: On Thu, Apr 10, 2014 at 09:13:47PM +0800, Olivier Lalonde wrote: I was wondering if there would be any way to do the following in PostgreSQL: UPDATE cryptotable SET work = work + 'some big hexadecimal number' where work is an unsigned 256 bit integer. Right now my column is a character varying(64) column (hexadecimal representation of the number) but I would be happy to switch to another data type if it lets me do the operation above. If it's not possible with vanilla PostgreSQL, are there extensions that could help me? -- - Oli Olivier Lalonde http://www.syskall.com -- connect with me! Hi Olivier, Here are some sample pl/pgsql helper functions that I have written for other purposes. They use integers but can be adapted to use numeric. Regards, Ken --- CREATE OR REPLACE FUNCTION hex2dec(t text) RETURNS integer AS $$ DECLARE r RECORD; BEGIN FOR r IN EXECUTE 'SELECT x'''||t||'''::integer AS hex' LOOP RETURN r.hex; END LOOP; END $$ LANGUAGE plpgsql IMMUTABLE STRICT; --- --- CREATE OR REPLACE FUNCTION bytea2int ( in_string BYTEA ) RETURNS INTEGER AS $$ DECLARE b1 INTEGER := 0; b2 INTEGER := 0; b3 INTEGER := 0; b4 INTEGER := 0; out_int INTEGER := 0; BEGIN CASE OCTET_LENGTH(in_string) WHEN 1 THEN b4 := get_byte(in_string, 0); WHEN 2 THEN b3 := get_byte(in_string, 0); b4 := get_byte(in_string, 1); WHEN 3 THEN b2 := get_byte(in_string, 0); b3 := get_byte(in_string, 1); b4 := get_byte(in_string, 2); WHEN 4 THEN b1 := get_byte(in_string, 0); b2 := get_byte(in_string, 1); b3 := get_byte(in_string, 2); b4 := get_byte(in_string, 3); END CASE; out_int := (b1 24) + (b2 16) + (b3 8) + b4; RETURN(out_int); END; $$ LANGUAGE plpgsql IMMUTABLE; --- -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers