Re: [HACKERS] PostgreSQL not setting OpenSSL session id context?

2017-08-05 Thread Shay Rojansky
Awesome, thanks!!

On Fri, Aug 4, 2017 at 11:54 PM, Tom Lane <t...@sss.pgh.pa.us> wrote:

> Shay Rojansky <r...@roji.org> writes:
> > Great. Do you think it's possible to backport to the other maintained
> > branches as well, seeing as how this is quite trivial and low-impact?
>
> Already done, will be in next week's minor releases.  (You timed this
> bug report well.)
>
> regards, tom lane
>


Re: [HACKERS] PostgreSQL not setting OpenSSL session id context?

2017-08-04 Thread Shay Rojansky
>
> > Doing SSL_CTX_set_session_cache_mode(context, SSL_SESS_CACHE_OFF)
> doesn't
> > have any effect whatsoever - I still have the same issue (session id
> > context uninitialized). I suspect session caching is an entirely
> different
> > feature from session tickets/RFC5077 (although it might still be a good
> > idea to disable).
>
> Right, we expected that that would have no visible effect, because there
> is no way to cache sessions in Postgres anyway.  The main point, if I
> understand Heikki's concern correctly, is that this might save some
> amount of low-level overhead from clients trying to cache connections.
>

OK, sounds right (i.e. this is a defensive measure that isn't directly
connected to my problem but makes sense).

> Doing SSL_CTX_set_options(context, SSL_OP_NO_TICKET) indeed resolves the
> > issue, as expected.
>
> Excellent.  I'll push this patch tomorrow sometime (too late/tired
> right now).
>

Great. Do you think it's possible to backport to the other maintained
branches as well, seeing as how this is quite trivial and low-impact?


> > As I wrote above, I'd remove the #ifdef and execute it always.
>
> The reason I put the #ifdef in is that according to my research the
> SSL_OP_NO_TICKET symbol was introduced in openssl 0.9.8f, while we
> claim to support back to 0.9.8.  I'd be the first to say that you're
> nuts if you're running openssl versions that old; but this patch is not
> something to move the compatibility goalposts for when it only takes
> an #ifdef to avoid breaking older versions.
>
> (I need to check how far back SSL_SESS_CACHE_OFF goes ... we might
> need an #ifdef for that too.)
>

Ah OK, thanks for the explanation - makes perfect sense.


Re: [HACKERS] PostgreSQL not setting OpenSSL session id context?

2017-08-04 Thread Shay Rojansky
>
> On 2017-08-04 07:22:42 +0300, Shay Rojansky wrote:
> > I'm still not convinced of the risk/problem of simply setting the session
> > id context as I explained above (rather than disabling the optimization),
> > but of course either solution resolves my problem.
>
> How would that do anything? Each backend has it's own local
> memory. I.e. any cache state that openssl would maintain wouldn't be
> useful. If you want to take advantage of features around this you really
> need to cache tickets in shared memory...
>

Guys, there's no data being cached at the backend - RFC5077 is about
packaging information into a client-side opaque session ticket that allows
skipping a roundtrip on the next connection. As I said, simply setting the
session id context (*not* the session id or anything else) makes this
feature work, even though a completely new backend process is launched.


Re: [HACKERS] PostgreSQL not setting OpenSSL session id context?

2017-08-03 Thread Shay Rojansky
I tested the patch.

Doing SSL_CTX_set_session_cache_mode(context, SSL_SESS_CACHE_OFF) doesn't
have any effect whatsoever - I still have the same issue (session id
context uninitialized). I suspect session caching is an entirely different
feature from session tickets/RFC5077 (although it might still be a good
idea to disable).

Doing SSL_CTX_set_options(context, SSL_OP_NO_TICKET) indeed resolves the
issue, as expected. As I wrote above, I'd remove the #ifdef and execute it
always.

I'm still not convinced of the risk/problem of simply setting the session
id context as I explained above (rather than disabling the optimization),
but of course either solution resolves my problem.


Re: [HACKERS] PostgreSQL not setting OpenSSL session id context?

2017-08-02 Thread Shay Rojansky
One more note: https://github.com/netty/netty/pull/5321/files is an
equivalent PR setting the session ID context to a constant value in netty
(which is also a server using OpenSSL). This is in line with the
documentation on SSL_CTX_set_session_id_context (
https://wiki.openssl.org/index.php/Manual:SSL_CTX_set_session_id_context(3)
):

> Sessions are generated within a certain context. When exporting/importing
sessions with *i2d_SSL_SESSION*/*d2i_SSL_SESSION* it would be possible, to
re-import a session generated from another context (e.g. another
application), which might lead to malfunctions. Therefore each application
must set its own session id context *sid_ctx* which is used to distinguish
the contexts and is stored in exported sessions. The *sid_ctx* can be any
kind of binary data with a given length, it is therefore possible to use
e.g. the name of the application and/or the hostname and/or service name ...


Re: [HACKERS] PostgreSQL not setting OpenSSL session id context?

2017-08-02 Thread Shay Rojansky
>
> Shay Rojansky <r...@roji.org> writes:
> > Once again, I manged to make the error go away simply by setting the
> > session id context, which seems to be a mandatory server-side step for
> > properly support session tickets.
>
> The fact that you made the error go away doesn't make this a good
> solution.  In particular, using a simple constant session ID is completely
> insecure according to the TLS spec.  RFC 5246, F.1.4, doesn't even care
> for the idea of ever writing session IDs to stable storage; although
> Apache seems to be content with a session ID that is unique per-server
> (it looks like they just use a hash of the server's host name).
>

I think there may be a confusion here - I'm not doing anything with session
IDs, merely setting the session ID *context*. This seems to be an
OpenSSL-specific feature (nothing to do with TLS) that simply allows
distinguishing between different "applications" (or contexts) running in
the same process. See
https://wiki.openssl.org/index.php/Manual:SSL_CTX_set_session_id_context(3)
for the docs.

This feature does not involve writing anything (and definitely not session
IDs) to stable storage. The idea is to provide the client with an opaque
"session ticket", which is passed by the client back to the server on
subsequent connections, and which allows the skipping of a roundtrip in the
SSL handshake.


>
> More generally, PG as currently configured can't do anything with a
> session cache since each new backend would start with an empty cache.
>

Again, there's no backend cache - RFC5077 is about having all state at the
client side.


> So the question here is whether it's safe or worthwhile to allow use
> of session tickets.  I agree with Heikki's opinion that it's unlikely
> to provide any meaningful performance gain for database sessions that
> are of reasonable length.  I'm also pretty concerned about the possibility
> for security problems, eg a client being able to force a server into some
> low-security SSL mode.  Both RFC 5077 and the Apache people say that if
> you use session tickets you'd better rotate the keys for them regularly,
> eg in Apache's changelog we find
>
>  Session ticket creation uses a random key created during web
>  server startup and recreated during restarts. No other key
>  recreation mechanism is available currently. Therefore using session
>  tickets without restarting the web server with an appropriate
> frequency
>  (e.g. daily) compromises perfect forward secrecy. [Rainer Jung]
>
> Since we have no mechanism for that, I think that we need to err on
> the side of security.
>

I may definitely be wrong about this, but I'm under the impression that
management of the session ticket (as of the entire resumption mechanism) is
OpenSSL's responsibility and does not require anything from PostgreSQL
itself. However, if you're suspicious of OpenSSL itself that's another
story (and I'd definitely understand).


>
> Accordingly, what I think we should do is something more like the
> attached.  Could you see whether it fixes your problem?
>

I will be able to test this later tonight and confirm. I'm not sure why
the SSL_OP_NO_TICKET is in an #ifdef, I would simply do it in all cases.
I've seen people reporting that this issue is solved via setting
SSL_OP_NO_TICKET (e.g.
https://forums.aws.amazon.com/message.jspa?messageID=505895) so I'm not
sure what SSL_CTX_set_session_cache_mode is supposed to add, but if the
idea is to defensively disable other forms of caching than it makes sense.

Just to be clear, I don't necessarily have a problem with disabling RFC5077
session resumption as the benefits in the PostgreSQL scenario aren't big
(although I don't think a handshake roundtrip is completely negligible
either). I just think it's advisable we understand exactly what it is we're
disabling - there seems to be a confusion between session IDs, session ID
contexts, server/client-side state etc.


Re: [HACKERS] PostgreSQL not setting OpenSSL session id context?

2017-07-31 Thread Shay Rojansky
Hi Tom and Heikki.

As Tom says, session caching and session tickets seem to be two separate
things. However, I think you may be reading more into the session ticket
feature than there is - AFAICT there is no expectation or mechanism for
restoring *application* state of any kind - the mechanism is only supposed
to abbreviate the SSL handshake itself, i.e. save a roundtrip in the full
handshake process for agreeing on cypher etc. Here's an article I found
useful about this:
https://vincent.bernat.im/en/blog/2011-ssl-session-reuse-rfc5077 (in
addition to the RFC itself, of course).

Once again, I manged to make the error go away simply by setting the
session id context, which seems to be a mandatory server-side step for
properly support session tickets. So to summarize: at the moment PostgreSQL
indeed provides a session ticket in the first connection, which is cached
on the client side. On the second connection attempt, the (opaque) session
ticket is included in the first SSL packet sent to the server
(ClientHello), but the lack of a session id context causes OpenSSL to
error. In effect, this seems to be a trivial server-side "misconfiguration".

I do understand the reluctance to deal with any SSL "optimizations", having
experienced some of the headaches created by renegotiations. However,
session tickets do seem like a simple and well-defined optimization that
takes effect at connection only. Also, there is no risk of breaking any
*current* clients, since at the moment session tickets simply aren't
supported (because of the lack of session id context). So this seems to me
like a rather low-risk thing to enable. On the other hand, I also
understand that saving a connection-time handshake roundtrip is somewhat
less relevant to PostgreSQL.

I'm a little busy at the moment but if you'd like I can whip up a trivial
client implementation in .NET that demonstrates the issue.

On Tue, Aug 1, 2017 at 12:26 AM, Tom Lane  wrote:

> Heikki Linnakangas  writes:
> > I agree with Tom that we don't really want abbreviated SSL handshakes,
> > or other similar optimizations, to take place. PostgreSQL connections
> > are quite long-lived, so we have little to gain. But it makes the attack
> > surface larger. There have been vulnerabilities related to SSL
> > renegotiation, resumption, abbreviated handshakes, and all that.
>
> > I think we should actually call SSL_CTX_set_session_cache_mode(ctx,
> > SSL_SESS_CACHE_OFF), to disable session caching altogether. I'm not sure
> > if we still need to call SSL_CTX_set_session_cache_mode() if we do that.
>
> AIUI (and I just learned about this stuff yesterday, so I might be wrong)
> session caching and session tickets are two independent mechanisms for
> SSL session reuse.
>
> I have no objection to explicitly disabling session caching, but I think
> it won't have any real effect, because no backend process could ever have
> any entries in its session cache anyway.  Maybe it'd result in a more
> apropos error message, don't know.
>
> But we need to disable session tickets separately from that.  What's
> happening right now in Shay's case, I believe, is that the client is
> asking for a session ticket and getting one.  The ticket contains enough
> data to re-establish the same SSL context with a successor backend;
> but it does not contain any data that would allow restoration of
> relevant backend state.  We could imagine "resuming" the session with
> virgin backend state, but I think that violates the spirit if not the
> letter of RFC 5077.  In any case, implementing it with those semantics
> would tie our hands if anyone ever wanted to provide something closer
> to true session restoration.
>
> regards, tom lane
>


Re: [HACKERS] PostgreSQL not setting OpenSSL session id context?

2017-07-30 Thread Shay Rojansky
Just to continue the above, I can confirm that adding a simple call
to SSL_CTX_set_session_id_context() to be_tls_init() with some arbitrary
const value fixes the error for me. Attached is a patch (ideally a test
should be done for this, but that's beyond what I can invest at the moment,
let me know if it's absolutely necessary).

On Mon, Jul 31, 2017 at 1:15 AM, Shay Rojansky <r...@roji.org> wrote:

> Hi Tom.
>
> Again, I know little about this, but from what I understand PostgreSQL
> wouldn't actually need to do/implement anything here - the session ticket
> might be used only to abbreviate the SSL handshake (this would explain why
> it's on by default without any application support). In other words, simply
> setting the session context id may make the problem go away and at the same
> time unlock the abbreviated SSL handshake optimization. I could be wrong
> about this though.
>
> Whether the above is correct or not, SSL resumption - which removes a
> network roundtrip from the connection process - may be a worthy
> optimization even for long-lived connections such as PostgreSQL, although
> obviously much less valuable than, say, short-lived HTTP connections.
>
> But regardless, it seems that as you say: if you *don't* want to support
> resumption, you're required to explicitly disable it with
> SSL_OP_NO_TICKET.
>
> Just to give some context, Npgsql has its own, internal TLS implementation
> which does not implement session tickets at the client side - this is the
> workaround currently used. However, it would be much better if the standard
> .NET SSL implementation could be used instead (i.e. I'm hoping a backport
> would be possible here).
>
> On Sun, Jul 30, 2017 at 10:59 PM, Tom Lane <t...@sss.pgh.pa.us> wrote:
>
>> I wrote:
>> > I think what you need to do is tell SslStream not to expect that PG
>> > servers will do session resumption.  (I'm a bit astonished that that
>> > would be its default assumption in the first place, but whatever.)
>>
>> Actually, after a bit of further googling, it seems that the brain
>> damage here may be on the server side.  It seems that OpenSSL will
>> send a session ticket if requested, even though the surrounding
>> application has given it no means to identify the session (!?).
>> Apparently we need to pass SSL_OP_NO_TICKET to SSL_CTX_set_options
>> to prevent that from happening.
>>
>> regards, tom lane
>>
>
>


openssl-set-session-id-context.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PostgreSQL not setting OpenSSL session id context?

2017-07-30 Thread Shay Rojansky
Hi Tom.

Again, I know little about this, but from what I understand PostgreSQL
wouldn't actually need to do/implement anything here - the session ticket
might be used only to abbreviate the SSL handshake (this would explain why
it's on by default without any application support). In other words, simply
setting the session context id may make the problem go away and at the same
time unlock the abbreviated SSL handshake optimization. I could be wrong
about this though.

Whether the above is correct or not, SSL resumption - which removes a
network roundtrip from the connection process - may be a worthy
optimization even for long-lived connections such as PostgreSQL, although
obviously much less valuable than, say, short-lived HTTP connections.

But regardless, it seems that as you say: if you *don't* want to support
resumption, you're required to explicitly disable it with SSL_OP_NO_TICKET.

Just to give some context, Npgsql has its own, internal TLS implementation
which does not implement session tickets at the client side - this is the
workaround currently used. However, it would be much better if the standard
.NET SSL implementation could be used instead (i.e. I'm hoping a backport
would be possible here).

On Sun, Jul 30, 2017 at 10:59 PM, Tom Lane  wrote:

> I wrote:
> > I think what you need to do is tell SslStream not to expect that PG
> > servers will do session resumption.  (I'm a bit astonished that that
> > would be its default assumption in the first place, but whatever.)
>
> Actually, after a bit of further googling, it seems that the brain
> damage here may be on the server side.  It seems that OpenSSL will
> send a session ticket if requested, even though the surrounding
> application has given it no means to identify the session (!?).
> Apparently we need to pass SSL_OP_NO_TICKET to SSL_CTX_set_options
> to prevent that from happening.
>
> regards, tom lane
>


[HACKERS] PostgreSQL not setting OpenSSL session id context?

2017-07-30 Thread Shay Rojansky
Dear hackers, a long-standing issue reported by users of the Npgsql .NET
driver for PostgreSQL may have its roots on the PostgreSQL side. I'm far
from being an SSL/OpenSSL expert so please be patient if the terms/analysis
are incorrect.

When trying to connect with Npgsql to PostgreSQL with client authentication
(PG has ssl_ca_file set), the first connection works just fine. The second
connection, however, fails and the PostgreSQL logs contain the message
session id context uninitialized". This occurs when using .NET's default
SSL implementation, SslStream, which supports session resumption - the
session connection's ClientHello message contains a session ticket from the
first session, triggering the issue.

>From some research, it seems that for session resumption/reuse to work, the
SSL/TLS server must call SSL_CTX_set_session_id_context/and
SSL_set_session_id_context with some arbitrary binary data, to distinguish
between contexts/applications. A grep in the PostgreSQL source for
"set_session_id_context" doesn't yield anything.

Can someone with more knowledge confirm whether an issue exists on the
PostgreSQL side? If so, it seems completely trivial to fix this.

Thanks,

Shay


Re: [HACKERS] macaddr 64 bit (EUI-64) datatype support

2016-11-07 Thread Shay Rojansky
>
> > As I said before, Npgsql for one loads data types by name, not by OID.
>>> > So this would definitely cause breakage.
>>>
>>> Why would that cause breakage?
>>
>>
>> Well, the first thing Npgsql does when it connects to a new database, is
>> to query pg_type. The type names are used to associate entries with type
>> handlers, avoiding the hard-coding of OIDs in code. So if the type name
>> "macaddr" suddenly has a new meaning and its wire representation is
>> different breakage will occur. It is possible to release new versions of
>> Npgsql which will look at the PostgreSQL version and say "we know that in
>> PostgreSQL < 10 macaddr means this, but in >= 10 it means that". But that
>> doesn't seem like a good solution, plus old versions of Npgsql from before
>> this change won't work.
>>
>
> The new datatype that is going to replace the existing one works with both
> 6 and 8 byte
> MAC address and stores it a variable length format. This doesn't change
> the wire format.
> I don't see any problem with the existing applications. The new datatype
> can recv and send
> 8 byte MAC address also.
>

Apologies, I may have misunderstood. If the new type is 100%
wire-compatible (recv/send) with the old fixed-length 6-byte type, then
there's no issue whatsoever.


Re: [HACKERS] macaddr 64 bit (EUI-64) datatype support

2016-11-06 Thread Shay Rojansky
>
> > 1. Does everyone agrees that renaming the existing datatype without
> > changing the OID?
> >
> >
> > As I said before, Npgsql for one loads data types by name, not by OID.
> > So this would definitely cause breakage.
>
> Why would that cause breakage?


Well, the first thing Npgsql does when it connects to a new database, is to
query pg_type. The type names are used to associate entries with type
handlers, avoiding the hard-coding of OIDs in code. So if the type name
"macaddr" suddenly has a new meaning and its wire representation is
different breakage will occur. It is possible to release new versions of
Npgsql which will look at the PostgreSQL version and say "we know that in
PostgreSQL < 10 macaddr means this, but in >= 10 it means that". But that
doesn't seem like a good solution, plus old versions of Npgsql from before
this change won't work.


Re: [HACKERS] macaddr 64 bit (EUI-64) datatype support

2016-11-04 Thread Shay Rojansky
>
> Yes. Before doing this change, it is better to confirm the approach and
> then do all the changes.
>
> 1. Does everyone agrees that renaming the existing datatype without
> changing the OID?
>

As I said before, Npgsql for one loads data types by name, not by OID. So
this would definitely cause breakage.

For users who actually need the new variable-length type, it seems
perfectly reasonable to ask to switch to a new type - after all they're
making a change in their system. It would really be preferable to leave the
current type alone and create a new one.


Re: [HACKERS] Batches, error handling and transaction in the protocol

2016-10-18 Thread Shay Rojansky
> More generally speaking, the protocol appears to couple two different
things which may be unrelated. On the one hand, we have a protocol
> sync mechanism for error recovery (skip until Sync). One the other hand,
we have an implicit transaction for extended query messages until
> that same Sync. It seems valid to want to have error recovery without an
implicit transaction, but this doesn't seem supported by the current
> protocol (I could add a note for v4).

In the absence of any response on my message from September 28th, I've
added a todo item for wire protocol v4 (separate transaction delineation
from protocol error recovery).

Note that the same issue was discussed with Craig Ringer in
https://www.postgresql.org/message-id/CAMsr%2BYEgnJ8ZAWPLx5%3DBCbYYq9SNTdwbwvUcb7V-vYm5d5uhbQ%40mail.gmail.com

On Wed, Sep 28, 2016 at 6:04 PM, Shay Rojansky <r...@roji.org> wrote:

> Hi everyone, I'd appreciate some guidance on an issue that's been raised
> with Npgsql, input from other driver writers would be especially helpful.
>
> Npgsql currently supports batching (or pipelining) to avoid roundtrips,
> and sends a Sync message only at the end of the batch (so
> Parse1/Bind1/Describe1/Execute1/Parse2/Bind2/Describe2/Execute2/Sync).
> The reasoning is that if the first statement in the batch fails, the others
> shouldn't be processed. This seems to be the standard approach (the
> proposed patch for libpq seems to do the same).
>
> At the same time, if the batch doesn't occur within an explicit
> transaction (i.e. after BEGIN), it is automatically wrapped in an implicit
> transaction, with Sync committing it. This can, for example, provoke
> deadlocks if two batches try to update the same rows in reverse order. The
> problem is that the user didn't request a transaction in any way - they're
> just using batching to avoid roundtrips and their intention is to be in
> autocommit mode.
>
> One possible solution for this would be to insert a Sync after every
> execute in the batch, rather than a single Sync at the very end. This would
> make batches work the same as unbatched statements, and would resolve the
> deadlocks. However, behavior in case of error would be problematic:
> PostgreSQL would continue executing later messages if earlier ones failed,
> Npgsql would have to deal with multiple errors, etc.
>
> More generally speaking, the protocol appears to couple two different
> things which may be unrelated. On the one hand, we have a protocol sync
> mechanism for error recovery (skip until Sync). One the other hand, we have
> an implicit transaction for extended query messages until that same Sync.
> It seems valid to want to have error recovery without an implicit
> transaction, but this doesn't seem supported by the current protocol (I
> could add a note for v4).
>
> Finally, to give more context, a Microsoft developer ran into this while
> running ASP.NET benchmarks over Npgsql and its Entity Framework Core ORM
> provider. One of EFCore's great new features is that it batches database
> updates into a single roundtrip, but this triggered deadlocks. Whereas in
> many cases it's OK to tell users to solve the deadlocks by properly
> ordering their statements, when an ORM is creating the batch it's a more
> difficult proposition.
>
> Thanks for any thoughts or guidance!
>
> Shay
>


Re: [HACKERS] macaddr 64 bit (EUI-64) datatype support

2016-10-17 Thread Shay Rojansky
>
> The current macaddr datatype needs to be kept for some time by renaming
> it without changing OID and use the newer one for further usage.
>

>From the point of view of a driver maintainer... Npgsql looks up data types
by their name - upon first connection to a database it queries pg_type and
maps its internal data type handlers based on name. This allows it to avoid
hardcoding data type OIDs in source code, and easily handle user-defined
data types as well (enums, composites...). So a sudden rename of a datatype
would definitely cause a problem. Of course it's possible to first check
the server version and act accordingly but it seems to complicate things
needlessly.


Re: [HACKERS] PATCH: Batch/pipelining support for libpq

2016-10-14 Thread Shay Rojansky
>
> >  Of course, this is a
> > relatively minor performance issue (especially when compared to the
> overall
> > performance benefits provided by batching), and providing an API
> distinction
> > between adding a Sync and flushing the buffer may over-complicate the
> API. I
> > just thought I'd mention it.
>
> Unnecessary IMO. If we really want to add it later we'd probably do so
> by setting a "no flush on endbatch" mode and adding an explicit flush
> call. But I expect TCP_CORK will satisfy all realistic needs.
>

Unless I'm mistaken TCP_CORK is not necessarily going to work across all
platforms (e.g. Windows), although SO_LINGER (which is more standard) also
helps here.

> When it's about to send a batch, Npgsql knows whether it's in an
> (explicit)
> > transaction or not (by examining the transaction indicator on the last
> > ReadyForQuery message it received). If it's not in an (explicit)
> > transaction, it automatically inserts a Sync message after every
> Execute. If
> > some statement happens to be a BEGIN, it will be executed as a normal
> > statement and so on. The only issue is that if an error occurs after a
> > sneaked-in BEGIN, all subsequent statements will fail, and all have the
> Sync
> > messages Npgsql inserted. The end result will be a series of errors that
> > will be raised up to the user, but this isn't fundamentally different
> from
> > the case of a simple auto-commit batch containing multiple failures
> (because
> > of unique constraint violation or whatever) - multiple errors is
> something
> > that will have to be handled in any case.
>
> I'm a bit hesitant about how this will interact with multi-statements
> containing embedded BEGIN or COMMIT, where a single protocol message
> contains multiple ; delimited queries. But at least at this time of
> night I can't give a concrete problematic example.


Unless I'm mistaken, in the extended protocol you can't combine multiple ;
delimited queries in a single Parse - that's a feature supported only by
the Query message of the simple protocol. But then, if you're in the simple
protocol Sync doesn't play any role, does it? I mean, a Query message
behaves as though it's implicitly followed by a Sync message - an error in
a previous Query message doesn't cause later messages to be skipped...

Note that Npgsql only rarely uses the simple protocol for user messages.
Npgsql is a binary-only driver, and the simple protocol doesn't support
requesting binary results. So only the extended protocol is used, except
for some edge cases where it's possible to use the simple protocol for
efficiency - statements with no parameters and where the ExecuteNonQuery
API is used (i.e. the user won't access any results).


Re: [HACKERS] PATCH: Batch/pipelining support for libpq

2016-10-14 Thread Shay Rojansky
>
> > It has recently come to my attention that this implementation is
> problematic
> > because it forces the batch to occur within a transaction; in other
> words,
> > there's no option for a non-transactional batch.
>
> That's not strictly the case. If you explicitly BEGIN and COMMIT,
> those operations are honoured within the batch.
>

I wasn't precise in my formulation (although I think we understand each
other)... The problem I'm trying to address here is the fact that in the
"usual" batching implementation (i.e. where a single Sync message is sent
at the end of the batch), there's no support for batches which have no
transactions whatsoever (i.e. where each statement is auto-committed and
errors in earlier statements don't trigger skipping of later statements).

What's missing is implicit transactions. Usually if you send a series
> of messages they will each get their own implicit transaction. If you
> batch them, the whole lot gets an implicit transaction. This is
> because the PostgreSQL v3 protocol conflates transaction delineation
> with protocol error recovery into a single Sync message type.
>

That's right. I sent a message complaining about this conflation a while
ago:
https://www.postgresql.org/message-id/CADT4RqDdo9EcFbxwB_YO2H3BVZ0t-1qqZ%3D%2B%2BdVMnYaN6BpyUGQ%40mail.gmail.com.
There weren't any responses, although I'll add a note on the wiki on this
as a requested feature for the v4 protocol.

If the mythical v4 protocol ever happens I'd want to split Sync into
> two messages, one which is a protocol synchronisation message and
> another that is a transaction delimiter. Or give it flags or whatever.
>

Totally agree.


> In the mean time:
>
> > This can be a problem for
> > several reasons: users may want to sent off a batch of inserts, not
> caring
> > whether one of them fails (e.g. because of a unique constraint
> violation).
> > In other words, in some scenarios it may be appropriate for later batched
> > statements to be executed when an earlier batched statement raised an
> error.
> > If Sync is only sent at the very end, this isn't possible.
>
> Right, and that remains the case even with explicit transaction
> delineation, because the first ERROR causes processing of all
> subsequent messages to be skipped.
>
> The design I have in libpq allows for this by allowing the client to
> delimit batches without ending batch mode, concurrently consuming a
> stream of multiple batches. Each endbatch is a Sync. So a client that
> wants autocommit-like behavour can send a series of 1-query batches.


> I think I'll need to document this a bit better since it's more subtle
> than I properly explained.
>

Ah, I see. libpq's API is considerably more low-level than what Npgsql
needs to provide. If I understand correctly, you allow users to specify
exactly where to insert Sync messages (if at all), so that any number of
statements arbitrarily interleaved with Sync messages can be sent without
starting to read any results. If so, then the user indeed has everything
they need to control the exact transactional behavior they want (including
full auto-commit) without compromising on performance in any way (i.e. by
increasing roundtrips).

The only minor problem I can see is that PQsendEndBatch not only adds a
Sync message to the buffer, but also flushes it. This means that you may be
forcing users to needlessly flush the buffer just because they wanted to
insert a Sync. In other words, users can't send the following messages in a
single buffer/packet:
Prepare1/Bind1/Describe1/Execute1/Sync1/Prepare2/Bind2/Describe2/Execute2/Sync2
- they have to be split into different packets. Of course, this is a
relatively minor performance issue (especially when compared to the overall
performance benefits provided by batching), and providing an API
distinction between adding a Sync and flushing the buffer may
over-complicate the API. I just thought I'd mention it.


> Yes, that's what I suggest, and basically what the libpq batch
> interface does, though it expects the client to deal with the
> transaction boundaries.
>
> You will need to think hard about transaction boundaries as they
> relate to multi-statements unless nPgSQL parses out each statement
> from multi-statement strings like PgJDBC does. Otherwise a user can
> sneak in:
>
> somestatement; BEGIN; someotherstatement;
>
> or
>
> somestatement; CoMMiT; otherstatement;
>

That's a good point. I definitely don't want to depend on client-side
parsing of SQL in any way (in fact a planned feature is to allow using
Npgsql without any sort of client-side parsing/manipulation of SQL).
However, the fact that BEGIN/COMMIT can be sent in batches doesn't appear
too problematic to me.

When it's about to send a batch, Npgsql knows whether it's in an (explicit)
transaction or not (by examining the transaction indicator on the last
ReadyForQuery message it received). If it's not in an (explicit)
transaction, it automatically inserts a Sync message after every 

Re: [HACKERS] PATCH: Batch/pipelining support for libpq

2016-10-12 Thread Shay Rojansky
Hi all. I thought I'd share some experience from Npgsql regarding
batching/pipelining - hope this isn't off-topic.

Npgsql has supported batching for quite a while, similar to what this patch
proposes - with a single Sync message is sent at the end.

It has recently come to my attention that this implementation is
problematic because it forces the batch to occur within a transaction; in
other words, there's no option for a non-transactional batch. This can be a
problem for several reasons: users may want to sent off a batch of inserts,
not caring whether one of them fails (e.g. because of a unique constraint
violation). In other words, in some scenarios it may be appropriate for
later batched statements to be executed when an earlier batched statement
raised an error. If Sync is only sent at the very end, this isn't possible.
Another example of a problem (which actually happened) is that transactions
acquire row-level locks, and so may trigger deadlocks if two different
batches update the same rows in reverse order. Both of these issues
wouldn't occur if the batch weren't implicitly batched.

My current plan is to modify the batch implementation based on whether
we're in an (explicit) transaction or not. If we're in a transaction, then
it makes perfect sense to send a single Sync at the end as is being
proposed here - any failure would cause the transaction to fail anyway, so
skipping all subsequent statements until the batch's end makes sense.
However, if we're not in an explicit transaction, I plan to insert a Sync
message after each individual Execute, making non-transactional batched
statements more or less identical in behavior to non-transactional
unbatched statements. Note that this mean that a batch can generate
multiple errors, not just one.

I'm sharing this since it may be relevant to the libpq batching
implementation as well, and also to get any feedback regarding how Npgsql
should act.


Re: [HACKERS] Binary I/O for isn extension

2016-09-28 Thread Shay Rojansky
Sorry about this, I just haven't had a free moment (and it's definitely not
very high priority...)

On Wed, Sep 28, 2016 at 5:04 PM, Robert Haas  wrote:

> On Mon, Aug 22, 2016 at 8:14 AM, Fabien COELHO 
> wrote:
> > Hello Shay,
> >> Attached is a new version of the patch, adding an upgrade script and the
> >> rest of it. Note that because, as Fabien noted, there's doesn't seem to
> be
> >> a way to add send/receive functions with ALTER TYPE, I did that by
> >> updating
> >> pg_type directly - hope that's OK.
> >
> > This patch does not apply anymore, because there as been an update in
> > between to mark relevant contrib functions as "parallel".
> >
> > Could you update the patch?
>
> So, it's been over a month since this request, and there doesn't seem
> to be an update to this patch.  The CommitFest is over in 2 days, so
> I've marked this "Returned with Feedback".  Shay, please feel free to
> resubmit for the next CommitFest.
>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>


[HACKERS] Batches, error handling and transaction in the protocol

2016-09-28 Thread Shay Rojansky
Hi everyone, I'd appreciate some guidance on an issue that's been raised
with Npgsql, input from other driver writers would be especially helpful.

Npgsql currently supports batching (or pipelining) to avoid roundtrips, and
sends a Sync message only at the end of the batch (so
Parse1/Bind1/Describe1/Execute1/Parse2/Bind2/Describe2/Execute2/Sync). The
reasoning is that if the first statement in the batch fails, the others
shouldn't be processed. This seems to be the standard approach (the
proposed patch for libpq seems to do the same).

At the same time, if the batch doesn't occur within an explicit transaction
(i.e. after BEGIN), it is automatically wrapped in an implicit transaction,
with Sync committing it. This can, for example, provoke deadlocks if two
batches try to update the same rows in reverse order. The problem is that
the user didn't request a transaction in any way - they're just using
batching to avoid roundtrips and their intention is to be in autocommit
mode.

One possible solution for this would be to insert a Sync after every
execute in the batch, rather than a single Sync at the very end. This would
make batches work the same as unbatched statements, and would resolve the
deadlocks. However, behavior in case of error would be problematic:
PostgreSQL would continue executing later messages if earlier ones failed,
Npgsql would have to deal with multiple errors, etc.

More generally speaking, the protocol appears to couple two different
things which may be unrelated. On the one hand, we have a protocol sync
mechanism for error recovery (skip until Sync). One the other hand, we have
an implicit transaction for extended query messages until that same Sync.
It seems valid to want to have error recovery without an implicit
transaction, but this doesn't seem supported by the current protocol (I
could add a note for v4).

Finally, to give more context, a Microsoft developer ran into this while
running ASP.NET benchmarks over Npgsql and its Entity Framework Core ORM
provider. One of EFCore's great new features is that it batches database
updates into a single roundtrip, but this triggered deadlocks. Whereas in
many cases it's OK to tell users to solve the deadlocks by properly
ordering their statements, when an ORM is creating the batch it's a more
difficult proposition.

Thanks for any thoughts or guidance!

Shay


Re: [HACKERS] Slowness of extended protocol

2016-08-23 Thread Shay Rojansky
Just a note from me - I also agree this thread evolved (or rather devolved)
in a rather unproductive and strange way.

One important note that came out, though, is that adding a new client
message does have a backwards compatibility issue - intelligent proxies
such as pgbouncer/pgpool will probably break once they see an unknown
client message. Even if they don't, they may miss potentially important
information being transmitted to the server. Whether this is a deal-breaker
for introducing new messages is another matter (I personally don't think
so).

On Tue, Aug 23, 2016 at 4:54 PM, Andres Freund  wrote:

> On 2016-08-23 11:42:53 -0400, Robert Haas wrote:
> > I think this could possibly be done, but it seems a lot better to me
> > to just bite the bullet and add a new protocol message.  That was
> > proposed by Tom Lane on July 31st and I think it's still by far the
> > best and easiest idea proposed, except I think we could introduce it
> > without waiting for a bigger rework of the protocol if we design the
> > libpq APIs carefully.  Most of the rest of this thread seems to have
> > devolved into an argument about whether this is really necessary,
> > which IMHO is a pretty silly argument, instead of focusing on how it
> > might be done, which I think would be a much more productive
> > conversation.
>
> I agree about the odd course of the further discussion, especially the
> tone was rather odd.  But I do think it's valuable to think about a path
> that fixes the issue without requiring version-dependant adaptions in
> all client drivers.
>
> Greetings,
>
> Andres Freund
>


Re: [HACKERS] Slowness of extended protocol

2016-08-16 Thread Shay Rojansky
Halfway through this mail I suddenly understood something, please read all
the way down before responding...

On Tue, Aug 16, 2016 at 2:16 PM, Vladimir Sitnikov <
sitnikov.vladi...@gmail.com> wrote:

> Shay> your analogy breaks down. Of course L2 was invented to improve
> performance,
> Shay> but that doesn't mean that all caches are the same. More precisely,
> what I
> Shay> find objectionable about your approach is not any caching - it's the
> Shay> implicit or automatic preparation of statements. This practice isn't
> Shay> invisible in that a) it may cause errors that wouldn't have been
> there
> Shay> otherwise (e.g. because of DDL),
>
> Long-lived named server-prepared statements cause problems even if
> server-prepared statements are created manually by developers.
>
> Could you please stop saying "automatic preparation causes ~DDL issues"?
>

I never said that... As I've said many times, the problem is errors caused
by something the user never asked for. If I server-prepare a statement and
then get an error, it's a result of my own action.

Shay> As I said above, I think this is a critical point of misunderstand
> between
> Shay> us. The developers tells the driver which statements should be
> Shay> server-prepared by calling .prepareStatement(). I'm guessing you
> have a
> Shay> totally different understanding here.
>
> Please, quote the document you got that "developers tell the driver which
> statements should be server-prepared by calling ..." from. It never
> works like that.
> Neither in Java, nor in C#. I would admit I've no C# experience, but I did
> find documentation on IDbCommand.Prepare() and examined it.
>

> The proper way to say is "by calling .Prepare() developer passes the
> intention that
> he might be using the same query multiple times".
> That is it. It never means "driver must absolutely use server-prepare
> in the response
> to .Prepare() call".
>
> The same goes for Java's PreparedStatement.
> It never means "the driver must use server-prepared features".
>
> As Microsoft lists in the .Prepare() documentation, modern versions of
> MSSQL just ignore .Prepare() and cache statements automatically.
>
> It is not a developer's business which statements should be in the
> database cache.
> Neither developer should care which statements reside in the driver cache.
>

I'm really baffled here.

First, I never said prepared statements *must* be server-prepared. You're
completely correct that databases APIs don't *require* this, because they
by definition cover many databases and drivers. In Sqlite there's no
server, so there can be no server-prepared statement.

However, where there *is* a server which supports prepared statements as an
optimization, it's completely unthinkable to me that a driver wouldn't
implement prepare as server-prepare. Nobody forces you to do it - it just
seems unthinkable to do otherwise. This reason for this is that if
server-prepared statements are supported by your database, we expect them
to be a significant optimization (otherwise why would they exist), and
therefore not using them when the user calls "prepare" seems like...
foolishness. In other words, whatever client-side "precompilation" or other
optimization is possible is probably going to be negligible when compared
to server-preparation (this seems to be the case with PostgreSQL at the
very least), so why *not* map the database API's prepare method to
server-prepared statements?

I'm going to requote the API note which you quoted above on
Connection.prepareStatement (
https://docs.oracle.com/javase/7/docs/api/java/sql/Connection.html):

> This method is optimized for handling parametric SQL statements that
benefit from precompilation. If the driver supports precompilation, the
method prepareStatement will send the statement to the database for
precompilation. Some drivers may not support precompilation.

Again, my understanding of English may be flawed, or maybe my logic
circuits are malfunctioning. But I read this the following way:
1. preparedStatement is about precompilation.
2. If a driver supports precompilation (i.e. preparation),
"prepareStatement will send the statement to the *database* for
precompilation". Note that the API explicitly mentioned sending *to the
database* - server preparation...
3. A driver may not support precompilation (i.e. preparation). This could
be because it simply hasn't implemented it yet, or because the backend
doesn't support it, or for any other reason. In this case it's a noop,
which doesn't really change anything in this discussion.

A compliant implementation (that is a driver) could just assemble full SQL
> by concatenating the parameters on each execution and send it via 'Q'
> simple
> execute message.
>

I think I may have understood the problem here - there's definitely a Java
vs. C# issue difference this conversation.

>From reading the Java docs, I now realize that JDBC only seems to support
parameters in prepared statements. In other words, the 

Re: [HACKERS] Slowness of extended protocol

2016-08-15 Thread Shay Rojansky
>
> I'm not going to respond to the part about dealing with prepared
>> statements errors, since I think we've already covered that and there's
>> nothing new being said. I don't find automatic savepointing acceptable, and
>> a significant change of the PostgreSQL protocol to support this doesn't
>> seem reasonable (but you can try proposing).
>>
>
> As mentioned before. JDBC is not the only postgres driver to do this the
> ODBC driver does this as well. This is a requested feature by users. We
> didn't just decide to do it on our own.
>
> One thing to keep in mind is that both JDBC and ODBC are not exclusively
> PostgreSQL drivers and as such we sometimes have to jump through hoops to
> provide the semantics requested by the API.
>

I don't have anything in particular against automatic savepointing when
requested directly by users (especially if it's opt-in). My problem is more
with automatic savepointing as a solution to a problem created by doing
automatic prepared statements... Way too much automatic stuff going on
there for my taste...


Re: [HACKERS] Slowness of extended protocol

2016-08-15 Thread Shay Rojansky
On Mon, Aug 15, 2016 at 3:16 PM, Vladimir Sitnikov <
sitnikov.vladi...@gmail.com> wrote:

> Vladimir>> Yes, that is what happens.
> Vladimir>> The idea is not to mess with gucs.
>
> Shay:> Wow... That is... insane...
>
> Someone might say that "programming languages that enable side-effects
> are insane".

Lots of connection pools work by sharing the connections and it is up
> to developer
> if he can behave well (see "It is not" below)
>

The insane part is that there's really almost no reason to allow these
side-effects... It's possible to efficiently reset connection state with a
truly negligible impact on


> Shay> it's entirely reasonable for
> Shay> more than one app to use use the same pool
>
> Sharing connections between different applications might be not that good
> idea.
>

Not sure why... It seems like a bad idea mostly if your pool is leaking
state.


> However, I would not agree that "having out-of-process connection
> pool" is the only sane
> way to go.
> I do admit there might be valid cases for out of process pooling,
> however I do not agree
> it is the only way to go. Not only "inprocess" is one of the ways,
> "in-process" way is wildly used
> in at least one of the top enterprise languages.
>
> If you agree with that, you might agree that "in-process connection
> pool that serves
> exactly one application" might work in a reasonable fashion even
> without DISCARD ALL.
>

I never said out-of-process pooling is the way to go - in the .NET world
in-process pooling is very valid. But as I carefully said in my last email,
the same problems you have with multiple applications can occur with
multiple components within the same application. The process boundary isn't
very important here - in some scenarios programmers choose process
separation, in others they choose threads. The same basic problems that
occur in one model can occur in the other, including usage strategies which
make LRU caching very bad.


> Shay> Even with in-process pools it's standard practice (and a good idea)
> to
> Shay> reset state.
>
> It is not. Can you quote where did you get that "standard practice is
> to reset state" from?
>

I guess I take that back, I haven't actually made a thorough comparison
here.

Shay> If some part of a big complex
> Shay> application modified state, and then some other part happens to get
> that
> Shay> physical connection, it's extremely hard to make sense of things.
>
> Let's not go into "functional programming" vs "imperative programming"
> discussion?
> Of course you might argue that "programming in Haskell or OCaml or F#"
> makes
> "extremely easy to make sense of things", but that's completely
> another discussion.
>

I'm not sure what this is referring to... If you're talking about my
comment that "isolation/separation of layers is a good thing in
programming", then I don't think it's the function vs. imperative kind of
argument.

Shay> One note - in Npgsql's implementation of persistent prepared
> statements,
> Shay> instead of sending DISCARD ALL Npgsql sends the commands listed in
> Shay> https://www.postgresql.org/docs/current/static/sql-discard.html,
> except for
> Shay> DEALLOCATE ALL. This cleans up state changes but leaves prepared
> statements
> Shay> in place.
>
> Ok. At least you consider that "always discarding all the state" might be
> bad.
>

Yes I do. I actually implemented persistent prepared statements before this
conversation started - I think it's a great performance booster. I'm still
not sure if it should be opt-in or default, although I admit I'm leaning
towards default. But that feature has very little to do with *implicit*
preparation.

Shay> This is somewhat similar to the CPU reordering you
> Shay> keep coming back to - it's totally invisible
>
> I would disagree. CPU reordering is easily visible if you are dealing
> with multithreaded case.
> It can easily result in application bugs if application misses some
> synchronization.
>
> CPU reordering is very visible to regular programmers, and it is a
> compromise:
> 1) Developers enable compiler and CPU do certain "reorderings"
> 2) Developers agree to follow the rules like "missing synchronization
> might screw things up"
> 3) In the result, the code gets executed faster.
>

The point is that AFAIK the same bugs that can result from reordering can
also result from other basic conditions as well. If you're writing
multithreaded code then you must handle synchronization - this is not a
reordering-specific problem. Therefore if your program is multithreaded but
doesn't do proper synchronization you have a bug - regardless of whether
its manifestation is triggered by CPU reordering or not. I admit I'm not an
expert on this and may be wrong (it would be interesting to know).


> Vladimir> Just in case: PostgreSQL does not execute "discard all" on its
> own.
>
> Shay> Of course it doesn't - it doesn't know anything about connection
> pooling,
> Shay> it only knows about physical connections. When would it execute

Re: [HACKERS] Slowness of extended protocol

2016-08-15 Thread Shay Rojansky
On Sat, Aug 13, 2016 at 11:20 PM, Vladimir Sitnikov <
sitnikov.vladi...@gmail.com> wrote:

> Tatsuo>Interesting. What would happen if a user changes some of GUC
> parameters? Subsequent session accidentally inherits the changed GUC
> parameter?
>
> Yes, that is what happens.
> The idea is not to mess with gucs.
>

Wow... That is... insane...


> Tatsuo>There's nothing wrong with DICARD ALL
> Tatsuo>"DISCARD ALL" is perfect for this goal.
>
> It looks like you mean: "let's reset the connection state just in case".
> I see where it might help: e.g. when providing database access to random
> people who might screw a connection in every possible way.
>

It's not about random people screwing up connections, it's about
maintaining isolation between different connections... When using an
out-of-process pool (e.g. pgbouncer/pgpool), it's entirely reasonable for
more than one app to use use the same pool. It's entirely possible for one
app to change a GUC and then the other app goes boom. It's especially hard
to debug too, because it depends on who got the physical connection before
you.

Even with in-process pools it's standard practice (and a good idea) to
reset state. The idea is that pooling, much like a cache (see below!), is
supposed to speed up your application without having any other visible
effects. In other words, running your app with or without a pool should be
identical except for performance aspects. If some part of a big complex
application modified state, and then some other part happens to get that
physical connection, it's extremely hard to make sense of things.

One note - in Npgsql's implementation of persistent prepared statements,
instead of sending DISCARD ALL Npgsql sends the commands listed in
https://www.postgresql.org/docs/current/static/sql-discard.html, except for
DEALLOCATE ALL. This cleans up state changes but leaves prepared statements
in place.

Just in case: do you think one should reset CPU caches, OS filesystem
> caches, DNS caches, bounce the application, and bounce the OS in addition
> to "discard all"?
> Why reset only "connection properties" when we can reset everything to its
> pristine state?
>

Um, because all the caches you mention are, well, caches - by definition
they're not supposed to have any visible impact on any application, except
for making it faster. This is somewhat similar to the CPU reordering you
keep coming back to - it's totally invisible. GUCs are the exact opposite -
you use them to modify how PostgreSQL behaves. So it makes perfect sense to
reset them.

Just in case: PostgreSQL does not execute "discard all" on its own.
>

Of course it doesn't - it doesn't know anything about connection pooling,
it only knows about physical connections. When would it execute "discard
all" on its own?


> If you mean "pgpool is exactly like reconnect to postgresql but faster
> since connection is already established", then OK, that might work in some
> cases (e.g. when response times/throughput are not important), however why
> forcing "you must always start from scratch" execution model?
>

To enforce isolation, which is maybe the most important way for programs to
be reliable - but this is a general theme which you don't seem to agree
with. Regardless, resetting state doesn't have to have a necessary effect
on response times/throughput.


> Developers are not that dumb, and they can understand that "changing gucs
> at random is bad".
>

This has nothing to do with random, developers may have a legitimate reason
to modify a GUC. In fact, GUCs are there precisely so that developers can
change them...

Also, session state is not only about GUCs. DISCARD ALL also releases
locks, resets authorization, resets sequence state (currval/nextval), etc.
etc. All these things should not leak across connections.

> Please read again. PreparedStatement is the only way to execute statements
> in JDBC API. There's no API that allows user to specify "use
> server-prepared here".
> Well, there's non-prepared API in JDBC, however it misses "bind
> variables" support,
> so if bind variables required, developer would use PreparedStatement.

> Java's PreparedStatement does not have an option to distinguish which
statements
> should be server-prepared and which should not.

This is something new - maybe it's part of the misunderstanding here. To
me, the term "prepared statements" always means "server-prepared
statements"; this seems to be supported by the documentation you quote: "If
the driver supports precompilation, the method prepareStatement will send
the statement to the database for precompilation". I don't understand the
concept of a prepared statements which isn't server-prepared - do you have
some sort of driver-only prepared statements, which

Regardless of any optimizations you may be doing, in every database driver
I've ever seen in my life, "prepared" simply means "server-prepared". And
in every driver I've ever seen, there's an explicit API for that. Therefore
server-prepare 

Re: [HACKERS] Slowness of extended protocol

2016-08-15 Thread Shay Rojansky
Apologies, I accidentally replied off-list, here's the response I sent.
Vladimir, I suggest you reply to this message with your own response...

On Sat, Aug 13, 2016 at 6:32 PM, Vladimir Sitnikov <
sitnikov.vladi...@gmail.com> wrote:

> Shay>To be honest, the mere idea of having an SQL parser inside my driver
> makes me shiver.
>
> Same for me.
> However I cannot wait for PostgreSQL 18 that does not need client-side
> parsing.
>

I'm glad we agree on something. For me the problem has nothing to do with
PostgreSQL or performance - it has to do with database APIs that impose
uniform parameter placeholder formats, and therefore force drivers to
rewrite user SQL queries. AFAIK rewriting a user's SQL for optimization is
totally out of the scope of the driver's work.

Shay>There's nothing your driver is doing that the application developer
> can't do themselves -
> Shay>so your driver isn't faster than other drivers. It's faster only when
> used by lazy programmers.
>
> I'm afraid you do not get the point.
> ORMs like Hibernate, EclipseLink, etc send regular "insert ... values" via
> batch API.
> For the developer the only way to make use of "multivalues" is to
> implement either "ORM fix" or "the driver fix" or "postgresql fix".
>
So the feature has very little to do with laziness of the programmers.
> Application developer just does not have full control of each SQL when
> working though ORM.
> Do you suggest "stop using ORMs"? Do you suggest fixing all the ORMs so it
> uses optimal for each DB insert statement?
> Do you suggest fixing postgresql?
>
> Once again "multivalues rewrite at pgjdbc level" enables the feature
> transparently for all the users. If PostgreSQL 10/11 would improve
> bind/exec performance, we could even drop that rewrite at pgjdbc level and
> revert to the regular flow. That would again be transparent to the
> application.
>
>
I do get the point, and in fact I myself mentioned the ORM case above as an
advantage of implicit query preparation.

First, there's nothing stopping an ORM from optimizing multiple inserts
into a single multivalue insert. I do admit I'm not aware of any who do
this, but it's a good idea for an optimization - I happen to maintain the
Entity Framework Core provider for Npgsql, I might take a look at this
optimization (so again thanks for the idea).

Second, it's well-known that using an ORM almost always implies a
performance sacrifice - it's a tradeoff that's chosen when going with an
ORM. It's great that you optimize multiple inserts, but there are a myriad
of other cases where an ORM generates less efficient SQL that what would be
possible - but I don't think it makes sense for the driver to actually
contain an SQL optimizer. Slightly worse performance isn't in itself a
reason to drop ORMs: it's frequent practice to drop down to raw SQL for
performance-critical operations, etc.

But all that isn't really important - I'm going to repeat what I said
before and it would be good to get some reaction to this. Every software
component in the stack has a role, and maintaining those separations is
what keeps things simple and sane. Just for fun, we could imagine a kernel
network-stack feature which analyzes outgoing messages and optimizes them;
we could even implement your multiple insert -> multivalue insert
optimization there. This would have the advantage of working out-of-the-box
for every driver and every language out there (just like your driver does
provides it transparently for all ORMs) But nobody in their right mind
would think of doing something like this, and for good reason.

The programmer's task is to write SQL, the driver's task is to communicate
that SQL via the database-specific protocol, the kernel's networking
stack's task is to transmit that protocol via TCP, etc. etc. If an ORM is
used, the programmer effectively outsources the task of writing SQL to
another component, which is supposed to do a good job about it. Once you go
about blurring all the lines here, everything becomes more complicated,
brittle and hard to debug.

For what it's worth, I can definitely imagine your kind of optimizations
occurring at some additional layer which the user would choose to use - an
intermediate SQL optimizer between application code (or ORM) and the
driver. This "SQL optimizer" layer would keep the driver itself lean and
simple (for users who *don't* want all the magic), while allowing for
transparent optimizations for ORMs. Or if the magic is implemented at the
driver leve, it should be opt-in, or at least easy to disable entirely.

>

> Shay>are you sure there aren't "hidden" costs on the PostgreSQL side for
> generating so many implicit savepoints?
>
> Technically speaking I use the same savepoint name through bind/exec
> message.
>

Out of curiosity, I whipped up a quick benchmark (voltmeter) of the impact
of adding a savepoint before every command. Each iteration creates a
transaction, sends one Bind/Execute for "SELECT 1" (which was prepared
during setup), 

Re: [HACKERS] Slowness of extended protocol

2016-08-13 Thread Shay Rojansky
Vladimir wrote:

Shay>I don't know much about the Java world, but both pgbouncer and pgpool
> (the major pools?)
>
> In Java world, https://github.com/brettwooldridge/HikariCP is a very good
> connection pool.
> Neither pgbouncer nor pgpool is required.
> The architecture is:  application <=> HikariCP <=> pgjdbc <=> PostgreSQL
>
> The idea is HikariCP pools pgjdbc connections, and server-prepared
> statements are per-pgjdbc connection, so there is no reason to "discard
> all" or "deallocate all" or whatever.
>
> Shay>send DISCARD ALL by default. That is a fact, and it has nothing to do
> with any bugs or issues pgbouncer may have.
>
> That is configurable. If pgbouncer/pgpool defaults to "wrong
> configuration", why should we (driver developers, backend developers) try
> to accommodate that?
>

In a nutshell, that sentence represents much of the problem I have with
this conversation: you simply consider that anything that's different from
your approach is simply "wrong".

Shay>f you do a savepoint on every single command, that surely would impact
> performance even without extra roundtrips...?
>
> My voltmeter says me that the overhead is just 3us for a simple "SELECT"
> statement (see https://github.com/pgjdbc/pgjdbc/pull/477#
> issuecomment-239579547 ).
> I think it is acceptable to have it enabled by default, however it would
> be nice if the database did not require a savepoint dance to heal its "not
> implemented" "cache query cannot change result type" error.
>

That's interesting (I don't mean that ironically). Aside from simple
client-observed speed which you're measuring, are you sure there aren't
"hidden" costs on the PostgreSQL side for generating so many implicit
savepoints? I don't know anything about PostgreSQL transaction/savepoint
internals, but adding savepoints to each and every statement in a
transaction seems like it would have some impact. It would be great to have
the confirmation of someone who really knows the internals.


> Shay>I'm not aware of other drivers that implicitly prepare statements,
> Shay >and definitely of no drivers that internally create savepoints and
> Shay> roll the back without explicit user APIs
>
> Glad to hear that.
> Are you aware of other drivers that translate "insert into table(a,b,c)
> values(?,?,?)" =into=> "insert into table(a,b,c) values(?,?,?),
> (?,?,?),...,(?,?,?)" statement on the fly?
>
> That gives 2-3x performance boost (that includes client result processing
> as well!) for batched inserts since "bind/exec" message processing is not
> that fast. That is why I say that investing into "bind/exec performance"
> makes more sense than investing into "improved execution of non-prepared
> statements".
>

I'm not aware of any, but I also don't consider that part of the driver's
job. There's nothing your driver is doing that the application developer
can't do themselves - so your driver isn't faster than other drivers. It's
faster only when used by lazy programmers.

What you're doing is optimizing developer code, with the assumption that
developers can't be trusted to code efficiently - they're going to write
bad SQL and forget to prepare their statements. That's your approach, and
that's fine. The other approach, which is also fine, is that each software
component has its own role, and that clearly-defined boundaries of
responsibility should exist - that writing SQL is the developer's job, and
that delivering it to the database is the driver's job. To me this
separation of layers/roles is key part of software engineering: it results
in simpler, leaner components which interact in predictable ways, and when
there's a problem it's easier to know exactly where it originated. To be
honest, the mere idea of having an SQL parser inside my driver makes me
shiver.

The HikariCP page you sent (thanks, didn't know it) contains a great
Dijkstra quote that summarizes what I think about this - "Simplicity is
prerequisite for reliability.". IMHO you're going the opposite way by
implicitly rewriting SQL, preparing statements and creating savepoints.

But at the end of the day it's two different philosophies. All I ask is
that you respect the one that isn't yours, which means accepting the
optimization of non-prepared statements. There's no mutual exclusion.


> 2) I don't say "it is the only true way". I would change my mind if
> someone would suggest better solution. Everybody makes mistakes, and I have
> no problem with admitting that "I made a mistake" and moving forward.
> They say: "Don't cling to a mistake just because you spent a lot of time
> making it"
>

:)

So your way isn't the only true way, but others are just clinging to their
mistakes... :)


> However, no-one did suggest a case when application issues lots of unique
> SQL statements.
>

We did, you just dismissed or ignored them.

3) For "performance related" issues, a business case and a voltmeter is
> required to prove there's an issue.
>
>
> Shay>But the second query, which should be 

Re: [HACKERS] Slowness of extended protocol

2016-08-11 Thread Shay Rojansky
On Thu, Aug 11, 2016 at 1:22 PM, Vladimir Sitnikov <
sitnikov.vladi...@gmail.com> wrote:

2) The driver can use safepoints and autorollback to the good "right before
> failure" state in case of a known failure. Here's the implementation:
> https://github.com/pgjdbc/pgjdbc/pull/477
>
As far as I can remember, performance overheads are close to zero (no extra
> roundtrips to create a safepoint)
>

What? Do you mean you do implicit savepoints and autorollback too? How does
the driver decide when to do a savepoint? Is it on every single command? If
not, commands can get lost when an error is raised and you automatically
roll back? If you do a savepoint on every single command, that surely would
impact performance even without extra roundtrips...?

You seem to have a very "unique" idea of what a database driver should do
under-the-hood for its users. At the very least I can say that your concept
is very far from almost any database driver I've seen up to now (PostgreSQL
JDBC, psycopg, Npgsql, libpq...). I'm not aware of other drivers that
implicitly prepare statements, and definitely of no drivers that internally
create savepoints and roll the back without explicit user APIs. At the very
least you should be aware (and also clearly admit!) that you're doing
something very different - not necessarily wrong - and not attempt to
impose your ideas on everyone as if it's the only true way to write a db
driver.

3) Backend could somehow invalidate prepared statements, and notify clients
> accordingly. Note: the problem is hard in a generic case, however it might
> be not that hard if we fix well-known often-used cases like "a column is
> added". That however, would add memory overheads to store additional maps
> like "table_oid -> statement_names[]"
>

Assuming your driver supports batching/pipelining (and I hope it does),
that doesn't make sense. Say I send a 2-statement batch, with the first one
a DDL and with the second one some prepared query invalidated by the first.
When your DDL is executed by PostgreSQL your hypothetical notification is
sent. But the second query, which should be invalidated, has already been
sent to the server (because of batching), and boom.

4) Other. For instance, new message flow so frontend and backend could
> re-negotiate "binary vs text formats for the new resulting type". Or
> "ValidatePreparedStatement" message that would just validate the statement
> and avoid killing the transaction if the statement is invalid. Or whatever
> else there can be invented.
>

When would you send this ValidatePreparedStatement? Before each execution
as a separate roundtrip? That would kill performance. Would you send it in
the same packet before Execute? In that case you still get the error when
Execute is evaluated...

There really is no solution to this problem within the current PostgreSQL
way of doing things - although of course we could reinvent the entire
PostgreSQL protocol here to accommodate for your special driver...

The basic truth is this... In every db driver I'm familiar with programmers
are expected to manage query preparation on their own. They're supposed to
do it based on knowledge only they have, weighing pros and cons. They have
responsibility over their own code and they don't outsource major decisions
like this to their driver. When they get an error from PostgreSQL, it's
triggered by something they did in a very explicit and clear way - and they
therefore have a good chance to understand what's going on. They generally
don't get errors triggered by some under-the-hood magic their driver
decided to do for them, and which are hard to diagnose and understand.
Their drivers are simple, predictable and lean, and they don't have to
include complex healing logic to deal with errors they themselves triggered
with under-the-hood logic (e.g. implicit savepoints).


> Shay>So the general point is that the existence of pools is problematic
> for the argument "always prepare for recurring statements".
>
> So what?
> Don't use pools that issue "discard all" or configure them accordingly.
> That's it.
> In Java world, no wildly used pool defaults to "discard everything"
> strategy.
>

I don't know much about the Java world, but both pgbouncer and pgpool (the
major pools?) send DISCARD ALL by default. That is a fact, and it has
nothing to do with any bugs or issues pgbouncer may have. I'm tempted to go
look at other pools in other languages but frankly I don't think that would
have any effect in this conversation...


Re: [HACKERS] Slowness of extended protocol

2016-08-11 Thread Shay Rojansky
On Thu, Aug 11, 2016 at 8:39 AM, Vladimir Sitnikov <
sitnikov.vladi...@gmail.com> wrote:

>  Shay:
>
>> Prepared statements can have very visible effects apart from the speedup
>> they provide (e.g. failure because of schema changes) It's not that
>> these effects can't be worked around - they can be - but programmers can be
>> surprised by these effects, which can cause difficult-to-diagnose issues.
>>
>
> The specific effect of "cached plan cannot change return type" can be
> solved by cooperation of backend and frontend (~driver) developers.
>

I really don't see how. As I said, an error is going to kill the ongoing
transaction, how can this be solved without application logic? Unless you
propose keeping track of all statements executed in the query and resend
them?

I'm asking this out of genuine interest (I might be missing something),
since I'm seriously considering implementing implicit query preparation in
Npgsql.

I find that solving that kind of issues is more important than investing
> into "ParseBindExecDeallocateInOneGo" message.
>

I don't think it's solvable. But regardless, it's totally to invest in two
things, I think it's totally legitimate to invest in implicit query
preparation.


> I hope you would forgive me if I just stop the discussion here.
>

Of course, no problem.


> I find I'd better spent that time on just fixing pgbouncer issue rather
> than discussing if it is pgbouncer's or postgresql's issue.
>

The point is that it's not just a pgbouncer "bug". Connection pools like
pgbouncer (there are others) usually send DISCARD ALL on connection close,
which makes prepared statements useless for short-lived connection
scenarios (since you have to reprepare). I don't know if that's changeable
across pool implementations in general, and even if so, requires
significant user understanding for tweaking the pool. So the general point
is that the existence of pools is problematic for the argument "always
prepare for recurring statements".

I'm sorry for being impolite if I was ever.
>

I don't think you were impolite, I personally learned a few things from the
conversation and will probably implement opt-in implicit query preparation
thanks to it.


Re: [HACKERS] Slowness of extended protocol

2016-08-11 Thread Shay Rojansky
Vladimir wrote:

Shay> As Tom said, if an application can benefit from preparing, the
> developer has the responsibility (and also the best knowledge) to manage
> preparation, not the driver. Magical behavior under the hood causes
> surprises, hard-to-diagnose bugs etc.
>
> Why do you do C# then?
> Aren't you supposed to love machine codes as the least magical thing?
> Even x86 does not describe "the exact way the code should be executed".
> All the CPUs shuffle the instructions to make it faster.
>

These are valid questions. I have several answers to that.

First, it depends on the extent to which the "under the hood" activity may
have visible side-effects. CPU instruction reordering is 100% invisible,
having no effect whatsoever on the application apart from speeding it up.
Prepared statements can have very visible effects apart from the speedup
they provide (e.g. failure because of schema changes). It's not that these
effects can't be worked around - they can be - but programmers can be
surprised by these effects, which can cause difficult-to-diagnose issues.
Nobody needs to be really aware of CPU instruction reordering because it
will never cause an application issue (barring a CPU bug).

Another important visible effect of preparing a statement is server-side
cost (i.e. memory). I'm not sure what the overhead is, but it is there and
an irresponsible "prepare everything everywhere" can create a significant
resource drain on your server. I know pgjdbc has knobs for controlling how
many statements are prepared, but in a large-scale performance-sensitive
app a programmer may want to manually decide, on a per-query basis, whether
they want to prepare or not.

But the most important question is that this is a choice that should be
left to the developer, rather than imposed. In some cases developers *do*
drop down to assembly, or prefer to call system calls directly to be in
control of exactly what's happening. They should be able to do this. Now,
of course you can provide an option which disables implicit preparing, but
if you refuse to optimize non-prepared paths you're effectively forcing the
programmer to go down the prepared statement path.

Shay>As Tom said, if an application can benefit from preparing, the
> developer has the responsibility
>
> Does developer have the responsibility to choose between "index scan" and
> "table seq scan"? So your "developer has the responsibility" is like
> building on sand.
>

I'm very glad you raised that point. The programmers indeed outsources that
decision to the database, because the database has the most *knowledge* on
optimal execution (e.g. which indices are defined). Implicit preparing, on
the other hand, has nothing to do with knowledge: as Tom said, the driver
knows absolutely nothing about the application, and doesn't have any
advantage over the programmer in making that decision.

But note that the DBA *does* have to decide which indices exist and which
don't. Would you accept a database that automatically creates indices based
on use, i.e. "x queries caused full table scans, so I'll silently create an
index here"? I wouldn't, it would be utter nonsense and create lots of
unpredictability in terms of performance and resource drain.

My experience shows, that people are very bad at predicting where the
> performance problem would be.
> For 80% (or even 99%) of the cases, they just do not care thinking if a
> particular statement should be server-prepared or not.
> They have absolutely no idea how much resources it would take and so on.
>

And for those people it's great to have an implicit preparation feature,
most probably opt-in. But that shouldn't mean we shouldn't optimize things
for the rest.


> ORMs have no that notion of "this query must be server-prepared, while
> that one must not be".
> And so on.
>
> It is somewhat insane to assume people would use naked SQL. Of course they
> would use ORMs and alike, so they just would not be able to pass that
> additional parameter telling if a particular query out of thousands should
> be server-prepared or not.
>

Uh, I really wouldn't make statements like that if you want to be taken
seriously. Tons of applications use naked SQL and avoid ORMs for many
reasons (e.g. performance benefits of hand-crafted SQL), such general
statements on what applications do in the world are, well, silly. Again,
ORMs are an argument for why implicit preparation should exist (I made that
argument myself above), but they're not an argument for why optimizing
other paths should be excluded.


> Vladimir> "cached plan cannot change result type" -- PostgreSQL just fails
> to execute the server-prepared statement if a table was altered.
>
> Shay>How exactly do users cope with this in pgjdbc? Do they have some
> special API to flush (i.e. deallocate) prepared statements which they're
> supposed to use after a DDL?
>
> First of all, pgjdbc does report those problems to hackers.
> Unfortunately, it is still "not implemented".
> Then, 

Re: [HACKERS] Slowness of extended protocol

2016-08-10 Thread Shay Rojansky
Some comments...

For the record, I do find implicit/transparent driver-level query
preparation interesting and potentially useful, and have opened
https://github.com/npgsql/npgsql/issues/1237 to think about it - mostly
based on arguments on this thread. One big question mark I have is whether
this behavior should be opt-out as in pgjdbc, rather than opt-in. Like
others in this thread (I think), I prefer my libraries and drivers very
simple, following my exact API calls to the letter and doing minimal magic
under the hood. As Tom said, if an application can benefit from preparing,
the developer has the responsibility (and also the best knowledge) to
manage preparation, not the driver. Magical behavior under the hood causes
surprises, hard-to-diagnose bugs etc.

One interesting case where implicit preparation is unbeatable, though, is
when users aren't coding against the driver directly but are using some
layer, e.g. an ORM. For example, the Entity Framework ORM simply does not
prepare statements (see
https://github.com/aspnet/EntityFramework/issues/5459 which I opened). This
is one reason I find it a valuable feature to have, although probably as
opt-in and not opt-out.

Regarding driver-level SQL parsing, Vladimir wrote:

> Unfortunately, client-side SQL parsing is inevitable evil (see below on
'?'), so any sane driver would cache queries any way. The ones that do not
have query cache will perform badly anyway.

First, it's a very different thing to have a query cache in the driver, and
to implicitly prepare statements.

Second, I personally hate the idea that drivers parse SQL and rewrite
queries. It's true that in many languages database APIs have "standardized"
parameter placeholders, and therefore drivers have no choice but to
rewrite. ADO.NET (the .NET database API) actually does not do this -
parameter placeholders are completely database-dependent (see
https://msdn.microsoft.com/en-us/library/yy6y35y8%28v=vs.110%29.aspx?f=255=-2147217396).
Npgsql does rewrite queries to provide named parameters, but the next
version is going to have a "raw query" mode in which no parsing/rewriting
happens whatsoever. It simply takes your string, packs it into a Parse (or
Query), and sends it off. This is not necessarily going to be the main way
users use Npgsql, but it's important to note that query parsing and
rewriting isn't an "inevitable evil".

Vladimir wrote:

> This new message would be slower than proper query cache, so why should
we all spend time on a half-baked solution?

The cases where the prepared statement path doesn't work have already been
listed many times - pgbouncer (and other pools), drivers which don't
support persisting prepared statement across pooled connection open/close,
and people (like many in this conversation) who don't appreciate their
drivers doing magic under the hood. This is why I thing both proposals are
great - there's nothing wrong with query caching (or more precisely,
implicit statement preparation by the driver), especially if it's opt-in,
but that doesn't mean we shouldn't optimize things for people who don't,
can't or won't go for that.

> To be fair, implementing a cache is a trivial thing when compared with
hard-coding binary/text formats for all the datatypes in each and every
language.
> Remember, each driver has to implement its own set of procedures to
input/output values in text/binary format, and that is a way harder than
implementing the cache we are talking about.

I agree with that, but it's not a question of how easy it is to implement
implicit preparation. It's a question of whether driver developers choose
to do this, based on other considerations as well - many people here think
it's not the job of the driver to decide for you whether to prepare or not,
for example. It has nothing to do with implementation complexity.

> "cached plan cannot change result type" -- PostgreSQL just fails to
execute the server-prepared statement if a table was altered.

That alone seems to be a good reason to make implicit preparation opt-in at
best. How exactly do users cope with this in pgjdbc? Do they have some
special API to flush (i.e. deallocate) prepared statements which they're
supposed to use after a DDL? Do you look at the command tag in the
CommandComplete to know whether a command was DDL or not?


Re: [HACKERS] Slowness of extended protocol

2016-08-09 Thread Shay Rojansky
On Tue, Aug 9, 2016 at 3:42 PM, Vladimir Sitnikov <
sitnikov.vladi...@gmail.com> wrote:

> Shay>But here's the more important general point. We're driver
> developers, not application developers. I don't really know what
> performance is "just fine" for each of my users, and what is not worth
> optimizing further. Users may follow best practices, or they may not for
> various reasons.
>
> Of course we cannot benchmark all the existing applications, however we
> should at lest try to use "close to production" benchmarks.
>
> Let me recap: even "select 1" shows clear advantage of reusing
> server-prepared statements.
> My machine shows the following results for "select 1 pgbench":
> simple: 38K ops/sec (~26us/op)
> extended: 32K ops/sec (~31us/op)
> prepared: 47K ops/sec (~21us/op)
>
> Note: reusing server-prepared statements shaves 10us (out of 31us), while
> "brand new ParseBindExecDeallocate" message would not able to perform
> better than 26us/op (that is 5 us worse than the prepared one). So it makes
> much more sense investing time in "server-prepared statement reuse" at the
> client side and "improving Bind/Exec performance" at the backend side.
>
> For more complex queries the gap (prepared vs simple) would be much bigger
> since parse/validate/plan for a complex query is much harder operation than
> the one for "select 1"
>

You seem to be misunderstanding the fundamental point here. Nobody is
saying that prepared statements aren't a huge performance booster - they
are. I recommend them to anyone who asks. But you're saying "let's not
optimize anything else", whereas there are many programs out there *not*
using prepared statements for various reasons (e.g. pgbouncer, or simply an
existing codebase). If your opinion is that nothing should be done for
these users, fine - nobody's forcing you to do anything. I simply don't see
why *not* optimize the very widely-used single-statement execution path.


> Note: I do not mean "look, prepared always win". I mean: "if your
> environment does not allow reuse of prepared statements for some reason,
> you lose huge amount of time on re-parsing the queries, and it is worth
> fixing that obvious issue first".
>

Maybe it's worth fixing it, maybe not - that's going to depend on the
application. Some applications may be huge/legacy and hard to change,
others may depend on something like pgbouncer which doesn't allow it. Other
drivers out there probably don't persist prepared statements across
close/open, making prepared statements useless for short-lived scenarios.
Does the Python driver persist prepared statements? Does the Go driver do
so? If not the single-execution flow is very relevant for optimization.


> Shay>I don't see how reusing SQL text affects security in any way.
>
> Reusing SQL text makes application more secure as "build SQL on the fly"
> is prone to SQL injection security issues.
> So reusing SQL text makes application more secure and it enables
> server-prepared statements that improve performance considerably. It is a
> win-win.
>

We've all understood that server-prepared statements are good for
performance. But they aren't more or less vulnerable to SQL injection -
developers can just as well concatenate user-provided strings into a
prepared SQL text.

[deleting more comments trying to convince that prepared statements are
great for performance, which they are]

Shay>Again, in a world where prepared statements aren't persisted across
> connections (e.g. pgbouncer)
>
> pgbouncer does not properly support named statements, and that is
> pbgouncer's issue.
>
> Here's the issue for pgbouncer project: https://github.com/
> pgbouncer/pgbouncer/issues/126#issuecomment-200900171
> The response from pgbouncer team is "all the protocol bits are there, it
> is just implementation from pgbouncer that is missing".
>
> By the way: I do not use pgbouncer, thus there's no much interest for me
> to invest time in fixing pgbouncer's issues.
>

Um, OK... So you're not at all bothered by the fact that the (probably)
most popular PostgreSQL connection pool is incompatible with your argument?
I'm trying to think about actual users and the actual software they use, so
pgbouncer is very relevant.

Shay>Any scenario where you open a relatively short-lived connection and
> execute something once is problematic - imagine a simple web service which
> needs to insert a single record into a table.
>
> I would assume the application does not use random string for a table name
> (and columns/aliases), thus it would result in typical SQL text reuse, thus
> it should trigger "server-side statement prepare" logic. In other way, that
> kind of application does not need the "new ParseBindExecDeallocate message
> we are talking about".
>

I wouldn't assume anything. Maybe the application does want to change
column names (which can't be parameterized). Maybe it needs the extended
protocol simply because it wants to do binary encoding, which isn't
possible with the simple protocol 

Re: [HACKERS] Slowness of extended protocol

2016-08-09 Thread Shay Rojansky
On Tue, Aug 9, 2016 at 8:50 AM, Vladimir Sitnikov <
sitnikov.vladi...@gmail.com> wrote:

> Shay>There are many scenarios where connections are very short-lived
> (think about webapps where a pooled connection is allocated per-request and
> reset in between)
>
> Why the connection is reset in between in the first place?
> In pgjdbc we do not reset per-connection statement cache, thus we easily
> reuse named statements for pooled connections.
>

A DISCARD ALL is sent when the connection is returned to the pool, the
prevent state leakage etc. You make a valid comment though - there's
already a new feature in the Npgsql dev branch which allows prepared
statements to be persisted across open/close on pooled connections, it's
interesting to learn that it is standard behavior in pgjdbc. At least up to
now, the logic was not to implcitly keep holding server-side resources
across a pooled connection close, because these may become a drain on the
server etc.

More important, unless I'm mistaken pgbouncer also sends DISCARD ALL to
clean the connection state, as will other pooling solutions. That
unfortunately means that you can't always depend on prepared statements to
persist after closing the connection.


> Shay>There are also many scenarios where you're not necessarily going to
> send the same query multiple times in a single connection lifespan, so
> preparing is again out of the question.
>

> Can you list at least one scenario of that kind, so we can code it into
> pgbench (or alike) and validate "simple vs prepared" performance?
>

Again, in a world where prepared statements aren't persisted across
connections (e.g. pgbouncer), this scenario is extremely common. Any
scenario where you open a relatively short-lived connection and execute
something once is problematic - imagine a simple web service which needs to
insert a single record into a table.

Shay>And more generally, there's no reason for a basic, non-prepared
> execution to be slower than it can be.
>
> That's too generic. If the performance for "end-to-end cases" is just
> fine, then it is not worth optimizing further. Typical application best
> practice is to reuse SQL text (for both security and performance point of
> views), so in typical applications I've seen, query text was reused, thus
> it naturally was handled by server-prepared logic.
>

I don't see how reusing SQL text affects security in any way.

But here's the more important general point. We're driver developers, not
application developers. I don't really know what performance is "just fine"
for each of my users, and what is not worth optimizing further. Users may
follow best practices, or they may not for various reasons. They may be
porting code over from SqlServer, where prepare is almost never used
(because SqlServer caches plans implicitly), or they may simply not be very
good programmers. The API for executing non-prepared statements is there,
we support it and PostgreSQL supports it - it just happens to not be very
fast. Of course we can say "screw everyone not preparing statements", but
that doesn't seem like a very good way to treat users. Especially since the
fix isn't that hard.

Let me highlight another direction: current execution of server-prepared
> statement requires some copying of "parse tree" (or whatever). I bet it
> would be much better investing in removal of that copying rather than
> investing into "make one-time queries faster" thing. If we could make
> "Exec" processing faster, it would immediately improve tons of applications.
>

I don't understand what exactly you're proposing here, but if you have some
unrelated optimization that would speed up prepared statements, by all
means that's great. It's just unrelated to this thread.


> Shay>Of course we can choose a different query to benchmark instead of
> SELECT 1 - feel free to propose one (or several).
>
> I've tried pgbench -M prepared, and it is way faster than pgbench -M
> simple.
>
> Once again: all cases I have in mind would benefit from reusing
> server-prepared statements. In other words, after some warmup the
> appication would use just Bind-Execute-Sync kind of messages, and it would
> completely avoid Parse/Describe ones.
>

Of course that's the ideal scenario. It's just not the *only* scenario for
all users - they may either not have prepared statements persisting across
open/close as detailed above, or their code may simply not be preparing
statements at the moment. Why not help them out for free?

Shay>FYI in Npgsql specifically describe isn't used to get any knowledge
> about parameters - users must populate the correct parameters or query
> execution fails.
>
> I think the main reason to describe for pgjdbc is to get result oids.
> pgjdbc is not "full binary", thus it has to be careful which fields it
> requests in binary format.
> That indeed slows down "unknown queries", but as the query gets reused,
> pgjdbc switches to server-prepared execution, and eliminates parse-describe
> overheads 

Re: [HACKERS] Slowness of extended protocol

2016-08-09 Thread Shay Rojansky
On Mon, Aug 8, 2016 at 6:44 PM, Vladimir Sitnikov <
sitnikov.vladi...@gmail.com> wrote:
>
> The problem with "empty statement name" is statements with empty name can
>> be reused (for instance, for batch insert executions), so the server side
>> has to do a defensive copy (it cannot predict how many times this unnamed
>> statement will be used).
>>
>
That seems right.

Also, part of the point here is to reduce the number of protocol messages
>> needed in order to send a parameterized query - not to have to do
>> Parse/Describe/Bind/Execute/Sync - since part of the slowdown comes from
>> that (although I'm not sure how much). Your proposal keeps the 5 messages.
>>
>
> As my benchmarks show, notable overhead is due to "defensive copying of
> the execution plan". So I would measure first, and only then would claim
> where the overhead is.
>

> Some more profiling is required to tell which part is a main time consumer.
>

Tom also pointed to the caching as the main culprit, although there seems
to be some message-related overhead as well. It seems that things like
process title changes may be fixable easily - do we really need a process
title change on every message (as opposed to, say, execute messages only).
This profiling and optimization effort can happen in parallel to the
discussion on what to do with the execution plan caching.


> Technically speaking, I would prefer to have a more real-life looking
> example instead of SELECT 1.
> Do you have something in mind?
> For instance, for more complex queries, "Parse/Plan" could cost much more
> than we shave by adding "a special non-cached statement name" or by
> reducing "5 messages into 1".
>
> There's a side problem: describe message requires full roundtrip since
> there are cases when client needs to know how to encode parameters.
> Additional roundtrip hurts much worse than just an additional message that
> is pipelined (e.g. sent in the same TCP packet).
>

This is true, but there doesn't seem to be anything we can do about it - if
your usage relies on describe to get information on parameters (as opposed
to results), you're stuck with an extra roundtrip no matter what. So it
seems you have to use the extended protocol anyway.

FYI in Npgsql specifically describe isn't used to get any knowledge about
parameters - users must populate the correct parameters or query execution
fails.


> Note: it is quite easy to invent a name that is not yet used in the wild,
>>> so it is safe.
>>>
>>
>> That's problematic, how do you know what's being used in the wild and
>> what isn't? The protocol has a specification, it's very problematic to get
>> up one day and to change it retroactively. But again, the empty statement
>> seems to already be there for that.
>>
>
> Empty statement has different semantics, and it is wildly used.
> For instance, pgjdbc uses unnamed statements a lot.
> On the other hand, statement name of "!pq@#!@#42" is rather safe to use
> as a special case.
> Note: statement names are not typically created by humans (statement name
> is not a SQL), and very little PG clients do support named statements.
>

IMHO this simply isn't the kind of thing one does in a serious protocol of
a widely-used product, others seem to agree on this.

> Sir, any new SQL keyword is what you call a "retroactively defining
special semantics".
> It's obvious that very little current clients do use named
server-prepared statements.
> Statement names are not something that is provided by the end-user in a
web page, so it is not a rocket science to come up with a
> statement name that is both short and "never ever used in the wild".

The difference is that before the new SQL keyword is added, trying to use
it results in an error. What you're proposing is taking something that
already works in one way and changing its behavior.

> Shay, can you come up with a real-life use case when those "I claim the
statement will be used only once" is would indeed improve performance?
> Or, to put it in another way: "do you have a real-life case when simple
protocol is faster than extended protocol with statement reuse"?

> I do have a couple of java applications and it turns out there's a huge
win of reusing server-prepared statements.
> There's a problem that "generic plan after 5th execution might be much
worse than a specific one", however those statements are not often
> and I just put hints to the SQL (limit 0, +0, CTE, those kind of things).

I maintain Npgsql, which is a general-purpose database driver and not a
specific application. The general .NET database API (ADO.NET), like most
(all?) DB APIs, allows users to send a simple statement to the database
(ExecuteReader, ExecuteScalar, ExecuteNonQuery). Every time a user uses
these APIs without preparing, they pay a performance penalty because the
extended protocol has more overhead than the simple one.

Obviously smart use of prepared statements is a great idea, but it doesn't
work everywhere. There are many scenarios where 

Re: [HACKERS] Slowness of extended protocol

2016-08-08 Thread Shay Rojansky
Vladimir wrote:


> On the other hand, usage of some well-defined statement name to trigger
>>> the special case would be fine: all pgbouncer versions would pass those
>>> parse/bind/exec message as if it were regular messages.
>>>
>>
>> Can you elaborate on what that means exactly? Are you proposing to
>> somehow piggyback on an existing message (e.g. Parse) but use some special
>> statement name that would make PostgreSQL interpret it as a different
>> message type?
>>
>
> Exactly.
> For instance: if client sends Parse(statementName=I_swear_
> the_statement_will_be_used_only_once), then the subsequent message must
> be BindMessage, and the subsequent must be ExecMessage for exactly the same
> statement id.
>

Ah, I understand the proposal better now - you're not proposing encoding a
new message type in an old one, but rather a magic statement name in Parse
which triggers an optimized processing path in PostgreSQL, that wouldn't go
through the query cache etc.

If so, isn't that what the empty statement is already supposed to do? I
know there's already some optimizations in place around the scenario of
empty statement name (and empty portal).

Also, part of the point here is to reduce the number of protocol messages
needed in order to send a parameterized query - not to have to do
Parse/Describe/Bind/Execute/Sync - since part of the slowdown comes from
that (although I'm not sure how much). Your proposal keeps the 5 messages.

Again, if it's possible to make "Parse/Describe/Bind/Execute/Sync" perform
close to Query, e.g. when specifying empty statement/portal, that's
obviously the best thing here. But people seem to be suggesting that a
significant part of the overhead comes from the fact that there are 5
messages, meaning there's no way to optimize this without a new message
type.

Note: it is quite easy to invent a name that is not yet used in the wild,
> so it is safe.
>

That's problematic, how do you know what's being used in the wild and what
isn't? The protocol has a specification, it's very problematic to get up
one day and to change it retroactively. But again, the empty statement
seems to already be there for that.


Re: [HACKERS] Slowness of extended protocol

2016-08-08 Thread Shay Rojansky
>
> We could call this "protocol 3.1" since it doesn't break backwards
>> compatibility (no incompatible server-initiated message changes, but it
>> does include a feature that won't be supported by servers which only
>> support 3.0. This could be a sort of "semantic versioning" for the protocol
>> - optional new client-initiated features are a minor version bump, others
>> are a major version bump...
>>
>
> Adding a new message is not backward compatible since it will fail in
> pgbouncer kind of deployments.
> Suppose there's "new backend", "old pgbouncer", "new client" deployment.
> If the client tries to send the new message, it will fail since pgbouncer
> would have no idea what to do with that new message.
>

That's definitely a valid point. But do you think it's a strong enough
argument to avoid ever adding new messages?


> On the other hand, usage of some well-defined statement name to trigger
> the special case would be fine: all pgbouncer versions would pass those
> parse/bind/exec message as if it were regular messages.
>

Can you elaborate on what that means exactly? Are you proposing to somehow
piggyback on an existing message (e.g. Parse) but use some special
statement name that would make PostgreSQL interpret it as a different
message type? Apart from being a pretty horrible hack, it would still break
pgbouncer, which has to actually inspect and understand SQL being sent to
the database (e.g. to know when transactions start and stop).


Re: [HACKERS] Slowness of extended protocol

2016-08-07 Thread Shay Rojansky
On Sun, Aug 7, 2016 at 6:11 PM, Robert Haas  wrote:

> I'm glad reducing the overhead of out-of-line parameters seems like an
> > important goal. FWIW, if as Vladimir seems to suggest, it's possible to
> > bring down the overhead of the v3 extended protocol to somewhere near the
> > simple protocol, that would obviously be a better solution - it would
> mean
> > things work faster here and now, rather than waiting for the v4
> protocol. I
> > have no idea if that's possible though, I'll see if I can spend some
> time on
> > understand where the slowdown comes from.
>
> That having been said, I don't really see a reason why we couldn't
> introduce a new protocol message for this without bumping the protocol
> version.  Clients who don't know about the new message type just won't
> use it; nothing breaks.  Clients who do know about the new message
> need to be careful not to send it to older servers, but since the
> server reports its version to the client before the first opportunity
> to send queries, that shouldn't be too hard.  We could add a new
> interface to libpq that uses the new protocol message on new servers
> and falls back to the existing extended protocol on older servers.  In
> general, it seems to me that we only need to bump the protocol version
> if there will be server-initiated communication that is incompatible
> with existing clients.  Anything that the client can choose to
> initiate (or not) based on the server version should be OK.


That sounds right to me. As you say, the server version is sent early in
the startup phase, before any queries are sent to the backend, so frontends
know which server they're communicating with.

We could call this "protocol 3.1" since it doesn't break backwards
compatibility (no incompatible server-initiated message changes, but it
does include a feature that won't be supported by servers which only
support 3.0. This could be a sort of "semantic versioning" for the protocol
- optional new client-initiated features are a minor version bump, others
are a major version bump...

This new client-initiated message would be similar to query, except that it
would include the parameter and result-related fields from Bind. The
responses would be identical to the responses for the Query message.

Does this make sense?


Re: [HACKERS] Slowness of extended protocol

2016-08-05 Thread Shay Rojansky
>
> > I really don't get what's problematic with posting a message on a mailing
> > list about a potential performance issue, to try to get people's
> reactions,
> > without diving into profiling right away. I'm not a PostgreSQL developer,
> > have other urgent things to do and don't even spend most of my
> programming
> > time in C.
>
> There's absolutely nothing wrong with that.  I find your questions
> helpful and interesting and I hope you will keep asking them.  I think
> that they are a valuable contribution to the discussion on this list.
>

Thanks for the positive feedback Robert.

I'm glad reducing the overhead of out-of-line parameters seems like an
important goal. FWIW, if as Vladimir seems to suggest, it's possible to
bring down the overhead of the v3 extended protocol to somewhere near the
simple protocol, that would obviously be a better solution - it would mean
things work faster here and now, rather than waiting for the v4 protocol. I
have no idea if that's possible though, I'll see if I can spend some time
on understand where the slowdown comes from.


Re: [HACKERS] Slowness of extended protocol

2016-08-02 Thread Shay Rojansky
On Mon, Aug 1, 2016 at 12:12 PM, Vladimir Sitnikov <
sitnikov.vladi...@gmail.com> wrote:

> The attached patch passes `make check` and it gains 31221 -> 33547
improvement for "extended pgbench of SELECT 1".
>
> The same version gains 35682 in "simple" mode, and "prepared" mode
achieves 46367 (just in case).

That's great, thanks for looking into it! I hope your patch gets merged.

> Shay, why don't you use a profiler? Seriously.
> I'm afraid "iterate the per-message loop in PostgresMain five times not
once" /"just discussing what may or may not be a problem..."  is just
hand-waving.
>
> Come on, it is not that hard.

I really don't get what's problematic with posting a message on a mailing
list about a potential performance issue, to try to get people's reactions,
without diving into profiling right away. I'm not a PostgreSQL developer,
have other urgent things to do and don't even spend most of my programming
time in C.


Re: [HACKERS] Slowness of extended protocol

2016-08-01 Thread Shay Rojansky
Greg wrote:
> I think you're looking at this the wrong way around. 30% of what?
> You're doing these simple read-only selects on a database that
> obviously is entirely in RAM. If you do the math on the numbers you
> gave above the simple protocol took 678 microseconds per transaction
> and the extended protocol took 876 microseconds. The difference is 198
> microseconds. I'm not sure exactly where those 200us are going and
> perhaps it could be lower but in what real-world query is it going to
> have a measurable impact on the total time?

That's a valid question, but as Andres said, it may not matter for most
workloads but I think such a significant difference would matter to some...
Keep in mind that I'm writing from the point of view of a driver developer,
and not of a specific app - I know there are some point executing against a
local database and trying to get extremely high throughput, for RAM reading
queries or otherwise.

> The other danger in unrealistic test cases is that you're probably
> measuring work that doesn't scale and in fact optimizing based on it
> could impose a cost that *does* scale. For example if 150us of that
> time is being spent in the prepare and we cut that down by a factor of
> 10 to 15us then it would be only a 10% penalty over the simple
> protocol in your test. But if that optimization added any overhead to
> the execute stage then when it's executed thousands of times that
> could add milliseconds to the total runtime.

I think it's a bit too early to say that... We're not discussing any
proposed optimizations yet, just discussing what may or may not be a
problem... Of course any proposed optimization would have to be carefully
studied to make sure it doesn't cause performance degradation elsewhere.

Andres wrote:
> Shay, are you using unnamed or named portals? There's already a shortcut
> path for the former in some places.

The benchmarks I posted are simply pgbench doing SELECT 1 with extended vs.
simple, so I'm assuming unnamed portals. Npgsql itself only uses the
unnamed portal, I think it's documented somewhere that this is better for
performance.

Tom wrote:
> In hindsight it seems clear that what a lot of apps want out of extended
> protocol is only the ability to send parameter values out-of-line instead
> of having to quote/escape them into SQL literals.  Maybe an idea for the
> fabled V4 protocol update is some compromise query type that corresponds
> precisely to PQexecParams's feature set: you can send parameter values
> out-of-line, and you can specify text or binary results, but there's no
> notion of any persistent state being created and no feedback about
> parameter data types.

That seems like a good way forward. It may be possible to generalize this
into a more "pay-per-play" protocol. You currently have a binary choice
between a simple but fast protocol supporting very little, and an extended
but slow protocol supporting everything. Making things more "pick and
choose" could help here: if you want to actually use plan reuse, you pay
for that. If you actually send parameters, you pay for that. It would be a
pretty significant protocol change but it would make things more modular
that way.

>  I think the tie-in to the plan cache is a
> significant part of the added overhead, and so is the fact that we have to
> iterate the per-message loop in PostgresMain five times not once, with
> overheads like updating the process title incurred several times in that.

I was thinking that something like that may be the cause. Is it worth
looking into the loop and trying to optimize? For example, updating the
process title doesn't seem to make sense for every single extended
message...


Re: [HACKERS] Slowness of extended protocol

2016-07-31 Thread Shay Rojansky
>
> Shay Rojansky <r...@roji.org>:
>
>> I'm well aware of how the extended protocol works, but it seems odd for a
>> 30% increase in processing time to be the result exclusively of processing
>> 5 messages instead of just 1 - it doesn't seem like that big a deal
>> (although I may be mistaken). I was imagining that there's something more
>> fundamental in how the protocol or PostgreSQL state is managed internally,
>> that would be responsible for the slowdown.
>>
>
> Hi, have you tried to use a profiler to identify the _cause_ of the
> difference in performance?
>
> Here's relevant read:
> https://shipilev.net/blog/2015/voltmeter/#_english_version
>

I'm definitely not a stage where I'm interested in the cause of the
difference. I'm not a PostgreSQL hacker, and I'm not going into the source
code to try and optimize anything (not yet anyway). For now, I'm just
looking to get a high-level picture of the situation and to inform people
that there may be an issue.

Or in terms of your article, I'm plugging a light bulb into the wall socket
and the light is dim, so I'm trying to ask the building electricity team if
they're aware of it, if if it's a fixable situation and if there are plans
to fix it - before pushing any fingers into some electricity cabinet I
don't know.


Re: [HACKERS] Slowness of extended protocol

2016-07-31 Thread Shay Rojansky
>
> Without re-using prepared statements or portals, extended protocol is
> always slow because it requires more messages exchanges than simple
> protocol. In pgbench case, it always sends parse, bind, describe,
> execute and sync message in each transaction even if each transaction
> involves identical statement ("SELECT 1" in your case).
>
> See the manual for the protocol details.
>

I'm well aware of how the extended protocol works, but it seems odd for a
30% increase in processing time to be the result exclusively of processing
5 messages instead of just 1 - it doesn't seem like that big a deal
(although I may be mistaken). I was imagining that there's something more
fundamental in how the protocol or PostgreSQL state is managed internally,
that would be responsible for the slowdown.


[HACKERS] Slowness of extended protocol

2016-07-31 Thread Shay Rojansky
Hi all. I know this has been discussed before, I'd like to know what's the
current position on this.

Comparing the performance of the simple vs. extended protocols with pgbench
yields some extreme results:

$ ./pgbench -T 10 -S -M simple -f /tmp/pgbench.sql pgbench
tps = 14739.803253 (excluding connections establishing)

$ ./pgbench -T 10 -S -M extended -f /tmp/pgbench.sql pgbench
tps = 11407.012679 (excluding connections establishing)

(pgbench.sql contains a minimal SELECT 1, I'm running against localhost)

I was aware that there's some overhead associated with the extended
protocol, but not that it was a 30% difference... My question is whether
there are good reasons why this should be so, or rather that this simply
hasn't been optimized yet. If it's the latter, are there plans to do so?

To give some context, I maintain Npgsql, the open-source .NET driver for
PostgreSQL. Since recent version Npgsql uses the extended protocol almost
exclusively, mainly because it does binary data rather than text. Even if
that weren't the case, imposing such a performance penalty on extended-only
features (parameters, prepared statements) seems problematic.

I'm aware that testing against localhost inflates the performance issue -
taking into account the latency of a remote server, the simple/extended
difference would be much less significant. But the issue still seems to be
relevant.


Re: [HACKERS] Binary I/O for isn extension

2016-05-31 Thread Shay Rojansky
>
> I added this patch to the next CF (2016-09) under "Miscellaneous".
>

Thanks!


> Out of curiosity, what is the motivation?


I'm the owner of Npgsql, the open-source .NET driver for PostgreSQL, which
is a binary-first driver. That is, working with types that have no binary
I/O is possible but awkward.

I hope the general direction is (or will be) to have binary I/O for all
supported types, both for compatibility with binary-first consumers such as
Npgsql and for general efficiency.


Re: [HACKERS] Binary I/O for isn extension

2016-05-31 Thread Shay Rojansky
>
> When adding new functions to an extension you need to bump the version of
> the extension by renaming the file, updating the .control file, creating an
> upgrade script, and updating the Makefile to include the new files.


Thanks for the guidance, I'll fix all that and resubmit a patch.


[HACKERS] Binary I/O for isn extension

2016-05-31 Thread Shay Rojansky
Hi.

Attached is a small patch which adds binary input/output for the types
added by the isn extension.

Shay


isn-binary.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parameters don't work in FETCH NEXT clause?

2016-05-17 Thread Shay Rojansky
>
> ​Would something like this be valid?
>
> OFFSET { start_literal | ( start_expression ) } { ROW | ROWS }
> FETCH { FIRST | NEXT} [ count_literal | ( count_expression ) ] { ROW |
> ROWS } ONLY
>
> Leaving the mandatory parentheses detail to the description, while
> adequate, seems insufficient - especially when a normal LIMIT expression is
> not so restricted.
>
> ​And don't you think the section header would be more accurately named:
>
> Limit, Offset & Fetch Clauses
>
> ​The nuance regarding "different standard syntax" is unknown to the reader
> who first looks at the syntax and sees three different lines, one for each
> clause, and then scrolls down looking at headers until they find the
> section for the clause they are interested in.  That FETCH is an alias for
> LIMIT ​is not something that I immediately understood - though to be honest
> I don't think I comprehended the presence of FETCH on a SELECT query at all
> and thought it only pertained to cursors
>
>
All these suggestions would definitely have saved me (and therefore this
list!) some time.


Re: [HACKERS] Parameters don't work in FETCH NEXT clause?

2016-05-17 Thread Shay Rojansky
Apologies, as usual I didn't read the docs carefully enough.

On Tue, May 17, 2016 at 7:13 PM, Tom Lane <t...@sss.pgh.pa.us> wrote:

> Shay Rojansky <r...@roji.org> writes:
> > A user of mine just raised a strange issue... While it is possible to
> use a
> > parameter in a LIMIT clause, PostgreSQL does not seem to allow using one
> in
> > a FETCH NEXT clause. In other words, while the following works:
> > SELECT 1 LIMIT $1;
> > The following generates a syntax error:
> > SELECT 1 FETCH NEXT $1 ROWS ONLY;
> > Since LIMIT and FETCH NEXT are supposed to be equivalent this behavior is
> > odd.
>
> Per the SELECT reference page:
>
> SQL:2008 introduced a different syntax to achieve the same result,
> which PostgreSQL also supports. It is:
>
> OFFSET start { ROW | ROWS }
> FETCH { FIRST | NEXT } [ count ] { ROW | ROWS } ONLY
>
> In this syntax, to write anything except a simple integer constant for
> start or count, you must write parentheses around it.
>
> The comments about this in gram.y are informative:
>
>  * Allowing full expressions without parentheses causes various parsing
>  * problems with the trailing ROW/ROWS key words.  SQL only calls for
>  * constants, so we allow the rest only with parentheses.  If omitted,
>  * default to 1.
>
> regards, tom lane
>


[HACKERS] Parameters don't work in FETCH NEXT clause?

2016-05-17 Thread Shay Rojansky
A user of mine just raised a strange issue... While it is possible to use a
parameter in a LIMIT clause, PostgreSQL does not seem to allow using one in
a FETCH NEXT clause. In other words, while the following works:

SELECT 1 LIMIT $1;

The following generates a syntax error:

SELECT 1 FETCH NEXT $1 ROWS ONLY;

Since LIMIT and FETCH NEXT are supposed to be equivalent this behavior is
odd.

More generally, is there some documentation on where exactly PostgreSQL
allows parameters and where it doesn't? I occasionally get complaints from
users expecting parameters to work in DDL, or even in table/column names in
SELECT queries... I haven't seen a resource like this.


Re: [HACKERS] Proposed change to make cancellations safe

2016-04-25 Thread Shay Rojansky
>
> We really do need "cancel up to" semantics for reliable behavior.
> Consider the case where the client has sent the query (or thinks it has)
> but the server hasn't received it yet.  If the cancel request can arrive
> at the server before the query fully arrives, and we don't have "cancel
> all messages up through N" semantics, the cancel will not do what the
> client expects it to.
>

Keep in mind that in the case of a cancellation arriving really too early,
i.e. before any messages have been received by the server, it will be
totally ignored since at the time of reception there's nothing for the
server to cancel yet. This may seem a bit exotic, although if you really
want to provide air-tight cancellation semantics you could have the server
track unfulfilled cancellation requests. In other words, if the server
receives "cancel up to X" and is now processing X-5, the cancellation
request is kept in memory until X has been duly cancelled.


Re: [HACKERS] Proposed change to make cancellations safe

2016-04-25 Thread Shay Rojansky
>
> I definitely agree that simply tracking message sequence numbers on both
>> sides is better. It's also a powerful feature to be able to cancel all
>> messages "up to N" - I'm thinking of a scenario where, for example, many
>> simple queries are sent and the whole process needs to be cancelled.
>>
>
> For security, I think any non-matching cancellation would be ignored so
> that only someone with proper context could issue the cancel.
>

Isn't security taken care of by the secret key?


> Issuing bulk cancellations sounds like a bad plan.
>

I'm not sure why, but at the very least it's a useful thing to have when
batching several statements together and then wanting to cancel them all.


> Yes, this has been happening to some Npgsql users, it's not very frequent
>> but it does happen from time to time. I also bumped into this in some
>> automated testing scenarios. It's not the highest-value feature, but it is
>> an improvement to have if you plan on working on a new protocol version.
>>
>> Let me know if you'd like me to update the TODO.
>>
>
> If you've got an itch, expecting someone else to scratch it is less likely
> to succeed.
>

Sure. I'd consider sending in a patch, but as this is a protocol-changing
feature it seems like working on this before the team "officially" starts
working on a new protocol might be a waste of time. Once there's critical
mass for a new protocol and agreement that PostgreSQL is going for it I'd
be happy to work on it.

BTW it seems it's no longer possible for anyone to add things to the TODO
(no editor privileges).


Re: [HACKERS] Proposed change to make cancellations safe

2016-04-24 Thread Shay Rojansky
I definitely agree that simply tracking message sequence numbers on both
sides is better. It's also a powerful feature to be able to cancel all
messages "up to N" - I'm thinking of a scenario where, for example, many
simple queries are sent and the whole process needs to be cancelled.

Yes, this has been happening to some Npgsql users, it's not very frequent
but it does happen from time to time. I also bumped into this in some
automated testing scenarios. It's not the highest-value feature, but it is
an improvement to have if you plan on working on a new protocol version.

Let me know if you'd like me to update the TODO.

On Sun, Apr 24, 2016 at 6:11 PM, Tom Lane <t...@sss.pgh.pa.us> wrote:

> Shay Rojansky <r...@roji.org> writes:
> > The issue I'd like to tackle is the fact that it's not possible to make
> > sure a cancellation request affects a specific query.
>
> Right ...
>
> > A simple fix for this would be to have a sequence number returned in the
> > BindComplete message.
>
> First, that doesn't fix anything for simple query protocol, which doesn't
> use such a message.
>
> Second, making sure that the BindComplete gets delivered soon enough to be
> useful would require issuing a Sync between Bind and Execute.  At the
> least that doubles the number of network packets for small query results
> (so I dispute your assumption that the added network load would be
> negligible).  But a bigger issue is that it would have subtle effects
> on error handling.  An application could not blindly fire out
> Parse/Bind/Sync/Execute/Sync as one packet; it would have to wait for
> the Bind result to come back before it could know if it's safe to send
> Execute, unlike the case where it sends Parse/Bind/Execute/Sync.  (The
> server's skip-to-Sync-after-error rule is critical here.)  So actually,
> making use of this feature would double the number of network round trips,
> as well as requiring carefully thought-through changes to application
> error handling.
>
> Third, if a query is not cancellable till Execute, that prevents
> cancellation of scenarios where the parser or planner takes a long time
> for some reason.  Long planner runtimes are certainly not uncommon.
>
>
> So this is worth thinking about, but that doesn't sound like much of
> a solution to me.
>
> I wonder though why we need the server to tell the client a sequence
> number at all.  Surely both sides of the connection could easily count
> the number of messages the client has transmitted.  We could simply extend
> the cancel packet to include a sequence number field, with the semantics
> "this should only take effect if you're still working on message N".
> Or I guess maybe it should read "working on message N or before", so
> that in a case like Parse/Bind/Execute/Sync you would mention the number
> of the Execute message and still get cancellation if the query was hung
> up in Parse.
>
> Have you seen this to be a problem in practice, or is it just
> theoretical?  I do not recall many, if any, field complaints
> about the issue.
>
> regards, tom lane
>


[HACKERS] Proposed change to make cancellations safe

2016-04-24 Thread Shay Rojansky
Hi.

A while ago I discussed some reliability issues when using cancellations (
http://www.postgresql.org/message-id/CADT4RqAk0E10=9ba8v+uu0dq9tr+pn8x+ptqbxfc1fbivh3...@mail.gmail.com).
Since we were discussing some protocol wire changes recently I'd like to
propose one to help with that.

The issue I'd like to tackle is the fact that it's not possible to make
sure a cancellation request affects a specific query. Since cancellations
are totally asynchronous and occur on another socket, they affect whatever
query is currently being processed. This makes it possible to inadvertently
kill a later query, because by the time the cancellation request is
delivered and "applied" an the intended query already completed and a later
one gets killed.

A simple fix for this would be to have a sequence number returned in the
BindComplete message. When submitting a cancellation request, the frontend
would have the option (but not the obligation) to pass such as sequence
number to the backend. When cancelling, the backend would make sure the
sequence number corresponds to the currently running query.

The only drawback seems to be the increased payload of the BindComplete
message (4 bytes, or possibly less if we're really worried about it).

If this makes sense, I'll add it to the TODO wiki.

Shay


[HACKERS] Wire protocol compression

2016-04-21 Thread Shay Rojansky
I know this has been discussed before (
http://postgresql.nabble.com/Compression-on-SSL-links-td2261205.html,
http://www.postgresql.org/message-id/BANLkTi=Ba1ZCmBuwwn7M1wvPFioT=6n...@mail.gmail.com),
but it seems to make sense to revisit this in 2016.

Since CRIME in 2012, AFAIK compression with encryption is considered
insecure, and the feature is removed entirely in the TLS 1.3 draft. In
addition (and because of that), many (most?) client TLS implementations
don't support compression (Java, .NET), meaning that a considerable number
of PostgreSQL users don't have access to compression.

Does it make sense to you guys to discuss compression outside of TLS? There
are potentially huge bandwidth savings which could benefit both WAN and
non-WAN scenarios, and decoupling this problem from TLS would make it both
accessible to everyone (assuming PostgreSQL clients follow). It would be a
protocol change though.

Shay


Re: [HACKERS] Some 9.5beta2 backend processes not terminating properly?

2016-01-01 Thread Shay Rojansky
>
> On googling, it seems this is related to .Net framework compatibility. I am
> using .Net Framework 4 to build the program.cs and that is what I have
> on my m/c.  Are you using the same for Npgsql or some different version?
>

That is probably the problem. Npgsql 3.0 is only available for .NET
Framework 4.5 and above, you should be able to build and run the sample
with VS2013 or VS2015 (note that Microsoft release totally free community
editions of these which you can use). Let me know if you run into any
issues.

Regardless, apologies I haven't had more time to work on this recently. I
now have debug builds of 9.5rc1 (with the bug) and 9.4.5 (without the bug)
and can reliably debug through application as the error occurs. I will
probably have some time tomorrow or Monday to dive into this. Amit, it
would be great if you could confirm the error happening too.


Re: [HACKERS] Some 9.5beta2 backend processes not terminating properly?

2015-12-30 Thread Shay Rojansky
>
> > The backends seem to hang when the client closes a socket without first
> > sending a Terminate message - some of the tests make this happen. I've
> > confirmed this happens with 9.5rc1 running on Windows (versions 10 and
> 7),
> > but this does not occur on Ubuntu 15.10. The client runs on Windows as
> well
> > (although I doubt that's important).
>
> Hm. So that seems to indicate that, on windows, we're not properly
> recognizing dead sockets in the latch code. Could you check, IIRC with
> netstat or something like it, in what state the connections are?
>

netstat shows the socket is in FIN_WAIT_2.


> Any chance you could single-step through WaitLatchOrSocket() with a
> debugger? Without additional information this is rather hard to
> diagnose.
>

Uh I sure can, but I have no idea what to look for :) Anything specific?


Re: [HACKERS] Some 9.5beta2 backend processes not terminating properly?

2015-12-30 Thread Shay Rojansky
>
> Hm. Is this with a self compiled postgres? If so, is it with assertions
> enabled?
>

No, it's just the EnterpriseDB 9.5rc1 installer...

Tom's probably right about the optimized code. I could try compiling a
debug version..


Re: [HACKERS] Some 9.5beta2 backend processes not terminating properly?

2015-12-30 Thread Shay Rojansky
OK, I finally found some time to dive into this.

The backends seem to hang when the client closes a socket without first
sending a Terminate message - some of the tests make this happen. I've
confirmed this happens with 9.5rc1 running on Windows (versions 10 and 7),
but this does not occur on Ubuntu 15.10. The client runs on Windows as well
(although I doubt that's important).

In case it helps, here's a gist
<https://gist.github.com/roji/33df4e818c5d64a607aa> with some .NET code
that uses Npgsql 3.0.4 to reproduce this.

If there's anything else I can do please let me know.

Shay

On Wed, Dec 30, 2015 at 5:32 AM, Amit Kapila <amit.kapil...@gmail.com>
wrote:

>
>
> On Tue, Dec 29, 2015 at 7:04 PM, Shay Rojansky <r...@roji.org> wrote:
>
>> Could you describe the worklad a bit more? Is this rather concurrent? Do
>>> you use optimized or debug builds? How long did you wait for the
>>> backends to die? Is this all over localhost, external ip but local,
>>> remotely?
>>>
>>
>> The workload is a a rather diverse set of integration tests executed with
>> Npgsql. There's no concurrency whatsoever - tests are executed serially.
>> The backends stay alive indefinitely, until they are killed. All this is
>> over localhost with TCP. I can try other scenarios if that'll help.
>>
>>
>
> What procedure do you use to kill backends?  Normally, if we kill
> via task manager using "End Process", it is considered as backend
> crash and the server gets restarted and all other backends got
> disconnected.
>
>
>> > Note that the number of backends that stay stuck after the tests is
>>> > constant (always 12).
>>>
>>> Can you increase the number of backends used in the test? And check
>>> whether it's still 12?
>>>
>>
>> Well, I ran the testsuite twice in parallel, and got... 23 backends stuck
>> at the end.
>>
>>
>>> How are your clients disconnecting? Possibly without properly
>>> disconnecting?
>>>
>>
>> That's possible, definitely in some of the test cases.
>>
>> What I can do is try to isolate things further by playing around with the
>> tests and trying to see if a more minimal repro can be done - I'll try
>> doing this later today or tomorrow. If anyone has any other specific tests
>> or checks I should do let me know.
>>
>
> I think first we should try to isolate whether the hanged backends
> are due to the reason that they are not disconnected properly or
> there is some other factor involved as well, so you can try to kill/
> disconnect the sessions connected via psql in the same way as
> you are doing for connections with Npgsql and see if you can
> reproduce the same behaviour.
>
> With Regards,
> Amit Kapila.
> EnterpriseDB: http://www.enterprisedb.com
>


Re: [HACKERS] Some 9.5beta2 backend processes not terminating properly?

2015-12-30 Thread Shay Rojansky
>
> > > Any chance you could single-step through WaitLatchOrSocket() with a
> > > debugger? Without additional information this is rather hard to
> > > diagnose.
> > >
> >
> > Uh I sure can, but I have no idea what to look for :) Anything
> > specific?
>
> Things that'd be interesting:
> 1) what are the arguments passed to WaitLatchOrSocket(), most
>importantly wakeEvents and sock
> 2) are we busy looping, or is WaitForMultipleObjects() blocking
>endlessly
> 3) If you kill -9 (well, terminate in the task manager) a client, while
>stepping serverside in WaitLatchOrSocket, does
>WaitForMultipleObjects() return? If so, what paths are we taking?
>

The process definitely isn't busy looping - zero CPU usage.

I'll try to set up debugging, it may take some time though (unfamiliar with
PostgreSQL internals and Windows debugging techniques).


Re: [HACKERS] Some 9.5beta2 backend processes not terminating properly?

2015-12-30 Thread Shay Rojansky
>
> Things that'd be interesting:
> 1) what are the arguments passed to WaitLatchOrSocket(), most
>importantly wakeEvents and sock
>

wakeEvents is 8387808 and so is sock.

Tom, this bug doesn't occur with 9.4.4 (will try to download 9.4.5 and
test).


Re: [HACKERS] Some 9.5beta2 backend processes not terminating properly?

2015-12-30 Thread Shay Rojansky
>
> Are we sure this is a 9.5-only bug?  Shay, can you try 9.4 branch tip
> and see if it misbehaves?  Can anyone else reproduce the problem?
>
>
Doesn't occur with 9.4.5 either. The first version I tested which exhibited
this was 9.5beta2.


Re: [HACKERS] Some 9.5beta2 backend processes not terminating properly?

2015-12-30 Thread Shay Rojansky
>
> Is this in a backend with ssl?
>

No.

If you go up one frame, what value does port->sock have?
>

For some reason VS is telling me "Unable to read memory" on port->sock... I
have no idea why that is...


Re: [HACKERS] Some 9.5beta2 backend processes not terminating properly?

2015-12-29 Thread Shay Rojansky
>
> > The tests run for a couple minutes, open and close some connection. With
> my
> > pre-9.5 backends, the moment the test runner exits I can see that all
> > backend processes exit immediately, and pg_activity_stat has no rows
> > (except the querying one). With 9.5beta2, however, some backend processes
> > continue to stay alive beyond the test runner, and pg_activity_stat
> > contains extra rows (state idle, waiting false). This situation persists
> > until I restart PostgreSQL.
>
> No idea what's happening, but a couple of questions:
>
> * Are you using SSL connections?
>
> * Can you get stack traces from the seemingly-stuck backends?
>

Most of my tests don't use SSL but some do. Looking at the query field in
pg_stat_activity I can see queries that don't seem to originate from SSL
tests.

Note that the number of backends that stay stuck after the tests is
constant (always 12).

Here's are stack dumps of the same process taken with both VS2015 Community
and Process Explorer, I went over 4 processes and saw the same thing. Let
me know what I else I can provide to help.

>From VS2015 Community:

Main Thread
> ntdll.dll!NtWaitForMultipleObjects() Unknown
  KernelBase.dll!WaitForMultipleObjectsEx() Unknown
  KernelBase.dll!WaitForMultipleObjects() Unknown
  postgres.exe!WaitLatchOrSocket(volatile Latch * latch, int wakeEvents,
unsigned __int64 sock, long timeout) Line 202 C
  postgres.exe!secure_read(Port * port, void * ptr, unsigned __int64 len)
Line 151 C
  postgres.exe!pq_getbyte() Line 926 C
  postgres.exe!SocketBackend(StringInfoData * inBuf) Line 345 C
  postgres.exe!PostgresMain(int argc, char * * argv, const char * dbname,
const char * username) Line 3984 C
  postgres.exe!BackendRun(Port * port) Line 4236 C
  postgres.exe!SubPostmasterMain(int argc, char * * argv) Line 4727 C
  postgres.exe!main(int argc, char * * argv) Line 211 C
  postgres.exe!__tmainCRTStartup() Line 626 C
  kernel32.dll!BaseThreadInitThunk() Unknown
  ntdll.dll!RtlUserThreadStart() Unknown

Worker Thread
> ntdll.dll!NtWaitForWorkViaWorkerFactory() Unknown
  ntdll.dll!TppWorkerThread() Unknown
  kernel32.dll!BaseThreadInitThunk() Unknown
  ntdll.dll!RtlUserThreadStart() Unknown

Worker Thread
> ntdll.dll!NtFsControlFile() Unknown
  KernelBase.dll!ConnectNamedPipe() Unknown
  postgres.exe!pg_signal_thread(void * param) Line 279 C
  kernel32.dll!BaseThreadInitThunk() Unknown
  ntdll.dll!RtlUserThreadStart() Unknown

Worker Thread
> ntdll.dll!NtWaitForSingleObject() Unknown
  KernelBase.dll!WaitForSingleObjectEx() Unknown
  postgres.exe!pg_timer_thread(void * param) Line 49 C
  kernel32.dll!BaseThreadInitThunk() Unknown
  ntdll.dll!RtlUserThreadStart() Unknown

>From Process Explorer (slightly different):

ntoskrnl.exe!KeSynchronizeExecution+0x3de6
ntoskrnl.exe!KeWaitForSingleObject+0xc7a
ntoskrnl.exe!KeWaitForSingleObject+0x709
ntoskrnl.exe!KeWaitForSingleObject+0x375
ntoskrnl.exe!IoQueueWorkItem+0x370
ntoskrnl.exe!KeRemoveQueueEx+0x16ba
ntoskrnl.exe!KeWaitForSingleObject+0xe8e
ntoskrnl.exe!KeWaitForSingleObject+0x709
ntoskrnl.exe!KeWaitForMultipleObjects+0x24e
ntoskrnl.exe!ObWaitForMultipleObjects+0x2bd
ntoskrnl.exe!IoWMIRegistrationControl+0x2402
ntoskrnl.exe!setjmpex+0x3943
ntdll.dll!NtWaitForMultipleObjects+0x14
KERNELBASE.dll!WaitForMultipleObjectsEx+0xef
KERNELBASE.dll!WaitForMultipleObjects+0xe
postgres.exe!WaitLatchOrSocket+0x243
postgres.exe!secure_read+0xb0
postgres.exe!pq_getbyte+0xec
postgres.exe!get_stats_option_name+0x392
postgres.exe!PostgresMain+0x537
postgres.exe!ShmemBackendArrayAllocation+0x2a6a
postgres.exe!SubPostmasterMain+0x273
postgres.exe!main+0x480
postgres.exe!pgwin32_popen+0x130b
KERNEL32.DLL!BaseThreadInitThunk+0x22
ntdll.dll!RtlUserThreadStart+0x34


Re: [HACKERS] Some 9.5beta2 backend processes not terminating properly?

2015-12-29 Thread Shay Rojansky
>
> Could you describe the worklad a bit more? Is this rather concurrent? Do
> you use optimized or debug builds? How long did you wait for the
> backends to die? Is this all over localhost, external ip but local,
> remotely?
>

The workload is a a rather diverse set of integration tests executed with
Npgsql. There's no concurrency whatsoever - tests are executed serially.
The backends stay alive indefinitely, until they are killed. All this is
over localhost with TCP. I can try other scenarios if that'll help.


> > Note that the number of backends that stay stuck after the tests is
> > constant (always 12).
>
> Can you increase the number of backends used in the test? And check
> whether it's still 12?
>

Well, I ran the testsuite twice in parallel, and got... 23 backends stuck
at the end.


> How are your clients disconnecting? Possibly without properly
> disconnecting?
>

That's possible, definitely in some of the test cases.

What I can do is try to isolate things further by playing around with the
tests and trying to see if a more minimal repro can be done - I'll try
doing this later today or tomorrow. If anyone has any other specific tests
or checks I should do let me know.


[HACKERS] Some 9.5beta2 backend processes not terminating properly?

2015-12-28 Thread Shay Rojansky
After setting up 9.5beta2 on the Npgsql build server and running the Npgsql
test suite against I've noticed some weird behavior.

The tests run for a couple minutes, open and close some connection. With my
pre-9.5 backends, the moment the test runner exits I can see that all
backend processes exit immediately, and pg_activity_stat has no rows
(except the querying one). With 9.5beta2, however, some backend processes
continue to stay alive beyond the test runner, and pg_activity_stat
contains extra rows (state idle, waiting false). This situation persists
until I restart PostgreSQL.

This happens consistently on two machines, running Windows 7 and Windows
10. Both client and server are on the same machine and use TCP to
communicate. I can investigate further and try to produce a more isolated
repro but I thought I'd talk to you guys first.

Any thoughts or ideas on what might cause this? Any suggestions for
tracking this down?

Shay


Re: [HACKERS] Allow ssl_renegotiation_limit in PG 9.5

2015-10-24 Thread Shay Rojansky
>
> > Here's a patch that adds back the GUC, with default/min/max 0 and
> > GUC_NO_SHOW_ALL | GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE.
> >
> > This is my first pg patch, please be gentle with any screwups :)
>
> Why, you dummy.
>
> No, actually, this looks fine.  I've committed it and back-patched it
> to 9.5.  I took the liberty of making some cosmetic changes, however.
>
> Thanks for preparing this patch.
>

Thanks for accepting it! It will definitely save some trouble.


Re: [HACKERS] Allow ssl_renegotiation_limit in PG 9.5

2015-10-18 Thread Shay Rojansky
Here's a patch that adds back the GUC, with default/min/max 0
and GUC_NO_SHOW_ALL | GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE.

This is my first pg patch, please be gentle with any screwups :)


patch_tolerate_ssl_renegotiation_limit_zero
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Allow ssl_renegotiation_limit in PG 9.5

2015-10-16 Thread Shay Rojansky
>
> > > If not, the only solution I can see is for PostgreSQL to not protest
> if it
> > > sees the
> > > parameter in the startup packet.
> > >
> >
> > Yeah, that's the ideal solution here as far as I'm concerned.
>
> Well, it seems that's where we're ending up then. Could you prepare a
> patch?
>

Yes, will do so in the coming days, thanks!


Re: [HACKERS] Allow ssl_renegotiation_limit in PG 9.5

2015-10-16 Thread Shay Rojansky
>
> As far as I remember, that was introduced because of renegotiation bugs
> with Mono:
> http://lists.pgfoundry.org/pipermail/npgsql-devel/2010-February/001074.html
> http://fxjr.blogspot.co.at/2010/03/ssl-renegotiation-patch.html
>
> Of course, with renegotiation disabled, nobody knows whether there would
> also have been renegotiation
> problems since Npgsql switched from Mono SSL to .NET SSL ...
>

You may be right (this was before my time on Npgsql). But since PostgreSQL
is dropping negotiation on its side it doesn't really make sense to dive
into this and find out if it's still buggy or not...

Maybe Npgsql could switch to sending the statement after the connection has
> been
> established and resending it after each RESET ALL?
> You could add documentation that people should not use RESET ALL with
> Npgsql unless they
> are ready to disable renegotiation afterwards.
>

That's certainly possible, although it seems problematic to prohibit RESET
ALL because of an issue like this. It's a useful feature that users may
want to use as they manipulate parameters, that's why PostgreSQL has it in
the first place...

I also prefer not to rely on documentation warnings which people may miss,
especially in this case where the consequences of accidentally turning on
renegotiation with a buggy stack would be extremely difficult to track and
debug...

If not, the only solution I can see is for PostgreSQL to not protest if it
> sees the
> parameter in the startup packet.
>

Yeah, that's the ideal solution here as far as I'm concerned.


[HACKERS] Allow ssl_renegotiation_limit in PG 9.5

2015-10-14 Thread Shay Rojansky
Hi hackers.

I noticed ssl_renegotiation_limit has been removed in PostgreSQL 9.5, good
riddance...

However, the new situation where some versions of PG allow this parameter
while others bomb when seeing it. Specifically, Npgsql sends
ssl_renegotiation_limit=0 in the startup packet to completely disable
renegotiation. At this early stage it doesn't know yet whether the database
it's connecting to is PG 9.5 or earlier.

Is there any chance you'd consider allowing ssl_renegotiation_limit in PG
9.5, without it having any effect (I think that's the current behavior for
recent 9.4, 9.3, right)? It may be a good idea to only allow this parameter
to be set to zero, raising an error otherwise.

The alternative would be to force users to specify in advance whether the
database they're connecting to supports this parameter, or to send it after
the startup packet which complicates things etc.

Thanks,

Shay


Re: [HACKERS] Allow ssl_renegotiation_limit in PG 9.5

2015-10-14 Thread Shay Rojansky
Just to give some added reasoning...

As Andres suggested, Npgsql sends ssl_renegotiation_limit=0 because we've
seen renegotiation bugs with the standard .NET SSL implementation (which
Npgsql uses). Seems like everyone has a difficult time with renegotiation.

As Tom suggested, it gets sent in the startup packet so that DISCARD/RESET
ALL resets back to 0 and not to the default 512MB (in older versions).
Npgsql itself issues DISCARD ALL in its connection pool implementation to
reset the connection to its original opened state, and of course users may
want to do it themselves. Having SSL renegotiation accidentally turned on
because a user sent RESET ALL, when the SSL implementation is known to have
issues, is something to be avoided...

Shay


Re: [HACKERS] Odd query execution behavior with extended protocol

2015-10-05 Thread Shay Rojansky
Thanks for the help Tom and the others, I'll modify my sequence and report
if I encounter any further issues.

On Sun, Oct 4, 2015 at 7:36 PM, Tom Lane <t...@sss.pgh.pa.us> wrote:

> Shay Rojansky <r...@roji.org> writes:
> >> To my mind there is not a lot of value in performing Bind until you
> >> are ready to do Execute.  The only reason the operations are separated
> >> in the protocol is so that you can do multiple Executes with a row limit
> >> on each one, to retrieve a large query result in chunks.
>
> > So you would suggest changing my message chain to send Bind right after
> > Execute, right? This would yield the following messages:
>
> > P1/P2/D1/B1/E1/D2/B2/E2/S (rather than the current
> > P1/D1/B1/P2/D2/B2/E1/C1/E2/C2/S)
>
> > This would mean that I would switch to using named statements and the
> > unnamed portal, rather than the current unnamed statement
> > and named portals. If I recall correctly, I was under the impression that
> > there are some PostgreSQL performance benefits to using the
> > unnamed statement over named statements, although I admit I can't find
> any
> > documentation backing that. Can you confirm that the two
> > are equivalent performance-wise?
>
> Hmm.  I do not recall exactly what performance optimizations apply to
> those two cases; they're probably not "equivalent", though I do not think
> the difference is major in either case.  TBH I was a bit surprised on
> reading your message to hear that the system would take that sequence at
> all; it's not obvious that it should be allowed to replace a statement,
> named or not, while there's an open portal that depends on it.
>
> I think you might have more issues with lifespans, since portals go away
> at commit whereas named statements don't.
>
> regards, tom lane
>


[HACKERS] Odd query execution behavior with extended protocol

2015-10-05 Thread Shay Rojansky
Hi hackers, some odd behavior has been reported with Npgsql and I wanted to
get your help.

Npgsql supports sending multiple SQL statements in a single packet via the
extended protocol. This works fine, but when the second query SELECTs a
value modified by the first's UPDATE, I'm getting a result as if the UPDATE
hasn't yet occurred.

The exact messages send by Npgsql are:

Parse (UPDATE data SET name='foo' WHERE id=1), statement=unnamed
Describe (statement=unnamed)
Bind (statement=unnamed, portal=MQ0)
Parse (SELECT * FROM data WHERE id=1), statement=unnamed
Describe (statement=unnamed)
Bind (statement=unnamed, portal=MQ1)
Execute (portal=MQ0)
Close (portal=MQ0)
Execute (portal=MQ1)
Close (portal=MQ1)
Sync

Instead of returning the expected 'foo' value set in the first command's
UPDATE, I'm getting whatever value was previously there.
Note that this happen regardless of whether a transaction is already set
and of the isolation level.

Is this the expected behavior, have I misunderstood the protocol specs?

Thanks for your help, and please let me know if you need any more info.

Shay


Re: [HACKERS] Odd query execution behavior with extended protocol

2015-10-05 Thread Shay Rojansky
>
> > So you would suggest changing my message chain to send Bind right after
> > Execute, right? This would yield the following messages:
>
> > P1/P2/D1/B1/E1/D2/B2/E2/S (rather than the current
> > P1/D1/B1/P2/D2/B2/E1/C1/E2/C2/S)
>
> > This would mean that I would switch to using named statements and the
> > unnamed portal, rather than the current unnamed statement
> > and named portals. If I recall correctly, I was under the impression that
> > there are some PostgreSQL performance benefits to using the
> > unnamed statement over named statements, although I admit I can't find
> any
> > documentation backing that. Can you confirm that the two
> > are equivalent performance-wise?
>
> Hmm.  I do not recall exactly what performance optimizations apply to
> those two cases; they're probably not "equivalent", though I do not think
> the difference is major in either case.  TBH I was a bit surprised on
> reading your message to hear that the system would take that sequence at
> all; it's not obvious that it should be allowed to replace a statement,
> named or not, while there's an open portal that depends on it.
>

One more important piece of information...

The reason Npgsql currently sends P1/D1/B1/P2/D2/B2/E1/C1/E2/C2/S is to
avoid deadlocks, I've already discussed this with you in
http://www.postgresql.org/message-id/cadt4rqb+fbtqpte5ylz0hkb-2k-ngzhm2ybvj0tmc8rqbgf...@mail.gmail.com
.

Unfortunately, the alternative I proposed above, P1/P2/D1/B1/E1/D2/B2/E2/S,
suffers from the same issue: any sequence in which a Bind is sent after a
previous Execute is deadlock-prone - Execute causes PostgreSQL to start
writing a potentially large dataset, while Bind means the client may be
writing a potentially large parameter value.

In other words, unless I'm mistaken it seems there's no alternative but to
implement non-blocking I/O at the client side - write until writing would
block, switching to reading when that happens. This adds some substantial
complexity, especially with .NET's SSL/TLS implementation layer.

Or does anyone see some sort of alternative which I've missed?


Re: [HACKERS] Odd query execution behavior with extended protocol

2015-10-04 Thread Shay Rojansky
>
> > Npgsql supports sending multiple SQL statements in a single packet via
> the extended protocol. This works fine, but when the second query SELECTs a
> value modified by the first's UPDATE, I'm getting a result as if the
> > UPDATE hasn't yet occurred.
>
> Looks like the first updating statement is not committed, assuming that
> the two statements run in different transactions.
>

I did try to prefix the message chain with an explicit transaction BEGIN
(with the several different isolation levels) without a difference in
behavior.

> The exact messages send by Npgsql are:
> >
> > Parse (UPDATE data SET name='foo' WHERE id=1), statement=unnamed
> > Describe (statement=unnamed)
> > Bind (statement=unnamed, portal=MQ0)
> > Parse (SELECT * FROM data WHERE id=1), statement=unnamed
> > Describe (statement=unnamed)
> > Bind (statement=unnamed, portal=MQ1)
> > Execute (portal=MQ0)
> > Close (portal=MQ0)
> > Execute (portal=MQ1)
> > Close (portal=MQ1)
> > Sync
>
> I never used Npgsql so I don't know if there is something missing there.
> Would you need an explicit commit before closing MQ0?
>

I guess this is exactly my question to PostgreSQL... But unless I'm
misunderstanding the transaction semantics I shouldn't need to commit the
first UPDATE in order to see its effect in the second SELECT...

Also I am not in clear what "statement=unnamed" means, but it is used
> twice. Is it possible that the update is overwritten with select before it
> executes?
>

statement=unnamed means that the destination statement is the unnamed
prepared statement (as described in
http://www.postgresql.org/docs/current/static/protocol-message-formats.html).
Right after the Parse I bind the unnamed statement which I just parsed to
cursor MQ0. In other words, Npgsql first parses the two queries and binds
them to portals MQ0 and MQ1, and only then executes both portals

BTW: Do you see the change after update in your DB if you look into it with
> another tool (e.g. psql)?
>

That's a good suggestion, I'll try to check it out, thanks!


[HACKERS] Odd query execution behavior with extended protocol

2015-10-04 Thread Shay Rojansky
Hi hackers, some odd behavior has been reported with Npgsql and I'm sure
you can help.

Npgsql supports sending multiple SQL statements in a single packet via the
extended protocol. This works fine, but when the second query SELECTs a
value modified by the first's UPDATE, I'm getting a result as if the UPDATE
hasn't yet occurred.

The exact messages send by Npgsql are:

Parse (UPDATE data SET name='foo' WHERE id=1), statement=unnamed
Describe (statement=unnamed)
Bind (statement=unnamed, portal=MQ0)
Parse (SELECT * FROM data WHERE id=1), statement=unnamed
Describe (statement=unnamed)
Bind (statement=unnamed, portal=MQ1)
Execute (portal=MQ0)
Close (portal=MQ0)
Execute (portal=MQ1)
Close (portal=MQ1)
Sync

Instead of returning the expected 'foo' value set in the first command's
UPDATE, I'm getting whatever value was previously there.
Note that this happen regardless of whether a transaction is already set
and of the isolation level.

Is this the expected behavior, have I misunderstood the protocol specs?

Thanks for your help, and please let me know if you need any more info.

Shay


Re: [HACKERS] Odd query execution behavior with extended protocol

2015-10-04 Thread Shay Rojansky
>
> I'm fairly sure that the query snapshot is established at Bind time,
> which means that this SELECT will run with a snapshot that indeed
> does not see the effects of the UPDATE.
>
> To my mind there is not a lot of value in performing Bind until you
> are ready to do Execute.  The only reason the operations are separated
> in the protocol is so that you can do multiple Executes with a row limit
> on each one, to retrieve a large query result in chunks.
>

So you would suggest changing my message chain to send Bind right after
Execute, right? This would yield the following messages:

P1/P2/D1/B1/E1/D2/B2/E2/S (rather than the current
P1/D1/B1/P2/D2/B2/E1/C1/E2/C2/S)

This would mean that I would switch to using named statements and the
unnamed portal, rather than the current unnamed statement
and named portals. If I recall correctly, I was under the impression that
there are some PostgreSQL performance benefits to using the
unnamed statement over named statements, although I admit I can't find any
documentation backing that. Can you confirm that the two
are equivalent performance-wise?

Shay


Re: [HACKERS] Odd query execution behavior with extended protocol

2015-10-04 Thread Shay Rojansky
>
> Try adding a sync before the second execute.
>

I tried inserting a Sync right before the second Execute, this caused an
error with the message 'portal "MQ1" does not exist'.
This seems like problematic behavior on its own, regardless of my issues
here (Sync shouldn't be causing an implicit close of the portal, should
it?).


Re: [HACKERS] Odd/undocumented precedence of concatenation operator

2015-09-09 Thread Shay Rojansky
>
> It is expected, and documented.  (It's also different in 9.5, see
>
> http://git.postgresql.org/gitweb/?p=postgresql.git=commitdiff=c6b3c939b7e0f1d35f4ed4996e71420a993810d2
> )
>

Ah, thanks!


> > If nothing else, it seems that the concatenation operator should be
> listed
> > on the operator precedence table at
> >
> http://www.postgresql.org/docs/9.4/static/sql-syntax-lexical.html#SQL-PRECEDENCE-TABLE
>
> Both >= and || fall into the "any other operator" case, no?
>

I somehow missed that in the table, assuming that >= would be somewhere
with > and =. Thanks again.


[HACKERS] Odd/undocumented precedence of concatenation operator

2015-09-08 Thread Shay Rojansky
Hi hackers.

Trying to execute the following query on PostgreSQL 9.4.4:

select 'a' >= 'b' || 'c';

Gives the result "falsec", implying that the precedence of the string
concatenation operator is lower than the comparison operator. Changing the
>= into = provides the result false, which is less surprising.

Is this the expected behavior, considering that >= and = behave differently
and that + ranks much higher?

If nothing else, it seems that the concatenation operator should be listed
on the operator precedence table at
http://www.postgresql.org/docs/9.4/static/sql-syntax-lexical.html#SQL-PRECEDENCE-TABLE
?

Thanks!

Shay


Re: [HACKERS] statement_timeout affects query results fetching?

2015-08-11 Thread Shay Rojansky
Thanks (once again!) for the valuable suggestions Robert.

The idea of chunking/buffering via cursors has been raised before for
another purpose - allowing multiple queries concurrently at the API level
(where concurrently means interleaving when reading the resultsets). This
would imply exposing the number of rows fetched to the user like you
suggested. However, I don't think there's a way we can remove the API
option to *not* buffer (as I said before, ADO.NET even provides a standard
API feature for reading column-by-column), and therefore the general
problem remains...

I think the right solution for us at the driver level would be to switch to
driver-enforced timeouts, i.e. to no longer use statement_timeout but look
at socket read times instead. I'll look into doing that for our next
version.

Thanks for all your thoughts!

On Mon, Aug 10, 2015 at 2:30 PM, Robert Haas robertmh...@gmail.com wrote:

 On Mon, Aug 10, 2015 at 5:25 AM, Shay Rojansky r...@roji.org wrote:
  Thanks for the explanation Robert, that makes total sense. However, it
 seems
  like the utility of PG's statement_timeout is much more limited than I
  thought.
 
  In case you're interested, I dug a little further and it seems that
  Microsoft's client for SQL Server implements the following timeout
 (source):
 
  cumulative time-out (for all network packets that are read during the
  invocation of a method) for all network reads during command execution or
  processing of the results. A time-out can still occur after the first
 row is
  returned, and does not include user processing time, only network read
 time.
 
  Since it doesn't seem possible to have a clean query-processing-only
 timeout
  at the backend, we may be better off doing something similar to the above
  and enforce timeouts on the client only. Any further thoughts on this
 would
  be appreciated.

 An alternative you may want to consider is using the Execute message
 with a non-zero row count and reading all of the returned rows as they
 come back, buffering them in memory.  When those have all been
 consumed, issue another Execute message and get some more rows.

 AFAICS, the biggest problem with this is that there's no good way to
 bound the number of rows returned by size rather than by number, which
 has been complained about before by somebody else in a situation
 similar to yours.  Another problem is that I believe it will cause
 cursor_tuple_fraction to kick in, which may change query plans.  But
 it does have the advantage that the query will be suspended from the
 server's point of view, which I *think* will toll statement_timeout.

 You might also consider exposing some knobs to the user, so that they
 can set the number of rows fetched in one go, and let that be all the
 rows or only some of them.

 We really need a better way of doing this, but I think this is the way
 other drivers are handling it now.

 --
 Robert Haas
 EnterpriseDB: http://www.enterprisedb.com
 The Enterprise PostgreSQL Company



Re: [HACKERS] statement_timeout affects query results fetching?

2015-08-10 Thread Shay Rojansky
Thanks for the explanation Robert, that makes total sense. However, it
seems like the utility of PG's statement_timeout is much more limited than
I thought.

In case you're interested, I dug a little further and it seems that
Microsoft's client for SQL Server implements the following timeout (source
https://msdn.microsoft.com/en-us/library/system.data.sqlclient.sqlcommand.commandtimeout%28v=vs.110%29.aspx?f=255MSPPError=-2147217396
):

cumulative time-out (for all network packets that are read during the
invocation of a method) for all network reads during command execution or
processing of the results. A time-out can still occur after the first row
is returned, and does not include user processing time, only network read
time.

Since it doesn't seem possible to have a clean query-processing-only
timeout at the backend, we may be better off doing something similar to the
above and enforce timeouts on the client only. Any further thoughts on this
would be appreciated.

On Sun, Aug 9, 2015 at 5:21 PM, Robert Haas robertmh...@gmail.com wrote:

 On Sat, Aug 8, 2015 at 11:30 AM, Shay Rojansky r...@roji.org wrote:
  the entire row in memory (imagine rows with megabyte-sized columns). This
  makes sense to me; Tom, doesn't the libpq behavior you describe of
 absorbing
  the result set as fast as possible mean that a lot of memory is wasted on
  the client side?

 It sure does.

  I can definitely appreciate the complexity of changing this behavior. I
  think that some usage cases (such as mine) would benefit from a timeout
 on
  the time until the first row is sent, this would allow to put an upper
 cap
  on stuff like query complexity, for example.

 Unfortunately, it would not do any such thing.  It's possible for the
 first row to be returned really really fast and then for an arbitrary
 amount of time to pass and computation to happen before all the rows
 are returned.  A plan can have a startup cost of zero and a run cost
 of a billion (or a trillion).  This kind of scenario isn't even
 particularly uncommon.  You just need a plan that looks like this:

 Nested Loop
 - Nested Loop
   - Nested Loop
 - Seq Scan
 - Index Scan
   - Index Scan
 - Index Scan

 You can just keep pushing more nested loop/index scans on there and
 the first row will still pop out quite fast.  But if the seq-scanned
 table is large, generating the whole result set can take a long, long
 time.

 Even worse, you could have a case like this:

 SELECT somefunc(a) FROM foo;

 Now suppose somefunc is normally very quick, but if a = 42 then it
 does pg_sleep() in a loop until the world ends.   You're going to have
 however many rows of foo have a != 42 pop out near-instantaneously,
 and then it will go into the tank and not come out until the meaning
 of life, the universe, and everything is finally revealed.

 That second case is a little extreme, and a little artificial, but the
 point is this: just as you don't want the client to have to buffer the
 results in memory, the server doesn't either.  It's not the case that
 the server computes the rows and sends them to you.  Each one is sent
 to you as it is computed, and in the normal case, at the time the
 first row is sent, only a small percentage of the total work of the
 query has been performed.

 --
 Robert Haas
 EnterpriseDB: http://www.enterprisedb.com
 The Enterprise PostgreSQL Company



[HACKERS] statement_timeout affects query results fetching?

2015-08-08 Thread Shay Rojansky
Hi everyone, I'm seeing some strange behavior and wanted to confirm it.

When executing a query that selects a long result set, if the code
processing the results takes its time (i.e.g more than statement_timeout),
a timeout occurs. My expectation was that statement_timeout only affects
query *processing*, and does not cover the frontend actually processing the
result.

First, I wanted to confirm that this is the case (and not some sort of bug
in my code).

If this is the case, is this intended? Here are some points:
* It makes statement_timeout very difficult to use; clients do very diverse
things with database results, it may be very difficult (or impossible) to
estimate how much time they should take (e.g. random load factors on the
client machine or on some other machine receiving results).
* It makes it impossible to specifically detect problematic *queries* which
take too long to execute (as opposed to the time taken to process their
results).

If you do insist that this behavior is correct, a documentation update for
statement_timeout might make this clearer.

Thanks,

Shay


Re: [HACKERS] statement_timeout affects query results fetching?

2015-08-08 Thread Shay Rojansky
Thanks for your responses.

I'm not using cursors or anything fancy. The expected behavior (as far as I
can tell) for a .NET database driver is to read one row at a time from the
database and make it available. There's even a standard API option for
fetching data on a column-by-column basis: this allows the user to not hold
the entire row in memory (imagine rows with megabyte-sized columns). This
makes sense to me; Tom, doesn't the libpq behavior you describe of
absorbing the result set as fast as possible mean that a lot of memory is
wasted on the client side? I'd be interested in your take on this.

I can definitely appreciate the complexity of changing this behavior. I
think that some usage cases (such as mine) would benefit from a timeout on
the time until the first row is sent, this would allow to put an upper cap
on stuff like query complexity, for example.

Shay

On Sat, Aug 8, 2015 at 5:13 PM, Tom Lane t...@sss.pgh.pa.us wrote:

 Shay Rojansky r...@roji.org writes:
  Hi everyone, I'm seeing some strange behavior and wanted to confirm it.
  When executing a query that selects a long result set, if the code
  processing the results takes its time (i.e.g more than
 statement_timeout),
  a timeout occurs. My expectation was that statement_timeout only affects
  query *processing*, and does not cover the frontend actually processing
 the
  result.

 Are you using a cursor, or something like that?

 libpq ordinarily absorbs the result set as fast as possible and then hands
 it back to the application as one blob; the time subsequently spent by the
 application looking at the blob doesn't count against statement_timeout.

 As Robert says, statement_timeout *does* include time spent sending the
 result set to the client, and we're not going to change that, because it
 would be too hard to disentangle calculation from I/O.  So if the client
 isn't prompt about absorbing all the data then you have to factor that
 into your setting.  But it would be a slightly unusual usage pattern
 AFAIK.

 regards, tom lane



Re: [HACKERS] statement_timeout affects query results fetching?

2015-08-08 Thread Shay Rojansky
I'd also recommend adding a sentence about this aspect of statement_timeout
in the docs to prevent confusion...

On Sat, Aug 8, 2015 at 5:30 PM, Shay Rojansky r...@roji.org wrote:

 Thanks for your responses.

 I'm not using cursors or anything fancy. The expected behavior (as far as
 I can tell) for a .NET database driver is to read one row at a time from
 the database and make it available. There's even a standard API option for
 fetching data on a column-by-column basis: this allows the user to not hold
 the entire row in memory (imagine rows with megabyte-sized columns). This
 makes sense to me; Tom, doesn't the libpq behavior you describe of
 absorbing the result set as fast as possible mean that a lot of memory is
 wasted on the client side? I'd be interested in your take on this.

 I can definitely appreciate the complexity of changing this behavior. I
 think that some usage cases (such as mine) would benefit from a timeout on
 the time until the first row is sent, this would allow to put an upper cap
 on stuff like query complexity, for example.

 Shay

 On Sat, Aug 8, 2015 at 5:13 PM, Tom Lane t...@sss.pgh.pa.us wrote:

 Shay Rojansky r...@roji.org writes:
  Hi everyone, I'm seeing some strange behavior and wanted to confirm it.
  When executing a query that selects a long result set, if the code
  processing the results takes its time (i.e.g more than
 statement_timeout),
  a timeout occurs. My expectation was that statement_timeout only affects
  query *processing*, and does not cover the frontend actually processing
 the
  result.

 Are you using a cursor, or something like that?

 libpq ordinarily absorbs the result set as fast as possible and then hands
 it back to the application as one blob; the time subsequently spent by the
 application looking at the blob doesn't count against statement_timeout.

 As Robert says, statement_timeout *does* include time spent sending the
 result set to the client, and we're not going to change that, because it
 would be too hard to disentangle calculation from I/O.  So if the client
 isn't prompt about absorbing all the data then you have to factor that
 into your setting.  But it would be a slightly unusual usage pattern
 AFAIK.

 regards, tom lane





Re: [HACKERS] Encoding of early PG messages

2015-08-01 Thread Shay Rojansky
Oh sorry, I think I misunderstood your suggestion - setting lc_messages in
the startup packet wouldn't work any more than setting client_encoding,
would it. So any solution here would be on the database/backend side, and
so irrelevant for a general-purpose driver...

On Fri, Jul 31, 2015 at 4:28 PM, Shay Rojansky r...@roji.org wrote:

 Thanks for the suggestions Tom.

 As I'm developing a general-purpose driver I can't do anything in
 PostgreSQL config, but it's a good workaround suggestion for users who
 encounter this error.

 Sending lc_messages in the startup packet could work, but if I understand
 correctly that setting combines both encoding and language. I guess I can
 look at the user's locale preference on the client machine, try to
 translate that into a PostgreSQL language/encoding and send that in
 lc_messages - that seems like it might work.

 Shay


 On Fri, Jul 31, 2015 at 3:46 PM, Tom Lane t...@sss.pgh.pa.us wrote:

 Shay Rojansky r...@roji.org writes:
  Developing Npgsql I've encountered the problem described in
 
 http://www.postgresql.org/message-id/20081223212414.gd3...@merkur.hilbert.loc
 :
  a German installation of PostgreSQL seems to respond to an incorrect
  password with a non-UTF8 encoding of the error messages, even if the
  startup message contains client_encoding=UTF8.

 I wouldn't hold your breath waiting for that to change.

 A possible workaround is to run the postmaster with lc_messages=C and
 then switch to your desired message language per-session.  It would
 certainly work to send lc_messages along with client_encoding in the
 startup packet; or possibly you could set those settings as per-database
 or per-role settings to avoid needing to teach the application code
 about it.  This would mean that bad-password and similar errors would
 come out in English, but at least they'd be validly encoded ...

 regards, tom lane





Re: [HACKERS] Encoding of early PG messages

2015-07-31 Thread Shay Rojansky
Thanks for the suggestions Tom.

As I'm developing a general-purpose driver I can't do anything in
PostgreSQL config, but it's a good workaround suggestion for users who
encounter this error.

Sending lc_messages in the startup packet could work, but if I understand
correctly that setting combines both encoding and language. I guess I can
look at the user's locale preference on the client machine, try to
translate that into a PostgreSQL language/encoding and send that in
lc_messages - that seems like it might work.

Shay

On Fri, Jul 31, 2015 at 3:46 PM, Tom Lane t...@sss.pgh.pa.us wrote:

 Shay Rojansky r...@roji.org writes:
  Developing Npgsql I've encountered the problem described in
 
 http://www.postgresql.org/message-id/20081223212414.gd3...@merkur.hilbert.loc
 :
  a German installation of PostgreSQL seems to respond to an incorrect
  password with a non-UTF8 encoding of the error messages, even if the
  startup message contains client_encoding=UTF8.

 I wouldn't hold your breath waiting for that to change.

 A possible workaround is to run the postmaster with lc_messages=C and
 then switch to your desired message language per-session.  It would
 certainly work to send lc_messages along with client_encoding in the
 startup packet; or possibly you could set those settings as per-database
 or per-role settings to avoid needing to teach the application code
 about it.  This would mean that bad-password and similar errors would
 come out in English, but at least they'd be validly encoded ...

 regards, tom lane



[HACKERS] Encoding of early PG messages

2015-07-31 Thread Shay Rojansky
Hi hackers.

Developing Npgsql I've encountered the problem described in
http://www.postgresql.org/message-id/20081223212414.gd3...@merkur.hilbert.loc:
a German installation of PostgreSQL seems to respond to an incorrect
password with a non-UTF8 encoding of the error messages, even if the
startup message contains client_encoding=UTF8.

It seems that the aforementioned thread didn't lead to a fix (either to
respect client_encoding even for early messages, or to switch to ASCII
English errors for these messages). Am I missing something here or is this
the current situation in PG? Any plans for some sort of fix?

I can work around this by using relaxed decoding (i.e. not failing on
non-UTF8 characters) but this means that error messages are partially or
totally garbled...

Thanks,

Shay


Re: [HACKERS] Entities created in one query not available in another in extended protocol

2015-06-14 Thread Shay Rojansky
Tom, I agree this is entirely a client-side issue. Regardless, as Simon
says it would be useful to have some documentation for client implementors.

Sehrope, thanks for the JDBC link! I was actually thinking of going about
it another way in Npgsql:

   1. Send messages normally until the first Execute message is sent.
   2. From that point on, socket writes should simply be non-blocking. As
   long as buffers aren't full, there's no issue, we continue writing. The
   moment a non-blocking write exits because it would block, we transfer
   control to the user, who can now read data from queries (the ADO.NET.API
   allows for multiple resultsets).
   3. When the user finishes processing the resultsets, control is
   transferred back to Npgsql which continues sending messages (back to step
   1).

This approach has the advantage of not caring about buffer sizes or trying
to assume how many bytes are sent by the server: we simply write as much as
we can without blocking, then switch to reading until we've exhausted
outstanding data, and back to writing. The main issue I'm concerned about
is SSL/TLS, which is a layer on top of the sockets and which might not work
well with non-blocking sockets...

Any comments?

Shay

On Sat, Jun 13, 2015 at 5:08 AM, Simon Riggs si...@2ndquadrant.com wrote:

 On 12 June 2015 at 20:06, Tom Lane t...@sss.pgh.pa.us wrote:

 Simon Riggs si...@2ndquadrant.com writes:
  On 11 June 2015 at 22:12, Shay Rojansky r...@roji.org wrote:
  Just in case it's interesting to you... The reason we implemented
 things
  this way is in order to avoid a deadlock situation - if we send two
 queries
  as P1/D1/B1/E1/P2/D2/B2/E2, and the first query has a large resultset,
  PostgreSQL may block writing the resultset, since Npgsql isn't reading
 it
  at that point. Npgsql on its part may get stuck writing the second
 query
  (if it's big enough) since PostgreSQL isn't reading on its end (thanks
 to
  Emil Lenngren for pointing this out originally).

  That part does sound like a problem that we have no good answer to.
 Sounds
  worth starting a new thread on that.

 I do not accept that the backend needs to deal with that; it's the
 responsibility of the client side to manage buffering properly if it is
 trying to overlap sending the next query with receipt of data from a
 previous one.  See commit 2a3f6e368 for a related issue in libpq.


 Then it's our responsibility to define what manage buffering properly
 means and document it.

 People should be able to talk to us without risk of deadlock.

 --
 Simon Riggshttp://www.2ndQuadrant.com/
 http://www.2ndquadrant.com/
 PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services



Re: [HACKERS] Entities created in one query not available in another in extended protocol

2015-06-14 Thread Shay Rojansky
On Sun, Jun 14, 2015 at 6:31 PM, Tom Lane t...@sss.pgh.pa.us wrote:

 Shay Rojansky r...@roji.org writes:
  [ rely on non-blocking sockets to avoid deadlock ]

 Yeah, that's pretty much the approach libpq has taken: write (or read)
 when you can, but press on when you can't.


Good to hear.

 The main issue I'm concerned about
  is SSL/TLS, which is a layer on top of the sockets and which might not
 work
  well with non-blocking sockets...

 We have not had word of any such problem with libpq.  It's possible that
 the intersection of SSL users with non-blocking-mode users is nil, but
 I kinda doubt that.  You do need to interpret openssl's return codes
 correctly ...


I don't think there's a problem with non-blocking I/O and SSL per se, the
question is about the .NET TLS/SSL implementation Npgsql uses - so it's
really totally unrelated to PostgreSQL...

Shay


[HACKERS] Entities created in one query not available in another in extended protocol

2015-06-11 Thread Shay Rojansky
In Npgsql, the .NET driver for PostgreSQL, we've switched from simple to
extended protocol and have received a user complaint.

It appears that when we send two messages in an extended protocol (so two
Parse/Bind/Execute followed by a single Sync), where the first one creates
some entity (function, table), and the second one can't query that entity
(not found). This isn't terribly important but does seem a bit odd, I
wanted to make sure you're aware of this.

Thanks,

Shay


Re: [HACKERS] Entities created in one query not available in another in extended protocol

2015-06-11 Thread Shay Rojansky
I just understood the same thing Tom wrote, yes, Npgsql (currently) sends
Parse for the second command before sending Execute for the first one. I
will look into that implementation decision. It might be worth looking into
Simon's comment though, I'll report if I still see the same problematic
behavior after reordering the messages (assuming we do reorder).

Thanks for your inputs...

On Thu, Jun 11, 2015 at 5:50 PM, Tom Lane t...@sss.pgh.pa.us wrote:

 Simon Riggs si...@2ndquadrant.com writes:
  On 11 June 2015 at 11:20, Shay Rojansky r...@roji.org wrote:
  It appears that when we send two messages in an extended protocol (so
 two
  Parse/Bind/Execute followed by a single Sync), where the first one
 creates
  some entity (function, table), and the second one can't query that
 entity
  (not found). This isn't terribly important but does seem a bit odd, I
  wanted to make sure you're aware of this.

  Sounds somewhat unlikely, but thank you for the report. Can we see a test
  case?

  Most commonly in such cases the first request failed and error messages
  weren't checked before running second message.

 I'm wondering if it was really more like
 Parse/Parse/Bind/Bind/Execute/Execute/Sync, in which case the described
 behavior wouldn't be too surprising at all.

 I do note that if a transaction is implicitly started to execute these
 messages, it's not closed until Sync.  But that should only affect the
 visibility of the results to other sessions, not to the current one.
 There's definitely a CommandCounterIncrement in exec_execute_message ...

 regards, tom lane



Re: [HACKERS] Cancel race condition

2015-06-11 Thread Shay Rojansky
Thanks for the extra consideration Robert.

Since I'm implementing a generic driver, users can send either
single-statement transactions or actual multiple-statement transaction.
However, whether we're in a transaction or not doesn't seem to affect
Npgsql logic (unless I'm missing something) - if the cancellation does hit
a query the transaction will be cancelled and it's up to the user to roll
it back as is required in PostgreSQL...


On Thu, Jun 11, 2015 at 9:50 PM, Robert Haas robertmh...@gmail.com wrote:

 On Tue, Jun 9, 2015 at 4:42 AM, Shay Rojansky r...@roji.org wrote:
  Ah, OK - I wasn't aware that cancellation was actually delivered as a
  regular POSIX signal... You're right about the lack of guarantees then.
 
  In that case, I'm guessing not much could be do to guarantee sane
  cancellation behavior... I do understand the best effort idea around
  cancellations. However, it seems different to say we'll try our best and
  the cancellation may not be delivered (no bad consequences except maybe
  performance), and to say we'll try our best but the cancellation may be
  delivered randomly to any query you send from the moment you send the
  cancellation. The second makes it very difficult to design a 100% sane,
  deterministic application... Any plans to address this in protocol 4?
 
  Would you have any further recommendations or guidelines to make the
  situation as sane as possible? I guess I could block any new SQL queries
  while a cancellation on that connection is still outstanding (meaning
 that
  the cancellation connection hasn't yet been closed). As you mentioned
 this
  wouldn't be a 100% solution since it would only cover signal sending, but
  better than nothing?

 Blocking new queries seems like a good idea.  Note that the entire
 transaction (whether single-statement or multi-statement) will be
 aborted, or at least the currently-active subtransaction, not just the
 current query.  If you're using single-statement transactions I guess
 there is not much practical difference, but if you are using
 multi-statement transactions the application kind of needs to be aware
 of this, since it needs to know that any work it did got rolled back,
 and everything's going to fail up until the current (sub)transaction
 is rolled back.

 --
 Robert Haas
 EnterpriseDB: http://www.enterprisedb.com
 The Enterprise PostgreSQL Company



Re: [HACKERS] Entities created in one query not available in another in extended protocol

2015-06-11 Thread Shay Rojansky
Thanks everyone for your time (or rather sorry for having wasted it).

Just in case it's interesting to you... The reason we implemented things
this way is in order to avoid a deadlock situation - if we send two queries
as P1/D1/B1/E1/P2/D2/B2/E2, and the first query has a large resultset,
PostgreSQL may block writing the resultset, since Npgsql isn't reading it
at that point. Npgsql on its part may get stuck writing the second query
(if it's big enough) since PostgreSQL isn't reading on its end (thanks to
Emil Lenngren for pointing this out originally).

Of course this isn't an excuse for anything, we're looking into ways of
solving this problem differently in our driver implementation.

Shay

On Thu, Jun 11, 2015 at 6:17 PM, Simon Riggs si...@2ndquadrant.com wrote:

 On 11 June 2015 at 16:56, Shay Rojansky r...@roji.org wrote:

 Npgsql (currently) sends Parse for the second command before sending
 Execute for the first one.


 Look no further than that.

 --
 Simon Riggshttp://www.2ndQuadrant.com/
 http://www.2ndquadrant.com/
 PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services



Re: [HACKERS] Cancel race condition

2015-06-09 Thread Shay Rojansky
Ah, OK - I wasn't aware that cancellation was actually delivered as a
regular POSIX signal... You're right about the lack of guarantees then.

In that case, I'm guessing not much could be do to guarantee sane
cancellation behavior... I do understand the best effort idea around
cancellations. However, it seems different to say we'll try our best and
the cancellation may not be delivered (no bad consequences except maybe
performance), and to say we'll try our best but the cancellation may be
delivered randomly to any query you send from the moment you send the
cancellation. The second makes it very difficult to design a 100% sane,
deterministic application... Any plans to address this in protocol 4?

Would you have any further recommendations or guidelines to make the
situation as sane as possible? I guess I could block any new SQL queries
while a cancellation on that connection is still outstanding (meaning that
the cancellation connection hasn't yet been closed). As you mentioned this
wouldn't be a 100% solution since it would only cover signal sending, but
better than nothing?


On Tue, Jun 9, 2015 at 1:01 AM, Tom Lane t...@sss.pgh.pa.us wrote:

 Shay Rojansky r...@roji.org writes:
  My questions/comments:
 - Does PostgreSQL *guarantee* that once the connection used to send
 the
 cancellation request is closed by the server, the cancellation has
 been
 delivered (as mentioned by Tom)? In other words, should I be
 designing a
 .NET driver around this behavior?

 The signal's been *sent*.  Whether it's been *delivered* is something
 you'd have to ask your local kernel hacker.  The POSIX standard appears
 to specifically disclaim any such guarantee; in fact, it doesn't even
 entirely promise that a self-signal is synchronous.  There are also
 issues like what if the target process currently has signals blocked;
 does delivery mean that the signal handler's been entered, or something
 weaker?

 In any case, Postgres has always considered that query cancel is a best
 effort thing, so even if I thought this was 100% portably reliable,
 I would not be in favor of promising anything in the docs.

 regards, tom lane



[HACKERS] Cancel race condition

2015-06-08 Thread Shay Rojansky
Hi everyone.

I'm working on Npgsql and have run into a race condition when cancelling.
The issue is described in the following 10-year-old thread, and I'd like to
make sure things are still the same:
http://www.postgresql.org/message-id/27126.1126649...@sss.pgh.pa.us

My questions/comments:

   - Does PostgreSQL *guarantee* that once the connection used to send the
   cancellation request is closed by the server, the cancellation has been
   delivered (as mentioned by Tom)? In other words, should I be designing a
   .NET driver around this behavior?
   - If so, may I suggest to update the protocol docs to reflect this (
   http://www.postgresql.org/docs/current/static/protocol-flow.html#AEN103033
   )
   - I'm not sure if there's some sort of feature/request list for protocol
   4, but it may make sense to provide a simpler solution for this problem.
   One example would be for the client to send some sort of numeric ID
   identifying each comment (some autoincrement), and to include that ID when
   cancelling. I'm not sure the benefits are worth the extra payload but it
   may be useful for other functionality as well (tracking/logging)? Just a
   thought.

Thanks,

Shay


Re: [HACKERS] Fetch zero result rows when executing a query?

2015-02-19 Thread Shay Rojansky
Sorry to revive this thread, I just had one additional thought...

To those advocating the deprecation of the max_rows parameter of Execute,
there's another argument to consider. max_rows isn't just there in order to
fetch, say, a single row of the result set and discard the rest (which is
what I originally asked about). There's also the function of retrieving the
resultset in chunks: getting 5 rows, then 10, etc. etc. Deprecating
max_rows would leave the user/driver only with the option of retrieving the
entire resultset in one go (for example, no option for the interleaving of
rows from several resultsets). And the lack of the ability to execute and
retrieve 0 rows hurts this scenario as well.

Just wanted to put it out there as another argument against deprecation.

On Wed, Feb 11, 2015 at 2:05 AM, Shay Rojansky r...@roji.org wrote:

 Thanks for understanding Robert, that's more or less what I had in mind, I
 was mainly wondering if I were missing some deeper explanation for the
 absence of the possibility of requesting 0 rows.

 Regardless of all of the above, it's really no big deal. I'll go ahead and
 use max_rows=1 for now, hopefully you guys don't decide to deprecate it.

 Shay

 On Tue, Feb 10, 2015 at 3:00 PM, Robert Haas robertmh...@gmail.com
 wrote:

 On Sun, Feb 8, 2015 at 3:56 AM, Shay Rojansky r...@roji.org wrote:
  Just to be precise: what is strange to me is that the max_rows feature
  exists
  but has no 0 value. You and Marko are arguing that the whole feature
 should
  be
  deprecated (i.e. always return all rows).

 I think the fact that it has no zero value is probably just a
 historical accident; most likely, whoever designed it originally
 (probably twenty years ago) didn't think about queries with
 side-effects and therefore didn't consider that wanting 0 rows would
 ever be sensible.  Meanwhile, a sentinel value was needed to request
 all rows, so they used 0.  If they'd thought of it, they might have
 picked -1 and we'd not be having this discussion.

 FWIW, I'm in complete agreement that it would be good if we had this
 feature.  I believe this is not the first report we've had of
 PostgreSQL doing things in ways that mesh nicely with standardized
 driver interfaces.  Whether we think those interfaces are
 well-designed or not, they are standardized.  When people use $OTHERDB
 and have a really great driver, and then they move to PostgreSQL and
 get one with more warts, it does not encourage them to stick with
 PostgreSQL.

 .NET is not some fringe user community that we can dismiss as
 irrelevant.  We need users of all languages to want to use PostgreSQL,
 not just users of languages any one of us happens to personally like.

 --
 Robert Haas
 EnterpriseDB: http://www.enterprisedb.com
 The Enterprise PostgreSQL Company





Re: [HACKERS] Fetch zero result rows when executing a query?

2015-02-10 Thread Shay Rojansky
Thanks for understanding Robert, that's more or less what I had in mind, I
was mainly wondering if I were missing some deeper explanation for the
absence of the possibility of requesting 0 rows.

Regardless of all of the above, it's really no big deal. I'll go ahead and
use max_rows=1 for now, hopefully you guys don't decide to deprecate it.

Shay

On Tue, Feb 10, 2015 at 3:00 PM, Robert Haas robertmh...@gmail.com wrote:

 On Sun, Feb 8, 2015 at 3:56 AM, Shay Rojansky r...@roji.org wrote:
  Just to be precise: what is strange to me is that the max_rows feature
  exists
  but has no 0 value. You and Marko are arguing that the whole feature
 should
  be
  deprecated (i.e. always return all rows).

 I think the fact that it has no zero value is probably just a
 historical accident; most likely, whoever designed it originally
 (probably twenty years ago) didn't think about queries with
 side-effects and therefore didn't consider that wanting 0 rows would
 ever be sensible.  Meanwhile, a sentinel value was needed to request
 all rows, so they used 0.  If they'd thought of it, they might have
 picked -1 and we'd not be having this discussion.

 FWIW, I'm in complete agreement that it would be good if we had this
 feature.  I believe this is not the first report we've had of
 PostgreSQL doing things in ways that mesh nicely with standardized
 driver interfaces.  Whether we think those interfaces are
 well-designed or not, they are standardized.  When people use $OTHERDB
 and have a really great driver, and then they move to PostgreSQL and
 get one with more warts, it does not encourage them to stick with
 PostgreSQL.

 .NET is not some fringe user community that we can dismiss as
 irrelevant.  We need users of all languages to want to use PostgreSQL,
 not just users of languages any one of us happens to personally like.

 --
 Robert Haas
 EnterpriseDB: http://www.enterprisedb.com
 The Enterprise PostgreSQL Company



Re: [HACKERS] Fetch zero result rows when executing a query?

2015-02-08 Thread Shay Rojansky
First a general comment:

 Then the driver writers that need these special API behaviors are
 reasonably expected to contribute to adding them to backend products that
 do not already have them.  The database developers are not going to take
on
 responsibility for the API decisions of others; and features deemed (or
 that in reality are) of marginal usefulness are likely to be omitted -
 intentionally or otherwise - from the official (in this case libpq)
 protocol.

I absolutely agree with you there, I'm not trying to get anybody to
implement
something I need (i.e. fetch 0 rows). This is more of a general discussion
as
to whether that feature *makes sense* to you as a protocol feature (which
doesn't
seem to be the case, as some of you guys want to deprecate the whole
max_rows
thing).

 I'll clarify just a little... I am indeed talking about the PostgreSQL
 network protocol, and not about query optimization (with LIMIT or
omitting
 RETURNING etc.). I am implementing ADO.NET's ExecuteNonQuery
 through which the user indicates they're not interested in any result
rows
 whether those exist or not.

 ExecuteNonQuery returns an integer while row-returning queries do not.
 I'd argue that the API states that the user is declaring that the query
 they are executing does not return any actual rows - just a count of
 affected rows - not that they do not care to see what rows are returned.

That's true. IMHO the count of affected rows isn't relevant to this
discussion
so I didn't mention it.

 For the situation where a user does ExecuteNonQuery but the query returns
 result rows, the driver can save the needless network transfers. We can
 definitely say it's the user's fault for providing a query with a
resultset
 to ExecuteNonQuery, but we *do* have the user's clear intention that no
 rows be fetched so why not act on it. I agree this isn't a terribly
 important optimization, the trigger for this question was first and
 foremost curiosity: it seems strange the protocol allows you to specify
 max_rows for any value other than 0.

 Yes, it does seem strange and, like Marko said, ideally would be
 deprecated.  The fact that it cannot handle zero rows seems like an
 unnecessary limitation and I cannot image that any values other than 0 and
 all would be of practical usage.  In the case of zero returning instead
the
 number of rows would be more useful than simply refusing to return
anything
 so even if something like this is needed the current implementation is
 flawed.

Just to be precise: what is strange to me is that the max_rows feature
exists
but has no 0 value. You and Marko are arguing that the whole feature should
be
deprecated (i.e. always return all rows).

 Here's a possible believable use-case which doesn't involve user neglect:
 imagine some server-side function which has side-effects and also returns
 some rows. In some situations the user is interested in the result rows,
 but in others they only want the side-effect. The user would probably
have
 no control over the function, and their only way to *not* transfer the
 result rows would be with a mechanism such as max_rows.

 Functions always return rows and so should not be executed using
 ExecuteNonQuery.  In most cases action-oriented functions return a
single
 result-status row so ignoring that row, while likely not advisable, is not
 exactly expensive.

Your description of functions doesn't hold for all functions, this is why I
tried to provide a usecase. It is possible for some function to both have a
side-effect (i.e. modify some table) *and* return a large number of rows. It
may be legitimate for a user to want to have the side-effect but not care
about the rows.  Ignoring one row isn't expensive, ignoring many could be.

 The basic question here becomes - the executor already must generate, in
 memory, all of the rows so is there a way to properly interact with the
 server where you can request the number of rows that were generated but
not
 be obligated to actually pull them down to the client.  This doesn't seem
 like an unreasonable request but assuming that it is not currently
possible
 (of which I have little clue) then the question becomes who cares enough
to
 design and implement such a protocol enhancement.

OK.

 More to the point, doesn't max_rows=1 have exactly the same dangers as
 LIMIT 1? The two seem to be identical, except that one is expressed in
the
 SQL query and the other at the network protocol level.

 The planner does not have access to network protocol level? options while
 it does know about LIMIT.

That's an internal PostgreSQL matter (which granted, may impact efficiency).
My comment about max_rows being equivalent to LIMIT was meant to address
Marko's
argument that max_rows is dangerous because any row might come out and tests
may pass accidentally (but that holds for LIMIT 1 as well, doesn't it).

 Expecting users to use an API without knowledge or control of the SQL that
 is being executed seems like a stretch to me.  

Re: [HACKERS] Fetch zero result rows when executing a query?

2015-02-07 Thread Shay Rojansky
Sorry everyone, I was unexpectedly very busy and couldn't respond
earlier... My apologies.

I'll clarify just a little... I am indeed talking about the PostgreSQL
network protocol, and not about query optimization (with LIMIT or omitting
RETURNING etc.). I am implementing ADO.NET's ExecuteNonQuery, through which
the user indicates they're not interested in any result rows whether those
exist or not. For the situation where a user does ExecuteNonQuery but the
query returns result rows, the driver can save the needless network
transfers. We can definitely say it's the user's fault for providing a
query with a resultset to ExecuteNonQuery, but we *do* have the user's
clear intention that no rows be fetched so why not act on it. I agree this
isn't a terribly important optimization, the trigger for this question was
first and foremost curiosity: it seems strange the protocol allows you to
specify max_rows for any value other than 0.

Here's a possible believable use-case which doesn't involve user neglect:
imagine some server-side function which has side-effects and also returns
some rows. In some situations the user is interested in the result rows,
but in others they only want the side-effect. The user would probably have
no control over the function, and their only way to *not* transfer the
result rows would be with a mechanism such as max_rows.

Marko, regarding your general criticism of max_rows:

 This seems to be a common pattern, and I think it's a *huge* mistake to
specify maxrows=1 and/or ignore rows after the first one in
 the driver layer.  If the user says give me the only row returned by
this query, the interface should check that only one row is in
 reality returned by the query.  If the query returns more than one row,
the user made a mistake in formulating the query and she
 probably wants to know about it.  If she genuinely doesn't care about the
rows after the first one, she can always specify LIMIT 1.

 For a sad example, look at PL/PgSQL's  SELECT .. INTO ..; it's not
terribly difficult to write a query which returns more than one
 row *by mistake* and have something really bad happen later on since it
went undetected during testing because you just
 happened to get the expected row back first.  And when you do want to
specifically enforce it for e.g. security critical code,
 you have to resort to really ugly hacks like window functions.

There are some problems with what you say... First, the ADO.NET API
provides a SingleRow API option which explicitly provides exactly this.
This API option doesn't at all mean that there *should* be only one row
(i.e. an error should be raised if otherwise), but simply that any other
rows beyond the first should be discarded. So regardless of what we think
best practices are on this, this behavior is mandated/specified by a major
API.

More to the point, doesn't max_rows=1 have exactly the same dangers as
LIMIT 1? The two seem to be identical, except that one is expressed in the
SQL query and the other at the network protocol level. The way I see it, if
the user specifies one of them without specifying ORDER BY, they are
explicitly saying they don't care which row comes out. And if their testing
code fails because this is wrong, then they've made a mistake - IMHo this
isn't a reason to kill the entire feature.

In general, in my view it's beneficial to separate between the SQL queries
and the features that the driver is supposed to provide in its API. The SQL
may be written or managed by one entity, reused in many places (some of
which want all rows and others which want only 1).


  1   2   >