Re: [HACKERS] Odd query execution behavior with extended protocol

2015-10-06 Thread Tom Lane
Robert Haas  writes:
> On Tue, Oct 6, 2015 at 5:53 PM, Tom Lane  wrote:
>> I dunno, if you close a portal before you've gotten CommandComplete,
>> should you expect that all its actions were taken?  Arguably, that
>> should be more like a ROLLBACK.

> I dunno either, but a rollback would undo everything, and a commit
> would do everything.  Ending up in a state where we've done some of it
> but not all of it is strange.  Being able to run an unbounded number
> of commands without a CommandCounterIncrement is *really* strange.

I'm fairly sure that we *have* done all of it; the Portal code is careful
to execute DML commands to completion whether or not the client accepts
all the RETURNING rows.  It will become visible after you commit.  The
issue here is just whether it's visible to a subsequent Bind within the
same transaction.

> I'm not very sure what to do about it, though.

Possibly we need the close-portal message processing code to do the CCI
stuff if it observes that the Portal hasn't been run to completion.
(I think there is already enough state in the Portal struct to tell that,
though I'm too lazy to look right now.)

I'm concerned though whether this would amount to a protocol break.
It's certainly a behavioral change that we'd have to document.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Odd query execution behavior with extended protocol

2015-10-06 Thread Robert Haas
On Tue, Oct 6, 2015 at 5:53 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> From looking at the code, it appears to me that if the Execute is run
>> to completion, then its results will be seen by future statements, but
>> if the portal is closed earlier, then not.  See the end of
>> exec_execute_message.  The handler for the Close message (inside
>> PostgresMain) doesn't seem to do anything that would result in a
>> CommandCounterIncrement() or CommitTransactionCommand().
>
>> This does seem a little strange.
>
> I dunno, if you close a portal before you've gotten CommandComplete,
> should you expect that all its actions were taken?  Arguably, that
> should be more like a ROLLBACK.

I dunno either, but a rollback would undo everything, and a commit
would do everything.  Ending up in a state where we've done some of it
but not all of it is strange.  Being able to run an unbounded number
of commands without a CommandCounterIncrement is *really* strange.

I'm not very sure what to do about it, though.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Odd query execution behavior with extended protocol

2015-10-06 Thread Tom Lane
Robert Haas  writes:
> On Tue, Oct 6, 2015 at 6:10 PM, Tom Lane  wrote:
>> I'm concerned though whether this would amount to a protocol break.

> Me, too.

There are enough CCI's floating around the code that there may not
actually be any observable problem, at least not in typical cases.
That would explain the lack of complaints ...

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Odd query execution behavior with extended protocol

2015-10-06 Thread Robert Haas
On Sat, Oct 3, 2015 at 5:03 AM, Shay Rojansky  wrote:
> Hi hackers, some odd behavior has been reported with Npgsql and I wanted to
> get your help.
>
> Npgsql supports sending multiple SQL statements in a single packet via the
> extended protocol. This works fine, but when the second query SELECTs a
> value modified by the first's UPDATE, I'm getting a result as if the UPDATE
> hasn't yet occurred.
>
> The exact messages send by Npgsql are:
>
> Parse (UPDATE data SET name='foo' WHERE id=1), statement=unnamed
> Describe (statement=unnamed)
> Bind (statement=unnamed, portal=MQ0)
> Parse (SELECT * FROM data WHERE id=1), statement=unnamed
> Describe (statement=unnamed)
> Bind (statement=unnamed, portal=MQ1)
> Execute (portal=MQ0)
> Close (portal=MQ0)
> Execute (portal=MQ1)
> Close (portal=MQ1)
> Sync
>
> Instead of returning the expected 'foo' value set in the first command's
> UPDATE, I'm getting whatever value was previously there.
> Note that this happen regardless of whether a transaction is already set and
> of the isolation level.
>
> Is this the expected behavior, have I misunderstood the protocol specs?

>From looking at the code, it appears to me that if the Execute is run
to completion, then its results will be seen by future statements, but
if the portal is closed earlier, then not.  See the end of
exec_execute_message.  The handler for the Close message (inside
PostgresMain) doesn't seem to do anything that would result in a
CommandCounterIncrement() or CommitTransactionCommand().

This does seem a little strange.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Odd query execution behavior with extended protocol

2015-10-06 Thread Tom Lane
Robert Haas  writes:
> From looking at the code, it appears to me that if the Execute is run
> to completion, then its results will be seen by future statements, but
> if the portal is closed earlier, then not.  See the end of
> exec_execute_message.  The handler for the Close message (inside
> PostgresMain) doesn't seem to do anything that would result in a
> CommandCounterIncrement() or CommitTransactionCommand().

> This does seem a little strange.

I dunno, if you close a portal before you've gotten CommandComplete,
should you expect that all its actions were taken?  Arguably, that
should be more like a ROLLBACK.

Note there'd only be a difference in case of an operation with RETURNING,
else there's no way (at this level anyway) to pause a data-modifying
command mid-execution.  This logic all predates RETURNING, I think,
so maybe it does need to be revisited.  But it's not entirely clear
what should happen.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Odd query execution behavior with extended protocol

2015-10-06 Thread Robert Haas
On Tue, Oct 6, 2015 at 6:10 PM, Tom Lane  wrote:
> I'm concerned though whether this would amount to a protocol break.

Me, too.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Odd query execution behavior with extended protocol

2015-10-06 Thread Robert Haas
On Tue, Oct 6, 2015 at 6:18 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> On Tue, Oct 6, 2015 at 6:10 PM, Tom Lane  wrote:
>>> I'm concerned though whether this would amount to a protocol break.
>
>> Me, too.
>
> There are enough CCI's floating around the code that there may not
> actually be any observable problem, at least not in typical cases.
> That would explain the lack of complaints ...

It's pretty to think so, but I've been doing this long enough to be skeptical.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Odd query execution behavior with extended protocol

2015-10-05 Thread Shay Rojansky
Thanks for the help Tom and the others, I'll modify my sequence and report
if I encounter any further issues.

On Sun, Oct 4, 2015 at 7:36 PM, Tom Lane  wrote:

> Shay Rojansky  writes:
> >> To my mind there is not a lot of value in performing Bind until you
> >> are ready to do Execute.  The only reason the operations are separated
> >> in the protocol is so that you can do multiple Executes with a row limit
> >> on each one, to retrieve a large query result in chunks.
>
> > So you would suggest changing my message chain to send Bind right after
> > Execute, right? This would yield the following messages:
>
> > P1/P2/D1/B1/E1/D2/B2/E2/S (rather than the current
> > P1/D1/B1/P2/D2/B2/E1/C1/E2/C2/S)
>
> > This would mean that I would switch to using named statements and the
> > unnamed portal, rather than the current unnamed statement
> > and named portals. If I recall correctly, I was under the impression that
> > there are some PostgreSQL performance benefits to using the
> > unnamed statement over named statements, although I admit I can't find
> any
> > documentation backing that. Can you confirm that the two
> > are equivalent performance-wise?
>
> Hmm.  I do not recall exactly what performance optimizations apply to
> those two cases; they're probably not "equivalent", though I do not think
> the difference is major in either case.  TBH I was a bit surprised on
> reading your message to hear that the system would take that sequence at
> all; it's not obvious that it should be allowed to replace a statement,
> named or not, while there's an open portal that depends on it.
>
> I think you might have more issues with lifespans, since portals go away
> at commit whereas named statements don't.
>
> regards, tom lane
>


[HACKERS] Odd query execution behavior with extended protocol

2015-10-05 Thread Shay Rojansky
Hi hackers, some odd behavior has been reported with Npgsql and I wanted to
get your help.

Npgsql supports sending multiple SQL statements in a single packet via the
extended protocol. This works fine, but when the second query SELECTs a
value modified by the first's UPDATE, I'm getting a result as if the UPDATE
hasn't yet occurred.

The exact messages send by Npgsql are:

Parse (UPDATE data SET name='foo' WHERE id=1), statement=unnamed
Describe (statement=unnamed)
Bind (statement=unnamed, portal=MQ0)
Parse (SELECT * FROM data WHERE id=1), statement=unnamed
Describe (statement=unnamed)
Bind (statement=unnamed, portal=MQ1)
Execute (portal=MQ0)
Close (portal=MQ0)
Execute (portal=MQ1)
Close (portal=MQ1)
Sync

Instead of returning the expected 'foo' value set in the first command's
UPDATE, I'm getting whatever value was previously there.
Note that this happen regardless of whether a transaction is already set
and of the isolation level.

Is this the expected behavior, have I misunderstood the protocol specs?

Thanks for your help, and please let me know if you need any more info.

Shay


Re: [HACKERS] Odd query execution behavior with extended protocol

2015-10-05 Thread Shay Rojansky
>
> > So you would suggest changing my message chain to send Bind right after
> > Execute, right? This would yield the following messages:
>
> > P1/P2/D1/B1/E1/D2/B2/E2/S (rather than the current
> > P1/D1/B1/P2/D2/B2/E1/C1/E2/C2/S)
>
> > This would mean that I would switch to using named statements and the
> > unnamed portal, rather than the current unnamed statement
> > and named portals. If I recall correctly, I was under the impression that
> > there are some PostgreSQL performance benefits to using the
> > unnamed statement over named statements, although I admit I can't find
> any
> > documentation backing that. Can you confirm that the two
> > are equivalent performance-wise?
>
> Hmm.  I do not recall exactly what performance optimizations apply to
> those two cases; they're probably not "equivalent", though I do not think
> the difference is major in either case.  TBH I was a bit surprised on
> reading your message to hear that the system would take that sequence at
> all; it's not obvious that it should be allowed to replace a statement,
> named or not, while there's an open portal that depends on it.
>

One more important piece of information...

The reason Npgsql currently sends P1/D1/B1/P2/D2/B2/E1/C1/E2/C2/S is to
avoid deadlocks, I've already discussed this with you in
http://www.postgresql.org/message-id/cadt4rqb+fbtqpte5ylz0hkb-2k-ngzhm2ybvj0tmc8rqbgf...@mail.gmail.com
.

Unfortunately, the alternative I proposed above, P1/P2/D1/B1/E1/D2/B2/E2/S,
suffers from the same issue: any sequence in which a Bind is sent after a
previous Execute is deadlock-prone - Execute causes PostgreSQL to start
writing a potentially large dataset, while Bind means the client may be
writing a potentially large parameter value.

In other words, unless I'm mistaken it seems there's no alternative but to
implement non-blocking I/O at the client side - write until writing would
block, switching to reading when that happens. This adds some substantial
complexity, especially with .NET's SSL/TLS implementation layer.

Or does anyone see some sort of alternative which I've missed?


Re: [HACKERS] Odd query execution behavior with extended protocol

2015-10-04 Thread Andres Freund
On October 4, 2015 2:50:10 PM GMT+02:00, Shay Rojansky  wrote:
>>
>> > Npgsql supports sending multiple SQL statements in a single packet
>via
>> the extended protocol. This works fine, but when the second query
>SELECTs a
>> value modified by the first's UPDATE, I'm getting a result as if the
>> > UPDATE hasn't yet occurred.
>>
>> Looks like the first updating statement is not committed, assuming
>that
>> the two statements run in different transactions.
>>
>
>I did try to prefix the message chain with an explicit transaction
>BEGIN
>(with the several different isolation levels) without a difference in
>behavior.
>
>> The exact messages send by Npgsql are:
>> >
>> > Parse (UPDATE data SET name='foo' WHERE id=1), statement=unnamed
>> > Describe (statement=unnamed)
>> > Bind (statement=unnamed, portal=MQ0)
>> > Parse (SELECT * FROM data WHERE id=1), statement=unnamed
>> > Describe (statement=unnamed)
>> > Bind (statement=unnamed, portal=MQ1)
>> > Execute (portal=MQ0)
>> > Close (portal=MQ0)
>> > Execute (portal=MQ1)
>> > Close (portal=MQ1)
>> > Sync
>>
>> I never used Npgsql so I don't know if there is something missing
>there.
>> Would you need an explicit commit before closing MQ0?
>>
>
>I guess this is exactly my question to PostgreSQL... But unless I'm
>misunderstanding the transaction semantics I shouldn't need to commit
>the
>first UPDATE in order to see its effect in the second SELECT...

Try adding a sync before the second execute.

Andres

--- 
Please excuse brevity and formatting - I am writing this on my mobile phone.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Odd query execution behavior with extended protocol

2015-10-04 Thread Shay Rojansky
>
> > Npgsql supports sending multiple SQL statements in a single packet via
> the extended protocol. This works fine, but when the second query SELECTs a
> value modified by the first's UPDATE, I'm getting a result as if the
> > UPDATE hasn't yet occurred.
>
> Looks like the first updating statement is not committed, assuming that
> the two statements run in different transactions.
>

I did try to prefix the message chain with an explicit transaction BEGIN
(with the several different isolation levels) without a difference in
behavior.

> The exact messages send by Npgsql are:
> >
> > Parse (UPDATE data SET name='foo' WHERE id=1), statement=unnamed
> > Describe (statement=unnamed)
> > Bind (statement=unnamed, portal=MQ0)
> > Parse (SELECT * FROM data WHERE id=1), statement=unnamed
> > Describe (statement=unnamed)
> > Bind (statement=unnamed, portal=MQ1)
> > Execute (portal=MQ0)
> > Close (portal=MQ0)
> > Execute (portal=MQ1)
> > Close (portal=MQ1)
> > Sync
>
> I never used Npgsql so I don't know if there is something missing there.
> Would you need an explicit commit before closing MQ0?
>

I guess this is exactly my question to PostgreSQL... But unless I'm
misunderstanding the transaction semantics I shouldn't need to commit the
first UPDATE in order to see its effect in the second SELECT...

Also I am not in clear what "statement=unnamed" means, but it is used
> twice. Is it possible that the update is overwritten with select before it
> executes?
>

statement=unnamed means that the destination statement is the unnamed
prepared statement (as described in
http://www.postgresql.org/docs/current/static/protocol-message-formats.html).
Right after the Parse I bind the unnamed statement which I just parsed to
cursor MQ0. In other words, Npgsql first parses the two queries and binds
them to portals MQ0 and MQ1, and only then executes both portals

BTW: Do you see the change after update in your DB if you look into it with
> another tool (e.g. psql)?
>

That's a good suggestion, I'll try to check it out, thanks!


[HACKERS] Odd query execution behavior with extended protocol

2015-10-04 Thread Shay Rojansky
Hi hackers, some odd behavior has been reported with Npgsql and I'm sure
you can help.

Npgsql supports sending multiple SQL statements in a single packet via the
extended protocol. This works fine, but when the second query SELECTs a
value modified by the first's UPDATE, I'm getting a result as if the UPDATE
hasn't yet occurred.

The exact messages send by Npgsql are:

Parse (UPDATE data SET name='foo' WHERE id=1), statement=unnamed
Describe (statement=unnamed)
Bind (statement=unnamed, portal=MQ0)
Parse (SELECT * FROM data WHERE id=1), statement=unnamed
Describe (statement=unnamed)
Bind (statement=unnamed, portal=MQ1)
Execute (portal=MQ0)
Close (portal=MQ0)
Execute (portal=MQ1)
Close (portal=MQ1)
Sync

Instead of returning the expected 'foo' value set in the first command's
UPDATE, I'm getting whatever value was previously there.
Note that this happen regardless of whether a transaction is already set
and of the isolation level.

Is this the expected behavior, have I misunderstood the protocol specs?

Thanks for your help, and please let me know if you need any more info.

Shay


Re: [HACKERS] Odd query execution behavior with extended protocol

2015-10-04 Thread Charles Clavadetscher
Hello

> Npgsql supports sending multiple SQL statements in a single packet via the 
> extended protocol. This works fine, but when the second query SELECTs a value 
> modified by the first's UPDATE, I'm getting a result as if the 
> UPDATE hasn't yet occurred.

Looks like the first updating statement is not committed, assuming that the two 
statements run in different transactions.

> The exact messages send by Npgsql are:
> 
> Parse (UPDATE data SET name='foo' WHERE id=1), statement=unnamed
> Describe (statement=unnamed)
> Bind (statement=unnamed, portal=MQ0)
> Parse (SELECT * FROM data WHERE id=1), statement=unnamed
> Describe (statement=unnamed)
> Bind (statement=unnamed, portal=MQ1)
> Execute (portal=MQ0)
> Close (portal=MQ0)
> Execute (portal=MQ1)
> Close (portal=MQ1)
> Sync

I never used Npgsql so I don't know if there is something missing there. Would 
you need an explicit commit before closing MQ0?
Also I am not in clear what "statement=unnamed" means, but it is used twice. Is 
it possible that the update is overwritten with select before it executes?

Just some thoughts, as I said I know nothing of Npgsql.

BTW: Do you see the change after update in your DB if you look into it with 
another tool (e.g. psql)?

Charles




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Odd query execution behavior with extended protocol

2015-10-04 Thread Tom Lane
Shay Rojansky  writes:
>> To my mind there is not a lot of value in performing Bind until you
>> are ready to do Execute.  The only reason the operations are separated
>> in the protocol is so that you can do multiple Executes with a row limit
>> on each one, to retrieve a large query result in chunks.

> So you would suggest changing my message chain to send Bind right after
> Execute, right? This would yield the following messages:

> P1/P2/D1/B1/E1/D2/B2/E2/S (rather than the current
> P1/D1/B1/P2/D2/B2/E1/C1/E2/C2/S)

> This would mean that I would switch to using named statements and the
> unnamed portal, rather than the current unnamed statement
> and named portals. If I recall correctly, I was under the impression that
> there are some PostgreSQL performance benefits to using the
> unnamed statement over named statements, although I admit I can't find any
> documentation backing that. Can you confirm that the two
> are equivalent performance-wise?

Hmm.  I do not recall exactly what performance optimizations apply to
those two cases; they're probably not "equivalent", though I do not think
the difference is major in either case.  TBH I was a bit surprised on
reading your message to hear that the system would take that sequence at
all; it's not obvious that it should be allowed to replace a statement,
named or not, while there's an open portal that depends on it.

I think you might have more issues with lifespans, since portals go away
at commit whereas named statements don't.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Odd query execution behavior with extended protocol

2015-10-04 Thread Shay Rojansky
>
> I'm fairly sure that the query snapshot is established at Bind time,
> which means that this SELECT will run with a snapshot that indeed
> does not see the effects of the UPDATE.
>
> To my mind there is not a lot of value in performing Bind until you
> are ready to do Execute.  The only reason the operations are separated
> in the protocol is so that you can do multiple Executes with a row limit
> on each one, to retrieve a large query result in chunks.
>

So you would suggest changing my message chain to send Bind right after
Execute, right? This would yield the following messages:

P1/P2/D1/B1/E1/D2/B2/E2/S (rather than the current
P1/D1/B1/P2/D2/B2/E1/C1/E2/C2/S)

This would mean that I would switch to using named statements and the
unnamed portal, rather than the current unnamed statement
and named portals. If I recall correctly, I was under the impression that
there are some PostgreSQL performance benefits to using the
unnamed statement over named statements, although I admit I can't find any
documentation backing that. Can you confirm that the two
are equivalent performance-wise?

Shay


Re: [HACKERS] Odd query execution behavior with extended protocol

2015-10-04 Thread Shay Rojansky
>
> Try adding a sync before the second execute.
>

I tried inserting a Sync right before the second Execute, this caused an
error with the message 'portal "MQ1" does not exist'.
This seems like problematic behavior on its own, regardless of my issues
here (Sync shouldn't be causing an implicit close of the portal, should
it?).


Re: [HACKERS] Odd query execution behavior with extended protocol

2015-10-04 Thread Tom Lane
Shay Rojansky  writes:
>> Try adding a sync before the second execute.

> I tried inserting a Sync right before the second Execute, this caused an
> error with the message 'portal "MQ1" does not exist'.
> This seems like problematic behavior on its own, regardless of my issues
> here (Sync shouldn't be causing an implicit close of the portal, should
> it?).

Sync results in closing the transaction, if you've not explicitly executed
a BEGIN.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Odd query execution behavior with extended protocol

2015-10-04 Thread Tom Lane
Shay Rojansky  writes:
> Npgsql supports sending multiple SQL statements in a single packet via the
> extended protocol. This works fine, but when the second query SELECTs a
> value modified by the first's UPDATE, I'm getting a result as if the UPDATE
> hasn't yet occurred.

> The exact messages send by Npgsql are:

> Parse (UPDATE data SET name='foo' WHERE id=1), statement=unnamed
> Describe (statement=unnamed)
> Bind (statement=unnamed, portal=MQ0)
> Parse (SELECT * FROM data WHERE id=1), statement=unnamed
> Describe (statement=unnamed)
> Bind (statement=unnamed, portal=MQ1)
> Execute (portal=MQ0)
> Close (portal=MQ0)
> Execute (portal=MQ1)
> Close (portal=MQ1)
> Sync

I'm fairly sure that the query snapshot is established at Bind time,
which means that this SELECT will run with a snapshot that indeed
does not see the effects of the UPDATE.

To my mind there is not a lot of value in performing Bind until you
are ready to do Execute.  The only reason the operations are separated
in the protocol is so that you can do multiple Executes with a row limit
on each one, to retrieve a large query result in chunks.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers