Re: [HACKERS] Odd query execution behavior with extended protocol
On Tue, Oct 6, 2015 at 6:18 PM, Tom Lane wrote: > Robert Haas writes: >> On Tue, Oct 6, 2015 at 6:10 PM, Tom Lane wrote: >>> I'm concerned though whether this would amount to a protocol break. > >> Me, too. > > There are enough CCI's floating around the code that there may not > actually be any observable problem, at least not in typical cases. > That would explain the lack of complaints ... It's pretty to think so, but I've been doing this long enough to be skeptical. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Odd query execution behavior with extended protocol
Robert Haas writes: > On Tue, Oct 6, 2015 at 6:10 PM, Tom Lane wrote: >> I'm concerned though whether this would amount to a protocol break. > Me, too. There are enough CCI's floating around the code that there may not actually be any observable problem, at least not in typical cases. That would explain the lack of complaints ... regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Odd query execution behavior with extended protocol
On Tue, Oct 6, 2015 at 6:10 PM, Tom Lane wrote: > I'm concerned though whether this would amount to a protocol break. Me, too. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Odd query execution behavior with extended protocol
Robert Haas writes: > On Tue, Oct 6, 2015 at 5:53 PM, Tom Lane wrote: >> I dunno, if you close a portal before you've gotten CommandComplete, >> should you expect that all its actions were taken? Arguably, that >> should be more like a ROLLBACK. > I dunno either, but a rollback would undo everything, and a commit > would do everything. Ending up in a state where we've done some of it > but not all of it is strange. Being able to run an unbounded number > of commands without a CommandCounterIncrement is *really* strange. I'm fairly sure that we *have* done all of it; the Portal code is careful to execute DML commands to completion whether or not the client accepts all the RETURNING rows. It will become visible after you commit. The issue here is just whether it's visible to a subsequent Bind within the same transaction. > I'm not very sure what to do about it, though. Possibly we need the close-portal message processing code to do the CCI stuff if it observes that the Portal hasn't been run to completion. (I think there is already enough state in the Portal struct to tell that, though I'm too lazy to look right now.) I'm concerned though whether this would amount to a protocol break. It's certainly a behavioral change that we'd have to document. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Odd query execution behavior with extended protocol
On Tue, Oct 6, 2015 at 5:53 PM, Tom Lane wrote: > Robert Haas writes: >> From looking at the code, it appears to me that if the Execute is run >> to completion, then its results will be seen by future statements, but >> if the portal is closed earlier, then not. See the end of >> exec_execute_message. The handler for the Close message (inside >> PostgresMain) doesn't seem to do anything that would result in a >> CommandCounterIncrement() or CommitTransactionCommand(). > >> This does seem a little strange. > > I dunno, if you close a portal before you've gotten CommandComplete, > should you expect that all its actions were taken? Arguably, that > should be more like a ROLLBACK. I dunno either, but a rollback would undo everything, and a commit would do everything. Ending up in a state where we've done some of it but not all of it is strange. Being able to run an unbounded number of commands without a CommandCounterIncrement is *really* strange. I'm not very sure what to do about it, though. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Odd query execution behavior with extended protocol
Robert Haas writes: > From looking at the code, it appears to me that if the Execute is run > to completion, then its results will be seen by future statements, but > if the portal is closed earlier, then not. See the end of > exec_execute_message. The handler for the Close message (inside > PostgresMain) doesn't seem to do anything that would result in a > CommandCounterIncrement() or CommitTransactionCommand(). > This does seem a little strange. I dunno, if you close a portal before you've gotten CommandComplete, should you expect that all its actions were taken? Arguably, that should be more like a ROLLBACK. Note there'd only be a difference in case of an operation with RETURNING, else there's no way (at this level anyway) to pause a data-modifying command mid-execution. This logic all predates RETURNING, I think, so maybe it does need to be revisited. But it's not entirely clear what should happen. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Odd query execution behavior with extended protocol
On Sat, Oct 3, 2015 at 5:03 AM, Shay Rojansky wrote: > Hi hackers, some odd behavior has been reported with Npgsql and I wanted to > get your help. > > Npgsql supports sending multiple SQL statements in a single packet via the > extended protocol. This works fine, but when the second query SELECTs a > value modified by the first's UPDATE, I'm getting a result as if the UPDATE > hasn't yet occurred. > > The exact messages send by Npgsql are: > > Parse (UPDATE data SET name='foo' WHERE id=1), statement=unnamed > Describe (statement=unnamed) > Bind (statement=unnamed, portal=MQ0) > Parse (SELECT * FROM data WHERE id=1), statement=unnamed > Describe (statement=unnamed) > Bind (statement=unnamed, portal=MQ1) > Execute (portal=MQ0) > Close (portal=MQ0) > Execute (portal=MQ1) > Close (portal=MQ1) > Sync > > Instead of returning the expected 'foo' value set in the first command's > UPDATE, I'm getting whatever value was previously there. > Note that this happen regardless of whether a transaction is already set and > of the isolation level. > > Is this the expected behavior, have I misunderstood the protocol specs? >From looking at the code, it appears to me that if the Execute is run to completion, then its results will be seen by future statements, but if the portal is closed earlier, then not. See the end of exec_execute_message. The handler for the Close message (inside PostgresMain) doesn't seem to do anything that would result in a CommandCounterIncrement() or CommitTransactionCommand(). This does seem a little strange. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Odd query execution behavior with extended protocol
> > > So you would suggest changing my message chain to send Bind right after > > Execute, right? This would yield the following messages: > > > P1/P2/D1/B1/E1/D2/B2/E2/S (rather than the current > > P1/D1/B1/P2/D2/B2/E1/C1/E2/C2/S) > > > This would mean that I would switch to using named statements and the > > unnamed portal, rather than the current unnamed statement > > and named portals. If I recall correctly, I was under the impression that > > there are some PostgreSQL performance benefits to using the > > unnamed statement over named statements, although I admit I can't find > any > > documentation backing that. Can you confirm that the two > > are equivalent performance-wise? > > Hmm. I do not recall exactly what performance optimizations apply to > those two cases; they're probably not "equivalent", though I do not think > the difference is major in either case. TBH I was a bit surprised on > reading your message to hear that the system would take that sequence at > all; it's not obvious that it should be allowed to replace a statement, > named or not, while there's an open portal that depends on it. > One more important piece of information... The reason Npgsql currently sends P1/D1/B1/P2/D2/B2/E1/C1/E2/C2/S is to avoid deadlocks, I've already discussed this with you in http://www.postgresql.org/message-id/cadt4rqb+fbtqpte5ylz0hkb-2k-ngzhm2ybvj0tmc8rqbgf...@mail.gmail.com . Unfortunately, the alternative I proposed above, P1/P2/D1/B1/E1/D2/B2/E2/S, suffers from the same issue: any sequence in which a Bind is sent after a previous Execute is deadlock-prone - Execute causes PostgreSQL to start writing a potentially large dataset, while Bind means the client may be writing a potentially large parameter value. In other words, unless I'm mistaken it seems there's no alternative but to implement non-blocking I/O at the client side - write until writing would block, switching to reading when that happens. This adds some substantial complexity, especially with .NET's SSL/TLS implementation layer. Or does anyone see some sort of alternative which I've missed?
Re: [HACKERS] Odd query execution behavior with extended protocol
Thanks for the help Tom and the others, I'll modify my sequence and report if I encounter any further issues. On Sun, Oct 4, 2015 at 7:36 PM, Tom Lane wrote: > Shay Rojansky writes: > >> To my mind there is not a lot of value in performing Bind until you > >> are ready to do Execute. The only reason the operations are separated > >> in the protocol is so that you can do multiple Executes with a row limit > >> on each one, to retrieve a large query result in chunks. > > > So you would suggest changing my message chain to send Bind right after > > Execute, right? This would yield the following messages: > > > P1/P2/D1/B1/E1/D2/B2/E2/S (rather than the current > > P1/D1/B1/P2/D2/B2/E1/C1/E2/C2/S) > > > This would mean that I would switch to using named statements and the > > unnamed portal, rather than the current unnamed statement > > and named portals. If I recall correctly, I was under the impression that > > there are some PostgreSQL performance benefits to using the > > unnamed statement over named statements, although I admit I can't find > any > > documentation backing that. Can you confirm that the two > > are equivalent performance-wise? > > Hmm. I do not recall exactly what performance optimizations apply to > those two cases; they're probably not "equivalent", though I do not think > the difference is major in either case. TBH I was a bit surprised on > reading your message to hear that the system would take that sequence at > all; it's not obvious that it should be allowed to replace a statement, > named or not, while there's an open portal that depends on it. > > I think you might have more issues with lifespans, since portals go away > at commit whereas named statements don't. > > regards, tom lane >
Re: [HACKERS] Odd query execution behavior with extended protocol
Shay Rojansky writes: >> Try adding a sync before the second execute. > I tried inserting a Sync right before the second Execute, this caused an > error with the message 'portal "MQ1" does not exist'. > This seems like problematic behavior on its own, regardless of my issues > here (Sync shouldn't be causing an implicit close of the portal, should > it?). Sync results in closing the transaction, if you've not explicitly executed a BEGIN. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Odd query execution behavior with extended protocol
Shay Rojansky writes: >> To my mind there is not a lot of value in performing Bind until you >> are ready to do Execute. The only reason the operations are separated >> in the protocol is so that you can do multiple Executes with a row limit >> on each one, to retrieve a large query result in chunks. > So you would suggest changing my message chain to send Bind right after > Execute, right? This would yield the following messages: > P1/P2/D1/B1/E1/D2/B2/E2/S (rather than the current > P1/D1/B1/P2/D2/B2/E1/C1/E2/C2/S) > This would mean that I would switch to using named statements and the > unnamed portal, rather than the current unnamed statement > and named portals. If I recall correctly, I was under the impression that > there are some PostgreSQL performance benefits to using the > unnamed statement over named statements, although I admit I can't find any > documentation backing that. Can you confirm that the two > are equivalent performance-wise? Hmm. I do not recall exactly what performance optimizations apply to those two cases; they're probably not "equivalent", though I do not think the difference is major in either case. TBH I was a bit surprised on reading your message to hear that the system would take that sequence at all; it's not obvious that it should be allowed to replace a statement, named or not, while there's an open portal that depends on it. I think you might have more issues with lifespans, since portals go away at commit whereas named statements don't. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Odd query execution behavior with extended protocol
> > I'm fairly sure that the query snapshot is established at Bind time, > which means that this SELECT will run with a snapshot that indeed > does not see the effects of the UPDATE. > > To my mind there is not a lot of value in performing Bind until you > are ready to do Execute. The only reason the operations are separated > in the protocol is so that you can do multiple Executes with a row limit > on each one, to retrieve a large query result in chunks. > So you would suggest changing my message chain to send Bind right after Execute, right? This would yield the following messages: P1/P2/D1/B1/E1/D2/B2/E2/S (rather than the current P1/D1/B1/P2/D2/B2/E1/C1/E2/C2/S) This would mean that I would switch to using named statements and the unnamed portal, rather than the current unnamed statement and named portals. If I recall correctly, I was under the impression that there are some PostgreSQL performance benefits to using the unnamed statement over named statements, although I admit I can't find any documentation backing that. Can you confirm that the two are equivalent performance-wise? Shay
Re: [HACKERS] Odd query execution behavior with extended protocol
> > Try adding a sync before the second execute. > I tried inserting a Sync right before the second Execute, this caused an error with the message 'portal "MQ1" does not exist'. This seems like problematic behavior on its own, regardless of my issues here (Sync shouldn't be causing an implicit close of the portal, should it?).
Re: [HACKERS] Odd query execution behavior with extended protocol
Shay Rojansky writes: > Npgsql supports sending multiple SQL statements in a single packet via the > extended protocol. This works fine, but when the second query SELECTs a > value modified by the first's UPDATE, I'm getting a result as if the UPDATE > hasn't yet occurred. > The exact messages send by Npgsql are: > Parse (UPDATE data SET name='foo' WHERE id=1), statement=unnamed > Describe (statement=unnamed) > Bind (statement=unnamed, portal=MQ0) > Parse (SELECT * FROM data WHERE id=1), statement=unnamed > Describe (statement=unnamed) > Bind (statement=unnamed, portal=MQ1) > Execute (portal=MQ0) > Close (portal=MQ0) > Execute (portal=MQ1) > Close (portal=MQ1) > Sync I'm fairly sure that the query snapshot is established at Bind time, which means that this SELECT will run with a snapshot that indeed does not see the effects of the UPDATE. To my mind there is not a lot of value in performing Bind until you are ready to do Execute. The only reason the operations are separated in the protocol is so that you can do multiple Executes with a row limit on each one, to retrieve a large query result in chunks. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Odd query execution behavior with extended protocol
On October 4, 2015 2:50:10 PM GMT+02:00, Shay Rojansky wrote: >> >> > Npgsql supports sending multiple SQL statements in a single packet >via >> the extended protocol. This works fine, but when the second query >SELECTs a >> value modified by the first's UPDATE, I'm getting a result as if the >> > UPDATE hasn't yet occurred. >> >> Looks like the first updating statement is not committed, assuming >that >> the two statements run in different transactions. >> > >I did try to prefix the message chain with an explicit transaction >BEGIN >(with the several different isolation levels) without a difference in >behavior. > >> The exact messages send by Npgsql are: >> > >> > Parse (UPDATE data SET name='foo' WHERE id=1), statement=unnamed >> > Describe (statement=unnamed) >> > Bind (statement=unnamed, portal=MQ0) >> > Parse (SELECT * FROM data WHERE id=1), statement=unnamed >> > Describe (statement=unnamed) >> > Bind (statement=unnamed, portal=MQ1) >> > Execute (portal=MQ0) >> > Close (portal=MQ0) >> > Execute (portal=MQ1) >> > Close (portal=MQ1) >> > Sync >> >> I never used Npgsql so I don't know if there is something missing >there. >> Would you need an explicit commit before closing MQ0? >> > >I guess this is exactly my question to PostgreSQL... But unless I'm >misunderstanding the transaction semantics I shouldn't need to commit >the >first UPDATE in order to see its effect in the second SELECT... Try adding a sync before the second execute. Andres --- Please excuse brevity and formatting - I am writing this on my mobile phone. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Odd query execution behavior with extended protocol
> > > Npgsql supports sending multiple SQL statements in a single packet via > the extended protocol. This works fine, but when the second query SELECTs a > value modified by the first's UPDATE, I'm getting a result as if the > > UPDATE hasn't yet occurred. > > Looks like the first updating statement is not committed, assuming that > the two statements run in different transactions. > I did try to prefix the message chain with an explicit transaction BEGIN (with the several different isolation levels) without a difference in behavior. > The exact messages send by Npgsql are: > > > > Parse (UPDATE data SET name='foo' WHERE id=1), statement=unnamed > > Describe (statement=unnamed) > > Bind (statement=unnamed, portal=MQ0) > > Parse (SELECT * FROM data WHERE id=1), statement=unnamed > > Describe (statement=unnamed) > > Bind (statement=unnamed, portal=MQ1) > > Execute (portal=MQ0) > > Close (portal=MQ0) > > Execute (portal=MQ1) > > Close (portal=MQ1) > > Sync > > I never used Npgsql so I don't know if there is something missing there. > Would you need an explicit commit before closing MQ0? > I guess this is exactly my question to PostgreSQL... But unless I'm misunderstanding the transaction semantics I shouldn't need to commit the first UPDATE in order to see its effect in the second SELECT... Also I am not in clear what "statement=unnamed" means, but it is used > twice. Is it possible that the update is overwritten with select before it > executes? > statement=unnamed means that the destination statement is the unnamed prepared statement (as described in http://www.postgresql.org/docs/current/static/protocol-message-formats.html). Right after the Parse I bind the unnamed statement which I just parsed to cursor MQ0. In other words, Npgsql first parses the two queries and binds them to portals MQ0 and MQ1, and only then executes both portals BTW: Do you see the change after update in your DB if you look into it with > another tool (e.g. psql)? > That's a good suggestion, I'll try to check it out, thanks!
Re: [HACKERS] Odd query execution behavior with extended protocol
Hello > Npgsql supports sending multiple SQL statements in a single packet via the > extended protocol. This works fine, but when the second query SELECTs a value > modified by the first's UPDATE, I'm getting a result as if the > UPDATE hasn't yet occurred. Looks like the first updating statement is not committed, assuming that the two statements run in different transactions. > The exact messages send by Npgsql are: > > Parse (UPDATE data SET name='foo' WHERE id=1), statement=unnamed > Describe (statement=unnamed) > Bind (statement=unnamed, portal=MQ0) > Parse (SELECT * FROM data WHERE id=1), statement=unnamed > Describe (statement=unnamed) > Bind (statement=unnamed, portal=MQ1) > Execute (portal=MQ0) > Close (portal=MQ0) > Execute (portal=MQ1) > Close (portal=MQ1) > Sync I never used Npgsql so I don't know if there is something missing there. Would you need an explicit commit before closing MQ0? Also I am not in clear what "statement=unnamed" means, but it is used twice. Is it possible that the update is overwritten with select before it executes? Just some thoughts, as I said I know nothing of Npgsql. BTW: Do you see the change after update in your DB if you look into it with another tool (e.g. psql)? Charles -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers