> More generally speaking, the protocol appears to couple two different things which may be unrelated. On the one hand, we have a protocol > sync mechanism for error recovery (skip until Sync). One the other hand, we have an implicit transaction for extended query messages until > that same Sync. It seems valid to want to have error recovery without an implicit transaction, but this doesn't seem supported by the current > protocol (I could add a note for v4).
In the absence of any response on my message from September 28th, I've added a todo item for wire protocol v4 (separate transaction delineation from protocol error recovery). Note that the same issue was discussed with Craig Ringer in https://www.postgresql.org/message-id/CAMsr%2BYEgnJ8ZAWPLx5%3DBCbYYq9SNTdwbwvUcb7V-vYm5d5uhbQ%40mail.gmail.com On Wed, Sep 28, 2016 at 6:04 PM, Shay Rojansky <r...@roji.org> wrote: > Hi everyone, I'd appreciate some guidance on an issue that's been raised > with Npgsql, input from other driver writers would be especially helpful. > > Npgsql currently supports batching (or pipelining) to avoid roundtrips, > and sends a Sync message only at the end of the batch (so > Parse1/Bind1/Describe1/Execute1/Parse2/Bind2/Describe2/Execute2/Sync). > The reasoning is that if the first statement in the batch fails, the others > shouldn't be processed. This seems to be the standard approach (the > proposed patch for libpq seems to do the same). > > At the same time, if the batch doesn't occur within an explicit > transaction (i.e. after BEGIN), it is automatically wrapped in an implicit > transaction, with Sync committing it. This can, for example, provoke > deadlocks if two batches try to update the same rows in reverse order. The > problem is that the user didn't request a transaction in any way - they're > just using batching to avoid roundtrips and their intention is to be in > autocommit mode. > > One possible solution for this would be to insert a Sync after every > execute in the batch, rather than a single Sync at the very end. This would > make batches work the same as unbatched statements, and would resolve the > deadlocks. However, behavior in case of error would be problematic: > PostgreSQL would continue executing later messages if earlier ones failed, > Npgsql would have to deal with multiple errors, etc. > > More generally speaking, the protocol appears to couple two different > things which may be unrelated. On the one hand, we have a protocol sync > mechanism for error recovery (skip until Sync). One the other hand, we have > an implicit transaction for extended query messages until that same Sync. > It seems valid to want to have error recovery without an implicit > transaction, but this doesn't seem supported by the current protocol (I > could add a note for v4). > > Finally, to give more context, a Microsoft developer ran into this while > running ASP.NET benchmarks over Npgsql and its Entity Framework Core ORM > provider. One of EFCore's great new features is that it batches database > updates into a single roundtrip, but this triggered deadlocks. Whereas in > many cases it's OK to tell users to solve the deadlocks by properly > ordering their statements, when an ORM is creating the batch it's a more > difficult proposition. > > Thanks for any thoughts or guidance! > > Shay >