[HACKERS] Batches, error handling and transaction in the protocol

Shay Rojansky Wed, 28 Sep 2016 11:05:05 -0700

Hi everyone, I'd appreciate some guidance on an issue that's been raised
with Npgsql, input from other driver writers would be especially helpful.


Npgsql currently supports batching (or pipelining) to avoid roundtrips, and
sends a Sync message only at the end of the batch (so
Parse1/Bind1/Describe1/Execute1/Parse2/Bind2/Describe2/Execute2/Sync). The
reasoning is that if the first statement in the batch fails, the others
shouldn't be processed. This seems to be the standard approach (the
proposed patch for libpq seems to do the same).

At the same time, if the batch doesn't occur within an explicit transaction
(i.e. after BEGIN), it is automatically wrapped in an implicit transaction,
with Sync committing it. This can, for example, provoke deadlocks if two
batches try to update the same rows in reverse order. The problem is that
the user didn't request a transaction in any way - they're just using
batching to avoid roundtrips and their intention is to be in autocommit
mode.

One possible solution for this would be to insert a Sync after every
execute in the batch, rather than a single Sync at the very end. This would
make batches work the same as unbatched statements, and would resolve the
deadlocks. However, behavior in case of error would be problematic:
PostgreSQL would continue executing later messages if earlier ones failed,
Npgsql would have to deal with multiple errors, etc.

More generally speaking, the protocol appears to couple two different
things which may be unrelated. On the one hand, we have a protocol sync
mechanism for error recovery (skip until Sync). One the other hand, we have
an implicit transaction for extended query messages until that same Sync.
It seems valid to want to have error recovery without an implicit
transaction, but this doesn't seem supported by the current protocol (I
could add a note for v4).

Finally, to give more context, a Microsoft developer ran into this while
running ASP.NET benchmarks over Npgsql and its Entity Framework Core ORM
provider. One of EFCore's great new features is that it batches database
updates into a single roundtrip, but this triggered deadlocks. Whereas in
many cases it's OK to tell users to solve the deadlocks by properly
ordering their statements, when an ORM is creating the batch it's a more
difficult proposition.

Thanks for any thoughts or guidance!

Shay

[HACKERS] Batches, error handling and transaction in the protocol

Reply via email to