Hello,
This is a follow-up to « Is "transaction.atomic" in 1.6 supposed to work this
way? ». Two options seem viable to me. We must chose one before releasing 1.6
RC1.
Other options have been proposed but I couldn't make them work satisfactorily.
In order to keep this discussion manageable, if you'd like to propose a new
idea, please make sure it can be implemented and pass Django's test suite
first, so we can discuss working code.
** The problem **
We're talking about the following scenario:
1) The program enters an `atomic` block (ie. Django begins a transaction or
creates a savepoint).
2) At some point an exception is thrown and the transaction is marked for
rollback.
a) This is expected when a DatabaseError, IntegrityError, etc. occurs.
b) It may also happen in scenarios involving ORM-related signals and
probably others I haven't discovered yet.
3) The program catches the exception.
4) The program attempts to execute SQL queries.
5) The program exits the `atomic` block with a rollback (ie. Django rolls back
the transaction or to the savepoint)
Here's the behavior of the database at step 4 :
- PostgreSQL doesn't allow running queries after an error has occurred in a
transaction. It raises an exception.
- MySQL and Oracle have rolled back the faulty statement and allow running
queries. This is called statement-level atomicity, and it makes it possible to
commit a transaction even though not all statements have executed. This
behavior is… interesting… (did someone just shout "foot gun" at the back of the
room?) but it exists and developers are using it.
- SQLite's behavior is undefined. It may raise an error like PostgreSQL, roll
back the last statement like MySQL, or roll back the entire transaction.
** Option 1 — keep the current behavior **
Currently, `atomic` doesn't attempt to normalize the behavior in the scenario
described above. The actual behavior depends on the underlying database. This
approach has two drawbacks.
A) If the exception that originally triggered the rollback isn't related to the
database, it's possible to run queries. However, their effect will be lost
during the final rollback at step 5. Developers may be surprised by this
rollback. It doesn't break the fundamental contract of `atomic` (atomicity) but
it's unexpected.
This is clearly an abstraction leak: `atomic` manages database transactions
through Python exceptions, but the two can get out of sync. Technically, it can
be traced to the `savepoint=False` option of `atomic`, but that's a red
herring, and I don't think we can do much better without either complicating
the `atomic` API or adding a large performance overhead.
B) If the exception was a database error, but the database allows running
queries after an error in a transaction (MySQL, Oracle), developers who attempt
to handle the error properly will be very surprised that a rollback still
happens.
It looks like both problems can be solved with a simple guideline: catch errors
"around" an `atomic` block rather than "inside". This ensures that the
transaction is rolled back to an usable state before you handle the exception.
One may have to insert an extra `atomic` block for this purpose. If we were
only concerned with database errors (2a), this would be manageable.
Unfortunately, there's also (2b); Anssi provided some valid examples. Python
being a dynamic language, we can't exclude other scenarios. So, while the
guideline is simple in theory, it isn't realistic to apply it to every
try/catch block.
** Option 2 — prevent running queries in broken transactions **
`atomic` would normalize the behavior across databases by raising an exception
at step 4. This approach is implemented in
https://github.com/django/django/pull/1659 (still needs docs). It also has a
few drawbacks.
C) It enforces PostgreSQL's behavior across all databases. Django tries to be
database agnostic, except that when it has to make a choice, it always chooses
PostgreSQL's behavior, which isn't nice to developers who favor other databases.
D) It triggers false positives. I had to change several tests in Django's test
suite to make them pass:
https://github.com/aaugustin/django/commit/a08e5dd403c40ae12c77a677e52997a73bd09dbb
Besides, some statements are allowed in broken transactions, even under
PostgreSQL. For instance ROLLBACK TO SAVEPOINT is allowed! I had to resort to a
pretty bad hack to allow nesting `constraint_checks_disabled` (a MySQL-specific
feature) inside `atomic`:
https://github.com/aaugustin/django/commit/9e50dadf14adac92f1faf0a928df3fac032bddc6
E) Developers will learn `set_rollback(False)` which allows preventing the
rollback. Misusing it can result in really hard to diagnose errors. It don't
find it safe for general consumption. (I have nothing against
`set_rollback(True)` to force a rollback without raising an exception.
** Recommendation **
I'm +0 for option 2 for the following reasons:
- failing loudly is better than failing silently
- detecting problems at run time is better than requiring code changes
(forgiveness vs. permission)
- solving design problems with documentation doesn't fly
Thoughts?
--
Aymeric.
--
You received this message because you are subscribed to the Google Groups
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/django-developers.
For more options, visit https://groups.google.com/groups/opt_out.