Hello, I'd like to improve transactions handling in Django. The first step is the current emulation of autocommit with database-level autocommit.
** Rationale ** PEP249 says that any SQL query opens a transaction and that it's the application's job to commit (or rollback) changes. This model is also required by the SQL standard. But it isn't very developer-friendly. To alleviate the pain, Django commits after each ORM write. Unfortunately, this means that each read opens a transaction, that eventually gets committed after the next write, or rolled back at the end of the query. Such transactions are useless and don't come for free. Relying on them to enforce integrity is extremely fragile — what if an external library starts writing to a log table in the middle of one of these implicit transactions? The term "footgun" comes to mind. Database authors have reached the same conclusion, and most databases supported by Django use autocommit by default, ignoring the SQL standard. On PostgreSQL and SQLite, this is the only mode available. As a consequence, to implement the behavior mandated by PEP 249, the Python libraries (psycopg2, sqlite3, etc.) automatically start transactions. And then Django automatically commits them. This is not only wasteful, but also buggy. It's the root cause of "idle in transaction" connections on PostgreSQL. It's also sometimes poorly implemented: for instance, executing "SAVEPOINT …" on SQLite commits implicitly. (It's arguably a bug in the design of the sqlite3 module. The Python bug tracker suggests it's going to be documented.) Basically, Django intends to provide autocommit by default. Rather than fight the database adapter that itselfs fights the database, I propose to simply turn autocommit on, and stop implicitly starting and committing transactions. Explicit is better than implicit. ** Implementation ** All databases supported by Django provide an API to turn autocommit on: - http://initd.org/psycopg/docs/connection.html#connection.autocommit - http://docs.python.org/2/library/sqlite3#sqlite3.Connection.isolation_level - http://mysql-python.sourceforge.net/MySQLdb.html => conn.autocommit() - http://cx-oracle.sourceforge.net/html/connection.html#Connection.autocommit This obviously has far-reaching consequences on transaction handling in Django, but the current APIs should still work. (Fixing them is part 2 of the plan.) The general idea is that Django will explicitly start a transaction when entering transaction management. This will obviously impact maintainers of backend for other databases, but if it works on Oracle (which doesn't have autocommit — it's emulated in OCI) and on PostgreSQL (which enforces autocommit), I hope it can work anywhere. ** Backwards-compatibility ** Roughly, I'd classify Django users in four groups: 1 - "Transactions, how do they work?" 2 - "Django autocommits, doesn't it?" 3 - "I'm using TransactionMiddleware!" 4 - "I'm managing my transactions." Groups 1 and 2 won't see the difference. There won't be any difference for group 3. Group 4 may be impacted by the change, but I believe most people in this category have autocommit turned on already — or would like to, if they're on MySQL — and will understand the change. I don't see much point in providing an option to turn autocommit off, because starting a transaction is a much more explicit way to achieve the same effect. We usually don't provide "backwards compatibility with bugs". Yay or nay? -- Aymeric. -- You received this message because you are subscribed to the Google Groups "Django developers" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at http://groups.google.com/group/django-developers?hl=en. For more options, visit https://groups.google.com/groups/opt_out.
