Re: Switch to database-level autocommit

2013-03-07 Thread Shai Berger
On Wednesday 06 March 2013, Aymeric Augustin wrote:
> 
> So part 3 of the plan is to replace TransactionMiddleware by something
> based on transaction.atomic that actually works. For lack of a better
> idea, I'm considering deprecating the middleware and replacing it with a
> setting. This doesn't qualify as loose coupling; better ideas welcome!

Another half-baked idea: I think atomic_urlpatterns (which wrap the view 
callables in @atomic) could do a lot of the trick. They would leave middleware 
database accesses out of the transaction, which could turn out to be a 
"gotcha", but could be acceptable in many cases.

Shai.

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: Switch to database-level autocommit

2013-03-07 Thread Aymeric Augustin
On 6 mars 2013, at 23:21, Aymeric Augustin  
wrote:

>  using a WSGI middleware. It's two lines of code:
> 
> from django.core.wsgi import get_wsgi_application
> application = get_wsgi_application()
> 
> for db in ('default', 'other'):
> application = atomic(using=db)(application)
> 
> I'm leaning towards simply documenting that.


Sadly, this doesn't work. The atomic block needs to be inside Django's
exception hander. Otherwise, its __exit__ will never see an exception.

Sorry for the noise.

-- 
Aymeric.



-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: Switch to database-level autocommit

2013-03-06 Thread Jacob Kaplan-Moss
To clarify a little bit, the reasons I think it's a good idea are because:

* It's simple and clean.
* It doesn't require another setting, nor new middleware methods, nor
new anything actually.
* It makes Django inter-op a tiny bit better (if you want to use
Django's ORM and request-linked commits in a different framework,
well, now it's documented.)

On Wed, Mar 6, 2013 at 4:21 PM, Aymeric Augustin
 wrote:
> I'm leaning towards simply documenting that.

I think we should also have that line be in wsgi.py commented out (I
see a bunch of people using TransactionMiddleware in the wild, and we
need to give them a very easy upgrade path).

Jacob

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: Switch to database-level autocommit

2013-03-06 Thread Aymeric Augustin
On 5 mars 2013, at 23:13, Aymeric Augustin  
wrote:

> For lack of a better idea, I'm considering deprecating the middleware and 
> replacing it with a setting. This doesn't qualify as loose coupling; better 
> ideas welcome!


Jacob just suggested on IRC using a WSGI middleware. It's two lines of code:

from django.core.wsgi import get_wsgi_application
application = get_wsgi_application()

for db in ('default', 'other'):
application = atomic(using=db)(application)

I'm leaning towards simply documenting that.

-- 
Aymeric.



-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: Switch to database-level autocommit

2013-03-06 Thread Javier Guerra Giraldez
On Wed, Mar 6, 2013 at 4:15 AM, Jeremy Dunck  wrote:
> I, for one, would prefer that we not recommend TransactionMiddleware
> so highly.

then what would be the recommended option to have transactions tied to
the request/response cycle?

probably @commit_on_success for every view that modifies the database?
 i could live with that...

still, the "set and forget" nature of TransactionMiddleware is very attractive.


--
Javier

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: Switch to database-level autocommit

2013-03-06 Thread kayess


On Wednesday, 6 March 2013 16:12:18 UTC+7, jdunck wrote:
>
> Can 
> you give a concrete example of an exception being raised at commit 
> time? 
>

Postgres allows for things like foreign key integrity checks to be made on 
commit (rather than when the data is entered). This makes it significantly 
easier to enter data where there is a circular dependency, and also reduces 
the amount of time certain locks are held. Django makes use of this feature.


Kirit

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: Switch to database-level autocommit

2013-03-06 Thread Aymeric Augustin
On 6 mars 2013, at 08:49, Ian Kelly  wrote:

> On Tue, Mar 5, 2013 at 3:13 PM, Aymeric Augustin
>  wrote:
>> In the mean time, I discovered that it's impossible to implement
>> TransactionMiddleware reliably as a middleware, because there's no guarantee
>> that process_response or process_exception will be called.

I verified the code; here are a few scenarios where TransactionMiddleware
fails simply because it's implemented as a middleware.

(1) Another middleware's process_request returns a response before
TM.process_request can run. TM.process_response/process_exception
attempts to close a transaction than was never opened.

(2) Another middleware's process_request raises an exception after
TM.process_request has run. TM.process_exception is never called.

(3) Any middleware's process_view raises an exception.
TM.process_exception is never called.

(4) Any middleware's process_template_response raises an exception.
TM.process_exception is never called.

(5) Rendering a TemplateResponse raises an exception.
TM.process_exception is never called.

(6) The view returns normally. Another middleware's process_response
raises an exception before TM.process_response can run.

(7) The view returns an exception. Another middleware's process_exception
 raises an exception before TM.process_exception can run.

(8) The view returns an exception. Another middleware's process_exception
returns a response before TM.process_exception can run. Another
middleware's process_response raises an exception before
TM.process_response can run.


I've implemented atomic as a context manager so that Python itself
guarantee that calls to __enter__ and __exit__ will be balanced.

Django's middleware API requires to give up this property, and I'm very
reluctant to do that. If a transaction accidentally stays open, that's a major
data-loss bug, all the more since I added persistent connections.

(This doesn't happen in Django 1.5 because the HTTP handler unwinds the
"transaction management" stack and closes the connection in request_finished.)


> Perhaps this could be fixed.  We could add another middleware method
> called process_finally or somesuch, that would be guaranteed to be
> called on every request after the other methods.  It would take a
> request and response as arguments, and any return value would be
> ignored.  If it raised an exception, the exception would be caught,
> logged, and otherwise ignored.

I tried to implement this idea. It turns out we'd also need:
- a process_beforeall method that would be guaranteed to be called on
  every request before the other methods, to cater for case (1).
- a way to tell process_finally if an unhandled exception occurred, and
  it isn't clear to me if an exception dealt with by another middleware's
  process_exception should be considered handled or not.

Then TransactionMiddleware would look like this:

class TransactionMiddleware(object):
def __init__(self):
self.atomic = transaction.atomic()

def process_beforeall(self, request):
self.atomic.__enter__()

def process_finally(self, request, response, exception):
if exception:
traceback = exception.__traceback__ if six.PY3 else 
sys.exc_info()[2]
self.atomic.__exit__(type(exception), exception, traceback)
else:
self.atomic.__exit__(None, None, None)
return response


BaseHandler.get_response already has 8 "except" clauses, nested
three-levels deep, and despite this abundant error handling there's
many cases where calls aren't paired. I'm not thrilled by the idea of
making it even more complex.

There would certainly be some value in making the middleware API
more predictable. However, in the short term, adding two methods
only to support TransactionMiddleware feels a lot like shoehorning.

-- 
Aymeric.



-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: Switch to database-level autocommit

2013-03-06 Thread Jeremy Dunck
I, for one, would prefer that we not recommend TransactionMiddleware
so highly.  Now that I'm doing active development with Rails, it's
quite clear to me that a major difference in the ORM layers are which
DB they grew up with -- the django orm is tuned best to postgres,
while working passably on mysql, while the rails orm works well on
mysql (such as it can) and worse on postgres.

But in particular, when I had a site with moderate (median?) traffic
and slow-ish requests (200ms-2s), TransactionMiddleware was absolutely
a disaster on mysql because it handled locking very poorly and TM
makes transactions much longer than they typically need to be.
Postgres handles xact locking much better, so this pain is not felt.
So: I would prefer to make it clear that postgres is the best choice,
though mysql works well, but specifically warning against TM on mysql.


On Tue, Mar 5, 2013 at 6:03 AM, Chris Wilson  wrote:
> Hi Lennart,
>
>
> On Tue, 5 Mar 2013, Lennart Regebro wrote:
>
>> I do agree that 99.9% of the sites committing on the end of the request is
>> the right thing to do, and do think that this should be the default,
>> although I realize that debate is already lost.
>
>
> In a perfect academic world where there are never any errors, that approach
> is probably academically correct. In my opinion it makes correct error
> handling extremely difficult, because database errors (unique violations,
> concurrent modifications, etc) are thrown far away from the code that knows
> how to deal with them (the code that caused them in the first place).
>
> Cheers, Chris.
> --
> Aptivate | http://www.aptivate.org | Phone: +44 1223 967 838
> Future Business, Cam City FC, Milton Rd, Cambridge, CB4 1UY, UK
>
> Aptivate is a not-for-profit company registered in England and Wales
> with company number 04980791.
>
>
> --
> You received this message because you are subscribed to the Google Groups
> "Django developers" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to django-developers+unsubscr...@googlegroups.com.
> To post to this group, send email to django-developers@googlegroups.com.
> Visit this group at http://groups.google.com/group/django-developers?hl=en.
> For more options, visit https://groups.google.com/groups/opt_out.
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: Switch to database-level autocommit

2013-03-06 Thread Jeremy Dunck
I'm not sure what you're referring to here - integrity, uniqueness,
and locking are handled at the individual query level - transactions
just cause locks to be held (if needed) until commit or rollback.  Can
you give a concrete example of an exception being raised at commit
time?


On Tue, Mar 5, 2013 at 6:03 AM, Chris Wilson  wrote:
> Hi Lennart,
>
>
> On Tue, 5 Mar 2013, Lennart Regebro wrote:
>
>> I do agree that 99.9% of the sites committing on the end of the request is
>> the right thing to do, and do think that this should be the default,
>> although I realize that debate is already lost.
>
>
> In a perfect academic world where there are never any errors, that approach
> is probably academically correct. In my opinion it makes correct error
> handling extremely difficult, because database errors (unique violations,
> concurrent modifications, etc) are thrown far away from the code that knows
> how to deal with them (the code that caused them in the first place).
>
> Cheers, Chris.
> --
> Aptivate | http://www.aptivate.org | Phone: +44 1223 967 838
> Future Business, Cam City FC, Milton Rd, Cambridge, CB4 1UY, UK
>
> Aptivate is a not-for-profit company registered in England and Wales
> with company number 04980791.
>
>
> --
> You received this message because you are subscribed to the Google Groups
> "Django developers" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to django-developers+unsubscr...@googlegroups.com.
> To post to this group, send email to django-developers@googlegroups.com.
> Visit this group at http://groups.google.com/group/django-developers?hl=en.
> For more options, visit https://groups.google.com/groups/opt_out.
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: Switch to database-level autocommit

2013-03-06 Thread Jeremy Dunck
On Mon, Mar 4, 2013 at 1:34 AM, Aymeric Augustin
 wrote:
> On 4 mars 2013, at 01:07, Shai Berger  wrote:
...
>> The use of savepoints in Django apps so far has been very little, as far as 
>> I know. One point
>> I'm unsure of is the interaction of savepoints with cursors, since querysets 
>> are lazy; so the scenario
>> I'm worrirf about is, generally speaking,
>>
>> @atomic
>> def main():
>>   for obj in query_with_more_than_100_objects:
>>   try:
>>   handle(obj)
>>   except Bad:
>>   pass
>>
>> @atomic
>> def handle(obj):
>>   if good(obj):
>>   mark_good(obj)
>>   obj.save()
>>   else:
>>   raise Bad(obj)
>>
>> Will the next (database) fetch after an exception is raised get the right 
>> objects?
>>
>> My reading of the Postgresql documentation is that it will do the right 
>> thing, not so sure about
>> the other backends.
>
> Django currently doesn't support server side cursors — at least on 
> PostgreSQL, most likely on
> other databases too. Accessing the first object loads the entire queryset 
> instantly.

Actually, the queryset API tries pretty hard to avoid this:
https://github.com/django/django/blob/2cd0edaa477b327024e4007c8eaf46646dcd0f21/django/db/models/query.py#L954

It is true that by default, both psycopg and mysqldb do load the full
resultset despite using the use of .fetchmany rather than .fetchall
https://github.com/django/django/blob/0c82b1dfc48b4870e8fbcfb782ae02cdca821e1f/django/db/models/sql/compiler.py#L765
But this is not django's fault, and I have always wished more drivers
would handle streaming results better.

In any case, as far as I know, the above code (mixing 2 queries with
partial result returns on a single connection) has never worked
properly, and I think it will stay equally broken in the new world.  I
think this could be fairly easily (manually) tested by setting
GET_ITERATOR_CHUNK_SIZE  to an absurdly low size. Which is to say, +1
from me.

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: Switch to database-level autocommit

2013-03-06 Thread Florian Apolloner
On Wednesday, March 6, 2013 8:49:37 AM UTC+1, Ian wrote:
>
> On Tue, Mar 5, 2013 at 3:13 PM, Aymeric Augustin 
>  wrote: 
> > In the mean time, I discovered that it's impossible to implement 
> > TransactionMiddleware reliably as a middleware, because there's no 
> guarantee 
> > that process_response or process_exception will be called. 
>
> Perhaps this could be fixed.  We could add another middleware method 
> called process_finally or somesuch, that would be guaranteed to be 
> called on every request after the other methods.
>

In the long run we it might be nice to have such a thing, especially in 
light of streaming responses 

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: Switch to database-level autocommit

2013-03-05 Thread Ian Kelly
On Tue, Mar 5, 2013 at 3:13 PM, Aymeric Augustin
 wrote:
> In the mean time, I discovered that it's impossible to implement
> TransactionMiddleware reliably as a middleware, because there's no guarantee
> that process_response or process_exception will be called.

Perhaps this could be fixed.  We could add another middleware method
called process_finally or somesuch, that would be guaranteed to be
called on every request after the other methods.  It would take a
request and response as arguments, and any return value would be
ignored.  If it raised an exception, the exception would be caught,
logged, and otherwise ignored.

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: Switch to database-level autocommit

2013-03-05 Thread Shai Berger
Hi Ian, Aymeric and all,

I stand corrected.

On Tuesday 05 March 2013, Ian Kelly wrote:
> On Tue, Mar 5, 2013 at 6:38 AM, Shai Berger  wrote:
> > 
> > When Django checks for a second record on get() it reads the second
> > record, so a repeatable-read transaction would acquire an exclusive read
> > lock on it. This makes it impossible for another transaction to delete
> > the second row before the first finishes.
> 

This statement was based in part on my previous understanding of transaction 
isolation levels, and in part on the Wikipedia article about them[1], which 
explicitly names these locks (apparently, following the SQL Stansard).

> SELECT statements without "FOR UPDATE" do not generally acquire
> exclusive locks to my knowledge, even under serializable or repeatable
> read isolation levels.  That would be a major issue for parallel
> requests if they did.  Also, I don't think there's any distinction
> between exclusive read or exclusive write locks; row-level locks are
> either just exclusive or shared.
> 
> I just verified that, at least in PostgreSQL, the logic above works in
> the serializable isolation level but can result in data loss in the
> repeatable read isolation level.

A deeper look in the wikipedia article finds a SIGMOD article[2] which analyzes 
isolation levels and, indeed, shows that such anomalies are allowed by the 
Postgres "Repeatable Read" level (which they name "snapshot isolation").

They also note in passing, that by SQL-92's definitions, this also passes for 
serializable. I note, also, that the stricter (more correct) "serializable" 
definition is new even in Postgres -- only since 9.1[3]; before that, 
Postgresql's "serializable" was really just "snapshot isolation".

> I don't have MySQL handy to test,

I do, and -- as might be expected in view of the above -- it allows the bad 
updates in repeatable-read (the default) as well as serializable isolation 
levels.

So -- I'm giving up on this. As Aymeric noted, on Postgres and Oracle there's 
no harm done, and MySql, as exemplified here, undermines correctness anyways; 
if you are writing your code naively, the change may break your correct code 
in some spot, but it is likely already broken in a dozen others.

One clarification (though I still got it wrong) --

> > Version 2: code in the first answer.
> > 
> > for row in MyModel.objects.all():
> > if MyModel.objects.filter(photo_id=row.photo_id).count() > 1:
> > row.delete()
> > 
> > To the best of my understanding, a repeatable-read transaction gets a
> > read lock on all records participating in a count, so again, nobody can
> > delete the other records before the transaction finishes.
> 
> But with any kind of autocommit on, the transaction ends after the
> row.delete(), unlocking the remaining rows and allowing them to then
> be deleted.

according to [3], in PG/Serializable the count selection would place a 
"predicate lock" on all counted records, preventing wrongful deletes even with 
(current django) autocommit on. But not in repeatable-read, and I don't have a 
PG available to verify.

I'm leaving Ian's test description in the tail of this message for reference.

Sorry for the disruption,

Shai.

[1] http://en.wikipedia.org/wiki/Isolation_(database_systems)
[2] "A Critique of ANSI SQL Isolation Levels", 
http://www.cs.umb.edu/~poneil/iso.pdf
[3] http://www.postgresql.org/docs/9.1/static/transaction-iso.html

--

> and Oracle and sqlite3 support serializable and read committed but not
> repeatable read. I started two transactions in separate psql shells.
> The starting data was:
> 
>  id | photo_id
> +--
>   1 |1
>   2 |1
> 
> The first transaction ran:
> 
> BEGIN;
> SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;
> SELECT * FROM TEST WHERE photo_id = 1;
> DELETE FROM TEST WHERE id = 1;
> COMMIT;
> 
> The second transaction ran the same thing but deleted id 2 instead.
> The individual statements were interleaved (so both queries returned
> two rows in the SELECT before either query deleted anything).  The end
> result was that both rows were deleted.  Trying the same thing in the
> serializable isolation level results in this error when attempting to
> commit the second transaction:
> 
> ERROR:  could not serialize access due to read/write dependencies
> among transactions
> DETAIL:  Reason code: Canceled on identification as a pivot, during
> commit attempt.
> 

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: Switch to database-level autocommit

2013-03-05 Thread Aymeric Augustin
On 3 mars 2013, at 21:37, Aymeric Augustin  
wrote:

> Part 2 of the plan is more ambitious — it's about improving the transaction 
> APIs. Here's the general idea.
> 
> I'd like to add an @atomic decorator that:
>   - Really guarantees atomicity, by raising an exception if you attempt 
> commit within an atomic block,
>   - Supports nesting within an existing transaction, with savepoints.

This is now implemented: https://github.com/django/django/pull/873

The test suite passes under SQLite and almost under PostgreSQL (two failures). 
Code reviews, docs reviews, and smoke tests would help a lot at this stage. 
Checkout my branch, run your apps, tell me what breaks!

In the mean time, I discovered that it's impossible to implement 
TransactionMiddleware reliably as a middleware, because there's no guarantee 
that process_response or process_exception will be called.

Currently, Django has a safety net: the request_finished signal triggers a 
rollback if the middleware failed.
In other words the HTTP layer is already, to some extent, coupled with the 
transaction management.

Besides, TransactionMiddleware doesn't guarantee much. For instance, using 
commit_on_success as a context manager in a view breaks it by committing the 
transaction prematurely.

So part 3 of the plan is to replace TransactionMiddleware by something based on 
transaction.atomic that actually works. For lack of a better idea, I'm 
considering deprecating the middleware and replacing it with a setting. This 
doesn't qualify as loose coupling; better ideas welcome!

-- 
Aymeric.



-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: Switch to database-level autocommit

2013-03-05 Thread Michael Manfre
On Tue, Mar 5, 2013 at 4:42 AM, Aymeric Augustin <
aymeric.augus...@polytechnique.org> wrote:

> In practice, this would mean:
> - Define "hook points" that are a superset the old and the new APIs.
> - Re-implement the current transaction management with these "hook points"
>   (hard, because the intended behavior of the dirty flag isn't clear).
> - Implement the new transaction management  with these "hook points"
>   (hard, because that prevents using Python's context manager abstraction,
>   which maps exactly to atomic blocks and guarantees that the exit method
>   is always called).
> - Add checks to prevent invalid operations when mixing the old and new
> APIs,
>   this is extremely difficult to get right when there are many hook points.
> - Provide simple guidelines on how the old and new APIs may be mixed.
> - Ensure that the old API can be trivially removed, because the author of
> the
>   new code may not be there in two years.
>
> This looks possible, but not in the amount of time I'm able to spend
> volunteering for Django.  If someone wants to implement this, I'll put my
> branch on hold. Otherwise, I'll propose it for inclusion into core.
>
> --
> Aymeric.


Don't forget about doubling the number of tests that need to be run for
each database backend to ensure both old and new work as expected.

Supporting multiple APIs simultaneously seems like it would be very
tedious, error prone, and add additional constraints on how the overall
transaction management can be improved, assuming it's even possible to do
both simultaneously in a stable way. The superset API would be less clean
and would add more support overhead for all of the backends, not just
MySQL.

This question may have an obvious answer, but it has yet to be asked. What
is the likelihood that people will actually benefit from being able to flip
a setting to convert from old to new transaction management, instead of
just stopping on Django 1.5 until they are ready to make the transition to
whatever backwards incompatible changes exist in Django 1.6?

If there is a new superset API, then anyone using a 3rd party database
backend will be left behind until the backend gets around to implementing
the new API. To give a little perspective as a 3rd party database backend
maintainer (django-mssql), I'd probably just drop support of the old
behavior as soon as I made the changes to support the new. Even if the
superset API should theoretically just work for both, there will be more
testing, support issues, and code complexity to ensure both APIs work as
expected. I don't really have the time or motivation to support a
transition state that guarantees additional API changes (removal) in the
future.

>From reading this thread, I also feel like the arguments in favor of the
parallel old and new support has not really been justified. It's a huge
body of work for an small edge case. The arguments also seem to provide a
strong justification for revisiting the idea of moving database backends
out of the core. I don't want to sidetrack this thread, I'll start a new
thread momentarily, but I think the idea is relevant for the current
discussion and should help decide which way to proceed.

Regards,
Michael Manfre

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: Switch to database-level autocommit

2013-03-05 Thread Ian Kelly
On Tue, Mar 5, 2013 at 6:38 AM, Shai Berger  wrote:
> I'm not sure what you mean by "unsafe".
>
> Version 1: code in the question
>
> rows = MyModel.objects.all()
> for row in rows:
> try:
> MyModel.objects.get(photo_id=row.photo_id)
> except:
> row.delete()
>
> When Django checks for a second record on get() it reads the second record, so
> a repeatable-read transaction would acquire an exclusive read lock on it. This
> makes it impossible for another transaction to delete the second row before
> the first finishes.

SELECT statements without "FOR UPDATE" do not generally acquire
exclusive locks to my knowledge, even under serializable or repeatable
read isolation levels.  That would be a major issue for parallel
requests if they did.  Also, I don't think there's any distinction
between exclusive read or exclusive write locks; row-level locks are
either just exclusive or shared.

I just verified that, at least in PostgreSQL, the logic above works in
the serializable isolation level but can result in data loss in the
repeatable read isolation level.  I don't have MySQL handy to test,
and Oracle and sqlite3 support serializable and read committed but not
repeatable read. I started two transactions in separate psql shells.
The starting data was:

 id | photo_id
+--
  1 |1
  2 |1

The first transaction ran:

BEGIN;
SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;
SELECT * FROM TEST WHERE photo_id = 1;
DELETE FROM TEST WHERE id = 1;
COMMIT;

The second transaction ran the same thing but deleted id 2 instead.
The individual statements were interleaved (so both queries returned
two rows in the SELECT before either query deleted anything).  The end
result was that both rows were deleted.  Trying the same thing in the
serializable isolation level results in this error when attempting to
commit the second transaction:

ERROR:  could not serialize access due to read/write dependencies
among transactions
DETAIL:  Reason code: Canceled on identification as a pivot, during
commit attempt.

> Version 2: code in the first answer.
>
> for row in MyModel.objects.all():
> if MyModel.objects.filter(photo_id=row.photo_id).count() > 1:
> row.delete()
>
> To the best of my understanding, a repeatable-read transaction gets a read
> lock on all records participating in a count, so again, nobody can delete the
> other records before the transaction finishes.

But with any kind of autocommit on, the transaction ends after the
row.delete(), unlocking the remaining rows and allowing them to then
be deleted.

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: Switch to database-level autocommit

2013-03-05 Thread Chris Wilson

Hi Lennart,

On Tue, 5 Mar 2013, Lennart Regebro wrote:

I do agree that 99.9% of the sites committing on the end of the request 
is the right thing to do, and do think that this should be the default, 
although I realize that debate is already lost.


In a perfect academic world where there are never any errors, that 
approach is probably academically correct. In my opinion it makes correct 
error handling extremely difficult, because database errors (unique 
violations, concurrent modifications, etc) are thrown far away from the 
code that knows how to deal with them (the code that caused them in the 
first place).


Cheers, Chris.
--
Aptivate | http://www.aptivate.org | Phone: +44 1223 967 838
Future Business, Cam City FC, Milton Rd, Cambridge, CB4 1UY, UK

Aptivate is a not-for-profit company registered in England and Wales
with company number 04980791.

--
You received this message because you are subscribed to the Google Groups "Django 
developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: Switch to database-level autocommit

2013-03-05 Thread Shai Berger
On Tuesday 05 March 2013, Aymeric Augustin wrote:
> 
> In practice, this would mean:
> - Define "hook points" that are a superset the old and the new APIs.
> - Re-implement the current transaction management with these "hook points"
>   (hard, because the intended behavior of the dirty flag isn't clear).
> - Implement the new transaction management  with these "hook points"
>   (hard, because that prevents using Python's context manager abstraction,
>   which maps exactly to atomic blocks and guarantees that the exit method
>   is always called).
> - Add checks to prevent invalid operations when mixing the old and new
> APIs, this is extremely difficult to get right when there are many hook
> points. 
> - Provide simple guidelines on how the old and new APIs may be
> mixed. 
> - Ensure that the old API can be trivially removed, because the
> author of the new code may not be there in two years.
> 

I assume "new" and "old" APIs refer to commit_on_success vs. atomic (just 
making sure I got you right)?

> This looks possible, but not in the amount of time I'm able to spend
> volunteering for Django.  If someone wants to implement this, I'll put my
> branch on hold. Otherwise, I'll propose it for inclusion into core.

I couldn't ask for more,

Shai.

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: Switch to database-level autocommit

2013-03-05 Thread Shai Berger
On Tuesday 05 March 2013, Aymeric Augustin wrote:
> On 5 mars 2013, at 12:50, Shai Berger  wrote:
> > At least the "delete duplicates" example I've pointed to shows, that it's
> > quite easy to write code that is correct now, under the defaults, with
> > MySql and will break with the change.
> 
> http://stackoverflow.com/questions/8965391/delete-duplicate-rows-in-django-
> db
> 
> In fact this code is already unsafe under MySQL with "repeatable read";
> you need "serializable" to make it safe.
> 

I'm not sure what you mean by "unsafe". 

Version 1: code in the question

rows = MyModel.objects.all()
for row in rows:
try:
MyModel.objects.get(photo_id=row.photo_id)
except:
row.delete()

When Django checks for a second record on get() it reads the second record, so 
a repeatable-read transaction would acquire an exclusive read lock on it. This 
makes it impossible for another transaction to delete the second row before 
the first finishes.

Version 2: code in the first answer.

for row in MyModel.objects.all():
if MyModel.objects.filter(photo_id=row.photo_id).count() > 1:
row.delete()

To the best of my understanding, a repeatable-read transaction gets a read 
lock on all records participating in a count, so again, nobody can delete the 
other records before the transaction finishes.

With database-level autocommit, both versions are susceptible to having the 
"other" record deleted between the read that detects it and the deletion of 
the current row (deemed a redundant copy).

> The current behavior doesn't map to any common programming idiom and
> doesn't make much sense in general. The only explanation I've found is "we
> had to commit after making changes otherwise they weren't saved".

There is no argument that the change is wanted. The only issue I'm -1 on is 
changing the default with no deprecation and no option to use existing 
behavior.

> As a consequence, code written "unwittingly" is more likely to be incorrect
> than correct.

Still not a license to break it when it is correct, IMO.

Shai.

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: Switch to database-level autocommit

2013-03-05 Thread Aymeric Augustin
On 5 mars 2013, at 07:37, Lennart Regebro  wrote:

> I do agree that 99.9% of the sites committing on the end of the
> request is the right thing to do, and do think that this should be the
> default, although I realize that debate is already lost.


Well, it's possible to enable TransactionMiddleware in the default
settings template. I haven't given much though to this idea because
it's independent from my current project.

-- 
Aymeric.



-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: Switch to database-level autocommit

2013-03-05 Thread Lennart Regebro
On Fri, Mar 1, 2013 at 4:11 PM, Alex Gaynor  wrote:
> FWIW: any sort of scheme where a transaction is kept open all request
> (including a transaction that only ever reads), will cause you serious pain
> if you're trying to do migrations on MySQL with traffic.

Wouldn't the transaction be opened when you make modifications, and
closed once the reqeust returns? Ie, in a typical read situation: No
transaction. In the typical write situation, the transaction is
started somewhere in the view, and closed once an HTTP redirect is
returned?

I do agree that 99.9% of the sites committing on the end of the
request is the right thing to do, and do think that this should be the
default, although I realize that debate is already lost.

//Lennart

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: Switch to database-level autocommit

2013-03-05 Thread Aymeric Augustin
On 5 mars 2013, at 12:50, Shai Berger  wrote:

> At least the "delete duplicates" example I've pointed to shows, that it's 
> quite easy to write code that is correct now, under the defaults, with MySql 
> and will break with the change.

http://stackoverflow.com/questions/8965391/delete-duplicate-rows-in-django-db

In fact this code is already unsafe under MySQL with "repeatable read";
you need "serializable" to make it safe. 

The current behavior doesn't map to any common programming idiom and doesn't
make much sense in general. The only explanation I've found is "we had to
commit after making changes otherwise they weren't saved". As a consequence,
code written "unwittingly" is more likely to be incorrect than correct.

-- 
Aymeric.



-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: Switch to database-level autocommit

2013-03-05 Thread Shai Berger
On Tuesday 05 March 2013, Aymeric Augustin wrote:
> On 5 mars 2013, at 01:50, Shai Berger  wrote:
> > I want to point out how ironic this whole issue turns out.
> 
> That's a storm in a teapot. Let's keep some perspective:

I really had no intentions of disrupting the weather; however,

> - People using MyISAM aren't affected.

That's been recommended against for ages; InnoDB has been the default since 
MySQL 5.5, 12/2010. Django 1.3 was released three months later, and I think 
you can safely assume the usage of MyISAM in projects developed since 1.3 is 
negligible.

> - People using the transaction middleware aren't affected.

That's your group 3...

> - People using commit_on_success aren't affected.

... and that's your group 4.
> 
> In addition to the above, to end up with data corruption, you must:
> 
> - be aware that MySQL runs on "repeatable read" by default,
> - have checked that Django doesn't set another isolation level — this isn't
> documented,
> - be aware of Django's default transactional behavior —
> virtually no one understands the first paragraph of the transaction docs,
> - have chosen to rely on this behavior to implicitly guarantee some
> integrity between a read and a subsequent write,

No you don't; that's a key point that, for some reason, I haven't been able to 
push through. Code correctness is not dependent upon the state of mind of the 
author.

At least the "delete duplicates" example I've pointed to shows, that it's 
quite easy to write code that is correct now, under the defaults, with MySql 
and will break with the change. It is easy to do so "unwittingly", with no 
real awareness of the details involved (arguably, making it easy to write 
correct code "unwittingly" is one of the goals of frameworks like Django in 
the first place). 

That is exactly the situation that I think is likely for your groups 1&2. I 
don't think the fact that their code is correct "by chance" gives us a license 
to break it.

> - upgrade to Django 1.6 without taking into account the release notes or
> without understanding their consequences for you,

which is well within the norm for these users,

> - be running a medium- or high-traffic website where race conditions are
> likely,

No, that's what you need to end up with large-scale data corruption. I'd 
phrase that point in reverse: To be safe, you need to run a very-low traffic 
website, where concurrency virtually never happens.

> - have code that fails silently rather than raise an error on integrity
> violations.
> 
The problems caused do not necessarily create integrity violations. In the 
deletion example, they just make data go poof.

> That's technically possible, but sufficiently unlikely to be acceptable as
> a _documented_ backwards incompatibility.

I agree that this conclusion follows from your premises. It's those premises 
that I disagree with.

Shai.

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: Switch to database-level autocommit

2013-03-05 Thread Aymeric Augustin
On 5 mars 2013, at 02:28, Shai Berger  wrote:

> This is just a half-baked idea; I suspect those of you who are more well-
> versed in the current implementation of transaction management, and who put 
> some thought into this, may be able to shoot it down easily. But here goes.
> 
> Basically, transaction management reacts to user commands by doing either or 
> both of two things: Issuing database commands ("begin tran", "savepoint x", 
> "rollbak to x" etc), and changing its own internal state (e.g. marking the 
> transaction as dirty, puhsing a new transactional level on a stack). There is 
> a limited set of relevant user command classes -- explicit-enter-transaction, 
> ORM-read, begin-ORM-write, end-ORM-write, etc.

That's almost correct; I see one possible pain point. When autocommit is off,
transactions are automatically started and that makes it difficult for Django to
track what's going on. In other word, there's no explicit "begin-ORM-read"
hook in the current code.

The dirty flag theoretically carries this information, but Anssi's recent work 
in
this area show that it isn't entirely reliable nor sufficiently well defined. 
(It isn't
clear whether reads are intended to make the connection dirty; they should.)

Knowing exactly when a transaction is in progress or not is one of my reasons
for preferring autocommit.

> The selection of state-changes and transaction-commands that are executed in 
> response to user-commands is, in essence, a transaction-management policy -- 
> and there can be more than one of those in the code base, to be selected by a 
> setting. Say, transaction_bc (for "Backwards Compatible") which does 
> autocommit the way it is today, and transaction_ad (for "Autocommit at 
> Database level") which does the right thing.

To provide a transition path, both APIs must be usable in parallel, at least to
some extent. A hard switch from transaction_bc to transaction_ad would make
it very difficult to update codebases to take advantage of the new APIs,
especially if they use third-party libraries that haven't been updated yet.

> If this can be done, then I think the proper way to handle the change is:
> 
> Django 1.6 -- the default is transaction_bc, the default project template 
> selects transaction_ad. Further, transaction_bc raises a deprecation warning 
> whenever it sees an ORM-read followed by an ORM-write in an automatic 
> transaction when it knows the transaction isolation level is >= repeatable 
> read.
> 
> Django 1.7 -- the default is transaction_ad, transaction_bc still available 
> but raises deprecation warning whenever it is used (or at least whenever its 
> autocommit is used).
> 
> Django 1.8 -- transaction_bc removed.


In practice, this would mean:
- Define "hook points" that are a superset the old and the new APIs.
- Re-implement the current transaction management with these "hook points"
  (hard, because the intended behavior of the dirty flag isn't clear).
- Implement the new transaction management  with these "hook points"
  (hard, because that prevents using Python's context manager abstraction,
  which maps exactly to atomic blocks and guarantees that the exit method
  is always called).
- Add checks to prevent invalid operations when mixing the old and new APIs,
  this is extremely difficult to get right when there are many hook points.
- Provide simple guidelines on how the old and new APIs may be mixed.
- Ensure that the old API can be trivially removed, because the author of the
  new code may not be there in two years.

This looks possible, but not in the amount of time I'm able to spend
volunteering for Django.  If someone wants to implement this, I'll put my
branch on hold. Otherwise, I'll propose it for inclusion into core.

-- 
Aymeric.



-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: Switch to database-level autocommit

2013-03-05 Thread Aymeric Augustin
On 5 mars 2013, at 01:50, Shai Berger  wrote:

> I want to point out how ironic this whole issue turns out.

That's a storm in a teapot. Let's keep some perspective:

- People using MyISAM aren't affected.
- People using the transaction middleware aren't affected.
- People using commit_on_success aren't affected.

In addition to the above, to end up with data corruption, you must:

- be aware that MySQL runs on "repeatable read" by default,
- have checked that Django doesn't set another isolation level — this isn't 
documented,
- be aware of Django's default transactional behavior — virtually no one 
understands the first paragraph of the transaction docs,
- have chosen to rely on this behavior to implicitly guarantee some integrity 
between a read and a subsequent write,
- upgrade to Django 1.6 without taking into account the release notes or 
without understanding their consequences for you,
- be running a medium- or high-traffic website where race conditions are likely,
- have code that fails silently rather than raise an error on integrity 
violations.

That's technically possible, but sufficiently unlikely to be acceptable as a 
_documented_ backwards incompatibility.

-- 
Aymeric.



-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: Switch to database-level autocommit

2013-03-04 Thread Shai Berger
Hi Florian and everybody,

On Sunday 03 March 2013, Florian Apolloner wrote:
> On Sunday, March 3, 2013 12:27:47 AM UTC+1, Shai Berger wrote:
> > > I also believe that it beats the alternative — namely, live with the
> > > 
> > > current behavior forever.
> > 
> > I sincerely hope [...] that there's a way to implement the new behavior
> > side-by-side with the old one. There usually is.
> 
> Can you describe how an alternative in this case would look like? I
> personally can't think of any sensible one.
> 

This is just a half-baked idea; I suspect those of you who are more well-
versed in the current implementation of transaction management, and who put 
some thought into this, may be able to shoot it down easily. But here goes.

Basically, transaction management reacts to user commands by doing either or 
both of two things: Issuing database commands ("begin tran", "savepoint x", 
"rollbak to x" etc), and changing its own internal state (e.g. marking the 
transaction as dirty, puhsing a new transactional level on a stack). There is 
a limited set of relevant user command classes -- explicit-enter-transaction, 
ORM-read, begin-ORM-write, end-ORM-write, etc.

The selection of state-changes and transaction-commands that are executed in 
response to user-commands is, in essence, a transaction-management policy -- 
and there can be more than one of those in the code base, to be selected by a 
setting. Say, transaction_bc (for "Backwards Compatible") which does 
autocommit the way it is today, and transaction_ad (for "Autocommit at 
Database level") which does the right thing.

If this can be done, then I think the proper way to handle the change is:

Django 1.6 -- the default is transaction_bc, the default project template 
selects transaction_ad. Further, transaction_bc raises a deprecation warning 
whenever it sees an ORM-read followed by an ORM-write in an automatic 
transaction when it knows the transaction isolation level is >= repeatable 
read.

Django 1.7 -- the default is transaction_ad, transaction_bc still available 
but raises deprecation warning whenever it is used (or at least whenever its 
autocommit is used).

Django 1.8 -- transaction_bc removed.

Hope this makes sense,

Shai.

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: Switch to database-level autocommit

2013-03-04 Thread Shai Berger
Hi,

On Monday 04 March 2013, Aymeric Augustin wrote:
> On 4 mars 2013, at 04:04, Shai Berger  wrote:
> > you need to be sure that, in all these places, either reads don't
> > really affect consequent writes, or some constraint holds that is
> > equivalent to serializability -- otherwise, negative effect is possible.
> 
> PostgreSQL and Oracle use the "repeatable read" isolation level by default.
> They are immune to this problem, because a sequence of reads followed by a
> write has the same semantics with a transaction under "read committed" and
> without a transaction.
> 
> MySQL uses "read committed" and SQLite uses "serializable". Users of these
> databases may see a different behavior.
> 
> I'm going to add this to the list of backwards incompatibilities:
> https://github.com/aaugustin/django/commit/876fddb68dd6990e87b15e683c498e41
> f8921f14#L4R437 ("Using unsupported database features" isn't correct for
> MySQL and SQLite right now.)
> 

I want to point out how ironic this whole issue turns out.

First of all, it appears the bulk of users who may experience the subtle 
breakage discussed are users of MySql -- the database of choice for Aymeric's 
groups 1 & 2, the main declared target audience for the fix.

On the other hand, it probably serves them right -- the choice of MySql is 
usually motivated by putting less focus on database-level integrity and more 
on other factors, such as single-threaded performance and ease of deployment.

But considering this last disparaging comment, it's only MySql code that will 
be turned from correct to broken, because -- on this specific point -- it turns 
out MySql is, by default, stricter than the others about integrity.

I am speechless,

Shai.


-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: Switch to database-level autocommit

2013-03-04 Thread Christophe Pettus

On Mar 4, 2013, at 7:24 AM, Aymeric Augustin wrote:

> PostgreSQL and Oracle use the "read committed" ...

Sorry, replied too soon!

> The reasoning and the conclusion still stand.

Agreed.
--
-- Christophe Pettus
   x...@thebuild.com

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: Switch to database-level autocommit

2013-03-04 Thread Christophe Pettus

On Mar 4, 2013, at 5:00 AM, Aymeric Augustin wrote:

> PostgreSQL and Oracle use the "repeatable read" isolation level by default.

Without explicitly changing it, PostgreSQL's default is READ COMMITTED.  Or are 
we setting it explicitly to REPEATABLE READ in the new model?

--
-- Christophe Pettus
   x...@thebuild.com

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: Switch to database-level autocommit

2013-03-04 Thread Aymeric Augustin
On 4 mars 2013, at 16:04, Andrey Antukh  wrote:

> If not I'am wrong, postgresql uses "read commited" by default and mysql 
> "repeatable read"

Sorry, I swapped the isolation level names when I wrote the explanation. The 
correct version is:

PostgreSQL and Oracle use the "read committed" ...

MySQL uses "repeatable read" ...

The reasoning and the conclusion still stand.

-- 
Aymeric.


-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: Switch to database-level autocommit

2013-03-04 Thread Florian Apolloner
Hi,

On Monday, March 4, 2013 2:00:03 PM UTC+1, Aymeric Augustin wrote:
>
> PostgreSQL and Oracle use the "repeatable read" isolation level by 
> default. 


According to http://www.postgresql.org/docs/9.1/static/transaction-iso.html 
PG uses "read commited" as default.
 

> MySQL uses "read committed" and SQLite uses "serializable". Users of these 
> databases 
> may see a different behavior. 
>

For InnoDB the default is "repeatable read"according 
http://dev.mysql.com/doc/refman/5.0/en/set-transaction.html#isolevel_repeatable-read

Cheers,
Florian

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: Switch to database-level autocommit

2013-03-04 Thread Andrey Antukh
If not I'am wrong, postgresql uses "read commited" by default and mysql
"repeatable read"

http://www.postgresql.org/docs/9.1/static/transaction-iso.html -> "Read
Committed is the default isolation level in PostgreSQL. "
http://dev.mysql.com/doc/refman/5.0/en/innodb-transaction-model.html -> "In
terms of the SQL:1992 transaction isolation levels, the default InnoDB
level is REPEATABLE READ. InnoDB"

Andrey


2013/3/4 Aymeric Augustin 

> On 4 mars 2013, at 04:04, Shai Berger  wrote:
>
> > you need to be sure that, in all these places, either reads don't
> > really affect consequent writes, or some constraint holds that is
> equivalent to
> > serializability -- otherwise, negative effect is possible.
>
> PostgreSQL and Oracle use the "repeatable read" isolation level by
> default. They are
> immune to this problem, because a sequence of reads followed by a write
> has the same
> semantics with a transaction under "read committed" and without a
> transaction.
>
> MySQL uses "read committed" and SQLite uses "serializable". Users of these
> databases
> may see a different behavior.
>
> I'm going to add this to the list of backwards incompatibilities:
>
> https://github.com/aaugustin/django/commit/876fddb68dd6990e87b15e683c498e41f8921f14#L4R437
> ("Using unsupported database features" isn't correct for MySQL and SQLite
> right now.)
>
> Let me know if you think of other cases which I haven't covered!
>
> --
> Aymeric.
>
> --
> You received this message because you are subscribed to the Google Groups
> "Django developers" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to django-developers+unsubscr...@googlegroups.com.
> To post to this group, send email to django-developers@googlegroups.com.
> Visit this group at http://groups.google.com/group/django-developers?hl=en
> .
> For more options, visit https://groups.google.com/groups/opt_out.
>
>
>


-- 
Andrey Antukh - Андрей Антух - 
http://www.niwi.be/about.html
http://www.kaleidos.net/A5694F/

"Linux is for people who hate Windows, BSD is for people who love UNIX"
"Social Engineer -> Because there is no patch for human stupidity"

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: Switch to database-level autocommit

2013-03-04 Thread Aymeric Augustin
On 4 mars 2013, at 04:04, Shai Berger  wrote:

> you need to be sure that, in all these places, either reads don't 
> really affect consequent writes, or some constraint holds that is equivalent 
> to 
> serializability -- otherwise, negative effect is possible.

PostgreSQL and Oracle use the "repeatable read" isolation level by default. 
They are
immune to this problem, because a sequence of reads followed by a write has the 
same
semantics with a transaction under "read committed" and without a transaction.

MySQL uses "read committed" and SQLite uses "serializable". Users of these 
databases
may see a different behavior.

I'm going to add this to the list of backwards incompatibilities:
https://github.com/aaugustin/django/commit/876fddb68dd6990e87b15e683c498e41f8921f14#L4R437
("Using unsupported database features" isn't correct for MySQL and SQLite right 
now.)

Let me know if you think of other cases which I haven't covered!

-- 
Aymeric.

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: Switch to database-level autocommit

2013-03-04 Thread Aymeric Augustin
On 4 mars 2013, at 01:07, Shai Berger  wrote:

>> On 1 mars 2013, at 13:48, Aymeric Augustin 
>>  wrote:
>> 
>> I'd like to add an @atomic decorator that:
>>  - Really guarantees atomicity, by raising an exception if you attempt 
>> commit within an atomic block,
> 
> Fragility alert: databases commit implicitly. Most notably, Oracle and MySql 
> for every DDL command.

That a good point. I hadn't considered DDL.

> It is true that the commands that trigger this are not usually found in 
> normally-executed application
> code. However, since the proposal is all about running in autocommit mode 
> unless a transaction
> was started explicitly, it should be noted that once such a command is 
> executed, it does not
> just break the transaction in the middle, it renders everything after it 
> non-transactional.

I checked each supported database to determine whether this behavior would be 
expected.

PostgreSQL supports transactional DDL. (The PostgreSQL wiki also indicates that 
most
databases that have a third-party adapter for Django support DDL, which is good 
news:
http://wiki.postgresql.org/wiki/Transactional_DDL_in_PostgreSQL:_A_Competitive_Analysis)

SQLite supports transactional DDL — its implementation of transactions with a 
file lock
guarantees that anything can be done in a transaction. The Python sqlite3 
module gets in the
way again: it will commit on DDL unless autocommit is on: 
http://bugs.python.org/issue10740.
This isn't as braindead as committing on SAVEPOINT, but quite regrettable 
nonetheless.
(I'm not weighting on these bugs because I believe that the default mode of 
sqlite3 is an
aberration that only exists to conform to PEP 249 and cannot be salvaged.)

On MySQL, Django will behave exactly like typing commands in the MySQL shell. 
This behavior
shouldn't come as a surprise. See 
http://dev.mysql.com/doc/refman/5.6/en/implicit-commit.html.

Oracle is the only database supported by Django that isn't designed for 
autocommit by default.
Its docs say: "Oracle Database implicitly commits the current transaction 
before and after every
DDL statement." This sentence is obviously written with implicit transaction 
initiation in mind.
This behavior will be surprising for someone used to Oracle's transaction 
model, but not much
more that autocommit in general. It's a logical consequence of "autocommit is 
on", "@atomic
opens a transaction", and "DDL commits the transaction".

Finally, a quick search on the web shows that it's considered bad form to rely 
on DDL to
implicitly commit transactions on databases that don't support transactional 
DDL.

> Most DDL commands executed by Django apps, 1.6 and on, should be done through 
> the upcoming
> schema alteration mechanisms -- code under core's control, so things can be 
> set up correctly 
> (that is, when such a command is executed under @atomic with a 
> non-DDL-transactional backend,
> it should raise an exception). They just need care.

Yes, it's a good idea to add this check when core gets support for DDL commands.

> Other than that, strong documentation warnings should be given around the use 
> of raw sql in transactions.

Agreed, this should be documented as a limitation of @atomic.

>>  - Supports nesting within an existing transaction, with savepoints.
> 
> The use of savepoints in Django apps so far has been very little, as far as I 
> know. One point
> I'm unsure of is the interaction of savepoints with cursors, since querysets 
> are lazy; so the scenario
> I'm worrirf about is, generally speaking,
> 
> @atomic
> def main():
>   for obj in query_with_more_than_100_objects:
>   try:
>   handle(obj)
>   except Bad:
>   pass
> 
> @atomic
> def handle(obj):
>   if good(obj):
>   mark_good(obj)
>   obj.save()
>   else:
>   raise Bad(obj)
> 
> Will the next (database) fetch after an exception is raised get the right 
> objects?
> 
> My reading of the Postgresql documentation is that it will do the right 
> thing, not so sure about
> the other backends.

Django currently doesn't support server side cursors — at least on PostgreSQL, 
most likely on
other databases too. Accessing the first object loads the entire queryset 
instantly.

I suspect it's the database's job to handle this properly (maybe by raising an 
exception). For
example, I know that SQLite is smart enough to deal properly with equivalent 
scenarios using
the C API: it delays committing the transaction until all reads are finished. I 
don't know how
other databases deal with this in general.

-- 
Aymeric.



-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to 

Re: Switch to database-level autocommit

2013-03-03 Thread Shai Berger
On Sunday 03 March 2013, Jacob Kaplan-Moss wrote:
> Shai: do you have any real-world code that'd have data loss
> with this bug?

The code in sites I work on usually uses transaction middleware or 
@commit_on_success, so it would not suffer (nor benefit) from this change. So I 
tried to look for openly available code.

I found this very clear example on Stack Overflow -- I'm not sure it counts as 
"real-world code", but it definitely becomes subtly broken with database-level  
autocommit:

http://stackoverflow.com/questions/8965391/delete-duplicate-rows-in-django-db

Then I turned to somewhere I thought might be susceptible, and took a look at 
django-mptt. I wish I had more time and stamina at this point to find an actual 
smoking gun, but what I did find, was no use of transaction control: Not in 
code (the only transactional code is a call to 
transaction.commit_unless_managed at the end of the move_node operation), and 
not even in documentation (except for docstrings for context-managers which 
delay updates). The code there is pretty complex and abstract, but I'm pretty 
sure it does make decisions based on reads that affect consequent writes -- and 
I'm pretty sure that adding a commit between the former and the latter will 
make it fragile in new and interesting ways. So that is real-world code that 
I'm almost sure now has data loss possibilities, unless used in explicit 
transactions.

Which brings me to this comment:

> Looking through the tons of code I've got available
> can't reveal a single place where this chance would have any negative
> effect. OTOH, there're lots of places where it'd have a positive
> effect.
> 
Please correct me if I'm misunderstanding: For this to hold, you need lots of 
pieces of code where autocommit is on (otherwise there's no positive effect), 
and then you need to be sure that, in all these places, either reads don't 
really affect consequent writes, or some constraint holds that is equivalent to 
serializability -- otherwise, negative effect is possible. That sounds quite 
implausible.

Is the implausible really the case? Otherwise, what am I missing?

Thanks,
Shai.

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: Switch to database-level autocommit

2013-03-03 Thread Shai Berger
Hi,

Here's some thoughts on possible problems with Aymeric's plan. I like
the look of the suggested transaction API, I'm raising these to help
make sure it can be realized.

On Sunday 03 March 2013, Aymeric Augustin wrote:
> On 1 mars 2013, at 13:48, Aymeric Augustin 
>  wrote:
> 
> I'd like to add an @atomic decorator that:
>   - Really guarantees atomicity, by raising an exception if you attempt 
> commit within an atomic block,

Fragility alert: databases commit implicitly. Most notably, Oracle and MySql 
for every DDL command.

It is true that the commands that trigger this are not usually found in 
normally-executed application
code. However, since the proposal is all about running in autocommit mode 
unless a transaction
was started explicitly, it should be noted that once such a command is 
executed, it does not
just break the transaction in the middle, it renders everything after it 
non-transactional.

Most DDL commands executed by Django apps, 1.6 and on, should be done through 
the upcoming
schema alteration mechanisms -- code under core's control, so things can be set 
up correctly 
(that is, when such a command is executed under @atomic with a 
non-DDL-transactional backend,
it should raise an exception). They just need care. Other than that, strong 
documentation warnings
should be given around the use of raw sql in transactions.

>   - Supports nesting within an existing transaction, with savepoints.

The use of savepoints in Django apps so far has been very little, as far as I 
know. One point
I'm unsure of is the interaction of savepoints with cursors, since querysets 
are lazy; so the scenario
I'm worrirf about is, generally speaking,

@atomic
def main():
for obj in query_with_more_than_100_objects:
try:
handle(obj)
except Bad:
pass

@atomic
def handle(obj):
if good(obj):
mark_good(obj)
obj.save()
else:
raise Bad(obj)

Will the next (database) fetch after an exception is raised get the right 
objects?

My reading of the Postgresql documentation is that it will do the right thing, 
not so sure about
the other backends.

> 
> I'm reasonably confident that this plan can work.

My concerns about autocommit notwithstanding, I hope you're right.

Shai.

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: Switch to database-level autocommit

2013-03-03 Thread Aymeric Augustin
On 1 mars 2013, at 13:48, Aymeric Augustin  
wrote:

> Basically, Django intends to provide autocommit by default. Rather than fight 
> the database adapter that itselfs fights the database, I propose to simply 
> turn autocommit on, and stop implicitly starting and committing transactions. 
> Explicit is better than implicit.

This is implemented in https://github.com/django/django/pull/873 and ready for 
review.

Important notes:
- Some ORM APIs aren't atomic any longer, as noted by Shai and Christophe; 
others are using a hack: `enter_transaction_management(forced=True)`. I need a 
better transaction API to fix that, see below.
- I disabled three tests that tested the internals of transaction management in 
non-autocommit mode, which no longer exists. I don't know how much I'm going to 
refactor, so I haven't rewritten them yet.

I'd like to merge to master at this point, since I've reached a consistent, if 
not final, state.

> This obviously has far-reaching consequences on transaction handling in 
> Django, but the current APIs should still work. (Fixing them is part 2 of the 
> plan.) 


Part 2 of the plan is more ambitious — it's about improving the transaction 
APIs. Here's the general idea.

Currently, the most useful but also most problematic API is the 
@commit_on_success decorator. It's confusing because it mixes two roles.

1) Switch the transaction state from "auto mode" to "managed mode" ie. disable 
autocommit.
- Nesting works as expected because transaction state is managed as a 
stack.
- This stack is a pillar of the current implementation but I'm not 
convinced that it's an useful concept in general.

2) Define the context for a transaction — but in a very fragile way.
- It doesn't guarantee to rollback all changes if an exception occurs, 
because you can commit within the block.
- Nesting breaks atomicity, because the inner block triggers a commit 
or rollback before the end of the outer block.
- To sum up, it really does what the name says, and nothing else :)

I'd like to add an @atomic decorator that:
- Really guarantees atomicity, by raising an exception if you attempt 
commit within an atomic block,
- Supports nesting within an existing transaction, with savepoints.
- In option, support "merging" with an existing transaction, for 
pathological cases where the overhead of savepoints would be excessive: 
@atomic(merge=True)
Partial rollback is very Pythonic: it works by raising an exception and 
catching it higher in the stack. Each @atomic context that's exited with an 
exception rolls back to its savepoint.

Ideally, this decorator would start a transaction but not enable "managed 
mode". The point of "managed mode" is that you can commit at any time, and the 
next query will automatically reopen a transaction. Since @atomic forbids 
committing until the end of the block, this is no longer a useful property. 

Another good reason for staying in "auto mode" is that savepoints don't work on 
SQLite in "managed mode": cursor.execute("SAVEPOINT foo") triggers a COMMIT 
because Python's sqlite3 module tries to be smart and fails miserably.

The end goal is:
- for defining transaction blocks: to deprecate @commit_on_success / 
@commit_manually and provide @atomic instead.
- for controlling transaction state: to deprecate @commit_on_success / 
@autocommit and provide connection.set_autocommit(False/True) instead — only 
usable outside @atomic
- for the TransactionMiddleware: to provide exactly the same semantics as 
@atomic, and to keep the ability to disable it with @autocommit
- to remove the stack of transaction states, which mostly exists to support the 
current decorators, once the deprecation cycle completes.

Of course the two APIs should work in parallel until the old ones are gone.

I'm reasonably confident that this plan can work.

-- 
Aymeric.



-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: Switch to database-level autocommit

2013-03-03 Thread Christophe Pettus

On Mar 3, 2013, at 10:13 AM, Aymeric Augustin wrote:

> In practice, the solution is probably called @xact. Applying it to each 
> public ORM function should do the trick. Therefore, I'd like to ask your 
> permission to copy it in Django. Technically speaking, this means relicensing 
> it from PostgreSQL to BSD.

Absolutely; it would be my honor.  Just contact me off-list and we can sort out 
the details.

--
-- Christophe Pettus
   x...@thebuild.com

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: Switch to database-level autocommit

2013-03-03 Thread Aymeric Augustin
On 3 mars 2013, at 17:09, Christophe Pettus  wrote:

> Right now, the only real example I've heard (there might be more is):
> 
> 1. The ORM generates multiple updating operations for a single API-level 
> operation.
> 2. The developer did nothing to manage their transaction model (no decorator, 
> no middleware), but,
> 3. Is relying on Django to provide a transaction in this case.
> 
> That situation does exist, but it does seem pretty edge-case-y.  Does it 
> exist in any case besides model inheritance?  If not, could we have the ORM 
> wrap those operations in a transaction in that particular case?

Yes, I plan to ensure that all public ORM APIs are atomic. I'm not sure that 
they currently are; under autocommit, they probably aren't.

I will make a proposal to improve transaction management right after I've 
finished implementing autocommit.

In practice, the solution is probably called @xact. Applying it to each public 
ORM function should do the trick. Therefore, I'd like to ask your permission to 
copy it in Django. Technically speaking, this means relicensing it from 
PostgreSQL to BSD.

Thanks,

-- 
Aymeric.



-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: Switch to database-level autocommit

2013-03-03 Thread Christophe Pettus

On Mar 2, 2013, at 3:49 PM, Jacob Kaplan-Moss wrote:
> I'm with Aymeric: the current behavior is bad enough, and this is a
> big enough improvement, and the backwards-incompatibility is minor
> enough.

Right now, the only real example I've heard (there might be more is):

1. The ORM generates multiple updating operations for a single API-level 
operation.
2. The developer did nothing to manage their transaction model (no decorator, 
no middleware), but,
3. Is relying on Django to provide a transaction in this case.

That situation does exist, but it does seem pretty edge-case-y.  Does it exist 
in any case besides model inheritance?  If not, could we have the ORM wrap 
those operations in a transaction in that particular case?
--
-- Christophe Pettus
   x...@thebuild.com

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: Switch to database-level autocommit

2013-03-02 Thread Florian Apolloner
Hi Shai,

On Sunday, March 3, 2013 12:27:47 AM UTC+1, Shai Berger wrote:
>
> > I also believe that it beats the alternative — namely, live with the 
>
> > current behavior forever.
>   
>
> I sincerely hope that is not the only alternative; that there's a way to 
> implement the new behavior side-by-side with the old one. There usually is.
>

Can you describe how an alternative in this case would look like? I 
personally can't think of any sensible one.
 

> I don't consider backwards-compatibility an absolute, and the first of my 
> two paras which you quoted above shows just that. I do think there are 
> proper ways to introduce backwards-incompatible changes, and what you 
> suggested isn't one of them.
>

If you think there are proper ways to introduce this specific change, 
please show us. Aside from that, all of our bugfixes and new features have 
a possibility to break code. Bugfixes that break old code usually occur as 
a side effect of not seeing the issue the fix causes or being fully aware 
of the issue and considering it minor enough (that is no data-loss and only 
happening to a small fraction of users). New Features shouldn't break code, 
but they usually also touch old code, so they can break stuff in the same 
way as bugfixes.

In the case of this change I think that users with low traffic sites won't 
even notice the changes (no concurrency) and users with high traffic sites 
shouldn't see it cause they should be using autocommit and the relevant 
decorators anyways…


An immeiately-breaking change, where the breakage is hard to detect and 
> debug, and with high potential for data loss... I don't think any warning 
> in the release notes can be enough. All I can say is, -1.
>

I am still +1.

Cheers,
Florian

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: Switch to database-level autocommit

2013-03-02 Thread Jacob Kaplan-Moss
On Sat, Mar 2, 2013 at 3:27 PM, Shai Berger  wrote:
>> I believe that the level of backwards-incompatibility described above is
>> within acceptable bounds for Django 1.6.
>
> I believe this is the core of our disagreement here.

I'm with Aymeric: the current behavior is bad enough, and this is a
big enough improvement, and the backwards-incompatibility is minor
enough. I think the data-loss possibilities Shai suggests are pretty
overblown. Shai: do you have any real-world code that'd have data loss
with this bug? Looking through the tons of code I've got available
can't reveal a single place where this chance would have any negative
effect. OTOH, there're lots of places where it'd have a positive
effect.

Jacob

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: Switch to database-level autocommit

2013-03-02 Thread Shai Berger
Hi again,

On Sunday 03 March 2013, Aymeric Augustin wrote:
> On 2 mars 2013, at 21:46, Shai Berger  wrote:
> > The Django documentation on transactions, at the moment says this on
> > Django's
> > 
> > default behavior[0]:
> >> Django’s default behavior is to run with an open transaction which it
> >> commits automatically when any built-in, data-altering model function is
> >> called. For example, if you call model.save() or model.delete(), the
> >> change will be committed immediately.
> 
> Yes, my proposal is a backwards-incompatible change with regard to this
> sentence.
> 
> However, I believe that only a small fraction of the users of Django fully
> grok the implications of this sentence, and a fraction of this fraction
> relies on it to ensure some level of integrity.
> 

I agree with this belief about users *deliberately* relying on the 
implications for integrity. I have no idea how many people simply wrote code 
that works, and will now break. I also have no idea how many people maintain a 
system started long ago by a more knowledgeable developer.

> > What I have a harder time living with is my other point, which has not
> > been refuted: The proposed change turns code that is currently correct
> > -- liable to break if changed, but still correct as it stands -- into
> > code with "phantom bugs" that show up only in production, under specific
> > timings of concurrent requests (because reads and writes which used to
> > be in the same transaction are now on separate transactions).
> [...]
> 
> I believe that the level of backwards-incompatibility described above is
> within acceptable bounds for Django 1.6.
> 

I believe this is the core of our disagreement here.

> I also believe that it beats the alternative — namely, live with the
> current behavior forever.
> 

I sincerely hope that is not the only alternative; that there's a way to 
implement the new behavior side-by-side with the old one. There usually is.

> > If anyone can offer a way to detect the situation and warn users of the
> > change, that would make things perfectly fine. Changing the default
> > transaction behavior along the lines you suggested would be a great
> > improvement. I have a strong suspicion, though, that if the new behavior
> > is implemented as default (and with no recourse to the old behavior, to
> > boot), then that simply cannot be done.
> > 
> > I think that as suggested, the change would break the project's
> > committments to its users in a way that is much more important than the
> > performance gains, avoided idle-in-transaction errors, and effort
> > considerations, all put together. That is my objection.
> 
> Yes, I agree with this way to frame the discussion.
> 
> We reach different conclusions because I don't consider backwards
> compatibility an absolute that trumps every other consideration. It's a
> continuum. Backwards compatibility must find a balance with progress.

I don't consider backwards-compatibility an absolute, and the first of my two 
paras which you quoted above shows just that. I do think there are proper ways 
to introduce backwards-incompatible changes, and what you suggested isn't one 
of them.

In particular, per https://docs.djangoproject.com/en/dev/misc/api-stability/, 
we should have a deprecation process with warnings, unless fixing a security 
bug. Even with security issues, it was deemed preferrable to take an expedited 
deprecation process when possible (e.g. with the markdown module) rather than 
an immediate breaking change. Throwing this process to the wind over an 
improvement -- as nice as it is -- seems unwise to me.

An immeiately-breaking change, where the breakage is hard to detect and debug, 
and with high potential for data loss... I don't think any warning in the 
release notes can be enough. All I can say is, -1.

Shai.

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: Switch to database-level autocommit

2013-03-02 Thread Aymeric Augustin
On 2 mars 2013, at 21:46, Shai Berger  wrote:

> The Django documentation on transactions, at the moment says this on Django's 
> default behavior[0]:
> 
>> Django’s default behavior is to run with an open transaction which it
>> commits automatically when any built-in, data-altering model function is
>> called. For example, if you call model.save() or model.delete(), the
>> change will be committed immediately.

Yes, my proposal is a backwards-incompatible change with regard to this
sentence.

However, I believe that only a small fraction of the users of Django fully
grok the implications of this sentence, and a fraction of this fraction relies
on it to ensure some level of integrity.

> What I have a harder time living with is my other point, which has not been 
> refuted: The proposed change turns code that is currently correct -- liable 
> to 
> break if changed, but still correct as it stands -- into code with "phantom 
> bugs" that show up only in production, under specific timings of concurrent 
> requests (because reads and writes which used to be in the same transaction 
> are now on separate transactions).

I'm aware of this scenario and I'm willing to document it in the release
notes.

> It is my opinion that such changes may only 
> be made when changing major versions (i.e. Django 2.0), if at all.

The current consensus — at least within the core team — is that there will never
be a "break everything" Django 2.0.

I believe that the level of backwards-incompatibility described above is within
acceptable bounds for Django 1.6.

I also believe that it beats the alternative — namely, live with the current
behavior forever.

> If anyone can offer a way to detect the situation and warn users of the 
> change, 
> that would make things perfectly fine. Changing the default transaction 
> behavior along the lines you suggested would be a great improvement. I have a 
> strong suspicion, though, that if the new behavior is implemented as default 
> (and with no recourse to the old behavior, to boot), then that simply cannot 
> be done.
> 
> I think that as suggested, the change would break the project's committments 
> to its users in a way that is much more important than the performance gains, 
> avoided idle-in-transaction errors, and effort considerations, all put 
> together. That is my objection.


Yes, I agree with this way to frame the discussion.

We reach different conclusions because I don't consider backwards
compatibility an absolute that trumps every other consideration. It's a
continuum. Backwards compatibility must find a balance with progress.  The
same goes for security: while extremely important, it still has to find a
balance with usability — otherwise you'd never dare open port 80.

-- 
Aymeric.



-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: Switch to database-level autocommit

2013-03-02 Thread Shai Berger
On Saturday 02 March 2013, Aymeric Augustin wrote:
> On 2 mars 2013, at 16:18, Shai Berger  wrote:
> > -1. Make it easier (and cross-backend) to set db-level-autocommit on. Put
> > the setting for it in the default template for new projects. Don't
> > change existing code from "fragile" to "subtly broken".
> 
> This isn't simply about changing some default. It's about re-implementing
> all the transaction machinery.
> 
> With the scope I described, I estimate that I can complete the project in
> 120 to 200 hours. If I have to keep bug-parity with the current behavior,
> I estimate that I'm not capable of delivering it.
> 
> In other words, you're requesting that I fork Django rather than contribute
> this to the main tree.

I may have phrased my objection too strongly -- all in all, I laud your 
continuing effort to improve Django and its database layer, including this 
change. However, I must stand my ground.

The Django documentation on transactions, at the moment says this on Django's 
default behavior[0]:

> Django’s default behavior is to run with an open transaction which it
> commits automatically when any built-in, data-altering model function is
> called. For example, if you call model.save() or model.delete(), the
> change will be committed immediately.

Now, you want to call this behavior (the part where there's one open 
transaction, rather than one-per-db-operation) a bug. I find that a little odd, 
but I can live with it.

What I have a harder time living with is my other point, which has not been 
refuted: The proposed change turns code that is currently correct -- liable to 
break if changed, but still correct as it stands -- into code with "phantom 
bugs" that show up only in production, under specific timings of concurrent 
requests (because reads and writes which used to be in the same transaction 
are now on separate transactions). It is my opinion that such changes may only 
be made when changing major versions (i.e. Django 2.0), if at all. 

If anyone can offer a way to detect the situation and warn users of the change, 
that would make things perfectly fine. Changing the default transaction 
behavior along the lines you suggested would be a great improvement. I have a 
strong suspicion, though, that if the new behavior is implemented as default 
(and with no recourse to the old behavior, to boot), then that simply cannot 
be done.

I think that as suggested, the change would break the project's committments 
to its users in a way that is much more important than the performance gains, 
avoided idle-in-transaction errors, and effort considerations, all put 
together. That is my objection.

Thanks for your attention,

Shai.

[0] https://docs.djangoproject.com/en/dev/topics/db/transactions/

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: Switch to database-level autocommit

2013-03-02 Thread Aymeric Augustin
On 2 mars 2013, at 16:18, Shai Berger  wrote:

> -1. Make it easier (and cross-backend) to set db-level-autocommit on. Put the 
> setting for it in the default template for new projects. Don't change 
> existing 
> code from "fragile" to "subtly broken".

This isn't simply about changing some default. It's about re-implementing all 
the
transaction machinery.

With the scope I described, I estimate that I can complete the project in 120 
to 200
hours. If I have to keep bug-parity with the current behavior, I estimate that 
I'm not
capable of delivering it.

In other words, you're requesting that I fork Django rather than contribute 
this to the
main tree.

-- 
Aymeric.



-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: Switch to database-level autocommit

2013-03-02 Thread Aymeric Augustin
On 2 mars 2013, at 15:50, Shai Berger  wrote:

> There is an issue you seem to be ignoring: An "ORM Write" is, in many cases, 
> more than a single query against the backend. The most obvious example is 
> models with inheritance -- trying to do these with database-level autocommit 
> means that the code writes the two parts in separate transactions. I'm very 
> -1 
> on this.

Yes, that's very important. Anssi raised this problem on IRC too.

(I also heard that Django is currently broken in this regard; I didn't check.)

> There are two alternatives I see -- one is to make sure that an ORM Write is 
> still a transaction (not database-level autocommit); the other is to 
> explicitly commit-unless-managed (or something equivalent) after every read 
> and raw-sql, as well as write.

The plan is to ensure that ORM operations that may result in multiple queries
are executed within a transaction.

It's possible to start with a dumb implementation, ie. wrap all ORM writes in a
transaction, and eventually get smarter, ie. determine if more than one query
will be emitted.

Even the dumb implementation will have better transactional properties than
the status quo.

In practice, that means adding support for savepoints to all backends, ripping
off https://github.com/xof/xact — with permission from Christophe, which I
haven't asked for yet — and wrapping ORM writes in this decorator. However,
autocommit is a pre-requisite for adding savepoints to SQLite, because of the
stdlib's braindead implementation of PEP 249.

Hence, autocommit is part 1 of the plan, and improved transaction
management is part 2.

> BTW, how do you treat select-for-update, which currently makes sense in 
> "faked 
> autocommit" mode, but will stop making sense with the suggested change?


I'm leaning towards not doing any magic and documenting this as a backwards
incompatibility. Relying on Django's current pseudo-auto-commit is very fragile:

qs = MyModel.objects.filter(…)
qs.select_for_update(…)
# what if someone introduces an unrelated ORM write here?
qs.update(…)

I hope developers already wrap their select_for_update calls and associated
writes in transaction.commit_on_success, and it's a good thing to make it
mandatory.

-- 
Aymeric.



-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: Switch to database-level autocommit

2013-03-02 Thread Shai Berger
Thinking again, a few more points come to mind:

On Friday 01 March 2013, Aymeric Augustin wrote:
>
> Such transactions are useless and don't come for free. Relying on them to
> enforce integrity is extremely fragile — what if an external library
> starts writing to a log table in the middle of one of these implicit
> transactions? 

Then it's the external library that is at fault; it should be using a separate 
alias, with its own transactions.

> The term "footgun" comes to mind.
> 
True; but has a wider implication than you seem to give it credit for. In 
particular,

> 
> ** Backwards-compatibility **
> 
> Roughly, I'd classify Django users in four groups:
> 1 - "Transactions, how do they work?"
> 2 - "Django autocommits, doesn't it?"
> 3 - "I'm using TransactionMiddleware!"
> 4 - "I'm managing my transactions."
> 
> Groups 1 and 2 won't see the difference.

This will turn existing projects from correct (though perhaps fragile) to 
buggy, exactly in groups 1&2, where the users are less likely to understand 
the issues. The apps are correct today -- they rely on the documented 
transaction behavior, even if the authors are not fully aware of it. And they 
will turn buggy in a way that is not likely to be detected by tests, because 
it will have effects mostly under concurrent access or system crashes -- 
circumstances you don't usually have in tests.

Thus,
> 
> I don't see much point in providing an option to turn autocommit off,
> because starting a transaction is a much more explicit way to achieve the
> same effect. We usually don't provide "backwards compatibility with bugs".
> 
-1. Make it easier (and cross-backend) to set db-level-autocommit on. Put the 
setting for it in the default template for new projects. Don't change existing 
code from "fragile" to "subtly broken".

Shai.

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: Switch to database-level autocommit

2013-03-02 Thread Shai Berger
On Friday 01 March 2013, Aymeric Augustin wrote:
> Hello,
> 
> I'd like to improve transactions handling in Django. The first step is [to
> replace -- assumed, SB] the current emulation of autocommit with
> database-level autocommit.
> 

There is an issue you seem to be ignoring: An "ORM Write" is, in many cases, 
more than a single query against the backend. The most obvious example is 
models with inheritance -- trying to do these with database-level autocommit 
means that the code writes the two parts in separate transactions. I'm very -1 
on this.

There are two alternatives I see -- one is to make sure that an ORM Write is 
still a transaction (not database-level autocommit); the other is to 
explicitly commit-unless-managed (or something equivalent) after every read 
and raw-sql, as well as write.

BTW, how do you treat select-for-update, which currently makes sense in "faked 
autocommit" mode, but will stop making sense with the suggested change?

Shai.

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: Switch to database-level autocommit

2013-03-01 Thread Stephan Jaensch
 

> I'd like to improve transactions handling in Django. The first step is the 
> current emulation of autocommit with database-level autocommit.
>
[...] 

> I don't see much point in providing an option to turn autocommit off, 
> because starting a transaction is a much more explicit way to achieve the 
> same effect. We usually don't provide "backwards compatibility with bugs".
>
> Yay or nay?
>

A definite yay from me!

Cheers,
Stephan 

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: Switch to database-level autocommit

2013-03-01 Thread Anssi Kääriäinen
On 1 maalis, 14:48, Aymeric Augustin
 wrote:

> Yay or nay?

+1.

I discussed this on IRC with Aymeric. I tried to find cases where this
breaks currently working code, and I was mostly unsuccessful. It seems
there could be such cases (doing raw SQL writes, commit at end, and no
ORM writes, all this outside transaction management), but it seems
such cases are rare, and worked more by luck than knowledge of how
Django's tx management works. Explicit transactions for such cases
would be good anyways.

This will be a backwards incompatible change. The current behaviour is
explicitly documented, and the new behaviour is clearly different in
some cases. However, finding cases where one actually wants the
current behaviour (ORM writes commit, raw SQL writes do not, and any
query will start a transaction) is hard. Doing a full deprecation path
for this will be ugly, so just changing this without deprecation
period seems good to me.

All in all, this might cause some pain to very small portion of users
when upgrading to 1.6. Those users will almost certainly be happy
after they have upgraded their code. Otherwise this will be a big win
for everybody.

 - Anssi

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: Switch to database-level autocommit

2013-03-01 Thread Jacob Kaplan-Moss
Hey Aymeric -

Yes, I think this is the right thing to do: +1.

Jacob

On Fri, Mar 1, 2013 at 4:48 AM, Aymeric Augustin
 wrote:
> Hello,
>
> I'd like to improve transactions handling in Django. The first step is the
> current emulation of autocommit with database-level autocommit.
>
> ** Rationale **
>
> PEP249 says that any SQL query opens a transaction and that it's the
> application's job to commit (or rollback) changes.  This model is also
> required by the SQL standard. But it isn't very developer-friendly.
>
> To alleviate the pain, Django commits after each ORM write.  Unfortunately,
> this means that each read opens a transaction, that eventually gets
> committed after the next write, or rolled back at the end of the query. Such
> transactions are useless and don't come for free. Relying on them to enforce
> integrity is extremely fragile — what if an external library starts writing
> to a log table in the middle of one of these implicit transactions? The term
> "footgun" comes to mind.
>
> Database authors have reached the same conclusion, and most databases
> supported by Django use autocommit by default, ignoring the SQL standard. On
> PostgreSQL and SQLite, this is the only mode available.
>
> As a consequence, to implement the behavior mandated by PEP 249, the Python
> libraries (psycopg2, sqlite3, etc.) automatically start transactions. And
> then Django automatically commits them. This is not only wasteful, but also
> buggy. It's the root cause of "idle in transaction" connections on
> PostgreSQL. It's also sometimes poorly implemented: for instance, executing
> "SAVEPOINT …" on SQLite commits implicitly. (It's arguably a bug in the
> design of the sqlite3 module. The Python bug tracker suggests it's going to
> be documented.)
>
> Basically, Django intends to provide autocommit by default. Rather than
> fight the database adapter that itselfs fights the database, I propose to
> simply turn autocommit on, and stop implicitly starting and committing
> transactions. Explicit is better than implicit.
>
> ** Implementation **
>
> All databases supported by Django provide an API to turn autocommit on:
>
> - http://initd.org/psycopg/docs/connection.html#connection.autocommit
> -
> http://docs.python.org/2/library/sqlite3#sqlite3.Connection.isolation_level
> - http://mysql-python.sourceforge.net/MySQLdb.html => conn.autocommit()
> -
> http://cx-oracle.sourceforge.net/html/connection.html#Connection.autocommit
>
> This obviously has far-reaching consequences on transaction handling in
> Django, but the current APIs should still work. (Fixing them is part 2 of
> the plan.) The general idea is that Django will explicitly start a
> transaction when entering transaction management.
>
> This will obviously impact maintainers of backend for other databases, but
> if it works on Oracle (which doesn't have autocommit — it's emulated in OCI)
> and on PostgreSQL (which enforces autocommit), I hope it can work anywhere.
>
> ** Backwards-compatibility **
>
> Roughly, I'd classify Django users in four groups:
> 1 - "Transactions, how do they work?"
> 2 - "Django autocommits, doesn't it?"
> 3 - "I'm using TransactionMiddleware!"
> 4 - "I'm managing my transactions."
>
> Groups 1 and 2 won't see the difference. There won't be any difference for
> group 3. Group 4 may be impacted by the change, but I believe most people in
> this category have autocommit turned on already — or would like to, if
> they're on MySQL — and will understand the change.
>
> I don't see much point in providing an option to turn autocommit off,
> because starting a transaction is a much more explicit way to achieve the
> same effect. We usually don't provide "backwards compatibility with bugs".
>
> Yay or nay?
>
> --
> Aymeric.
>
>
>
> --
> You received this message because you are subscribed to the Google Groups
> "Django developers" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to django-developers+unsubscr...@googlegroups.com.
> To post to this group, send email to django-developers@googlegroups.com.
> Visit this group at http://groups.google.com/group/django-developers?hl=en.
> For more options, visit https://groups.google.com/groups/opt_out.
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: Switch to database-level autocommit

2013-03-01 Thread Christophe Pettus

On Mar 1, 2013, at 4:48 AM, Aymeric Augustin wrote:
> Yay or nay?

+1.
--
-- Christophe Pettus
   x...@thebuild.com

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: Switch to database-level autocommit

2013-03-01 Thread Alex Gaynor
+1 from me. Here's a patch to add autocommit to MySQL:
https://github.com/django/django/pull/857

FWIW: any sort of scheme where a transaction is kept open all request
(including a transaction that only ever reads), will cause you serious pain
if you're trying to do migrations on MySQL with traffic.

Alex


On Fri, Mar 1, 2013 at 7:07 AM, Carl Meyer  wrote:

> On 03/01/2013 05:48 AM, Aymeric Augustin wrote:
> > Basically, Django intends to provide autocommit by default. Rather than
> > fight the database adapter that itselfs fights the database, I propose
> > to simply turn autocommit on, and stop implicitly starting and
> > committing transactions. Explicit is better than implicit.
>
> +1.
>
> Carl
>
> --
> You received this message because you are subscribed to the Google Groups
> "Django developers" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to django-developers+unsubscr...@googlegroups.com.
> To post to this group, send email to django-developers@googlegroups.com.
> Visit this group at http://groups.google.com/group/django-developers?hl=en
> .
> For more options, visit https://groups.google.com/groups/opt_out.
>
>
>


-- 
"I disapprove of what you say, but I will defend to the death your right to
say it." -- Evelyn Beatrice Hall (summarizing Voltaire)
"The people's good is the highest law." -- Cicero

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: Switch to database-level autocommit

2013-03-01 Thread Carl Meyer
On 03/01/2013 05:48 AM, Aymeric Augustin wrote:
> Basically, Django intends to provide autocommit by default. Rather than
> fight the database adapter that itselfs fights the database, I propose
> to simply turn autocommit on, and stop implicitly starting and
> committing transactions. Explicit is better than implicit.

+1.

Carl

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: Switch to database-level autocommit

2013-03-01 Thread Aymeric Augustin
On 1 mars 2013, at 14:38, Javier Guerra Giraldez  wrote:

> On Fri, Mar 1, 2013 at 7:48 AM, Aymeric Augustin
>  wrote:
>> To alleviate the pain, Django commits after each ORM write.
> 
> just curious:  why is it so?   i can't think of any reason not to tie
> transaction to the request/response cycle by default. (ie what
> TransactionMiddleware does).

TransactionMiddleware makes efficient connection pooling nearly
impossible.

> this is the default on a few other frameworks i use, and has already
> saved my behinds more than once.


I agree that it's a good default for small-to-medium sites.

This is somewhere in a later part of my plan :)

-- 
Aymeric.



-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: Switch to database-level autocommit

2013-03-01 Thread Javier Guerra Giraldez
On Fri, Mar 1, 2013 at 7:48 AM, Aymeric Augustin
 wrote:
> To alleviate the pain, Django commits after each ORM write.

just curious:  why is it so?   i can't think of any reason not to tie
transaction to the request/response cycle by default. (ie what
TransactionMiddleware does).

this is the default on a few other frameworks i use, and has already
saved my behinds more than once.


-- 
Javier

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: Switch to database-level autocommit

2013-03-01 Thread Florian Apolloner
Yay (+1) and preferably ASAP to get it tested by users right now.

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: Switch to database-level autocommit

2013-03-01 Thread Donald Stufft
On Friday, March 1, 2013 at 7:48 AM, Aymeric Augustin wrote:
> Hello,
>  
> I'd like to improve transactions handling in Django. The first step is the 
> current emulation of autocommit with database-level autocommit.
>  
> ** Rationale **
>  
> PEP249 says that any SQL query opens a transaction and that it's the 
> application's job to commit (or rollback) changes.  This model is also 
> required by the SQL standard. But it isn't very developer-friendly.
>  
> To alleviate the pain, Django commits after each ORM write.  Unfortunately, 
> this means that each read opens a transaction, that eventually gets committed 
> after the next write, or rolled back at the end of the query. Such 
> transactions are useless and don't come for free. Relying on them to enforce 
> integrity is extremely fragile — what if an external library starts writing 
> to a log table in the middle of one of these implicit transactions? The term 
> "footgun" comes to mind.
>  
> Database authors have reached the same conclusion, and most databases 
> supported by Django use autocommit by default, ignoring the SQL standard. On 
> PostgreSQL and SQLite, this is the only mode available.
>  
> As a consequence, to implement the behavior mandated by PEP 249, the Python 
> libraries (psycopg2, sqlite3, etc.) automatically start transactions. And 
> then Django automatically commits them. This is not only wasteful, but also 
> buggy. It's the root cause of "idle in transaction" connections on 
> PostgreSQL. It's also sometimes poorly implemented: for instance, executing 
> "SAVEPOINT …" on SQLite commits implicitly. (It's arguably a bug in the 
> design of the sqlite3 module. The Python bug tracker suggests it's going to 
> be documented.)
>  
> Basically, Django intends to provide autocommit by default. Rather than fight 
> the database adapter that itselfs fights the database, I propose to simply 
> turn autocommit on, and stop implicitly starting and committing transactions. 
> Explicit is better than implicit.
>  
> ** Implementation **
>  
> All databases supported by Django provide an API to turn autocommit on:
>  
> - http://initd.org/psycopg/docs/connection.html#connection.autocommit
> - http://docs.python.org/2/library/sqlite3#sqlite3.Connection.isolation_level
> - http://mysql-python.sourceforge.net/MySQLdb.html => conn.autocommit()
> - http://cx-oracle.sourceforge.net/html/connection.html#Connection.autocommit
>  
> This obviously has far-reaching consequences on transaction handling in 
> Django, but the current APIs should still work. (Fixing them is part 2 of the 
> plan.) The general idea is that Django will explicitly start a transaction 
> when entering transaction management.
>  
> This will obviously impact maintainers of backend for other databases, but if 
> it works on Oracle (which doesn't have autocommit — it's emulated in OCI) and 
> on PostgreSQL (which enforces autocommit), I hope it can work anywhere.
>  
> ** Backwards-compatibility **
>  
> Roughly, I'd classify Django users in four groups:
> 1 - "Transactions, how do they work?"
> 2 - "Django autocommits, doesn't it?"
> 3 - "I'm using TransactionMiddleware!"
> 4 - "I'm managing my transactions."
>  
> Groups 1 and 2 won't see the difference. There won't be any difference for 
> group 3. Group 4 may be impacted by the change, but I believe most people in 
> this category have autocommit turned on already — or would like to, if 
> they're on MySQL — and will understand the change.
>  
> I don't see much point in providing an option to turn autocommit off, because 
> starting a transaction is a much more explicit way to achieve the same 
> effect. We usually don't provide "backwards compatibility with bugs".
>  
> Yay or nay?
+1  
>  
> --  
> Aymeric.
>  
>  
>  
> --  
> You received this message because you are subscribed to the Google Groups 
> "Django developers" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to django-developers+unsubscr...@googlegroups.com 
> (mailto:django-developers+unsubscr...@googlegroups.com).
> To post to this group, send email to django-developers@googlegroups.com 
> (mailto:django-developers@googlegroups.com).
> Visit this group at http://groups.google.com/group/django-developers?hl=en.
> For more options, visit https://groups.google.com/groups/opt_out.
>   
>   

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Switch to database-level autocommit

2013-03-01 Thread Aymeric Augustin
Hello,

I'd like to improve transactions handling in Django. The first step is the 
current emulation of autocommit with database-level autocommit.

** Rationale **

PEP249 says that any SQL query opens a transaction and that it's the 
application's job to commit (or rollback) changes.  This model is also required 
by the SQL standard. But it isn't very developer-friendly.

To alleviate the pain, Django commits after each ORM write.  Unfortunately, 
this means that each read opens a transaction, that eventually gets committed 
after the next write, or rolled back at the end of the query. Such transactions 
are useless and don't come for free. Relying on them to enforce integrity is 
extremely fragile — what if an external library starts writing to a log table 
in the middle of one of these implicit transactions? The term "footgun" comes 
to mind.

Database authors have reached the same conclusion, and most databases supported 
by Django use autocommit by default, ignoring the SQL standard. On PostgreSQL 
and SQLite, this is the only mode available.

As a consequence, to implement the behavior mandated by PEP 249, the Python 
libraries (psycopg2, sqlite3, etc.) automatically start transactions. And then 
Django automatically commits them. This is not only wasteful, but also buggy. 
It's the root cause of "idle in transaction" connections on PostgreSQL. It's 
also sometimes poorly implemented: for instance, executing "SAVEPOINT …" on 
SQLite commits implicitly. (It's arguably a bug in the design of the sqlite3 
module. The Python bug tracker suggests it's going to be documented.)

Basically, Django intends to provide autocommit by default. Rather than fight 
the database adapter that itselfs fights the database, I propose to simply turn 
autocommit on, and stop implicitly starting and committing transactions. 
Explicit is better than implicit.

** Implementation **

All databases supported by Django provide an API to turn autocommit on:

- http://initd.org/psycopg/docs/connection.html#connection.autocommit
- http://docs.python.org/2/library/sqlite3#sqlite3.Connection.isolation_level
- http://mysql-python.sourceforge.net/MySQLdb.html => conn.autocommit()
- http://cx-oracle.sourceforge.net/html/connection.html#Connection.autocommit

This obviously has far-reaching consequences on transaction handling in Django, 
but the current APIs should still work. (Fixing them is part 2 of the plan.) 
The general idea is that Django will explicitly start a transaction when 
entering transaction management.

This will obviously impact maintainers of backend for other databases, but if 
it works on Oracle (which doesn't have autocommit — it's emulated in OCI) and 
on PostgreSQL (which enforces autocommit), I hope it can work anywhere.

** Backwards-compatibility **

Roughly, I'd classify Django users in four groups:
1 - "Transactions, how do they work?"
2 - "Django autocommits, doesn't it?"
3 - "I'm using TransactionMiddleware!"
4 - "I'm managing my transactions."

Groups 1 and 2 won't see the difference. There won't be any difference for 
group 3. Group 4 may be impacted by the change, but I believe most people in 
this category have autocommit turned on already — or would like to, if they're 
on MySQL — and will understand the change.

I don't see much point in providing an option to turn autocommit off, because 
starting a transaction is a much more explicit way to achieve the same effect. 
We usually don't provide "backwards compatibility with bugs".

Yay or nay?

-- 
Aymeric.



-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.