On Thu, Feb 23, 2012 at 10:26 PM, Michael Hudson-Doyle <[email protected]> wrote: > On Thu, 23 Feb 2012 14:49:16 -0600, Andy Doan <[email protected]> wrote: >> I just hit something interesting in Djano and thought I'd share. I was >> playing with some data from a queryset. I was somewhat aware its not >> your ordinary python list. However, I was surprised to see this issue:
You reasoned based on a broken assumption. What you must know is that all Django's QuerySet methods are _lazy_. You only get a copy of the QuerySet object with extra expression/filter applied. The only exception to this is the small subset of methods that actually "evaluate" the QuerySet and return the data. This includes methods like .count(), __iter__(), __getitem__(), .values(), values_list(), and a few others. If you initially started with result = Model.objects.all() then, as Michael has already explained, each time results[i] is evaluated Django does a SQL query (that in this case can return random element as the ordering is not specified). >> >> I had a query that was returning TestResult objects. > > From what happens next I guess the query didn't have an ORDER BY? > >> When I iterated over the list like: >> >> for r in results: >> print r.measurement >> >> I got the objects I expected. However, at the time I was coding I needed >> to know the loop index, so I did something like: >> >> for i in xrange(results.count()) >> print results[i].measurement >> >> This example acts like it works, but upon looking at the data I realized >> I got incorrect results. It seems like results[0] and results[1] are >> always the same, but I haven't dug enough to prove that's always the case. >> >> Anyway, I had a feeling using the xrange approach was wrong to begin >> with, but it turns out to be actually wrong without telling you. > > If there's a bug here, it's that Django lets you write this. > "results[i]" appears to translate to this sort of query: > > select * from ... where ... limit 1 offset $i > > So the for loop you wrote is executing this results.count() times. > > The thing is, if you execute the query multiple times, there is no > guarantee that the ordering as considered by the limit & offset will be > the same, so you won't necessarily get the all the objects once in the > for loop. > > Even if you did ORDER BY the query, it's still horribly inefficient. > You probably wanted to write: > > for i, ob in enumerate(results): > ... > > instead :-) > > I think it's arguably a bug that Django lets you issue offset/limit > queries on unordered result sets. I can't imagine when it would be the > right thing to do. It's probably used in the implementation of .get() Best regards ZK _______________________________________________ linaro-validation mailing list [email protected] http://lists.linaro.org/mailman/listinfo/linaro-validation
