On Thu, Feb 23, 2012 at 10:26 PM, Michael Hudson-Doyle
<[email protected]> wrote:
> On Thu, 23 Feb 2012 14:49:16 -0600, Andy Doan <[email protected]> wrote:
>> I just hit something interesting in Djano and thought I'd share. I was
>> playing with some data from a queryset. I was somewhat aware its not
>> your ordinary python list. However, I was surprised to see this issue:

You reasoned based on a broken assumption.

What you must know is that all Django's QuerySet methods are _lazy_.
You only get a copy of the QuerySet object with extra
expression/filter applied. The only exception to this is the small
subset of methods that actually "evaluate" the QuerySet and return the
data. This includes methods like .count(), __iter__(), __getitem__(),
.values(), values_list(), and a few others.

If you initially started with result = Model.objects.all() then, as
Michael has already explained, each time results[i] is evaluated
Django does a SQL query (that in this case can return random element
as the ordering is not specified).

>>
>> I had a query that was returning TestResult objects.
>
> From what happens next I guess the query didn't have an ORDER BY?
>
>> When I iterated over the list like:
>>
>>   for r in results:
>>      print r.measurement
>>
>> I got the objects I expected. However, at the time I was coding I needed
>> to know the loop index, so I did something like:
>>
>>   for i in xrange(results.count())
>>       print results[i].measurement
>>
>> This example acts like it works, but upon looking at the data I realized
>> I got incorrect results. It seems like results[0] and results[1] are
>> always the same, but I haven't dug enough to prove that's always the case.
>>
>> Anyway, I had a feeling using the xrange approach was wrong to begin
>> with, but it turns out to be actually wrong without telling you.
>
> If there's a bug here, it's that Django lets you write this.
> "results[i]" appears to translate to this sort of query:
>
> select * from ... where ... limit 1 offset $i
>
> So the for loop you wrote is executing this results.count() times.
>
> The thing is, if you execute the query multiple times, there is no
> guarantee that the ordering as considered by the limit & offset will be
> the same, so you won't necessarily get the all the objects once in the
> for loop.
>
> Even if you did ORDER BY the query, it's still horribly inefficient.
> You probably wanted to write:
>
> for i, ob in enumerate(results):
>    ...
>
> instead :-)
>
> I think it's arguably a bug that Django lets you issue offset/limit
> queries on unordered result sets.  I can't imagine when it would be the
> right thing to do.

It's probably used in the implementation of .get()

Best regards
ZK

_______________________________________________
linaro-validation mailing list
[email protected]
http://lists.linaro.org/mailman/listinfo/linaro-validation

Reply via email to