Re: [Django] #11402: exists() method on QuerySets

Django Thu, 22 Oct 2009 06:05:27 -0700

#11402: exists() method on QuerySets
---------------------------------------------------+------------------------
          Reporter:  Alex                          |         Owner:  nobody
            Status:  new                           |     Milestone:        
         Component:  Database layer (models, ORM)  |       Version:  1.0   
        Resolution:                                |      Keywords:        
             Stage:  Accepted                      |     Has_patch:  0     
        Needs_docs:  0                             |   Needs_tests:  0     
Needs_better_patch:  0                             |  
---------------------------------------------------+------------------------
Comment (by lukeplant):


 I agree that 'any' is better than 'exists' - it matches the Python
 builtin.

 To answer someone's objection that this should be an optimization in
 `QuerySet.__nonzero__`:

 `__nonzero__` should '''not''' do this optimization trick:

  * because it should be consistent with `__len__`, which simply forces
 evaluation of the !QuerySet
  * because it shouldn't second guess the user.

 We did actually have this discussion about `QuerySet.__len__()`, way back
 (I can't find it), and with hindsight I'm sure we came to the right
 conclusion.

 Consider the following two bits of code (ignoring bugs for now):

 1) Take len() of a queryset, then use its data
 {{{
 #!python
 options = Options.objects.filter(bar=baz)
 choice = None
 if len(options) > 1:
     print "Options: " + ", ".join(opt.name for opt in options)
 else:
     choice = options[0]
 }}}


 2) Take len() of a queryset, then discard its data
 {{{
 #!python
 options = Option.objects.filter(bar=baz)
 if len(options) > 1:
     print "You've got more than 1!"
 else
     print "You've got 1 or less!"
 }}}

 In `QuerySet.__len__`, it's impossible to guess which the user is doing.
 So if `__len__` does a .count() as an optimization, sometimes it will be a
 pessimization, causing an extra DB hit compared to just evaluating the
 query. Exactly the same case can be made for `__nonzero__`.

 Explicit is better than implicit.  We provide `QuerySet.count()` if the
 user '''knows''' that they only want the count, and `QuerySet.any()` if
 the user '''knows''' that they only want that. If `__len__` and
 `__nonzero__` tried to be 'clever', then implementing snippet 1 in an
 efficient way gets harder - you have to wrap with `list()`.  And snippet 1
 is exactly the way that many templates will be written:
 {{{
   {% if basketitems %}
      You've got {{ basketitems|length }} item(s) in your basket:
      {% for item in basketitems %}
        ...
      {% endfor %}
   {% endif %}
 }}}

 If we made `__nonzero__` do the `any()` automatically, and similarly
 `__len__`, it would be very hard to avoid having 3 DB hits from within the
 above template.  But the other way around, it's easy to get optimal
 behaviour without having paid any attention to performance. And template
 authors should not have to worry about this, but view authors should, and
 we should give them the tools to be able to do it explicitly and simply.

 Using .count() and .any() isn't so 'pure', but the point is that our
 abstraction is not perfect (because we are worrying about optimisation),
 and we should manage the leak as best we can.  The best way is simply to
 add this documentation:

 >    bool() and len() and iter() force evaluation of the !QuerySet; use
 .any() or .count() if you know you don't need all the data.

-- 
Ticket URL: <http://code.djangoproject.com/ticket/11402#comment:5>
Django <http://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Django updates" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/django-updates?hl=en
-~----------~----~----~----~------~----~------~--~---

Re: [Django] #11402: exists() method on QuerySets

Reply via email to