#16937: Prefetch related objects
-------------------------------------+-------------------------------------
     Reporter:  lukeplant            |                    Owner:  nobody
         Type:  New feature          |                   Status:  new
    Component:  Database layer       |                  Version:  1.3
  (models, ORM)                      |               Resolution:
     Severity:  Normal               |             Triage Stage:  Accepted
     Keywords:                       |      Needs documentation:  0
    Has patch:  0                    |  Patch needs improvement:  1
  Needs tests:  0                    |                    UI/UX:  0
Easy pickings:  0                    |
-------------------------------------+-------------------------------------

Comment (by lukeplant):

 Replying to [comment:20 akaariai]:
 > I am doing some preliminary studying work for the extensibility patch,
 and while doing that I have some more nitpicking about small and really
 small things...
 >
 >  - Why are prefetch_related_objects and _prefetch_one_level functions
 instead of methods of Query?

 I thought it would be cleaner to have them outside the main class since it
 needs very little of the internals of the `QuerySet` - it is not involved
 in any of the main business of `QuerySet` - and it would otherwise be
 adding ~130 line to a class that is already over 900 lines long. I have
 found it is often easier to pull things out of a class earlier rather than
 later, and it leads to cleaner design, since you can't just go and access
 'self'. The `_prefetch_one_level` function, especially, needs no access to
 'self'.  This is similar to the `get_cached_row` helper which doesn't need
 to be a method, even though it can't be re-used outside of that class.

 We also have the fact that if you add something to 'self' in order to
 communicate between prefetch_related_objects and sub-procedures, which can
 be a very tempting path, you've potentially got big memory leaks. We can
 see clearly that prefetch_related_objects only adds to the result_cache
 instances (and further down the tree), and not anywhere else, because it
 has no access to the `QuerySet` 'self'.


 >  - Assuming there is a good reason for this, prefetch_related_objects
 argument "fields" is not well named. related_lookups or something like
 that might be a better name. If the prefetch_related_objects would be a
 method, there would not be a need to pass this argument.

 Good call, fixed.

 >  - In query.prefetch_related, the comment
 > {{{
 > If prefetch_related() is called with no arguments the list is cleared.
 > }}}
 > is stale. It should say that if called with None as the argument.

 Thanks, fixed.

 >  - It would be more consistent to surrounding code to write the
 .prefetch_related as:
 > {{{
 >         clone = self._clone()
 >         if fields == (None,):
 >             clone._prefetch_related = set()
 >         else:
 >             clone._prefetch_related =
 self._prefetch_related.union(set(fields))
 >         return clone
 > }}}
 > And it would be defensive to copy the _prefetch_related in .clone()
 instead of relying the other code only using .union() or other new set
 returning methods. Or make the set an immutable set.

 Actually, passing kwargs to `_clone()` is used in other places, where the
 change is a simply attribute that immutable. So due to your second point,
 and the fact that we actually need a list, I'm changing it, thanks. I've
 also added tests that demonstrate/test the ordering issue.

 I've pulled in your dict-based join, thank you.

 I've also fixed it so that '.count()' and '.exists()' work as expected
 (and any other `QuerySet` methods that can answer from the result_cache,
 like indexing), by moving changes to `relmanager.get_query_set()` instead
 of `relmanager.all()`.

 Regarding the expense of creating `QuerySets` which are not, I think it
 should be possible to address this with a some kind of `QuerySetProxy`
 which has a result cache, and possibly a few other methods like `count()`
 and `exists()`, and for everything else lazily creates the real `QuerySet`
 and delegates to it. It would probably involve adding a 'lazy' kwarg to
 `relmanager.get_query_set` and creating a subclass of `LazyObject`.
 However, I would actually prefer for this to go in without that
 optimization, so that in any subsequent debugging efforts are not hampered
 by an extra source of complication.

 Finally, I would create a separate ticket for your enhancements, which
 sound great, but I haven't had time to check them out.

 Thanks again for all your work on this.

-- 
Ticket URL: <https://code.djangoproject.com/ticket/16937#comment:22>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

-- 
You received this message because you are subscribed to the Google Groups 
"Django updates" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/django-updates?hl=en.

Reply via email to