Hi Tom, (You said in a follow-up this was intended for django-users, but personally I think it's on-topic for -developers, so I'm replying here.)
On 07/11/2013 02:54 PM, Tom Christie wrote: > I've noted that the generic view implementations use the private > `_clone` method when returning the queryset attribute > <https://github.com/django/django/blob/master/django/views/generic/detail.py#L72>. > > Presumably the need for that is something to do with the potential for > querysets to be cached or otherwise incorrectly stateful if this cloning > isn't performed. What's confusing me is that it's not at all obvious > *exactly* what the implications of making (or failing to make) this call > are. > > Furthermore if it *is* something that's strictly required in this > circumstance then do we need to be documenting whatever behavior is > being triggered, so that developers writing their own class based views > don't make the (mistake?) of simply returning/using a queryset attribute > without having first cloned it? > > For example, is the following incorrect? > > class SomeBaseGenericView(View): > queryset = None > > def get_queryset(self): > """ > Simply return `.queryset` by default. Subclasses may > override this behavior. > """ > return self.queryset > > If so, under what set of conditions can it fail, and is it possible to > unit test for the failing behavior? I don't use CBVs much, so I haven't run into this personally or tested it, but I believe that that code is indeed wrong. Here's the scenario: 1. A subclass defines the "queryset" class attr with an actual QuerySet. 2. This queryset is then evaluated (i.e. iterated over) by some caller of ``get_queryset``. This populates its result cache. 3. Any future request iterating over that queryset will now get the cached result instead of executing a new query. Since the queryset is a class attribute, and the class is a persistent object at module level, the queryset object is persistent from request to request, and the result data for that queryset will never change in that server process. Cloning the queryset on every request before anyone uses it fixes this problem, because only the per-request clones will get their result-caches populated, never the original class-attribute queryset. (Also if somehow the original queryset were evaluated, cloning a queryset doesn't copy the result-cache anyway). (In your above example, if it so happened that every caller of get_queryset did further filtering on the queryset before evaluating it, that filtering would implicitly clone the queryset, as every call to .filter() does, thus also preventing this problem.) It's a special case of the same problem that can occur anytime you define a queryset at module-import-time scope. Generally my rule of thumb for avoiding this problem is "don't ever do that", but since CBVs encourage breaking that rule with the "queryset" class attribute, they have to take care to clone that queryset in get_queryset. > I've dug into the source, but the `_clone` method isn't documented, nor > can I find anything terribly helpful related to queryset cloning after > googling around for a while. I think you may be right that this should be documented for authors of their own CBVs who take the generic ones as an example, though I'll leave a final decision on that to someone more involved in the CBV code and docs. That would imply making qs._clone() a publicly documented method. The issue would be that we'd probably want to make qs.clone() an alias for it to avoid documenting an underscore-prefixed method as public API, and that is potentially backwards-incompatible for anyone who has defined their own .clone() method on a custom QuerySet subclass. The alternative would be to recommend the use of qs.all(), which is in fact already an alias for qs._clone(). In that case perhaps the CBVs should be changed to use .all() as well, so as to avoid using private API. Carl -- You received this message because you are subscribed to the Google Groups "Django developers" group. To unsubscribe from this group and stop receiving emails from it, send an email to django-developers+unsubscr...@googlegroups.com. To post to this group, send email to django-developers@googlegroups.com. Visit this group at http://groups.google.com/group/django-developers. For more options, visit https://groups.google.com/groups/opt_out.