Hi Tom,

(You said in a follow-up this was intended for django-users, but
personally I think it's on-topic for -developers, so I'm replying here.)

On 07/11/2013 02:54 PM, Tom Christie wrote:
> I've noted that the generic view implementations use the private
> `_clone` method when returning the queryset attribute
> <https://github.com/django/django/blob/master/django/views/generic/detail.py#L72>.
> 
> Presumably the need for that is something to do with the potential for
> querysets to be cached or otherwise incorrectly stateful if this cloning
> isn't performed.  What's confusing me is that it's not at all obvious
> *exactly* what the implications of making (or failing to make) this call
> are.
> 
> Furthermore if it *is* something that's strictly required in this
> circumstance then do we need to be documenting whatever behavior is
> being triggered, so that developers writing their own class based views
> don't make the (mistake?) of simply returning/using a queryset attribute
> without having first cloned it?
> 
> For example, is the following incorrect?
> 
>     class SomeBaseGenericView(View):
>         queryset = None
> 
>         def get_queryset(self):
>             """
>             Simply return `.queryset` by default. Subclasses may
> override this behavior.
>             """
>             return self.queryset
> 
> If so, under what set of conditions can it fail, and is it possible to
> unit test for the failing behavior?

I don't use CBVs much, so I haven't run into this personally or tested
it, but I believe that that code is indeed wrong. Here's the scenario:

1. A subclass defines the "queryset" class attr with an actual QuerySet.

2. This queryset is then evaluated (i.e. iterated over) by some caller
of ``get_queryset``. This populates its result cache.

3. Any future request iterating over that queryset will now get the
cached result instead of executing a new query. Since the queryset is a
class attribute, and the class is a persistent object at module level,
the queryset object is persistent from request to request, and the
result data for that queryset will never change in that server process.

Cloning the queryset on every request before anyone uses it fixes this
problem, because only the per-request clones will get their
result-caches populated, never the original class-attribute queryset.
(Also if somehow the original queryset were evaluated, cloning a
queryset doesn't copy the result-cache anyway).

(In your above example, if it so happened that every caller of
get_queryset did further filtering on the queryset before evaluating it,
that filtering would implicitly clone the queryset, as every call to
.filter() does, thus also preventing this problem.)

It's a special case of the same problem that can occur anytime you
define a queryset at module-import-time scope. Generally my rule of
thumb for avoiding this problem is "don't ever do that", but since CBVs
encourage breaking that rule with the "queryset" class attribute, they
have to take care to clone that queryset in get_queryset.

> I've dug into the source, but the `_clone` method isn't documented, nor
> can I find anything terribly helpful related to queryset cloning after
> googling around for a while.

I think you may be right that this should be documented for authors of
their own CBVs who take the generic ones as an example, though I'll
leave a final decision on that to someone more involved in the CBV code
and docs.

That would imply making qs._clone() a publicly documented method. The
issue would be that we'd probably want to make qs.clone() an alias for
it to avoid documenting an underscore-prefixed method as public API, and
that is potentially backwards-incompatible for anyone who has defined
their own .clone() method on a custom QuerySet subclass.

The alternative would be to recommend the use of qs.all(), which is in
fact already an alias for qs._clone(). In that case perhaps the CBVs
should be changed to use .all() as well, so as to avoid using private API.

Carl

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers.
For more options, visit https://groups.google.com/groups/opt_out.


Reply via email to