On Tue, Feb 16, 2010 at 5:16 AM, Forest Bond <for...@alittletooquiet.net> wrote:
> Hi,
>
> Django allows an iterator to be passed as response content when instantiating 
> an
> HttpResponse.  However, doing so causes problems with the following classes 
> and
> functions:
>
> UpdateCacheMiddleware:
>  Caches the response object using the configured cache backend, but most
>  iterators cannot be pickled.
> CommonMiddleware:
>  Calculates a value for the ETag header, causing the iterator to be drained
>  prematurely.
> GzipMiddleware:
>  Calculates the response content length, causing the iterator to be drained
>  prematurely.
> ConditionalGetMiddleware
>  Calculates the response content length and sets the Content-Length header,
>  causing the iterator to be drained prematurely.
> patch_response_headers
>  Calculates a value for the ETag header, causing the iterator to be drained
>  prematurely.
>
> Some of these problems are discussed in the following tickets:
>
> #6027: FileWrapper iterator drained by GzipMiddleware before content can be 
> returned
> #6527: A bug in HttpResponse with iterators
> #12214: never_cache decorator breaks HttpResponse with iterator content
>
> Attached is a proof-of-concept patch illustrating how I'd like to tackle this
> issue.
>
> Roughly, my approach is as follows:
>
> * Introduce a ResponseContent class with subclasses MutableResponseContent and
>  IteratorResponseContent to limit operations on the response content depending
>  on the type of the value provided (string type or iterator).
> * Forbid premature draining of the content iterator via response.content by 
> only
>  evaluating the content iterator if accessed via iter(response) and raising an
>  exception if it is accessed via response.content.
> * Modify middleware/view decorator code that accesses response.content to 
> check
>  response.content_obj.is_readable() before trying to read the content.
>
> This approach should be backwards compatible, with the exception that if a
> middleware or view decorator does inappropriately access response.content, a
> TypeError will be raised.  Previously, the content iterator would be 
> prematurely
> drained and an empty response would be returned.  Since the previous behavior
> was not usable, it seems unlikely anyone would be depending on it.
>
> I have a few questions:
>
> * Does this approach make sense to others?  If so, I'll add a few tests and 
> look
>  at how the docs might need updating.

I think I understand where you're coming from, but I'm not immediately
convinced it's the right approach.  In particular, I'm not convinced
we can lock down the ability to iterate over responses more than once
in the way you have done.

I acknowledge that this fixes the problem (in the sense that it makes
the errors caused by multiple iteration go away), but I'm not sure
it's the right solution. Disabling Etags, content-length coding,
gziped content and so on seems like throwing the baby out with the
bathwater.

My immediate reaction is that we should be trying to capture and
preserve the process of evaluating the iterator, not limit the ability
to iterate. That is - the first attempt to evaluate the iterator
should store the content, and that evaluated content should be made
available by anything else that needs to iterate over the content.

Of course, the simplest way to do this would be to just evaluate the
iterator as soon as it is provided -- which kind of negates any
benefit of using an iterator in the first place. So -- I suppose the
bigger question that needs to be asked is what exactly is the use case
for an iterable response? I mean, I understand the general benefit of
using iterators and generators instead of rolled out lists, but what
is the use case in a HTTP context?

> * Is this type of change too invasive for 1.2?

I'm inclined to say yes. I know this is addressing a number of
specific tickets, so it isn't strictly a feature addition, but those
issues have existed for a long time, and the change you're proposing
has a conceptually wide footprint - I'd rather see changes like this
treated like a feature, so they get the benefit of review through
multiple prereleases, etc.

> * Should I create a "master" ticket to deal with this issue and continue
>  development work there?  The other tickets focus on specific symptoms of this
>  problem rather than the root cause more generally.

This would be a good idea. Given that there is a specific course of
action on the table, it's worth having a ticket to track the evolution
of the idea. By way of comparison - this is exactly what Ben Firshman
did with class-based Syndication views in ticket #12403; the ticket
addressed a dozen other tickets, but provided a single place to
coordinate review of the specific solution.

Make sure you reference any other existing tickets, but don't close
those tickets as duplicates - that way, if we find an alternate
approach, we can close this new ticket as wontfix without losing the
original problem reports.

Yours,
Russ Magee %-)

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-develop...@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.

Reply via email to