mod_python.publisher iterables and content_type broken
------------------------------------------------------

         Key: MODPYTHON-97
         URL: http://issues.apache.org/jira/browse/MODPYTHON-97
     Project: mod_python
        Type: Bug
  Components: publisher  
    Versions: 3.2    
    Reporter: Graham Dumpleton


In 3.2, mod_python.publisher was modified so that if it encountered an 
interable it would recursively iterate over the items and publish each with the 
result being concatenated.

FWIW, I personally didn't like this as I saw it potentially changing the 
behaviour of existing code, although perhaps in contrived cases or for test 
code only. I saw that this sort of behaviour should have been managed by the 
user by explicit use of a wrapper class instead, rather than it being magic. 
End of ramble. :-)

Regardless of my concerns, the behaviour that was added is broken. 
Specifically, mod_python.publisher is setting the content type based on the 
content of the first item returned from the iterable. For example, consider the 
following:

index = [
  '<html><body><p>',
  1000 * "X",
  '</p></body></html>',
]

When published, this is resulting in the content type being set to 'text/plain' 
and not 'text/html'. In part this is perhaps caused by the fact that the 
content type check is now performed by looking for a trailing '</html>' in the 
content whereas previously it would look for a leading '<html>'. This was 
changed because all the HTML prologue that can appear before '<html>' would 
often throw out this check with the HTML not being automatically being 
detected. Thus at the time it was thought that looking for the trailing 
'</html>' would be more reliable. It ain't going to help to go back to using a 
leading '<html>' check though as the first item may only contain the prologue 
and not '<html>'.

These checks are only going to work for iterables if the results of publishing 
of each item were added to the end of a list of strings, rather than being 
written back immediately using req.write(). Once all that has been returned by 
the iterable is obtained, this can all be joined back together and then the 
HTML check done.

Joining all the separate items returned from the iterable back together defeats 
the purpose of what this feature was about in the first place and may result in 
huge in memory objects needing to be created to hold the combined result just 
so the HTML check can be done.

The only way to avoid the problem is for the content type to be set explicitly 
by the user before the iterable is processed. This is a bit tricky as it is 
mod_python.publisher which is automagically doing this. The best you can do is 
something like:

class SetContentType:
  def __init__(self,content_type):
    self.__content_type = content_type
  def __call__(self,req):
    req.content_type = self.__content_type
    return ""

index = [
  SetContentType('text/html'),
  '<html><body><p>',
  1000 * "X",
  '</p></body></html>',
]

Once you start doing this, the user may as well have provided their own 
published function in the first place that set the content type and manually 
iterated over items and wrote them to req.write(). This could also be managed 
by a user specified wrapper class which is how I saw this as preferably being 
done in the first place. Ie.,

class PublishIterable:
  def __init__(self,value,content_type):
    self.__value = value
    self.__content_type = content_type
  def __call__(self,req):
    req.content_type = self.__content_type
    for item in self.__value:
      req.write(item)

_values = [
  '<html><body><p>',
  1000 * "X",
  '</p></body></html>',
]

index = PublishIterable(_values,'text/html')

Personally I believe this automagic publishing of iterables should be removed 
from mod_python.publisher. You might still provide a special function/wrapper 
that works like PublisherIterable but handles recursive structures and callable 
objects in the process, but I feel it should be up to the user to make a 
conscious effort to use it and mod_python.publisher shouldn't assume that it 
should process any iterable in this way automatically.



-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

Reply via email to