Russell Keith-Magee created LIBCLOUD-233:
--------------------------------------------

             Summary: Atmos storage driver doesn't correctly encode path names
                 Key: LIBCLOUD-233
                 URL: https://issues.apache.org/jira/browse/LIBCLOUD-233
             Project: Libcloud
          Issue Type: Bug
          Components: Storage
    Affects Versions: 0.10.1
         Environment: Python 2.7.1
            Reporter: Russell Keith-Magee
            Priority: Critical


If you use the Atmos storage driver, and you attempt to stream the upload of an 
object, and either your container name or your object name is a unicode string, 
the presence of these unicode strings will cause the HTTP message body to be 
converted into a unicode string. 

However, file content is provided as a byte string; if the file content 
contains binary data, httplib will try to convert this file content into 
unicode, yielding encoding errors. 

For example, if you try to stream upload a PDF whose name is stored as 
u'foo.pdf', you'll get a message something like:

UnicodeDecodeError: 'ascii' codec can't decode byte 0xc4 in position 10: 
ordinal not in range(128)

(Position 10 is where the binary content in a PDF starts, after the 
"%PDF-1.3\n%" header)

The behaviour of httplib in the presence of unicode content is a known issue 
(http://bugs.python.org/issue12398); All path tokens should be encoded as ascii 
before being passed to httplib to prevent this problem occurring.

As far as I can make out, this problem only exists under Python 2.7 -- I've 
observed it on Python 2.7.1 and Python 2.7.3. Python 2.6.7 is unaffected. 


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to