Re: How do I encode and decode this data to write to a file?

Dave Angel Mon, 29 Apr 2013 04:50:26 -0700

On 04/29/2013 05:47 AM, [email protected] wrote:

A couple of generic comments: your email program made a mess of thetraceback by appending each source line to the location information.

Please mention your Python version & OS. Apparently you're running 2.7on Linux or similar.

I am debugging some code that creates a static HTML gallery from a
directory hierarchy full of images. It's this package:-
     https://pypi.python.org/pypi/Gallery2.py/2.0


It's basically working and does pretty much what I want so I'm happy to
put some effort into it and fix things.

The problem I'm currently chasing is that it can't cope with directory
names that have accented characters in them, it fails when it tries to
write the HTML that creates the page with the thumbnails on.

The code that's failing is:-

         raw = os.path.join(directory, self.getNameNoExtension()) + ".html"
         file = open(raw, "w")
         file.write("".join(html).encode('utf-8'))

You can't encode byte data, it's already encoded. So you're forcing thePython system to implicitly decode it (using ASCII codec) before lettingyou encode it to utf-8. If you think it's already in utf-8, then omitthe encode() call there.

Additionally, you can debug things with some simple print statements, atleast if you decompose your 3-function line so you can get at theintermediate data. Split the line into three parts;

    temp1 = "".join(html)     #temp1 is byte data
    temp2 = temp1.decode()    #temp2 is unicode data
    temp3 = temp2.encode("utf-8")  #temp3 is byte data again
    file.write(temp3)

Now, you'll presumably get the error on the second line, so examine thebytes around byte 783. Make sure it's really in utf-8, and if it is,then skip the decode and the encode. If it's not, then Andrew's adviceis pertinent.

I would also look at the variable html. It's a list, but what are thetypes of the elements in it?

         file.close()

The variable html is a list containing the lines of HTML to write to the
file.  It fails when it contains accented characters (an é in this
case).  Here's the traceback:-

Traceback (most recent call last):
   File "/usr/local/lib/python2.7/dist-packages/gallery/galleries.py", line 41, 
in run self._recurse()
   File "/usr/local/lib/python2.7/dist-packages/gallery/galleries.py", line 272, in 
_recurse os.path.walk(self.props["sourcedir"], self.processDir, None)
   File "/usr/lib/python2.7/posixpath.py", line 246, in walk walk(name, func, arg) File 
"/usr/lib/python2.7/posixpath.py", line 246, in walk walk(name, func, arg)
   File "/usr/lib/python2.7/posixpath.py", line 246, in walk walk(name, func, arg) File 
"/usr/lib/python2.7/posixpath.py", line 238, in walk func(arg, top, names)
   File "/usr/local/lib/python2.7/dist-packages/gallery/galleries.py", line 
263, in processDir self.createGallery()
   File "/usr/local/lib/python2.7/dist-packages/gallery/galleries.py", line 
215, in createGallery self.picturemanager.createPictureHTMLs(self.footer)
   File "/usr/local/lib/python2.7/dist-packages/gallery/picturemanager.py", 
line 84, in createPictureHTMLs curPic.createPictureHTML(self.galleryDirectory, 
self.getStylesheet(), self.fullsize, footer)
   File "/usr/local/lib/python2.7/dist-packages/gallery/picture.py", line 361, in 
createPictureHTML file.write("".join(html).encode('utf-8')) UnicodeDecodeError: 'ascii' 
codec can't decode byte 0xc3 in position 783: ordinal not in range(128)



If I understand correctly the encode() is saying that it can't
understand the data in the html because there's a character 0xc3 in it.
I *think* this means that the é is encoded in UTF-8 already in the
incoming data stream (should be as my system is wholly UTF-8 as far as I
know and I created the directory name).

So how do I change the code so I don't get the error?  Do I just
decode() the data first and then encode() it?



--
DaveA
--
http://mail.python.org/mailman/listinfo/python-list

Re: How do I encode and decode this data to write to a file?

Reply via email to