On Thu, Aug 25, 2011 at 7:23 AM, Michel30 <[email protected]> wrote:
> Thanks Tom that clarifies a lot, learning every day.
>
> My filesystem is ext4, encoding is irrelevant here right?
> So, I guess the best thing to do is to convert my database into utf-8
> using a method as described here:
> http://www.bothernomore.com/2008/12/16/character-encoding-hell/
>
> That way I'm consistently using utf-8.
> Would this also be backwards compatible with my legacy app? I don't
> see it using any encoding specific.
>
> Thanks,
> Michel
>

Encoding is always relevant. Your filesystem will treat the filename
as just a series of bytes, but what those bytes are depends upon the
character encoding of the application that created the files.

I'm not sure how this will be displayed via email, but an example of a
file created with a latin1 name, and then attempted to be opened with
the equivalent unicode name:

>>> filename=u'£££'
>>> fp=open(filename.encode('latin1'), 'w+')
>>> fp.close()
>>> fp=open(filename.encode('utf-8'), 'r')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
IOError: [Errno 2] No such file or directory: '\xc2\xa3\xc2\xa3\xc2\xa3'
>>> os.listdir('.')
['\xa3\xa3\xa3']

\xa3 is the encoding of the '£' symbol in latin1, \xc2\xa3 is the
encoding of the same symbol in UTF-8.

Cheers

Tom

-- 
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en.

Reply via email to