On Fri, Aug 22, 2014 at 11:53:01AM -0700, Chris Barker wrote: > The point is that if you are reading a file name from the system, and then > passing it back to the system, then you can treat it as just bytes -- who > cares? And if you add the byte value of 47 thing, then you can even do > basic path manipulations. But once you want to do other things with your > file name, then you need to know the encoding. And it is very, very common > for users to need to do other things with filenames, and they almost always > want them as text that they can read and understand. > > Python3 supports this case very well. But it does indeed make it hard to > work with filenames when you don't know the encoding they are in.
Just "not knowing" is not sufficient. In that case, you'll likely get a Unicode string containing moji-bake: # I write a file name using UTF-8 on my system: filename = 'music by Наӥв.txt'.encode('utf-8') # You try to use it assuming ISO-8859-7 (Greek) filename.decode('iso-8859-7') => 'music by Π\x9dΠ°Σ₯Π².txt' which, even though it looks wrong, still lets you refer to the file (provided you then encode back to bytes with ISO-8859-7 again). This won't always be the case, sometimes the encoding you guess will be wrong. When I started this email, I originally began to say that the actual problem was with byte file names that cannot be decoded into Unicode using the system encoding (typically UTF-8 on Linux systems. But I've actually had difficulty demonstrating that it actually is a problem. I started with a byte sequence which is invalid UTF-8, namely: b'ZZ\xdb\xdf\xfa\xff' created a file with that name, and then tried listing it with os.listdir. Even in Python 3.1 it worked fine. I was able to list the directory and open the file, so I'm not entirely sure where the problem lies exactly. Can somebody demonstrate the failure mode? -- Steven _______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com