On 09/05/2012 07:31 AM, eryksun wrote: > On Wed, Sep 5, 2012 at 5:42 AM, Ray Jones <crawlz...@gmail.com> wrote: >> I have directory names that contain Russian characters, Romanian >> characters, French characters, et al. When I search for a file using >> glob.glob(), I end up with stuff like \x93\x8c\xd1 in place of the >> directory names. I thought simply identifying them as Unicode would >> clear that up. Nope. Now I have stuff like \u0456\u0439\u043e. > This is just an FYI in case you were manually decoding. Since glob > calls os.listdir(dirname), you can get Unicode output if you call it > with a Unicode arg: > > >>> t = u"\u0456\u0439\u043e" > >>> open(t, 'w').close() > > >>> import glob > > >>> glob.glob('*') # UTF-8 output > ['\xd1\x96\xd0\xb9\xd0\xbe'] > > >>> glob.glob(u'*') > [u'\u0456\u0439\u043e'] Yes, I played around with that some....in my lack of misunderstanding, I thought that adding the 'u' would pop the characters out at me the way they should appear. Silly me.... ;) > Regarding subprocess.Popen, just use Unicode -- at least on a POSIX > system. Popen calls an exec function, such as posix.execv, which > handles encoding Unicode arguments to the file system encoding. > > On Windows, the _subprocess C extension in 2.x is limited to calling > CreateProcessA with char* 8-bit strings. So Unicode characters beyond > ASCII (the default encoding) trigger an encoding error. subprocess.call(['dolphin', '/my_home/testdir/\u044c\u043e\u0432'])
Dolphin's error message: 'The file or folder /my_home/testdir/\u044c\u043e\u0432 does not exist' But if I copy the characters as seen by Bash's shell and paste them into my subprocess.call(), Dolphin recognizes the directory just fine. So is Dolphin unicode-dead, or am I missing something? Ray _______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor