New submission from Atle Pedersen <atle.peder...@gmail.com>: I've made a short program to traverse file tree and print file names.
for root, dirs, files in os.walk(path): for f in files: hex = ' '.join(["%02X"%ord(x) for x in f]) print('file is',hex,f) This fails with the following file: file is 67 72 DCE5 6B 61 6C 6C 65 6E 2E 6A 70 67 2E 68 74 6D 6C Traceback (most recent call last): File "/home/atle/bin/findpictures.py", line 16, in <module> print('file is',hexa,f) UnicodeEncodeError: 'utf-8' codec can't encode character '\udce5' in position 2: surrogates not allowed I don't really understand the issue, but this works with Python 2, and fails using 3.1.4 (gentoo: dev-lang/python-3.1.4-r3) Same code using Python 2.7.2 gives: ('file is', '67 72 E5 6B 61 6C 6C 65 6E 2E 6A 70 67 2E 68 74 6D 6C', 'gr\xe5kallen.jpg.html') ---------- components: Unicode messages: 150684 nosy: Atle.Pedersen, ezio.melotti priority: normal severity: normal status: open title: print fails on unicode '\udce5' surrogates not allowed type: behavior versions: Python 3.1 _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue13717> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com