Jason R. Coombs <jar...@jaraco.com> added the comment: I've created a repo to continue this work. I've integrated David's patch (thanks).
It's not obvious to me what the encoding should be. Python and the tarfile module can accept unicode filenames. It seems that only the gzip part of tarfile fails if a unicode name is passed. Encoding to 'utf-8' or the default file system encoding doesn't seem right (as the characters end up getting stored in the gzip archive itself). Additionally, encoding as 'utf-8' would cause the file to be created with a utf-8 filename, which would be undesirable. So in the current repo, I've created a check to convert the filename to ASCII. If it can be converted to ASCII, it is converted and passed through to tarfile. This should address the majority of users who have thus encountered this issue. For those who wish to use non-ascii characters in project names or versions, one will have to use Python 3 or wait until #13639 is fixed. Please review the enclosed patch. Since one test fails (and is known to fail), should it omitted? Can it remain but be marked as "expected to fail"? ---------- hgrepos: +96 _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue11638> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com