Augie Fackler added the comment: On Oct 8, 2013, at 5:24 PM, STINNER Victor <rep...@bugs.python.org> wrote:
> > STINNER Victor added the comment: > > 2013/10/8 Augie Fackler <rep...@bugs.python.org>: >> sys.stdout.write('%(state)s %(path)s\n' % {'state': 'M', 'path': >> 'some/filesystem/path'}) >> >> except we don't know the encoding of the filesystem path (Hi unix!) so we >> have to treat the whole thing as opaque bytes. > > You are doing it wrong. In Python 3, you "should" store filenames as > Unicode (str type). If Python fails to decode a filename, undecodable > bytes are stored as surrogate characters (see the PEP 383). No, I'm not. In Mercurial, all end-user data is OPAQUE BYTES, and must remain that way. We're not able to change either our on-disk data format OR our stdout format, even to support a newer version of Python. I don't know the encoding of the filename's bytes, but I _must_ faithfully reproduce them exactly as they are or I'll break tools like make(1) and patch(1). Similarly, if a file goes from ISO-8859-1 to UTF-8, I have to emit a diff that has some ISO bytes and some UTF bytes - it's not in *any* valid encoding. Changing that is a showstopper regression. > The Unicode type became natural in Python 3, as byte string (old "str" > type) was natural in Python 2. > > sys.stdout.write() expects a Unicode string, not a byte string. Ouch. Is there any way to write things to stderr and stdout without decoding and hopelessly breaking user data? > Does it mean that Mercurial is moving to Python 3? Cool :-) Not likely, honestly. I tackle this when I've got some spare cycles and my ability to handle pain is high. As it stands, I have the test-runner barely working, but it's making wrong assumptions to get there. The best estimate is that it's a year of work to upgrade to Python 3. > > ---------- > > _______________________________________ > Python tracker <rep...@bugs.python.org> > <http://bugs.python.org/issue3982> > _______________________________________ ---------- _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue3982> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com