New submission from Pekka Klärck <pekka.kla...@gmail.com>: If I have two strings that look the same but have different Unicode form, it's very hard to see where the problem actually is:
>>> a = 'hyv\xe4' >>> b = 'hyva\u0308' >>> print(a) hyvä >>> print(b) hyvä >>> a == b False >>> print(repr(a)) 'hyvä' >>> print(repr(b)) 'hyvä' This affects, for example, test automation frameworks using `repr()` in error reporting. For example, both unittest and pytest report `self.assertEqual('hyv\xe4', 'hyva\u0308')` like this: AssertionError: 'hyvä' != 'hyvä' - hyvä + hyvä Because the NFC form is used by strings by default, I would propose that `repr()` would show the decomposed form if the string is in NFD. In practice I'd like `repr('hyva\0308')` to yield `'hyva\0308'`. ---------- messages: 315504 nosy: pekka.klarck priority: normal severity: normal status: open title: `repr()` of string in NFC and NFD forms does not differ _______________________________________ Python tracker <rep...@bugs.python.org> <https://bugs.python.org/issue33317> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com