On Jan 2, 9:34 pm, Martin v. Löwis [EMAIL PROTECTED] wrote:
In any case, it goes well beyond the situation that triggered my
original question in the first place, that basically was to provide a
reasonable check on whether round-tripping a string is successful --
this is in the context of
Thanks again. I will chunk my responses as your message has too much
in it for me to process all at once...
On Jan 2, 9:34 pm, Martin v. Löwis [EMAIL PROTECTED] wrote:
Thanks a lot Martin and Marc for the really great explanations! I was
wondering if it would be reasonable to imagine a
Thanks a lot Martin and Marc for the really great explanations! I was
wondering if it would be reasonable to imagine a utility that will
determine whether, for a given encoding, two byte strings would be
equivalent. But I think such a utility will require *extensive*
knowledge about many
Thanks a lot Martin and Marc for the really great explanations! I was
wondering if it would be reasonable to imagine a utility that will
determine whether, for a given encoding, two byte strings would be
equivalent.
But that is much easier to answer:
s1.decode(enc) == s2.decode(enc)
On Dec 27, 7:37 pm, Martin v. Löwis [EMAIL PROTECTED] wrote:
Certainly. ISO-2022 is famous for having ambiguous encodings. Try
these:
unicode(Hallo,iso-2022-jp)
unicode(\x1b(BHallo,iso-2022-jp)
unicode(\x1b(JHallo,iso-2022-jp)
unicode(\x1b(BHal\x1b(Jlo,iso-2022-jp)
or likewise
On Fri, 28 Dec 2007 03:00:59 -0800, mario wrote:
On Dec 27, 7:37 pm, Martin v. Löwis [EMAIL PROTECTED] wrote:
Certainly. ISO-2022 is famous for having ambiguous encodings. Try
these:
unicode(Hallo,iso-2022-jp)
unicode(\x1b(BHallo,iso-2022-jp)
unicode(\x1b(JHallo,iso-2022-jp)
Wow, that's not easy to see why would anyone ever want that? Is there
any logic behind this?
It's the pre-Unicode solution to the we want to have many characters
encoded in a single file problem.
Suppose you have pre-defined characters sets A, B, C, and you want text
to contain characters from
I have checks in code, to ensure a decode/encode cycle returns the
original string.
Given no UnicodeErrors, are there any cases for the following not to
be True?
unicode(s, enc).encode(enc) == s
mario
--
http://mail.python.org/mailman/listinfo/python-list
Given no UnicodeErrors, are there any cases for the following not to
be True?
unicode(s, enc).encode(enc) == s
Certainly. ISO-2022 is famous for having ambiguous encodings. Try
these:
unicode(Hallo,iso-2022-jp)
unicode(\x1b(BHallo,iso-2022-jp)
unicode(\x1b(JHallo,iso-2022-jp)
unicode