MRAB writes: > [snip] > It might be slightly OT, but sometimes strict UTF-8 encoding is violated > by encoding U+0000 using 2 bytes (0xC0 0x80) so that 0x00 can be used as > a terminator. I think I read that Microsoft sometimes does this.
Nice hack! as long as you don't let it escape. But if 'strict' errors on this, then PEP 383 'utf8b' will do the right thing, I think. _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com