Status: Started
Owner: pekka.klarck
Labels: Type-Enhancement Target-2.8.2 Priority-Low bwic
New issue 1578 by pekka.klarck: Replace unrepresentable characters/bytes
instead of escaping them (and some other characters as well)
http://code.google.com/p/robotframework/issues/detail?id=1578
Currently when converting characters (i.e. bytes) to Unicode fails, they
are escaped by using Python's 'string_escape' encoding:
text = str(text).encode('string_escape')
This is problematic because this encoding doesn't only escape bytes
like '\xe4' but also characters like quotes and newlines that would
otherwise be handled normally.
A better solution is encoding the offending string to ASCII and replace
non-representable characters:
text = unicode(str(text), 'ASCII', 'replace')
Now bytes that can be represented as ASCII, such as newlines, will be shown
correctly and others will be replaced with the Unicode replacement
character.
This change is slightly backwards-incompatible because it changes how
non-representable characters are shown. In practice this should not be a
problem, though, because users should craft their libraries and test data
so that logged and/or returned messages are either Unicode or use their
system encoding. Not breaking multiline strings is a big enough win to
compensate this potential annoyance.
--
You received this message because this project is configured to send all
issue notifications to this address.
You may adjust your notification preferences at:
https://code.google.com/hosting/settings
--
---
You received this message because you are subscribed to the Google Groups "robotframework-commit" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/groups/opt_out.