Characters from the Unicode Supplemental Multilingual Plane included in story
definitions get rendered incorrectly in HTML
--------------------------------------------------------------------------------------------------------------------------
Key: JBEHAVE-374
URL: http://jira.codehaus.org/browse/JBEHAVE-374
Project: JBehave
Issue Type: Bug
Affects Versions: 3.0.3
Environment: Windows 7, 64-bit
Reporter: Alistair Dutton
Priority: Minor
If one includes characters from the Unicode Supplemental Multilingual Plane
(code points U+10000 upwards) in a story file, if one then asks for an HTML
report from the test run the characters will not be HTML-escaped correctly.
For example, given a story file with the following scenario:
------------
Scenario: Some scenario
Given some situation
When I do something
Then the result is 𐐆
------------
(The "dagger"-type character is actually code point U+10406 - see
http://en.wikibooks.org/wiki/Unicode/Character_reference/10000-10FFF)
The resulting HTML report will have the "dagger" character escaped as
�� - which represent surrogate-pair code points (used in UTF-16
only) and so is rendered as gibberish in HTML. The escape should be 𐐆
NOTE: This is NOT a bug in JBehave per se - the bug is in the StringEscapeUtils
class of commons-lang. A related bug has already been raised (and fixed) in
commons-lang: https://issues.apache.org/jira/browse/LANG-617. Although the
commons-lang bug report relates to XML escaping rather than HTML escaping, it
seems likely that the fix will cover both. Unfortunately, the fix is in
commons-lang 3.0...
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://jira.codehaus.org/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe from this list, please visit:
http://xircles.codehaus.org/manage_email