[
https://issues.apache.org/jira/browse/HDFS-7309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14192166#comment-14192166
]
Colin Patrick McCabe commented on HDFS-7309:
--------------------------------------------
Thanks for the review, Ravi.
bq. I think more and more we are implementing StringEscapeUtils though , and it
might be useful to figure out if we can simply move to commons-lang3 at some
point. Do you know if there has ever been a discussion about that? I see a
little bit on HADOOP-10783 but that JIRA has stalled.
So, the original reason why {{mangleXmlString}} exists is because there is no
standard way of representing certain code points in XML. If you look at:
http://commons.apache.org/proper/commons-lang/apidocs/ it says, "XML 1.1 can
represent certain control characters, but it cannot represent the null byte or
unpaired Unicode surrogate codepoints, even after escaping. escapeXml11 will
remove characters that do not fit in the following ranges..." So we can't use
this function since it simply removes a bunch of code points rather than trying
to represent them. There are also serious compatibility issues in upgrading
commons-lang 2.6 to 3.0, so I think it may have to wait until Hadoop 3.0.
> XMLUtils.mangleXmlString doesn't seem to handle less than sign
> --------------------------------------------------------------
>
> Key: HDFS-7309
> URL: https://issues.apache.org/jira/browse/HDFS-7309
> Project: Hadoop HDFS
> Issue Type: Bug
> Affects Versions: 2.1.0-beta
> Reporter: Ravi Prakash
> Assignee: Colin Patrick McCabe
> Priority: Minor
> Attachments: HDFS-7309.001.patch, HDFS-7309.002.patch, HDFS-7309.patch
>
>
> My expectation was that "<someElement>" + XMLUtils.mangleXmlString(
> "Containing<ALessThanSign") + "</someElement>" would be a string
> acceptable to a SAX parser. However this was not true.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)