[
https://issues.apache.org/jira/browse/HDFS-4235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13506846#comment-13506846
]
Colin Patrick McCabe commented on HDFS-4235:
--------------------------------------------
I can't attach the file where this happened, but I will come up with another
example that has the same problem (and/or unit test)
> when outputting XML, OfflineEditsViewer can't handle some edits containing
> non-ASCII strings
> --------------------------------------------------------------------------------------------
>
> Key: HDFS-4235
> URL: https://issues.apache.org/jira/browse/HDFS-4235
> Project: Hadoop HDFS
> Issue Type: Bug
> Reporter: Colin Patrick McCabe
> Priority: Minor
>
> It seems that when outputting XML, OfflineEditsViewer can't handle some edits
> containing non-ASCII strings.
> Example:
> {code}
> cmccabe@keter:/h> ./bin/hdfs oev -i ~/Downloads/current2/edits -o /tmp/u.xml
>
> 17:11:24,662 ERROR OfflineEditsBinaryLoader:82 - Got IOException at position
> 10593
> Encountered exception. Exiting: SAX error: The character '�' is an invalid
> XML character
> java.io.IOException: SAX error: The character '�' is an invalid XML character
> at
> org.apache.hadoop.hdfs.tools.offlineEditsViewer.XmlEditsVisitor.visitOp(XmlEditsVisitor.java:119)
> at
> org.apache.hadoop.hdfs.tools.offlineEditsViewer.OfflineEditsBinaryLoader.loadEdits(OfflineEditsBinaryLoader.java:78)
> at
> org.apache.hadoop.hdfs.tools.offlineEditsViewer.OfflineEditsViewer.go(OfflineEditsViewer.java:142)
> at
> org.apache.hadoop.hdfs.tools.offlineEditsViewer.OfflineEditsViewer.run(OfflineEditsViewer.java:228)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
> at
> org.apache.hadoop.hdfs.tools.offlineEditsViewer.OfflineEditsViewer.main(OfflineEditsViewer.java:237)
> {code}
> Probably, we forgot to properly escape and/or re-encode a filename before
> putting it into the XML. The other processors (stats, binary) don't have
> this problem, so it is purely an XML encoding issue.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira