[
https://issues.apache.org/jira/browse/HDFS-13744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16588981#comment-16588981
]
Sean Mackrory commented on HDFS-13744:
--------------------------------------
We should probably also handle StringUtils.CR. Hadoop is sometimes used from
Windows clients too.
I'm a little bit torn about not escaping the XML. If someone is embedding
control characters in filenames, even if that is technically allowed and there
are standards specifying how that is to be encoded / decoded, I think it's
likely to cause problems, and I would want those characters to show up
obviously in a report. I suspect there's a good chance that those characters
are the reason someone is trying to inspect the image in the first place :) But
I also don't want to cause practical problems in XML parsers. I can see an
argument either way - like I said I'm a bit torn and want to think about it...
> OIV tool should better handle control characters present in file or directory
> names
> -----------------------------------------------------------------------------------
>
> Key: HDFS-13744
> URL: https://issues.apache.org/jira/browse/HDFS-13744
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: hdfs, tools
> Affects Versions: 2.6.5, 2.9.1, 2.8.4, 2.7.6, 3.0.3
> Reporter: Zsolt Venczel
> Assignee: Zsolt Venczel
> Priority: Critical
> Attachments: HDFS-13744.01.patch
>
>
> In certain cases when control characters or white space is present in file or
> directory names OIV tool processors can export data in a misleading format.
> In the below examples we have EXAMPLE_NAME as a file and a directory name
> where the directory has a line feed character at the end (the actual
> production case has multiple line feeds and multiple spaces)
> * Delimited processor case:
> ** misleading example:
> {code:java}
> /user/data/EXAMPLE_NAME
> ,0,2017-04-24 04:34,1969-12-31 16:00,0,0,0,-1,-1,drwxrwxr-x+,user,group
> /user/data/EXAMPLE_NAME,2016-08-26 03:00,2017-05-16
> 10:05,134217728,1,520,0,0,-rw-rwxr--+,user,group
> {code}
> *
> ** expected example as suggested by
> [https://tools.ietf.org/html/rfc4180#section-2]:
> {code:java}
> "/user/data/EXAMPLE_NAME%x0A",0,2017-04-24 04:34,1969-12-31
> 16:00,0,0,0,-1,-1,drwxrwxr-x+,user,group
> "/user/data/EXAMPLE_NAME",2016-08-26 03:00,2017-05-16
> 10:05,134217728,1,520,0,0,-rw-rwxr--+,user,group
> {code}
> * XML processor case:
> ** misleading example:
> {code:java}
> <inode><id>479867791</id><type>DIRECTORY</type><name>EXAMPLE_NAME
> </name><mtime>1493033668294</mtime><permission>user:group:0775</permission></inode>
> <inode><id>113632535</id><type>FILE</type><name>EXAMPLE_NAME</name><replication>3</replication><mtime>1472205657504</mtime><atime>1494954320141</atime><preferredBlockSize>134217728</preferredBlockSize><permission>user:group:0674</permission></inode>
> {code}
> *
> ** expected example as specified in
> [https://www.w3.org/TR/REC-xml/#sec-line-ends]:
> {code:java}
> <inode><id>479867791</id><type>DIRECTORY</type><name>EXAMPLE_NAME#xA</name><mtime>1493033668294</mtime><permission>user:group:0775</permission></inode>
> <inode><id>113632535</id><type>FILE</type><name>EXAMPLE_NAME</name><replication>3</replication><mtime>1472205657504</mtime><atime>1494954320141</atime><preferredBlockSize>134217728</preferredBlockSize><permission>user:group:0674</permission></inode>
> {code}
> * JSON:
> The OIV Web Processor behaves correctly and produces the following:
> {code:java}
> {
> "FileStatuses": {
> "FileStatus": [
> {
> "fileId": 113632535,
> "accessTime": 1494954320141,
> "replication": 3,
> "owner": "user",
> "length": 520,
> "permission": "674",
> "blockSize": 134217728,
> "modificationTime": 1472205657504,
> "type": "FILE",
> "group": "group",
> "childrenNum": 0,
> "pathSuffix": "EXAMPLE_NAME"
> },
> {
> "fileId": 479867791,
> "accessTime": 0,
> "replication": 0,
> "owner": "user",
> "length": 0,
> "permission": "775",
> "blockSize": 0,
> "modificationTime": 1493033668294,
> "type": "DIRECTORY",
> "group": "group",
> "childrenNum": 0,
> "pathSuffix": "EXAMPLE_NAME\n"
> }
> ]
> }
> }
> {code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]