[ 
https://issues.apache.org/jira/browse/HDFS-13744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16605831#comment-16605831
 ] 

Zsolt Venczel commented on HDFS-13744:
--------------------------------------

Thanks a lot [~mackrorysd] for the review and the fix!

I was a bit puzzled on the specification about how to escape a CRLF properly as 
it's not specified exactly (there's an example to replace it character by 
character which is your approach but there's another example here: 
https://tools.ietf.org/html/rfc2234#section-2.3).
>From a usability perspective I think you're approach is the best as it clearly 
>displays all special characters. For debugging purposes this is the most 
>valuable.

Test failures are unrelated.



> OIV tool should better handle control characters present in file or directory 
> names
> -----------------------------------------------------------------------------------
>
>                 Key: HDFS-13744
>                 URL: https://issues.apache.org/jira/browse/HDFS-13744
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: hdfs, tools
>    Affects Versions: 2.6.5, 2.9.1, 2.8.4, 2.7.6, 3.0.3
>            Reporter: Zsolt Venczel
>            Assignee: Zsolt Venczel
>            Priority: Critical
>         Attachments: HDFS-13744.01.patch, HDFS-13744.02.patch, 
> HDFS-13744.03.patch
>
>
> In certain cases when control characters or white space is present in file or 
> directory names OIV tool processors can export data in a misleading format.
> In the below examples we have EXAMPLE_NAME as a file and a directory name 
> where the directory has a line feed character at the end (the actual 
> production case has multiple line feeds and multiple spaces)
>  * Delimited processor case:
>  ** misleading example:
> {code:java}
> /user/data/EXAMPLE_NAME
> ,0,2017-04-24 04:34,1969-12-31 16:00,0,0,0,-1,-1,drwxrwxr-x+,user,group
> /user/data/EXAMPLE_NAME,2016-08-26 03:00,2017-05-16 
> 10:05,134217728,1,520,0,0,-rw-rwxr--+,user,group
> {code}
>  * 
>  ** expected example as suggested by 
> [https://tools.ietf.org/html/rfc4180#section-2]:
> {code:java}
> "/user/data/EXAMPLE_NAME%x0A",0,2017-04-24 04:34,1969-12-31 
> 16:00,0,0,0,-1,-1,drwxrwxr-x+,user,group
> "/user/data/EXAMPLE_NAME",2016-08-26 03:00,2017-05-16 
> 10:05,134217728,1,520,0,0,-rw-rwxr--+,user,group
> {code}
>  * XML processor case:
>  ** misleading example:
> {code:java}
> <inode><id>479867791</id><type>DIRECTORY</type><name>EXAMPLE_NAME
> </name><mtime>1493033668294</mtime><permission>user:group:0775</permission></inode>
> <inode><id>113632535</id><type>FILE</type><name>EXAMPLE_NAME</name><replication>3</replication><mtime>1472205657504</mtime><atime>1494954320141</atime><preferredBlockSize>134217728</preferredBlockSize><permission>user:group:0674</permission></inode>
> {code}
>  * 
>  ** expected example as specified in 
> [https://www.w3.org/TR/REC-xml/#sec-line-ends]:
> {code:java}
> <inode><id>479867791</id><type>DIRECTORY</type><name>EXAMPLE_NAME#xA</name><mtime>1493033668294</mtime><permission>user:group:0775</permission></inode>
> <inode><id>113632535</id><type>FILE</type><name>EXAMPLE_NAME</name><replication>3</replication><mtime>1472205657504</mtime><atime>1494954320141</atime><preferredBlockSize>134217728</preferredBlockSize><permission>user:group:0674</permission></inode>
> {code}
>  * JSON:
>  The OIV Web Processor behaves correctly and produces the following:
> {code:java}
> {
>   "FileStatuses": {
>     "FileStatus": [
>       {
>         "fileId": 113632535,
>         "accessTime": 1494954320141,
>         "replication": 3,
>         "owner": "user",
>         "length": 520,
>         "permission": "674",
>         "blockSize": 134217728,
>         "modificationTime": 1472205657504,
>         "type": "FILE",
>         "group": "group",
>         "childrenNum": 0,
>         "pathSuffix": "EXAMPLE_NAME"
>       },
>       {
>         "fileId": 479867791,
>         "accessTime": 0,
>         "replication": 0,
>         "owner": "user",
>         "length": 0,
>         "permission": "775",
>         "blockSize": 0,
>         "modificationTime": 1493033668294,
>         "type": "DIRECTORY",
>         "group": "group",
>         "childrenNum": 0,
>         "pathSuffix": "EXAMPLE_NAME\n"
>       }
>     ]
>   }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to