[ 
https://issues.apache.org/jira/browse/HDFS-10983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manoj Govindassamy updated HDFS-10983:
--------------------------------------
    Attachment: HDFS-10983.01.patch

Thanks for the comments. 

Attached v01 patch to address the following:
1. HTTP REST : 
-- {{FSImageLoader}} now adds blockType to File and all xAttrs for Dir in the 
Json output. Refer (1.a) output in the previous comment.

2. XML Processor:
-- {{PBImageXmlWriter}} was erroneously writing BlockType in a child <name> 
section. Fixed that. New unit test verifies this. Refer (2) output in the 
previous comment.

3. OIV Delimited processor: 
-- {{PBImageDelimitedTextWriter}} now adds a new column "BlockType" at the end 
to show STRIPED/CONTIGUOUS for Files and Directories. Refer (3) output in the 
previous comment, but with the BlockType column moved to the last.
-- {{PBImageTextWriter}} which is visited by {{PBImageDelimitedTextWriter}} now 
passes the stringTable to getEntry() to be able to look at XAttrs.

4. Test:
-- {{TestOfflineImageViewer}} now verifies REST, WebHDFS requests, Delimited 
processor, XML processor for EC directory and EC files in FSImage.

bq. We should implement getErasureCodingPolicy in WebHDFS if we don't have it 
and have users call that API. 
When OIV is running the web server for the FSImage and when DFS shell is used 
to list the files, the standard {{PathData}} name to Path expansion happens and 
I am assuming don't want to add a new column to LS output just for these OIV 
context. 
Yes, in other clients which makes direct query to WebHdfs, displaying 
additional details is possible based on the new API. Don't have an usecase 
currently, unless I am overlooking something in OIV users. 

bq. For all outputs, it looks like we print whether a file is striped or not, 
but not what the EC policy is. Since users can rename files, just knowing the 
EC policy on an ancestor doesn't tell us the file's EC policy.
I see even the existing XML Processor which has support for EC doesn't do this. 
In the interest of patch size, will take this as a follow up item in a new jira 
if you are ok. If you prefer including this in this jira, please let me know.

bq. For the delimited format, I'd prefer to add the new column at the end. This 
makes it less likely to break scripts using cut or awk to parse the output.
The patch v01 takes care of this comment.

[~andrew.wang] can you please take a look at the patch ?

PS: The patch will have checkstyle issues in FSImageLoader in Switch/Case 
block. To adhere with the current indentation level in the block, I have not 
used the right indentation for the newly added code. Let me know if this is not 
ok, and i can change the entire block to follow the right indentation level. 

> OIV tool should make an EC file explicit
> ----------------------------------------
>
>                 Key: HDFS-10983
>                 URL: https://issues.apache.org/jira/browse/HDFS-10983
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: Wei-Chiu Chuang
>            Assignee: Manoj Govindassamy
>              Labels: hdfs-ec-3.0-nice-to-have
>         Attachments: HDFS-10983.01.patch
>
>
> The OIV tool's webhdfs interface does not print if a file is striped or not.
> Also, it prints the file's EC policy ID as replication factor, which is 
> inconsistent to the output of a typical webhdfs call to the cluster, which 
> always shows replication factor of 0 for EC files.
> Not just webhdfs, but delimiter output does not print if a file is stripped 
> or not either.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to