[ 
https://issues.apache.org/jira/browse/HDFS-10983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15857122#comment-15857122
 ] 

Andrew Wang commented on HDFS-10983:
------------------------------------

Hi Manoj, thanks for posting the patch and the thoughtful discussion above, 
some review comments:

One high-level comment upon looking more closely at this, I don't think we 
should mess with the delimited output at all. It's kind of a legacy format, and 
is already missing many of the new fields added to the fsimage since it was 
converted to PB. The output is also fragile, and I know there are some users 
out there who have built apps on the delimited output since they reported 
issues when we broke it when we switched the fsimage to PB. So, IMO we skip 
adding these new fields here too. Interested users can always use the XML 
output instead.

* FSImageLoader, sorry that I wasn't clear on this before, but we shouldn't be 
adding new fields to FileStatus that aren't there in the real WebHDFS output. 
We should expose xattr information with the {{getXAttrs}} and related APIs, not 
in {{getFileStatus}}. Note that most users interact with the web OIV tool via 
commands like {{hadoop fs -ls webhdfs://....}} not with curl commands directly. 
So, if the webhdfs client doesn't understand the field, it won't show up. 
Considering this is not really EC related, we could file a different JIRA to 
add xattr support to the OIV tool.
* Related to the above, adding blockType to the JSON output also won't show up 
when listing using the webhdfs client either. Kai is working on adding 
getECPolicy support in WebHDFS at HDFS-11394, after which we can also add 
support in OIV.
* This doesn't matter if we just skip all the delimited changes, but: I don't 
think a directory has a meaningful blockType or can be striped since it doesn't 
have any data. I'd prefer we stick to printing the xattrs (which is also 
generally useful). The class javadoc also recommends printing nothing rather 
than a "-" for missing values.
* Same as the above if we skip the delimited changes, but: 
PBImageTextWriter#getEntry, the new javadoc param is named differently from the 
actual param

With the above comments in mind, we aren't left with much in the current patch 
(basically just the {{blockType}} name fix).

I think we should complete HDFS-11382 before adding the file EC policy in the 
XML output since they're related, but besides that I'm fine with splitting the 
work among JIRAs however you wish.

> OIV tool should make an EC file explicit
> ----------------------------------------
>
>                 Key: HDFS-10983
>                 URL: https://issues.apache.org/jira/browse/HDFS-10983
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>    Affects Versions: 3.0.0-alpha1
>            Reporter: Wei-Chiu Chuang
>            Assignee: Manoj Govindassamy
>              Labels: hdfs-ec-3.0-nice-to-have
>         Attachments: HDFS-10983.01.patch
>
>
> The OIV tool's webhdfs interface does not print if a file is striped or not.
> Also, it prints the file's EC policy ID as replication factor, which is 
> inconsistent to the output of a typical webhdfs call to the cluster, which 
> always shows replication factor of 0 for EC files.
> Not just webhdfs, but delimiter output does not print if a file is stripped 
> or not either.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to