[ 
https://issues.apache.org/jira/browse/HDFS-10983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15847993#comment-15847993
 ] 

Manoj Govindassamy edited comment on HDFS-10983 at 2/3/17 1:09 AM:
-------------------------------------------------------------------

[~andrew.wang], [~jojochuang],

Here are the proposals. Please let me know your thoughts on the below.

1. OIV HTTP server does expose a read-only WebHDFS API which can be queried to 
print all file details.

1.a: Users can also get JSON formatted FileStatuses via HTTP REST API, which 
can very well be extended. Here is the proposal for REST API output.  Added 
"hdfs.erasurecoding.policy" for Directory and "blockType" for File.

{noformat}
curl -i http://127.0.0.1:5978/webhdfs/v1/ec?op=getfilestatus
{"FileStatus":
{"owner":"manoj","replication":0,"hdfs.erasurecoding.policy":"XOR-2-1-64k",
 "length":0,"permission":"755","type":"DIRECTORY", 
"blockSize":0,"pathSuffix":"","modificationTime":1485921930732,
"childrenNum":2,"accessTime":0,"group":"supergroup","fileId":16406}
}

curl -i http://127.0.0.1:5978/webhdfs/v1/ec/file.txt?op=getfilestatus
{"FileStatus":
{"owner":"manoj","replication":3,"blockType":"STRIPED", 
"length":0,"permission":"644","type":"FILE","blockSize":134217728, 
"pathSuffix":"","modificationTime":1485921930729,"childrenNum":0,
"accessTime":1485921930710,"group":"supergroup","fileId":16407}
}
{noformat}

1.b:  But,. when it is queried over shell, webhdfs returns only {{FileStatus}} 
as return type which doesn't carry any EC related details. So, I am not sure if 
we can make the following one to print extra details on EC file/dir.
{noformat}
hdfs dfs -ls webhdfs://127.0.0.1:5978/
<output same as before>
{noformat}


2. {{OIV XML processor}} already has support for EC. Please review the output 
below.
{noformat}
  1 <inode>
  2     <id>16406</id>
  3     <type>DIRECTORY</type>
  4     <name>ec</name>
  5     <mtime>1485918336816</mtime>
  6     <permission>manoj:supergroup:0755</permission>
  7     <xattrs>
  8         <xattr>
  9             <ns>SYSTEM</ns>
 10             <name>hdfs.erasurecoding.policy</name>     <=======
 11             <val>XOR-2-1-64k</val>
 12         </xattr>
 13     </xattrs>
 14     <nsquota>-1</nsquota>
 15     <dsquota>-1</dsquota>
 16 </inode>
 17 <inode>
 18     <id>16407</id>
 19     <type>FILE</type>
 20     <name>EmptyECFile.txt</name>
 21     <replication>3</replication>
 22     <mtime>1485918336813</mtime>
 23     <atime>1485918336796</atime>
 24     <preferredBlockSize>134217728</preferredBlockSize>
 25     <permission>manoj:supergroup:0644</permission>
 26     <storagePolicyId>0</storagePolicyId>
 27     <blockType>
 28         <name>STRIPED</name>    <=======
 29     </blockType>
 30 </inode>
{noformat}


3. {{OIV Delimited processor}} doesn't have support for EC. Here is the 
proposal for the new Header ("BlockType") and value ("CONTIGUOUS"/"STRIPED").

{noformat}
Path                Replication ModificationTime    AccessTime  
PreferredBlockSize  BlockType   BlocksCount FileSize    NSQUOTA DSQUOTA 
Permission  UserName    GroupName
/                   0   2017-01-31 18:57    1969-12-31 16:00    0               
    NA          0   0   9223372036854775807 -1  drwxr-xr-x  manoj   supergroup
/dir0               0   2017-01-31 18:57    1969-12-31 16:00    0               
    NA          0   0   -1  -1  drwxr-xr-x  manoj   supergroup
/dir0/file0         3   2017-01-31 18:57    2017-01-31 18:57    134217728       
    CONTIGUOUS  1   1   0   0   -rw-r--r--  manoj   supergroup
/dir0/file1         3   2017-01-31 18:57    2017-01-31 18:57    134217728       
    CONTIGUOUS  1   1   0   0   -rw-r--r--  manoj   supergroup
/dir0/file2         3   2017-01-31 18:57    2017-01-31 18:57    134217728       
    CONTIGUOUS  1   1   0   0   -rw-r--r--  manoj   supergroup
/dir0/file3         3   2017-01-31 18:57    2017-01-31 18:57    134217728       
    CONTIGUOUS  1   1   0   0   -rw-r--r--  manoj   supergroup

/emptydir           0   2017-01-31 18:57    1969-12-31 16:00    0               
    NA          0   0   -1  -1  drwxr-xr-x  manoj   supergroup

/ec                 0   2017-01-31 18:57    1969-12-31 16:00    0               
    NA          0   0   -1  -1  drwxr-xr-x  manoj   supergroup
/ec/EmptyECFile.txt 3   2017-01-31 18:57    2017-01-31 18:57    134217728       
    STRIPED     0   0   0   0   -rw-r--r--  manoj   supergroup
/ec/SmallECFile.txt 3   2017-01-31 18:57    2017-01-31 18:57    134217728       
    STRIPED     1   0   0   0   -rw-r--r--  manoj   supergroup
{noformat}











was (Author: manojg):
[~andrew.wang], [~jojochuang],

Here are the proposals. Please let me know your thoughts on the below.

1. OIV HTTP server does expose a read-only WebHDFS API which can be queried to 
print all file details.

1.a: Users can also get JSON formatted FileStatuses via HTTP REST API, which 
can very well be extended. Here is the proposal for REST API output.  Added 
"hdfs.erasurecoding.policy" for Directory and "blockType" for File.

{noformat}
curl -i http://127.0.0.1:5978/webhdfs/v1/ec?op=getfilestatus
{"FileStatus":
{"owner":"manoj","replication":0,"hdfs.erasurecoding.policy":"XOR-2-1-64k",
 "length":0,"permission":"755","type":"DIRECTORY", 
"blockSize":0,"pathSuffix":"","modificationTime":1485921930732,
"childrenNum":2,"accessTime":0,"group":"supergroup","fileId":16406}
}

curl -i http://127.0.0.1:5978/webhdfs/v1/ec/file.txt?op=getfilestatus
{"FileStatus":
{"owner":"manoj","replication":3,"blockType":"STRIPED", 
"length":0,"permission":"644","type":"FILE","blockSize":134217728, 
"pathSuffix":"","modificationTime":1485921930729,"childrenNum":0,
"accessTime":1485921930710,"group":"supergroup","fileId":16407}
}
{noformat}

1.b:  But,. when it is queried over shell, webhdfs returns only {{FileStatus}} 
as return type which doesn't carry any EC related details. So, I am not sure if 
we can make the following one to print extra details on EC file/dir.
{noformat}
hdfs dfs -ls webhdfs://127.0.0.1:5978/
<output same as before>
{noformat}


2. {{OIV XML processor}} already has support for EC. Please review the output 
below.
{noformat}
  1 <inode>
  2     <id>16406</id>
  3     <type>DIRECTORY</type>
  4     <name>ec</name>
  5     <mtime>1485918336816</mtime>
  6     <permission>manoj:supergroup:0755</permission>
  7     <xattrs>
  8         <xattr>
  9             <ns>SYSTEM</ns>
 10             <name>hdfs.erasurecoding.policy</name>     <=======
 11             <val>XOR-2-1-64k</val>
 12         </xattr>
 13     </xattrs>
 14     <nsquota>-1</nsquota>
 15     <dsquota>-1</dsquota>
 16 </inode>
 17 <inode>
 18     <id>16407</id>
 19     <type>FILE</type>
 20     <name>EmptyECFile.txt</name>
 21     <replication>3</replication>
 22     <mtime>1485918336813</mtime>
 23     <atime>1485918336796</atime>
 24     <preferredBlockSize>134217728</preferredBlockSize>
 25     <permission>manoj:supergroup:0644</permission>
 26     <storagePolicyId>0</storagePolicyId>
 27     <blockType>
 28         <name>STRIPED</name>    <=======
 29     </blockType>
 30 </inode>
{noformat}


2. {{OIV Delimited processor}} doesn't have support for EC. Here is the 
proposal for the new Header ("BlockType") and value ("CONTIGUOUS"/"STRIPED").

{noformat}
Path                Replication ModificationTime    AccessTime  
PreferredBlockSize  BlockType   BlocksCount FileSize    NSQUOTA DSQUOTA 
Permission  UserName    GroupName
/                   0   2017-01-31 18:57    1969-12-31 16:00    0               
    NA          0   0   9223372036854775807 -1  drwxr-xr-x  manoj   supergroup
/dir0               0   2017-01-31 18:57    1969-12-31 16:00    0               
    NA          0   0   -1  -1  drwxr-xr-x  manoj   supergroup
/dir0/file0         3   2017-01-31 18:57    2017-01-31 18:57    134217728       
    CONTIGUOUS  1   1   0   0   -rw-r--r--  manoj   supergroup
/dir0/file1         3   2017-01-31 18:57    2017-01-31 18:57    134217728       
    CONTIGUOUS  1   1   0   0   -rw-r--r--  manoj   supergroup
/dir0/file2         3   2017-01-31 18:57    2017-01-31 18:57    134217728       
    CONTIGUOUS  1   1   0   0   -rw-r--r--  manoj   supergroup
/dir0/file3         3   2017-01-31 18:57    2017-01-31 18:57    134217728       
    CONTIGUOUS  1   1   0   0   -rw-r--r--  manoj   supergroup

/emptydir           0   2017-01-31 18:57    1969-12-31 16:00    0               
    NA          0   0   -1  -1  drwxr-xr-x  manoj   supergroup

/ec                 0   2017-01-31 18:57    1969-12-31 16:00    0               
    NA          0   0   -1  -1  drwxr-xr-x  manoj   supergroup
/ec/EmptyECFile.txt 3   2017-01-31 18:57    2017-01-31 18:57    134217728       
    STRIPED     0   0   0   0   -rw-r--r--  manoj   supergroup
/ec/SmallECFile.txt 3   2017-01-31 18:57    2017-01-31 18:57    134217728       
    STRIPED     1   0   0   0   -rw-r--r--  manoj   supergroup
{noformat}










> OIV tool should make an EC file explicit
> ----------------------------------------
>
>                 Key: HDFS-10983
>                 URL: https://issues.apache.org/jira/browse/HDFS-10983
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>    Affects Versions: 3.0.0-alpha1
>            Reporter: Wei-Chiu Chuang
>            Assignee: Manoj Govindassamy
>              Labels: hdfs-ec-3.0-nice-to-have
>         Attachments: HDFS-10983.01.patch
>
>
> The OIV tool's webhdfs interface does not print if a file is striped or not.
> Also, it prints the file's EC policy ID as replication factor, which is 
> inconsistent to the output of a typical webhdfs call to the cluster, which 
> always shows replication factor of 0 for EC files.
> Not just webhdfs, but delimiter output does not print if a file is stripped 
> or not either.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to