[
https://issues.apache.org/jira/browse/HDFS-10983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15847993#comment-15847993
]
Manoj Govindassamy edited comment on HDFS-10983 at 2/3/17 1:09 AM:
-------------------------------------------------------------------
[~andrew.wang], [~jojochuang],
Here are the proposals. Please let me know your thoughts on the below.
1. OIV HTTP server does expose a read-only WebHDFS API which can be queried to
print all file details.
1.a: Users can also get JSON formatted FileStatuses via HTTP REST API, which
can very well be extended. Here is the proposal for REST API output. Added
"hdfs.erasurecoding.policy" for Directory and "blockType" for File.
{noformat}
curl -i http://127.0.0.1:5978/webhdfs/v1/ec?op=getfilestatus
{"FileStatus":
{"owner":"manoj","replication":0,"hdfs.erasurecoding.policy":"XOR-2-1-64k",
"length":0,"permission":"755","type":"DIRECTORY",
"blockSize":0,"pathSuffix":"","modificationTime":1485921930732,
"childrenNum":2,"accessTime":0,"group":"supergroup","fileId":16406}
}
curl -i http://127.0.0.1:5978/webhdfs/v1/ec/file.txt?op=getfilestatus
{"FileStatus":
{"owner":"manoj","replication":3,"blockType":"STRIPED",
"length":0,"permission":"644","type":"FILE","blockSize":134217728,
"pathSuffix":"","modificationTime":1485921930729,"childrenNum":0,
"accessTime":1485921930710,"group":"supergroup","fileId":16407}
}
{noformat}
1.b: But,. when it is queried over shell, webhdfs returns only {{FileStatus}}
as return type which doesn't carry any EC related details. So, I am not sure if
we can make the following one to print extra details on EC file/dir.
{noformat}
hdfs dfs -ls webhdfs://127.0.0.1:5978/
<output same as before>
{noformat}
2. {{OIV XML processor}} already has support for EC. Please review the output
below.
{noformat}
1 <inode>
2 <id>16406</id>
3 <type>DIRECTORY</type>
4 <name>ec</name>
5 <mtime>1485918336816</mtime>
6 <permission>manoj:supergroup:0755</permission>
7 <xattrs>
8 <xattr>
9 <ns>SYSTEM</ns>
10 <name>hdfs.erasurecoding.policy</name> <=======
11 <val>XOR-2-1-64k</val>
12 </xattr>
13 </xattrs>
14 <nsquota>-1</nsquota>
15 <dsquota>-1</dsquota>
16 </inode>
17 <inode>
18 <id>16407</id>
19 <type>FILE</type>
20 <name>EmptyECFile.txt</name>
21 <replication>3</replication>
22 <mtime>1485918336813</mtime>
23 <atime>1485918336796</atime>
24 <preferredBlockSize>134217728</preferredBlockSize>
25 <permission>manoj:supergroup:0644</permission>
26 <storagePolicyId>0</storagePolicyId>
27 <blockType>
28 <name>STRIPED</name> <=======
29 </blockType>
30 </inode>
{noformat}
3. {{OIV Delimited processor}} doesn't have support for EC. Here is the
proposal for the new Header ("BlockType") and value ("CONTIGUOUS"/"STRIPED").
{noformat}
Path Replication ModificationTime AccessTime
PreferredBlockSize BlockType BlocksCount FileSize NSQUOTA DSQUOTA
Permission UserName GroupName
/ 0 2017-01-31 18:57 1969-12-31 16:00 0
NA 0 0 9223372036854775807 -1 drwxr-xr-x manoj supergroup
/dir0 0 2017-01-31 18:57 1969-12-31 16:00 0
NA 0 0 -1 -1 drwxr-xr-x manoj supergroup
/dir0/file0 3 2017-01-31 18:57 2017-01-31 18:57 134217728
CONTIGUOUS 1 1 0 0 -rw-r--r-- manoj supergroup
/dir0/file1 3 2017-01-31 18:57 2017-01-31 18:57 134217728
CONTIGUOUS 1 1 0 0 -rw-r--r-- manoj supergroup
/dir0/file2 3 2017-01-31 18:57 2017-01-31 18:57 134217728
CONTIGUOUS 1 1 0 0 -rw-r--r-- manoj supergroup
/dir0/file3 3 2017-01-31 18:57 2017-01-31 18:57 134217728
CONTIGUOUS 1 1 0 0 -rw-r--r-- manoj supergroup
/emptydir 0 2017-01-31 18:57 1969-12-31 16:00 0
NA 0 0 -1 -1 drwxr-xr-x manoj supergroup
/ec 0 2017-01-31 18:57 1969-12-31 16:00 0
NA 0 0 -1 -1 drwxr-xr-x manoj supergroup
/ec/EmptyECFile.txt 3 2017-01-31 18:57 2017-01-31 18:57 134217728
STRIPED 0 0 0 0 -rw-r--r-- manoj supergroup
/ec/SmallECFile.txt 3 2017-01-31 18:57 2017-01-31 18:57 134217728
STRIPED 1 0 0 0 -rw-r--r-- manoj supergroup
{noformat}
was (Author: manojg):
[~andrew.wang], [~jojochuang],
Here are the proposals. Please let me know your thoughts on the below.
1. OIV HTTP server does expose a read-only WebHDFS API which can be queried to
print all file details.
1.a: Users can also get JSON formatted FileStatuses via HTTP REST API, which
can very well be extended. Here is the proposal for REST API output. Added
"hdfs.erasurecoding.policy" for Directory and "blockType" for File.
{noformat}
curl -i http://127.0.0.1:5978/webhdfs/v1/ec?op=getfilestatus
{"FileStatus":
{"owner":"manoj","replication":0,"hdfs.erasurecoding.policy":"XOR-2-1-64k",
"length":0,"permission":"755","type":"DIRECTORY",
"blockSize":0,"pathSuffix":"","modificationTime":1485921930732,
"childrenNum":2,"accessTime":0,"group":"supergroup","fileId":16406}
}
curl -i http://127.0.0.1:5978/webhdfs/v1/ec/file.txt?op=getfilestatus
{"FileStatus":
{"owner":"manoj","replication":3,"blockType":"STRIPED",
"length":0,"permission":"644","type":"FILE","blockSize":134217728,
"pathSuffix":"","modificationTime":1485921930729,"childrenNum":0,
"accessTime":1485921930710,"group":"supergroup","fileId":16407}
}
{noformat}
1.b: But,. when it is queried over shell, webhdfs returns only {{FileStatus}}
as return type which doesn't carry any EC related details. So, I am not sure if
we can make the following one to print extra details on EC file/dir.
{noformat}
hdfs dfs -ls webhdfs://127.0.0.1:5978/
<output same as before>
{noformat}
2. {{OIV XML processor}} already has support for EC. Please review the output
below.
{noformat}
1 <inode>
2 <id>16406</id>
3 <type>DIRECTORY</type>
4 <name>ec</name>
5 <mtime>1485918336816</mtime>
6 <permission>manoj:supergroup:0755</permission>
7 <xattrs>
8 <xattr>
9 <ns>SYSTEM</ns>
10 <name>hdfs.erasurecoding.policy</name> <=======
11 <val>XOR-2-1-64k</val>
12 </xattr>
13 </xattrs>
14 <nsquota>-1</nsquota>
15 <dsquota>-1</dsquota>
16 </inode>
17 <inode>
18 <id>16407</id>
19 <type>FILE</type>
20 <name>EmptyECFile.txt</name>
21 <replication>3</replication>
22 <mtime>1485918336813</mtime>
23 <atime>1485918336796</atime>
24 <preferredBlockSize>134217728</preferredBlockSize>
25 <permission>manoj:supergroup:0644</permission>
26 <storagePolicyId>0</storagePolicyId>
27 <blockType>
28 <name>STRIPED</name> <=======
29 </blockType>
30 </inode>
{noformat}
2. {{OIV Delimited processor}} doesn't have support for EC. Here is the
proposal for the new Header ("BlockType") and value ("CONTIGUOUS"/"STRIPED").
{noformat}
Path Replication ModificationTime AccessTime
PreferredBlockSize BlockType BlocksCount FileSize NSQUOTA DSQUOTA
Permission UserName GroupName
/ 0 2017-01-31 18:57 1969-12-31 16:00 0
NA 0 0 9223372036854775807 -1 drwxr-xr-x manoj supergroup
/dir0 0 2017-01-31 18:57 1969-12-31 16:00 0
NA 0 0 -1 -1 drwxr-xr-x manoj supergroup
/dir0/file0 3 2017-01-31 18:57 2017-01-31 18:57 134217728
CONTIGUOUS 1 1 0 0 -rw-r--r-- manoj supergroup
/dir0/file1 3 2017-01-31 18:57 2017-01-31 18:57 134217728
CONTIGUOUS 1 1 0 0 -rw-r--r-- manoj supergroup
/dir0/file2 3 2017-01-31 18:57 2017-01-31 18:57 134217728
CONTIGUOUS 1 1 0 0 -rw-r--r-- manoj supergroup
/dir0/file3 3 2017-01-31 18:57 2017-01-31 18:57 134217728
CONTIGUOUS 1 1 0 0 -rw-r--r-- manoj supergroup
/emptydir 0 2017-01-31 18:57 1969-12-31 16:00 0
NA 0 0 -1 -1 drwxr-xr-x manoj supergroup
/ec 0 2017-01-31 18:57 1969-12-31 16:00 0
NA 0 0 -1 -1 drwxr-xr-x manoj supergroup
/ec/EmptyECFile.txt 3 2017-01-31 18:57 2017-01-31 18:57 134217728
STRIPED 0 0 0 0 -rw-r--r-- manoj supergroup
/ec/SmallECFile.txt 3 2017-01-31 18:57 2017-01-31 18:57 134217728
STRIPED 1 0 0 0 -rw-r--r-- manoj supergroup
{noformat}
> OIV tool should make an EC file explicit
> ----------------------------------------
>
> Key: HDFS-10983
> URL: https://issues.apache.org/jira/browse/HDFS-10983
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Affects Versions: 3.0.0-alpha1
> Reporter: Wei-Chiu Chuang
> Assignee: Manoj Govindassamy
> Labels: hdfs-ec-3.0-nice-to-have
> Attachments: HDFS-10983.01.patch
>
>
> The OIV tool's webhdfs interface does not print if a file is striped or not.
> Also, it prints the file's EC policy ID as replication factor, which is
> inconsistent to the output of a typical webhdfs call to the cluster, which
> always shows replication factor of 0 for EC files.
> Not just webhdfs, but delimiter output does not print if a file is stripped
> or not either.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]