[
https://issues.apache.org/jira/browse/HDFS-11156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15723936#comment-15723936
]
Andrew Wang commented on HDFS-11156:
------------------------------------
Sorry I didn't get a chance to review this earlier, but I have a possible
compatibility concern. As I described earlier, webhdfs compatibility is
important so we can move data from an older to a newer cluster with distcp.
distcp is also often run in "pull" mode, with a new client on the new cluster
reading from the old cluster. See these tables which recommend running on the
destination cluster:
https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.3.0/bk_Sys_Admin_Guides/content/ref-cfb69f75-d06f-46a2-862f-efeba959b152.1.html
https://www.cloudera.com/documentation/enterprise/5-8-x/topics/cdh_admin_distcp_data_cluster_migrate.html
Since we don't have a fallback to use GET_BLOCK_LOCATIONS instead,
getFileBlockLocations won't work with the new client/old cluster case.
[~cheersyang], [~liuml07] what do you think? Wondering if we should revert to
handle this fallback.
> Add new op GETFILEBLOCKLOCATIONS to WebHDFS REST API
> ----------------------------------------------------
>
> Key: HDFS-11156
> URL: https://issues.apache.org/jira/browse/HDFS-11156
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: webhdfs
> Affects Versions: 2.7.3
> Reporter: Weiwei Yang
> Assignee: Weiwei Yang
> Fix For: 2.8.0, 3.0.0-alpha2
>
> Attachments: HDFS-11156.01.patch, HDFS-11156.02.patch,
> HDFS-11156.03.patch, HDFS-11156.04.patch, HDFS-11156.05.patch,
> HDFS-11156.06.patch
>
>
> Following webhdfs REST API
> {code}
> http://<HOST>:<PORT>/webhdfs/v1/<PATH>?op=GET_BLOCK_LOCATIONS&offset=0&length=1
> {code}
> will get a response like
> {code}
> {
> "LocatedBlocks" : {
> "fileLength" : 1073741824,
> "isLastBlockComplete" : true,
> "isUnderConstruction" : false,
> "lastLocatedBlock" : { ... },
> "locatedBlocks" : [ {...} ]
> }
> }
> {code}
> This represents for *o.a.h.h.p.LocatedBlocks*. However according to
> *FileSystem* API,
> {code}
> public BlockLocation[] getFileBlockLocations(Path p, long start, long len)
> {code}
> clients would expect an array of BlockLocation. This mismatch should be
> fixed. Marked as Incompatible change as this will change the output of the
> GET_BLOCK_LOCATIONS API.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]