[ 
https://issues.apache.org/jira/browse/HDFS-12681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16253955#comment-16253955
 ] 

Chris Douglas commented on HDFS-12681:
--------------------------------------

bq. now it can't distiniguish whether it needs an RPC call, so we need to 
directly call fs.getFileBlockLocations?
v06 of the patch (not v05, sorry mixed them up) would not make an RPC if the 
{{FileStatus}} included locations:
{noformat}
diff --git 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DistributedFileSystem.java
 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DistributedFileSystem.java
index a8a5cfa..617cbf4 100644
--- 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DistributedFileSystem.java
+++ 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DistributedFileSystem.java
@@ -237,6 +236,12 @@ String getPathName(Path file) {
     if (file == null) {
       return null;
     }
+    if (file instanceof LocatedFileStatus) {
+      BlockLocation[] loc = ((LocatedFileStatus)file).getBlockLocations();
+      if (loc != null) {
+        return loc;
+      }
+    }
     return getFileBlockLocations(file.getPath(), start, len);
   }
 {noformat}

This changes the semantics for HDFS (i.e., it won't refresh locations) and the 
change to MapReduce:
{noformat}
diff --git 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/FileInputFormat.java
 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/FileInputFormat.java
index 3e0ea25..0f0a45b 100644
--- 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/FileInputFormat.java
+++ 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/FileInputFormat.java
@@ -344,11 +344,7 @@ protected FileSplit makeSplit(Path file, long start, long 
length,
       if (length != 0) {
         FileSystem fs = path.getFileSystem(job);
         BlockLocation[] blkLocations;
-        if (file instanceof LocatedFileStatus) {
-          blkLocations = ((LocatedFileStatus) file).getBlockLocations();
-        } else {
-          blkLocations = fs.getFileBlockLocations(file, 0, length);
-        }
+        blkLocations = fs.getFileBlockLocations(file, 0, length);
{noformat}

Would have added additional RPC traffic for non-HDFS {{FileSystem}} 
implementations that rely on the type to determine if they need locations.

{{makeQualified\[Located\]}} are internal methods that allow HDFS to lazily 
bind {{FileStatus}} fields (improving space efficiency and avoiding some 
conversions). Clients shouldn't need to call them.

We _hope_ that clients would request locations in the first RPC call, rather 
than asking for a {{FileStatus}} and then requesting its block locations.

> Fold HdfsLocatedFileStatus into HdfsFileStatus
> ----------------------------------------------
>
>                 Key: HDFS-12681
>                 URL: https://issues.apache.org/jira/browse/HDFS-12681
>             Project: Hadoop HDFS
>          Issue Type: Bug
>            Reporter: Chris Douglas
>            Assignee: Chris Douglas
>            Priority: Minor
>             Fix For: 3.1.0
>
>         Attachments: HDFS-12681.00.patch, HDFS-12681.01.patch, 
> HDFS-12681.02.patch, HDFS-12681.03.patch, HDFS-12681.04.patch, 
> HDFS-12681.05.patch, HDFS-12681.06.patch, HDFS-12681.07.patch, 
> HDFS-12681.08.patch, HDFS-12681.09.patch, HDFS-12681.10.patch
>
>
> {{HdfsLocatedFileStatus}} is a subtype of {{HdfsFileStatus}}, but not of 
> {{LocatedFileStatus}}. Conversion requires copying common fields and shedding 
> unknown data. It would be cleaner and sufficient for {{HdfsFileStatus}} to 
> extend {{LocatedFileStatus}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to