[ 
https://issues.apache.org/jira/browse/HADOOP-894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12494575
 ] 

Sameer Paranjpye commented on HADOOP-894:
-----------------------------------------

Adding 'start' and 'length' parameters to the Namenodes 'open' RPC doesn't seem 
to add a lot of value. It won't be used unless we expose it through 
fs.FileSystem or dfs.DistributedFileSystem and adding an 'open and seek' kind 
of call just seems like API bloat.

On the other hand, having the locations of the first few block of a file is 
useful in many cases. In particular when a client is working with small files 
or wants to read the files header before seeking (as MR tasks processing 
sequence files do). Why not just have open default to returning the first few 
block locations?


> dfs client protocol should allow asking for parts of the block map
> ------------------------------------------------------------------
>
>                 Key: HADOOP-894
>                 URL: https://issues.apache.org/jira/browse/HADOOP-894
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: dfs
>            Reporter: Owen O'Malley
>         Assigned To: Konstantin Shvachko
>         Attachments: partialBlockList.patch, partialBlockList2.patch
>
>
> I think that the HDFS client protocol should change like:
> /** The meta-data about a file that was opened. */
> class OpenFileInfo {
>   /** the info for the first block */
>   public LocatedBlockInfo getBlockInfo();
>   public long getBlockSize();
>   public long getLength();
> }
> interface ClientProtocol extends VersionedProtocol {
>   public OpenFileInfo open(String name) throws IOException;
>   /** get block info for any range of blocks */
>   public LocatedBlockInfo[] getBlockInfo(String name, int blockOffset, int 
> blockLength) throws IOException;
> }
> so that the client can decide how much block info to request and when. 
> Currently, when the file is opened or an error occurs, the entire block list 
> is requested and sent.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to