[ 
https://issues.apache.org/jira/browse/HDFS-8555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

amit sehga updated HDFS-8555:
-----------------------------
    Summary: Random read support on HDFS files using Indexed Namenode feature  
(was: Random access support on HDFS files using Indexed Namenode feature)

> Random read support on HDFS files using Indexed Namenode feature
> ----------------------------------------------------------------
>
>                 Key: HDFS-8555
>                 URL: https://issues.apache.org/jira/browse/HDFS-8555
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: HDFS, hdfs-client, namenode
>    Affects Versions: 2.5.2
>         Environment: Linux
>            Reporter: amit sehga
>            Assignee: amit sehga
>             Fix For: 3.0.0
>
>   Original Estimate: 720h
>  Remaining Estimate: 720h
>
> Currently Namenode does not provide support to do random reads. With so many 
> tools built on top of HDFS solving the use case of Exploratory BI and 
> providing SQL over HDFS. The need of hour is to reduce the number of blocks 
> read for a Random read. 
> E.g. extracting say 10 lines worth of information out of 100GB files should 
> be reading only those block which can potentially have those 10 lines.
> This can be achieved by adding a tagging feature per block in name node, each 
> block written to HDFS will have tags associated to it stored in index.
> Namednode when access via the Indexing feature will use this index native to 
> reduce the no. of block returned to the client.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to