amit sehga created HDFS-8555:
--------------------------------

             Summary: Random access support on HDFS files using Indexed 
Namenode feature
                 Key: HDFS-8555
                 URL: https://issues.apache.org/jira/browse/HDFS-8555
             Project: Hadoop HDFS
          Issue Type: Improvement
          Components: HDFS, hdfs-client, namenode
    Affects Versions: 2.5.2
         Environment: Linux
            Reporter: amit sehga
            Assignee: amit sehga
             Fix For: 3.0.0


Currently Namenode does not provide support to do random reads. With so many 
tools built on top of HDFS solving the use case of Exploratory BI and providing 
SQL over HDFS. The need of hour is to reduce the number of blocks read for a 
Random read. 
E.g. extracting say 10 lines worth of information out of 100GB files should be 
reading only those block which can potentially have those 10 lines.
This can be achieved by adding a tagging feature per block in name node, each 
block written to HDFS will have tags associated to it stored in index.
Namednode when access via the Indexing feature will use this index native to 
reduce the no. of block returned to the client.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to