Extend SequenceFile to provide MapFile function by storing index at the end of 
the file
---------------------------------------------------------------------------------------

                 Key: HADOOP-603
                 URL: http://issues.apache.org/jira/browse/HADOOP-603
             Project: Hadoop
          Issue Type: Improvement
          Components: dfs
            Reporter: Jim Kellerman


MapFile increases the load on the name node as two files are created to provide 
a index file format. If SequenceFile were extended by storing the index at the 
end of the file, 1/2 of the files currently created for a map/reduce operation 
would be needed, reducing the load on the name node.

Perhaps this is why Google implemented SSTable files in this manner. (SSTable 
files are functionally identical to Hadoop MapFiles) (see the paper on BigTable 
- section 4 "Building Blocks" http://labs.google.com/papers/bigtable.html)

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to