Extend SequenceFile to provide MapFile function by storing index at the end of
the file
---------------------------------------------------------------------------------------
Key: HADOOP-603
URL: http://issues.apache.org/jira/browse/HADOOP-603
Project: Hadoop
Issue Type: Improvement
Components: dfs
Reporter: Jim Kellerman
MapFile increases the load on the name node as two files are created to provide
a index file format. If SequenceFile were extended by storing the index at the
end of the file, 1/2 of the files currently created for a map/reduce operation
would be needed, reducing the load on the name node.
Perhaps this is why Google implemented SSTable files in this manner. (SSTable
files are functionally identical to Hadoop MapFiles) (see the paper on BigTable
- section 4 "Building Blocks" http://labs.google.com/papers/bigtable.html)
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira