Allow simplified versioning for namenode and datanode metadata.
---------------------------------------------------------------

         Key: HADOOP-224
         URL: http://issues.apache.org/jira/browse/HADOOP-224
     Project: Hadoop
        Type: Improvement

  Components: dfs  
 Environment: All
    Reporter: Milind Bhandarkar


Currently namenode has two types of metadata: The FSImage, and FSEdits. FSImage 
contains information abut Inodes, and FSEdits contains a list of operations 
that were not saved to FSImage. Datanode currently does not have any metadata, 
but would have it some day. 

The file formats used for storing these metadata will evolve over time. It is 
important for the file-system to be backward compatible. That is, the metadata 
readers need to be able to identify which version of the file-format we are 
using, and need to be able to read information therein. As we add information 
to these metadata, the complexity of the reader increases dramatically.

I propose a versioning scheme with a major and minor version number, where a 
different reader class is associated with a major number, and that class 
interprets the minor number internally. The readers essentially form a chain 
starting with the latest version. Each version-reader looks at the file and if 
it does not recognize the version number, passes it to the version reader next 
to it by calling the parse method, returnng the results of the parse method up 
the chain (In case of the namenode, the parse result is an array of Inodes.

This scheme has an advantage that every time a new major version is added, the 
new reader only needs to know about the reader for its immediately previous 
version, and every reader needs to know only about which major version numbers 
it can read.

The writer is not so versioned, because metadata is always written in the most 
current version format.

One more change that is needed for simplified versioning is that the 
"struct-surping" of dfs.Block needs to be removed. Block's contents will change 
in later versions, and older versions should still be able to readFields 
properly. This is more general than Block of course, and in general only basic 
datatypes should be used as Writables in DFS metadata.

For edits, the reader should return <opcode, ArrayWritable> pairs' array. This 
will also remove the limitation of two operands for very opcodes, and will be 
more extensible.

Even with this new versioning scheme, the last Reader in the reader-chain would 
recognize current format, thus maintaining full backward compatibility.


-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

Reply via email to