[
http://issues.apache.org/jira/browse/HADOOP-224?page=comments#action_12412072 ]
Milind Bhandarkar commented on HADOOP-224:
------------------------------------------
I am referring to exactly this kind of code:
if (version > 3) {
newField = in.readXXX();
} else {
newField = YYY;
}
As more and more of such code gets added to the load and save method, it
becomes complicated to follow. Such code is currently in the FSDirectory class,
thus tightly coupling the abstraction of directory to the format used to store
this data in the FSImage.
If we instead separate this into a separate class, say FSImageReader, and as we
make major changes to file format, introduce new classes such as
FSImageReaderV2, FSImaageReaderV3 etc. A member in FSDirectory will hold a
reference to current file format reader, of which, the "parse" method will be
called by FSDirectory. If FSImageReaderV3 is the current version, and it is
trying to read an image that it cannot read, it will call FSImageReaderV2's
parse method, and so on. Since the readers will not have any real state, the
parse methods will be static methods, to avoid overhead of chaining.
So, as a first step, I could separate the loadFSImage and loadFSEdits into two
different classes.
Does it make sense ?
> Allow simplified versioning for namenode and datanode metadata.
> ---------------------------------------------------------------
>
> Key: HADOOP-224
> URL: http://issues.apache.org/jira/browse/HADOOP-224
> Project: Hadoop
> Type: Improvement
> Components: dfs
> Environment: All
> Reporter: Milind Bhandarkar
>
> Currently namenode has two types of metadata: The FSImage, and FSEdits.
> FSImage contains information abut Inodes, and FSEdits contains a list of
> operations that were not saved to FSImage. Datanode currently does not have
> any metadata, but would have it some day.
> The file formats used for storing these metadata will evolve over time. It is
> important for the file-system to be backward compatible. That is, the
> metadata readers need to be able to identify which version of the file-format
> we are using, and need to be able to read information therein. As we add
> information to these metadata, the complexity of the reader increases
> dramatically.
> I propose a versioning scheme with a major and minor version number, where a
> different reader class is associated with a major number, and that class
> interprets the minor number internally. The readers essentially form a chain
> starting with the latest version. Each version-reader looks at the file and
> if it does not recognize the version number, passes it to the version reader
> next to it by calling the parse method, returnng the results of the parse
> method up the chain (In case of the namenode, the parse result is an array of
> Inodes.
> This scheme has an advantage that every time a new major version is added,
> the new reader only needs to know about the reader for its immediately
> previous version, and every reader needs to know only about which major
> version numbers it can read.
> The writer is not so versioned, because metadata is always written in the
> most current version format.
> One more change that is needed for simplified versioning is that the
> "struct-surping" of dfs.Block needs to be removed. Block's contents will
> change in later versions, and older versions should still be able to
> readFields properly. This is more general than Block of course, and in
> general only basic datatypes should be used as Writables in DFS metadata.
> For edits, the reader should return <opcode, ArrayWritable> pairs' array.
> This will also remove the limitation of two operands for very opcodes, and
> will be more extensible.
> Even with this new versioning scheme, the last Reader in the reader-chain
> would recognize current format, thus maintaining full backward compatibility.
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
http://www.atlassian.com/software/jira