Hey, I was curious if anyone had proposed a good solution or future jira to LayoutVersion/FSEditLog and opcode differences between the different versions to support better upgrades and downgrades in the future.
I noticed here, https://issues.apache.org/jira/browse/HDFS-1842 it seems to have already caused some problems in at least a few cases, and saw other people complaining about security branch vs, later releases and then the really big discussion here, https://issues.apache.org/jira/browse/HDFS-1822. The topic seemed to focus in on the validity and truth of trunk, which I'll try to leave alone. As far as I can tell, different versions will be rolled, ideally its all in trunk, but what if it isn't? Yes burning them is one option, but what if some normal person just wants a downgrade for some reason, but doesn't have their rollback copy? Has someone thought of something like this a mapper/reader to support reading in one format of log that's defined, and then mapping the relevant fields to the current version. It would probably involve defining the schema (op_code+fields) in both the supported version for upgrade and the current versions, which you want to convert to and then mapping relevant fields/disgarding others, for which you could give a warning, ie dropping security settings due to downgrade. Obviously LayoutVersion.supports feature comes closer to this, but things are starting to look fragile. Then these types of things might go away from FSEditLogOp.java?: 570 if (logVersion <= -11) { 571 this.permissions = PermissionStatus.read(in); 572 } else { 573 this.permissions = null; 574 } I'm not an expert on the ins and out, and it's probably a big refactor, but was curious if anyone had thought about this? Might ease conversion process in general, and specific upgrades would be supported as defined in one place as opposed to rejected based on additional patches to force it not being allowed. If schema was documented in all releases, then conversion would be simplified, or so the idea goes. Any takers or ideas on this already been kicked around? If for anything else, people learning about hadoop have to reformat their data when they're just testing different versions, which is a shame for me :) Luke