[ https://issues.apache.org/jira/browse/MAPREDUCE-2001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13269355#comment-13269355 ]
Jim Donofrio commented on MAPREDUCE-2001: ----------------------------------------- I just added a simple static method setMetadata which then gets passed to the SequenceFile.Writer constructor. The users would then in their mapper or reducer configure or setup method call SequenceFileOutputFormat.setMetadata with the appropiate metadata object that they would create. I like your idea better though, we add some predefined constant to the JobConf and then store the key, value pairs as comma separated pairs. Then we make the SequenceFileOutputFormat JobConfigurable so that ReflectionUtils.newInstance will call configure on it and load the metadata. The user can then set metadata in their driver class or in the mapper or reducer and then dont have to subclass SequenceFileOutputFormat. I think we should avoid users having to subclass SequenceFileOutputFormat. Thoughts? > Enhancement to SequenceFileOutputFormat to allow user to set MetaData > --------------------------------------------------------------------- > > Key: MAPREDUCE-2001 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-2001 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Affects Versions: 0.20.2 > Reporter: David Rosenstrauch > Priority: Minor > Attachments: MAPREDUCE-2001.patch > > > The org.apache.hadoop.mapreduce.lib.output.SequenceFileOutputFormat class > currently does not provide a way for the user to pass in a MetaData object to > be written to the SequenceFile. > Currently he only way for a developer to implement this functionality appears > to be to create a subclass which overrides the SequenceFileOutputFormat's > getRecordWriter() method, which is a bit of a kludge. > This seems to be a common enough request to warrant a fix of some sort. > (It's already been brought up twice in the past year: > http://www.mail-archive.com/common-user@hadoop.apache.org/msg02198.html and > http://www.mail-archive.com/mapreduce-user@hadoop.apache.org/msg00904.html) > A couple of possible solutions: > 1) provide a static method SequenceFileOutputFormat.setMetaData(Job, MetaData) > 2) Provide a (non-static) setMetaData() method on the > SequenceFileOutputFormat class. The user would create a subclass of > SequenceFileOutputFormat which, say, implements Configurable. Then in the > setConf() method, the user could create the MetaData object (using data from > the Configuration), and then call setMetaData. The SequenceFileOutputFormat > would then use this MetaData object when creating the SequenceFile. (Note > that the user would have to create a subclass of SequenceFileOutputFormat to > make this solution work.) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira