Enhancement to SequenceFileOutputFormat to allow user to set MetaData
---------------------------------------------------------------------

                 Key: MAPREDUCE-2001
                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2001
             Project: Hadoop Map/Reduce
          Issue Type: Improvement
    Affects Versions: 0.20.2
            Reporter: David Rosenstrauch
            Priority: Minor


The org.apache.hadoop.mapreduce.lib.output.SequenceFileOutputFormat class 
currently does not provide a way for the user to pass in a MetaData object to 
be written to the SequenceFile.

Currently he only way for a developer to implement this functionality appears 
to be to create a subclass which overrides the SequenceFileOutputFormat's 
getRecordWriter() method, which is a bit of a kludge.

This seems to be a common enough request to warrant a fix of some sort.  (It's 
already been brought up twice in the past year:  
http://www.mail-archive.com/[email protected]/msg02198.html and 
http://www.mail-archive.com/[email protected]/msg00904.html)


A couple of possible solutions:

1) provide a static method SequenceFileOutputFormat.setMetaData(Job, MetaData)

2) Provide a (non-static) setMetaData() method on the SequenceFileOutputFormat 
class.  The user would create a subclass of SequenceFileOutputFormat which, 
say, implements Configurable.  Then in the setConf() method, the user could 
create the MetaData object (using data from the Configuration), and then call 
setMetaData.  The SequenceFileOutputFormat would then use this MetaData object 
when creating the SequenceFile.  (Note that the user would have to create a 
subclass of SequenceFileOutputFormat to make this solution work.)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to