Provide a way to open and read a side file using an existing InputFormat
------------------------------------------------------------------------

                 Key: MAPREDUCE-1130
                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1130
             Project: Hadoop Map/Reduce
          Issue Type: New Feature
            Reporter: Pradeep Kamath


In the Pig subproject there is a need to open a side file for implementing map 
side joins. In some cases, the entire file needs to be read as a side file and 
in some cases, there is a need to read a file beginning from a particular split 
to the last split. In order to use existing InputFormats to achieve this, the 
pig code would need to mimic hadoop in terms of calling InputFormat.getSplits 
and then for each split call  InputFormat.createRecordReader, 
RecordReader.initialize() and then call RecordReader.nextKey() repeatedly till 
we reach end of split - and then continue to the next split. It would be good 
if there are some utility methods in Hadoop to achieve this - to read the file 
partially to the end or entirely to the end.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to