Mark, The InputSplit is something of a meta class you ought to use to get path, offset and length information from. Your RecordReader implementation in the InputFormat would ideally be wrapping two instantiated RecordReaders made from the same InputSplit meta information. The InputSplit object does not serve any more purpose beyond that (and there should be no need to clone/copy it -- just extract the information you require from the FileSplit).
On Wed, Jun 8, 2011 at 10:08 PM, Mark question <[email protected]> wrote: > I have a question though for Harsh case... I wrote my custom inputFormat > which will create an array of recordReaders and give them to the MapRunner. > > Will that mean multiple copies of the inputSplit are all in memory? or will > there be one copy pointed by all of them .. as if they were pointers ? > > Thanks, > Mark > > On Wed, Jun 8, 2011 at 9:13 AM, Mark question <[email protected]> wrote: > >> Thanks for the replies, but input doesn't have 'clone' I don't know why ... >> so I'll have to write my custom inputFormat ... I was hoping for an easier >> way though. >> >> Thank you, >> Mark >> >> >> On Wed, Jun 8, 2011 at 1:58 AM, Harsh J <[email protected]> wrote: >> >>> Or if that does not work for any reason (haven't tried it really), try >>> writing your own InputFormat wrapper where in you can have direct >>> access to the InputSplit object to do what you want to (open two >>> record readers, and manage them separately). >>> >>> On Wed, Jun 8, 2011 at 1:48 PM, Stefan Wienert <[email protected]> wrote: >>> > Try input.clone()... >>> > >>> > 2011/6/8 Mark question <[email protected]>: >>> >> Hi, >>> >> >>> >> I'm trying to read the inputSplit over and over using following >>> function >>> >> in MapperRunner: >>> >> >>> >> @Override >>> >> public void run(RecordReader input, OutputCollector output, Reporter >>> >> reporter) throws IOException { >>> >> >>> >> RecordReader copyInput = input; >>> >> >>> >> //First read >>> >> while(input.next(key,value)); >>> >> >>> >> //Second read >>> >> while(copyInput.next(key,value)); >>> >> } >>> >> >>> >> It can clearly be seen that this won't work because both RecordReaders >>> are >>> >> actually the same. I'm trying to find a way for the second reader to >>> start >>> >> reading the split again from beginning ... How can I do that? >>> >> >>> >> Thanks, >>> >> Mark >>> >> >>> > >>> > >>> > >>> > -- >>> > Stefan Wienert >>> > >>> > http://www.wienert.cc >>> > [email protected] >>> > >>> > Telefon: +495251-2026838 >>> > Mobil: +49176-40170270 >>> > >>> >>> >>> >>> -- >>> Harsh J >>> >> >> > -- Harsh J
