Hi, It is hard to pick up certain lines of a text file - globally I mean. Remember that the file is split according to its size (byte boundries) not lines.,, so, it is possible to keep track of the lines inside a split, but globally for the whole file, assuming it is split among map tasks... i don't think it is possible.. I am new to hadoop, but that is my take on it.
Alexandra On Wed, May 18, 2011 at 2:41 PM, bnonymous <[email protected]> wrote: > > Hello, > > I'm trying to pick up certain lines of a text file. (say 1st, 110th line of > a file with 10^10 lines). I need a InputFormat which gives the Mapper line > number as the key. > > I tried to implement RecordReader, but I can't get line information from > InputSplit. > > Any solution to this??? > > Thanks in advance!!!!!!! > -- > View this message in context: > http://old.nabble.com/current-line-number-as-key--tp31649694p31649694.html > Sent from the Hadoop core-user mailing list archive at Nabble.com. > >
