Re: Suggestion for InputSplit and InputFormat - Split every line.

2012-03-16 Thread Deepak Nettem
Anil, Thanks for your suggestion. The NLineInputFormat code actually helped. Incase anybody has the same problem, here's a custom OneLineInputFormat (that splits the file such that each split contains only one line) you can use: public class OneLineInputFormat extends FileInputFormat { @Ove

RE: Suggestion for InputSplit and InputFormat - Split every line.

2012-03-15 Thread Vanessa van Gelder
gupta Sent: Friday, March 16, 2012 3:39 AM To: common-user@hadoop.apache.org Subject: Re: Suggestion for InputSplit and InputFormat - Split every line. Have a look at NLineInputFormat class in Hadoop. It is build to split the input on the basis of number of lines. On Thu, Mar 15, 2012 at 6:13 PM

Re: Suggestion for InputSplit and InputFormat - Split every line.

2012-03-15 Thread anil gupta
Have a look at NLineInputFormat class in Hadoop. It is build to split the input on the basis of number of lines. On Thu, Mar 15, 2012 at 6:13 PM, Deepak Nettem wrote: > Hi, > > I have this use case - I need to spawn as many mappers as the number of > lines in a file in HDFS. This file isn't big (