Anil,
Thanks for your suggestion. The NLineInputFormat code actually helped.
Incase anybody has the same problem, here's a custom OneLineInputFormat
(that splits the file such that each split contains only one line) you can
use:
public class OneLineInputFormat extends FileInputFormat
{
@Ove
gupta
Sent: Friday, March 16, 2012 3:39 AM
To: common-user@hadoop.apache.org
Subject: Re: Suggestion for InputSplit and InputFormat - Split every
line.
Have a look at NLineInputFormat class in Hadoop. It is build to split
the input on the basis of number of lines.
On Thu, Mar 15, 2012 at 6:13 PM
Have a look at NLineInputFormat class in Hadoop. It is build to split the
input on the basis of number of lines.
On Thu, Mar 15, 2012 at 6:13 PM, Deepak Nettem wrote:
> Hi,
>
> I have this use case - I need to spawn as many mappers as the number of
> lines in a file in HDFS. This file isn't big (