Re: NLine Input Format

Rahul Tenany Wed, 19 Nov 2008 20:14:44 -0800

Hi Amareshwari,    It is in the ToolRunner.run() method that i am setting
the FileInputFormat as NLineInputFormat and in the same function i am
setting the mapred.line.input.format.linespermap property. Will that not
work? How can i overload LineRecordReader, so that it returns the value as N
Lines?


Thanks
Rahul

On Mon, Nov 17, 2008 at 9:43 AM, Amareshwari Sriramadasu <
[EMAIL PROTECTED]> wrote:

> Hi Rahul,
>
> How did you set the configuration "mapred.line.input.format.linespermap"
> and your input format? You have to set them in hadoop-site.xml or pass them
> through -D option to the job.
> NLineInputFormat will split N lines of input as one split. So, each map
> gets N lines.
> But the RecordReader is still LineRecordReader, which reads one line at
> time, thereby Key is the offset in the file and Value is the line.
> If you want N lines as Key, you may to override LineRecordReader.
>
> Thanks
> Amareshwari
>
>
> Rahul Tenany wrote:
>
>> Hi,   I am writing a Binary Search Tree on Hadoop and for the same i
>> require
>> to use NLineInputFormat. I'll read n lines at a time, convert the numbers
>> in
>> each line from string to int and then insert them into the binary tree.
>> Once
>> the binary tree is made i'll search for elements in it. But even if i set
>> that input format as NLineInputFormat and set the
>> mapred.line.input.format.linespermap
>> to 10, i am able to read only 1 line at the time. Any idea where am i
>> going
>> wrong? How can i find whether NLineInputFormat is working or not?
>>
>> I want my program to work for any object that is comparable and not just
>> integers, so in there any way i can read NObjects at a time?
>>
>> I am completely stuck. Any help will be appreciated.
>>
>> Thanks
>> Rahul
>>
>>
>>
>
>

Re: NLine Input Format

Reply via email to