Re: num of reducer

2012-02-17 Thread Thamizhannal Paramasivam
It worked me. Thanks a lot Bejoy. Thanks Thamizh On Fri, Feb 17, 2012 at 3:08 PM, Bejoy Ks wrote: > Hi Tamizh > MultiFileInputFormat / CombineFileInputFormat is typically used > where the input files are relatively small (typically less than a block > size). When you use these, there i

Re: num of reducer

2012-02-17 Thread Bejoy Ks
Hi Tamizh MultiFileInputFormat / CombineFileInputFormat is typically used where the input files are relatively small (typically less than a block size). When you use these, there is some loss in data locality, as all the splits a mapper process won't be in the same node. TextInputFo

Re: num of reducer

2012-02-16 Thread Thamizhannal Paramasivam
Thank you so much to Joey & Bejoy for your suggestions. The Job's input path has 1300-1400 text files and each of 100-200MB. I thought, TextInputFormat spans single mapper per file and MultiFileInputFormat spans less number mapper(<(1300-1400)) that processes more many input files. Which input f

Re: num of reducer

2012-02-16 Thread Joey Echeverria
Is your data size 100-200MB *total*? If so, then this is the expected behavior for MultiFileInputFormat. As Bejoy says, you can switch to TextInputFormat to get one mapper per block (min one mapper per file). -Joey On Thu, Feb 16, 2012 at 11:03 AM, Thamizhannal Paramasivam < thamizhanna...@gmail

Re: num of reducer

2012-02-16 Thread bejoy . hadoop
2012 21:33:11 To: Reply-To: mapreduce-user@hadoop.apache.org Subject: Re: num of reducer Here are the input format for mapper. Input Format: MultiFileInputFormat MapperOutputKey : Text MapperOutputValue: CustomWritable I shall not be in the position to upgrade hadoop-0.19.2 for some reason. I

Re: num of reducer

2012-02-16 Thread Thamizhannal Paramasivam
Here are the input format for mapper. Input Format: MultiFileInputFormat MapperOutputKey : Text MapperOutputValue: CustomWritable I shall not be in the position to upgrade hadoop-0.19.2 for some reason. I have checked in number of mapper on job-tracker. Thanks, Thamizh On Thu, Feb 16, 2012 at 6

Re: num of reducer

2012-02-16 Thread Joey Echeverria
Hi Tamil, I'd recommend upgrading to a newer release as 0.19.2 is very old. As for your question, most input formats should set the number mappers correctly. What input format are you using? Where did you see the number of tasks it assigned to the job? -Joey On Thu, Feb 16, 2012 at 1:40 AM, Tham