Re: HCatInputFormat combine splits

Ankit Bhatnagar Thu, 14 May 2015 11:11:05 -0700

try thesemapred.max.split.size= mapred.min.split.size=  
mapreduce.input.fileinputformat.split.maxsize= 
mapreduce.input.fileinputformat.split.minsize=




     On Thursday, May 14, 2015 11:04 AM, Pradeep Gollakota 
<pradeep...@gmail.com> wrote:
   

 The following property has been to no effect.
mapreduce.input.fileinputformat.split.maxsize = 67108864
I'm still getting 1 Mapper per file.
On Thu, May 14, 2015 at 10:27 AM, Ankit Bhatnagar <ank...@yahoo-inc.com> wrote:

you can explicitly set the split size 


     On Wednesday, May 13, 2015 11:37 PM, Pradeep Gollakota 
<pradeep...@gmail.com> wrote:
   

 Hi All,
I'm writing an MR job to read data using HCatInputFormat... however, the job is 
generating too many splits. I don't have this problem when running queries in 
Hive since it combines splits by default.
Is there an equivalent in MR so that I'm not generating thousands of mappers?
Thanks,Pradeep

Re: HCatInputFormat combine splits

Reply via email to