Re: HCatInputFormat combine splits

2015-05-14 Thread Pradeep Gollakota
Still no effect. Set minsize to 32M and maxsize to 64M On Thu, May 14, 2015 at 11:07 AM, Ankit Bhatnagar wrote: > try these > mapred.max.split.size= > mapred.min.split.size= > > mapreduce.input.fileinputformat.split.maxsize= > mapreduce.input.fileinputformat.split.minsize= > > > > > > On Thurs

Re: HCatInputFormat combine splits

2015-05-14 Thread Ankit Bhatnagar
try thesemapred.max.split.size= mapred.min.split.size=   mapreduce.input.fileinputformat.split.maxsize= mapreduce.input.fileinputformat.split.minsize=   On Thursday, May 14, 2015 11:04 AM, Pradeep Gollakota wrote: The following property has been to no effect. mapreduce.input.filei

Re: HCatInputFormat combine splits

2015-05-14 Thread Pradeep Gollakota
The following property has been to no effect. mapreduce.input.fileinputformat.split.maxsize = 67108864 I'm still getting 1 Mapper per file. On Thu, May 14, 2015 at 10:27 AM, Ankit Bhatnagar wrote: > you can explicitly set the split size > > > > On Wednesday, May 13, 2015 11:37 PM, Pradeep Go

Re: HCatInputFormat combine splits

2015-05-14 Thread Ankit Bhatnagar
you can explicitly set the split size On Wednesday, May 13, 2015 11:37 PM, Pradeep Gollakota wrote: Hi All, I'm writing an MR job to read data using HCatInputFormat... however, the job is generating too many splits. I don't have this problem when running queries in Hive since it c

HCatInputFormat combine splits

2015-05-13 Thread Pradeep Gollakota
Hi All, I'm writing an MR job to read data using HCatInputFormat... however, the job is generating too many splits. I don't have this problem when running queries in Hive since it combines splits by default. Is there an equivalent in MR so that I'm not generating thousands of mappers? Thanks, Pr