Thanks All! On 11 Jul 2012 19:07, "Bejoy KS" <bejoy.had...@gmail.com> wrote:
> ** > Hi Manoj > > Block size is in hdfs storage level where as split size is the amount of > data processed by each mapper while running a map reduce job(One split is > the data processed by one mapper). One or more hdfs blocks can contribute a > split. Splits are determined by the InputFormat as well as the min and max > split size properties. > > As Arun mentioned use CombineFileInputFormat and adjust the min and max > split size properties to control/limit the number of mappers. > > Regards > Bejoy KS > > Sent from handheld, please excuse typos. > ------------------------------ > *From: * Manoj Babu <manoj...@gmail.com> > *Date: *Wed, 11 Jul 2012 18:17:41 +0530 > *To: *<mapreduce-user@hadoop.apache.org> > *ReplyTo: * mapreduce-user@hadoop.apache.org > *Subject: *Re: Mapper basic question > > Hi Tariq \Arun, > > The no of blocks(splits) = *total no of file size/hdfs block size * > replicate value* > The no of splits is again nothing but the blocks here. > > Other than increasing the block size(input splits) is it possible to limit > that no of mappers? > > > Cheers! > Manoj. > > > > On Wed, Jul 11, 2012 at 6:06 PM, Arun C Murthy <a...@hortonworks.com>wrote: > >> Take a look at CombineFileInputFormat - this will create 'meta splits' >> which include multiple small spilts, thus reducing #maps which are run. >> >> Arun >> >> On Jul 11, 2012, at 5:29 AM, Manoj Babu wrote: >> >> Hi, >> >> The no of mappers is depends on the no of blocks. Is it possible to limit >> the no of mappers size without increasing the HDFS block size? >> >> Thanks in advance. >> >> Cheers! >> Manoj. >> >> >> -- >> Arun C. Murthy >> Hortonworks Inc. >> http://hortonworks.com/ >> >> >> >