Thanks Viji. I am confused a little when the data is small y would there b 2 tasks. U will use the min as 2 if u need it but in this case it is not needed due to size of the data being smallĀ so y would 2 map tasks exec. Since it results in 1 block with 5 lines of data in it i am assuming this results in 5 map computations 1 per each lineĀ and all of em in 1 process/node since i m using a pseudo vm. Where is the second task coming from. The 5 computations of map on each line is 1 task. Is this right. Please help. Thanks
________________________________ From: Viji R <[email protected]> To: [email protected]; Sai Sai <[email protected]> Sent: Thursday, 26 September 2013 5:09 PM Subject: Re: 2 Map tasks running for a small input file Hi, Default number of map tasks is 2. You can set mapred.map.tasks to 1 to avoid this. Regards, Viji On Thu, Sep 26, 2013 at 4:28 PM, Sai Sai <[email protected]> wrote: > Hi > Here is the input file for the wordcount job: > ****************** > Hi This is a simple test. > Hi Hadoop how r u. > Hello Hello. > Hi Hi. > Hadoop Hadoop Welcome. > ****************** > > After running the wordcount successfully > here r the counters info: > > *************** > Job Counters SLOTS_MILLIS_MAPS 0 0 8,386 > Launched reduce tasks 0 0 1 > Total time spent by all reduces waiting after reserving slots (ms) 0 0 0 > Total time spent by all maps waiting after reserving slots (ms) 0 0 0 > Launched map tasks 0 0 2 > Data-local map tasks 0 0 2 > SLOTS_MILLIS_REDUCES 0 0 9,199 > *************** > My question why r there 2 launched map tasks when i have only a small file. > Per my understanding it is only 1 block. > and should be only 1 split. > Then for each line a map computation should occur > but it shows 2 map tasks. > Please let me know. > Thanks > Sai >
