number of map tasks on yarn

2014-04-01 Thread Libo Yu
Hi all, I pretty much use the default yarn setting to run a word count example on a 3 node cluster. Here are my settings: yarn.nodemanager.resource.memory-mb 8192 yarn.scheduler.minimum-allocation-mb 1024 yarn.scheduler.maximum-allocation-vcores 32 I would expect to see 8192/1024 * 3 = 24 map

Re: number of map tasks on yarn

2014-04-01 Thread Stanley Shi
map task number is not decided by the resources you need. It's decided by something else. Regards, *Stanley Shi,* On Wed, Apr 2, 2014 at 9:08 AM, Libo Yu yu_l...@hotmail.com wrote: Hi all, I pretty much use the default yarn setting to run a word count example on a 3 node cluster. Here are

Re: number of map tasks on yarn

2014-04-01 Thread Wangda Tan
More specifically, Number of map tasks for each job is depended on InputFormat.getSplits(...). The number of map tasks is as same as number of splits returned by InputFormat.getSplits(...). You can read source code of FileInputFormat to get more understanding about this. Regards, Wangda Tan

Re: number of map tasks on yarn

2014-04-01 Thread Mingjiang Shi
+1 for Wangda's comment. My 2 cents: There are 2 aspect of the problem: 1. How many maps task in a job. 2. How many map tasks can be run concurrently. For #1, see Wangda's comments. For #2, it depends on the cluster resource. In your case, the cluster will only be able to run 24 map tasks