On Tue, Jun 24, 2008 at 10:31 PM, Amar Kamat <[EMAIL PROTECTED]> wrote:
> Xuan Dzung Doan wrote: > >> >> >> The level of parallelism of a job, with respect to mappers, is largely the >> number of map tasks spawned, which is equal to the number of InputSplits. >> But within each InputSplit, there may be many records (many input key-value >> pairs), each is processed by one separate call to the map() method. So are >> these calls within one single map task also executed in parallel by the >> framework? >> >> >> > Afaik no. This might be a bit misunderstood. Each task node does run a few map tasks and each of these could be considered a "single map task executed in parallel". It is definitely true that you have more than one map task, even per task node. But it is also true that you get many calls to map per map task.
