You are confusing map and reduce tasks with a mapreduce jobs. Your pig script 
resulted in a single mapreduce job. The number of map tasks was 2, based on 
input size -- it has little to do with the actual operators you used. 

There is no union operator involved so I am not sure what you are referring to 
with that. 

On Mar 7, 2012, at 8:09 AM, Yongzhi Wang <[email protected]> wrote:

> Hi, There
> 
> I tried to use the syntax "explain", but the MapReduce plan sometime
> confused me.
> 
> I tried such syntax below:
> 
> *my_raw = LOAD './houred-small' USING PigStorage('\t') AS (user,hour,
> query);
> part1 = filter my_raw by hour>11;
> part2 = filter my_raw by hour<13;
> result = cogroup part1 by hour, part2 by hour;
> dump result;
> explain result;*
> 
> The job stats shows as blow, indicating there are 2 Map tasks and 1 reduce
> tasks. But I don't know how does the Map task is mapping to the MapReduce
> plan shown below. It seems each Map task just do one filter and rearrange,
> but on which phase the union operation is done? the shuffle phase? If in
> that case, two Map tasks actually done different filter work. Is that
> possible? Or my guess is wrong?
> 
> So, back to the question: *Is there any way that I can see the actual map
> and reduce task executed in the pig?*
> 
> *Job Stats (time in seconds):
> JobId   Maps    Reduces MaxMapTime      MinMapTIme      AvgMapTime
> MaxReduceTime   MinReduceTime   AvgReduceTime   Alias   Feature Outputs
> job_201203021230_0038   2       1       3       3       3       12
> 12     1    2       my_raw,part1,part2,result       COGROUP
> hdfs://master:54310/tmp/temp6260
> 37557/tmp-1661404166,
> *
> 
> The mapreduce plan shows as below:*
> #--------------------------------------------------
> # Map Reduce Plan
> #--------------------------------------------------
> MapReduce node scope-84
> Map Plan
> Union[tuple] - scope-85
> |
> |---result: Local Rearrange[tuple]{bytearray}(false) - scope-73
> |   |   |
> |   |   Project[bytearray][1] - scope-74
> |   |
> |   |---part1: Filter[bag] - scope-59
> |       |   |
> |       |   Greater Than[boolean] - scope-63
> |       |   |
> |       |   |---Cast[int] - scope-61
> |       |   |   |
> |       |   |   |---Project[bytearray][1] - scope-60
> |       |   |
> |       |   |---Constant(11) - scope-62
> |       |
> |       |---my_raw: New For Each(false,false,false)[bag] - scope-89
> |           |   |
> |           |   Project[bytearray][0] - scope-86
> |           |   |
> |           |   Project[bytearray][1] - scope-87
> |           |   |
> |           |   Project[bytearray][2] - scope-88
> |           |
> |           |---my_raw:
> Load(hdfs://master:54310/user/root/houred-small:PigStorage('    ')) -
> scope-90
> |
> |---result: Local Rearrange[tuple]{bytearray}(false) - scope-75
>    |   |
>    |   Project[bytearray][1] - scope-76
>    |
>    |---part2: Filter[bag] - scope-66
>        |   |
>        |   Less Than[boolean] - scope-70
>        |   |
>        |   |---Cast[int] - scope-68
>        |   |   |
>        |   |   |---Project[bytearray][1] - scope-67
>        |   |
>        |   |---Constant(13) - scope-69
>        |
>        |---my_raw: New For Each(false,false,false)[bag] - scope-94
>            |   |
>            |   Project[bytearray][0] - scope-91
>            |   |
>            |   Project[bytearray][1] - scope-92
>            |   |
>            |   Project[bytearray][2] - scope-93
>            |
>            |---my_raw:
> Load(hdfs://master:54310/user/root/houred-small:PigStorage('    ')) -
> scope-95--------
> Reduce Plan
> result: Store(fakefile:org.apache.pig.builtin.PigStorage) - scope-77
> |
> |---result: Package[tuple]{bytearray} - scope-72--------
> Global sort: false
> ----------------*
> 
> Thanks!

Reply via email to