You are confusing map and reduce tasks with a mapreduce jobs. Your pig script resulted in a single mapreduce job. The number of map tasks was 2, based on input size -- it has little to do with the actual operators you used.
There is no union operator involved so I am not sure what you are referring to with that. On Mar 7, 2012, at 8:09 AM, Yongzhi Wang <[email protected]> wrote: > Hi, There > > I tried to use the syntax "explain", but the MapReduce plan sometime > confused me. > > I tried such syntax below: > > *my_raw = LOAD './houred-small' USING PigStorage('\t') AS (user,hour, > query); > part1 = filter my_raw by hour>11; > part2 = filter my_raw by hour<13; > result = cogroup part1 by hour, part2 by hour; > dump result; > explain result;* > > The job stats shows as blow, indicating there are 2 Map tasks and 1 reduce > tasks. But I don't know how does the Map task is mapping to the MapReduce > plan shown below. It seems each Map task just do one filter and rearrange, > but on which phase the union operation is done? the shuffle phase? If in > that case, two Map tasks actually done different filter work. Is that > possible? Or my guess is wrong? > > So, back to the question: *Is there any way that I can see the actual map > and reduce task executed in the pig?* > > *Job Stats (time in seconds): > JobId Maps Reduces MaxMapTime MinMapTIme AvgMapTime > MaxReduceTime MinReduceTime AvgReduceTime Alias Feature Outputs > job_201203021230_0038 2 1 3 3 3 12 > 12 1 2 my_raw,part1,part2,result COGROUP > hdfs://master:54310/tmp/temp6260 > 37557/tmp-1661404166, > * > > The mapreduce plan shows as below:* > #-------------------------------------------------- > # Map Reduce Plan > #-------------------------------------------------- > MapReduce node scope-84 > Map Plan > Union[tuple] - scope-85 > | > |---result: Local Rearrange[tuple]{bytearray}(false) - scope-73 > | | | > | | Project[bytearray][1] - scope-74 > | | > | |---part1: Filter[bag] - scope-59 > | | | > | | Greater Than[boolean] - scope-63 > | | | > | | |---Cast[int] - scope-61 > | | | | > | | | |---Project[bytearray][1] - scope-60 > | | | > | | |---Constant(11) - scope-62 > | | > | |---my_raw: New For Each(false,false,false)[bag] - scope-89 > | | | > | | Project[bytearray][0] - scope-86 > | | | > | | Project[bytearray][1] - scope-87 > | | | > | | Project[bytearray][2] - scope-88 > | | > | |---my_raw: > Load(hdfs://master:54310/user/root/houred-small:PigStorage(' ')) - > scope-90 > | > |---result: Local Rearrange[tuple]{bytearray}(false) - scope-75 > | | > | Project[bytearray][1] - scope-76 > | > |---part2: Filter[bag] - scope-66 > | | > | Less Than[boolean] - scope-70 > | | > | |---Cast[int] - scope-68 > | | | > | | |---Project[bytearray][1] - scope-67 > | | > | |---Constant(13) - scope-69 > | > |---my_raw: New For Each(false,false,false)[bag] - scope-94 > | | > | Project[bytearray][0] - scope-91 > | | > | Project[bytearray][1] - scope-92 > | | > | Project[bytearray][2] - scope-93 > | > |---my_raw: > Load(hdfs://master:54310/user/root/houred-small:PigStorage(' ')) - > scope-95-------- > Reduce Plan > result: Store(fakefile:org.apache.pig.builtin.PigStorage) - scope-77 > | > |---result: Package[tuple]{bytearray} - scope-72-------- > Global sort: false > ----------------* > > Thanks!
