Hi,
I have this script whose stage 1 has n maps where n = # of input splits (# gz 
files) but has 1 reducer. I need to understand why my script causes 1 reducer. 
When I think about how I'd do it in Java MapReduce, I dont see why there would 
be a single reducer in stage 1.

register /home/ayon/udfs.jar;

a = load '$input' using PigStorage() as (a:chararray, b:chararray, c:int, 
d:chararray);

g = group a by (a, b);

g = foreach g {
      x = order $1 by c;
      generate group.a, group.b, x;
      };


u = foreach g generate myUDF($2) as triplets;
describe u;
dump u;

Do you see any reason there should be 1 reducer at any stage? How do I debug 
this? Where are the generated classes and plan? 

-Ayon
See My Photos on Flickr
Also check out my Blog for answers to commonly asked questions.

Reply via email to