Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Pig Wiki" for change 
notification.

The "FAQ" page has been changed by daijy.
http://wiki.apache.org/pig/FAQ?action=diff&rev1=6&rev2=7

--------------------------------------------------

  C = JOIN A by url, B by url PARALLEL 50. 
  }}}
  
- Even if you do not specify the parallel clause, the framework uses a default 
number of reducers, in the order of 0.9*(number of nodes allocated by user 
-1)*n where n is the number of maximum reduce slots, for running your M/R jobs 
which result from statements such as GROUP, COGROUP, JOIN, and ORDER BY. For 
example, when allocating 3 machines you get about 0.9*2*4 = 7 reducers for 
operating on your parallel jobs. 
+ Besides PARALLEL clause, you can also use "set default_parallel" statement in 
Pig script, or set "mapred.reduce.tasks" system property to specify default 
parallel to use. If none of these values are set, Pig will only use 1 reducers. 
(In Pig 0.8, we change the default reducer from 1 to a number calculated by a 
simple heuristic for foolproof purpose)
  
  '''Q: Can I do a numerical comparison while filtering?'''
  

Reply via email to