Dear Wiki user, You have subscribed to a wiki page or wiki category on "Pig Wiki" for change notification.
The "FAQ" page has been changed by daijy. http://wiki.apache.org/pig/FAQ?action=diff&rev1=6&rev2=7 -------------------------------------------------- C = JOIN A by url, B by url PARALLEL 50. }}} - Even if you do not specify the parallel clause, the framework uses a default number of reducers, in the order of 0.9*(number of nodes allocated by user -1)*n where n is the number of maximum reduce slots, for running your M/R jobs which result from statements such as GROUP, COGROUP, JOIN, and ORDER BY. For example, when allocating 3 machines you get about 0.9*2*4 = 7 reducers for operating on your parallel jobs. + Besides PARALLEL clause, you can also use "set default_parallel" statement in Pig script, or set "mapred.reduce.tasks" system property to specify default parallel to use. If none of these values are set, Pig will only use 1 reducers. (In Pig 0.8, we change the default reducer from 1 to a number calculated by a simple heuristic for foolproof purpose) '''Q: Can I do a numerical comparison while filtering?'''