Hi, Renato,
I think you are talking about how we organize different operators into
map-reduce jobs. Unfortunately there is no document currently. Basically
we will put as much operators into one map-reduce job as possible.
Co-group/Group, Join, Order, Distinct, Cross, Stream will create a
map-reduce boundary; Most others we will put into existing jobs. The
main logic is inside MRCompiler.java.
Daniel
Renato Marroquín Mogrovejo wrote:
Anyone, please?
Renato M.
2010/8/24 Renato Marroquín Mogrovejo <renatoj.marroq...@gmail.com>
Hi Daniel,
Thanks, but that was not what I was actually looking. What I want to know
is for example, how the optimizer work when the bags' logical plans are
combined, or if all commands are reduced at the end to CO-GROUP commands,
how is this handled? I know from Pig's paper that the ORDER, and LOAD,
commands generate new MapReduce jobs, are there any optimizations for the
physical plans?
Thanks in advanced.
Renato M.
2010/8/23 Daniel Dai <jiany...@yahoo-inc.com>
Hi, Renato,
There is a description of optimization rule in Pig Latin reference menu:
http://hadoop.apache.org/pig/docs/r0.7.0/piglatin_ref1.html#Optimization+Rules.
Is that enough?
Daniel
Renato Marroquín Mogrovejo wrote:
Hey everyone, I was wondering if anybody has any references or suggestion
on
how to learn about Pig's optimizer besides the source code or Pig's
paper.
Thanks in advance.
Renato M.