Hi Ji Mahn, Pig doesn't generate MapReduce jobs on the fly. In fact, the way how Pig works is follows. Pig has generic mapper and reducer classes. Then, it compiles queries into chunks of physical plans and replay them inside the generic mapper and reducer. For example, load / group-by / store are translated into a mapper that contains load / Hadoop shuffle / a reducer that contains store, and this is called MR plan. You can take a look at the explain <http://pig.apache.org/docs/r0.13.0/test.html#explain> output to see how Pig generate a MR plan for your queries.
Thanks, Cheolsoo On Thu, Jul 17, 2014 at 1:13 PM, Ji Mahn Ok <[email protected]> wrote: > Hello, > > I am trying to find the class which contains the job configuration part in > a job file produced by pig. > > In detail, for example, I run a PigMix query. As you know, when I run pig > script, it produces job jar file in tmp directory. I stored that job jar > file separately to figure out what kind of MapReduce job is really produced > by pig. But it is hard for me to find the class which has the job > configuration part from the job jar file. What I want to ask is this: where > can I find the class in which the job configurations are set in that job > jar file? Or, is there any other better way to see the real MapReduce job > produced by pig? > > Thank you in advance. > > Best Regards, >
