Re: Is there any document about the JobControlCompiler

2009-07-08 Thread Dmitriy Ryaboy
Jeff,
Chris Olston answered this a while back:

http://markmail.org/thread/xnwutstlftnyycxs

(by the way, MarkMail is awesome for searching mailing list archives. Highly
recommended.)

There are some changes that have to do with sampling and multi-store, but
that email will give you the general idea.

Also, remember you can always get the MR plan by running describe on a
relation.

Hope this helps
-Dmitriy


On Wed, Jul 8, 2009 at 6:24 PM, zhang jianfeng zjf...@gmail.com wrote:

 Hi all,


 I found that the following script will be converted into 3 mapreduce jobs:

 A = *LOAD* '/user/zjffdu/input.txt' *USING* PigStorage();

 B = *GROUP* A *BY* $0;

 B = *FOREACH* B *GENERATE* *group*,COUNT($1);

 B = *ORDER* B *BY* $1;

 *DUMP* B;

 I am very interested to know How Pig compile the script to jobs, reading
 the
 source code is a way, but If there’s any document, that would be better.
 Does anyone know where can I find the related documents ? Or is there any
 JIRA item related to this ?

 Thank you in advance.



 Jeff Zhang.



Re: Is there any document about the JobControlCompiler

2009-07-08 Thread zhang jianfeng
Dmitriy ,

Thank you for your help.


On Thu, Jul 9, 2009 at 9:34 AM, Dmitriy Ryaboy dvrya...@cloudera.comwrote:

 Jeff,
 Chris Olston answered this a while back:

 http://markmail.org/thread/xnwutstlftnyycxs

 (by the way, MarkMail is awesome for searching mailing list archives.
 Highly
 recommended.)

 There are some changes that have to do with sampling and multi-store, but
 that email will give you the general idea.

 Also, remember you can always get the MR plan by running describe on a
 relation.

 Hope this helps
 -Dmitriy


 On Wed, Jul 8, 2009 at 6:24 PM, zhang jianfeng zjf...@gmail.com wrote:

  Hi all,
 
 
  I found that the following script will be converted into 3 mapreduce
 jobs:
 
  A = *LOAD* '/user/zjffdu/input.txt' *USING* PigStorage();
 
  B = *GROUP* A *BY* $0;
 
  B = *FOREACH* B *GENERATE* *group*,COUNT($1);
 
  B = *ORDER* B *BY* $1;
 
  *DUMP* B;
 
  I am very interested to know How Pig compile the script to jobs, reading
  the
  source code is a way, but If there’s any document, that would be better.
  Does anyone know where can I find the related documents ? Or is there any
  JIRA item related to this ?
 
  Thank you in advance.
 
 
 
  Jeff Zhang.