Is it reasonable to archive the configuration file into hadoop18.jar

2009-07-09 Thread zhang jianfeng
Hi all,

I found the hadoop18.jar which pig use contains configuration files, such as
hadoop-site.xml , hadoop-default.xml. These files will be archived to
pig.jar when run ant jar.

And when I use pig in embed java way, run the following code snippet:

PigServer pig = new PigServer(ExecType.MAPREDUCE);
pig.registerScript(scripts/Test.pig);

it will default load these configuration files. So it is not easy for me to
make configuration when I want to use my own hadoop configuration files.  So
I think it's better not put these configruation to hadoop18.jar. Let the
user provide their own configuration files.

What do you think? Or Does there exists some ways to make configuration ?


Thanks

Jeff Zhang


Re: Is there any document about the JobControlCompiler

2009-07-08 Thread zhang jianfeng
Dmitriy ,

Thank you for your help.


On Thu, Jul 9, 2009 at 9:34 AM, Dmitriy Ryaboy dvrya...@cloudera.comwrote:

 Jeff,
 Chris Olston answered this a while back:

 http://markmail.org/thread/xnwutstlftnyycxs

 (by the way, MarkMail is awesome for searching mailing list archives.
 Highly
 recommended.)

 There are some changes that have to do with sampling and multi-store, but
 that email will give you the general idea.

 Also, remember you can always get the MR plan by running describe on a
 relation.

 Hope this helps
 -Dmitriy


 On Wed, Jul 8, 2009 at 6:24 PM, zhang jianfeng zjf...@gmail.com wrote:

  Hi all,
 
 
  I found that the following script will be converted into 3 mapreduce
 jobs:
 
  A = *LOAD* '/user/zjffdu/input.txt' *USING* PigStorage();
 
  B = *GROUP* A *BY* $0;
 
  B = *FOREACH* B *GENERATE* *group*,COUNT($1);
 
  B = *ORDER* B *BY* $1;
 
  *DUMP* B;
 
  I am very interested to know How Pig compile the script to jobs, reading
  the
  source code is a way, but If there’s any document, that would be better.
  Does anyone know where can I find the related documents ? Or is there any
  JIRA item related to this ?
 
  Thank you in advance.
 
 
 
  Jeff Zhang.
 



Re: [Pig Wiki] Update of HowToContribute by AlanGates

2009-04-15 Thread zhang jianfeng
Hi Alan,

Thank you for your guideline. So where's code of these ProposedProjects. Are
they in different branch or in  the trunk? How can I track the progress of
these ProposedProjects ?

Thank you.



On Thu, Apr 16, 2009 at 7:17 AM, Apache Wiki wikidi...@apache.org wrote:

 Dear Wiki user,

 You have subscribed to a wiki page or wiki category on Pig Wiki for
 change notification.

 The following page has been changed by AlanGates:
 http://wiki.apache.org/pig/HowToContribute


 --
  * [http://www.apache.org/dev/contributors.html Apache contributor
 documentation]
  * [http://www.apache.org/foundation/voting.html Apache voting
 documentation]

 + == Picking Something to Work On ==
 + Looking for a place to start?  A great first place is to peruse the
 + [https://issues.apache.org/jira/browse/PIG JIRA] and find an issue that
 needs
 + resolved.  If you're looking for a bigger project, try ProposedProjects.
  This
 + gives a list of projects the Pig team would like to see worked on.
 +