Is it reasonable to archive the configuration file into hadoop18.jar
Hi all, I found the hadoop18.jar which pig use contains configuration files, such as hadoop-site.xml , hadoop-default.xml. These files will be archived to pig.jar when run ant jar. And when I use pig in embed java way, run the following code snippet: PigServer pig = new PigServer(ExecType.MAPREDUCE); pig.registerScript(scripts/Test.pig); it will default load these configuration files. So it is not easy for me to make configuration when I want to use my own hadoop configuration files. So I think it's better not put these configruation to hadoop18.jar. Let the user provide their own configuration files. What do you think? Or Does there exists some ways to make configuration ? Thanks Jeff Zhang
Re: Is there any document about the JobControlCompiler
Dmitriy , Thank you for your help. On Thu, Jul 9, 2009 at 9:34 AM, Dmitriy Ryaboy dvrya...@cloudera.comwrote: Jeff, Chris Olston answered this a while back: http://markmail.org/thread/xnwutstlftnyycxs (by the way, MarkMail is awesome for searching mailing list archives. Highly recommended.) There are some changes that have to do with sampling and multi-store, but that email will give you the general idea. Also, remember you can always get the MR plan by running describe on a relation. Hope this helps -Dmitriy On Wed, Jul 8, 2009 at 6:24 PM, zhang jianfeng zjf...@gmail.com wrote: Hi all, I found that the following script will be converted into 3 mapreduce jobs: A = *LOAD* '/user/zjffdu/input.txt' *USING* PigStorage(); B = *GROUP* A *BY* $0; B = *FOREACH* B *GENERATE* *group*,COUNT($1); B = *ORDER* B *BY* $1; *DUMP* B; I am very interested to know How Pig compile the script to jobs, reading the source code is a way, but If there’s any document, that would be better. Does anyone know where can I find the related documents ? Or is there any JIRA item related to this ? Thank you in advance. Jeff Zhang.
Re: [Pig Wiki] Update of HowToContribute by AlanGates
Hi Alan, Thank you for your guideline. So where's code of these ProposedProjects. Are they in different branch or in the trunk? How can I track the progress of these ProposedProjects ? Thank you. On Thu, Apr 16, 2009 at 7:17 AM, Apache Wiki wikidi...@apache.org wrote: Dear Wiki user, You have subscribed to a wiki page or wiki category on Pig Wiki for change notification. The following page has been changed by AlanGates: http://wiki.apache.org/pig/HowToContribute -- * [http://www.apache.org/dev/contributors.html Apache contributor documentation] * [http://www.apache.org/foundation/voting.html Apache voting documentation] + == Picking Something to Work On == + Looking for a place to start? A great first place is to peruse the + [https://issues.apache.org/jira/browse/PIG JIRA] and find an issue that needs + resolved. If you're looking for a bigger project, try ProposedProjects. This + gives a list of projects the Pig team would like to see worked on. +