Good idea. :) Here is the output for Cloudera CDH3b3 distribution in case someone else needs it: home:/usr/lib/pig/bin/.. conf:/usr/lib/pig/bin/../conf
Thanks for helping me out! Anze On Sunday 17 October 2010, Gerrit Jansen van Vuuren wrote: > Glad it worked for you :) > > I use the standard apache pig distributions. > There are several places that environment variables can be changed and set, > and I have no idea which one cloudera uses but here is a list: > > /etc/profile.d/<any file> (we have hadoop.sh, pig.sh and java.sh here that > sets the home variables and is managed by puppet) > /etc/bash.bashrc (not good idea to set it here) > $HOME/.bashrc (quick for users that don't have permission to root but not > for production ) > $PIG_HOME/conf/pig-env.sh (standard in all hadoop related projects, gets > sourced by $PIG_HOME/bin/pig ) > > To see what variables your pig is picking up you can manually insert the > lines > echo "home:$PIG_HOME conf:$PIG_CONF_DIR" into the $PIG_HOME/bin/pig file > just before it calls java. > > Cheers, > Gerrit > > -----Original Message----- > From: Anze [mailto:[email protected]] > Sent: Sunday, October 17, 2010 7:49 AM > To: [email protected] > Subject: Re: accessing remote cluster with Pig > > > Gerrir, thank you for your answer! It has pointed me in the right > direction. > > > It looks like Pig (at least mine) ignores PIG_HOME. But with your help I > was > > able to debug a bit further: > ----- > $ find / -name 'pig.properties' > /etc/pig/conf.dist/pig.properties > /etc/pig/conf/pig.properties > /usr/lib/pig/example-confs/conf.default/pig.properties > /usr/lib/pig/conf/pig.properties > ----- > > I have changed /usr/lib/pig/conf/pig.properties and bingo - this is what my > Pig uses. > > So while Cloudera packaging makes /etc/pig/conf/pig.properties (the "Debian > way"), it is not used at all. And it probably ignores the environment vars > too. > > Thanks again! :) > > Anze > > On Sunday 17 October 2010, Gerrit Jansen van Vuuren wrote: > > Hi, > > > > Pig configuration is in the file: $PIG_HOME/conf/pig.properties > > > > The two parameters that tell pig where to find the namenode and job > > tracker > > > are: > > > > E.g (assuming your using the default ports) > > > > ----[ $PIG_HOME/conf/pig.properties ]--------------- > > > > fs.default.name=hdfs://<namenode url>:8020/ > > mapred.job.tracker=<jobtracker url>:8021 > > > > -------------- > > > > Having these properties you don't need to specify pig -x mapreduce, just > > pig is enough. > > > > > > Cheers, > > > > Gerrit > > > > -----Original Message----- > > From: Anze [mailto:[email protected]] > > Sent: Saturday, October 16, 2010 9:53 PM > > To: [email protected] > > Subject: accessing remote cluster with Pig > > > > Hi again! :) > > > > I am trying to run Pig on a local machine, but I want it to connect to a > > remote cluster. I can't make it use my settings - whatever I do, I get > > this: ----- > > $ pig -x mapreduce > > 10/10/16 22:17:43 INFO pig.Main: Logging error messages to: > > /home/pigtest/conf/pig_1287260263699.log > > 2010-10-16 22:17:43,896 [main] INFO > > org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - > > Connecting > > > to > > hadoop file system at: file:/// > > grunt> > > ----- > > > > I have copied the hadoop settings files (/etc/hadoop/conf/*) from the > > remote > > > > cluster's namenode to /home/pigtest/conf/ and exported PIG_CLASSPATH, > > PIGDIR, > > HADOOP_CLASSPATH,... I have also tried changing > > /etc/pig/conf/pig.configuration (even wrote there some free text so it > > would > > > > at least give me an error message) - nothing. It still connects to > > file:/// > > > and is still doesn't display a message about a jobtracker: > > ----- > > $ export HADOOPDIR=/etc/hadoop/conf > > $ export PIG_PATH=/etc/pig/conf > > $ export PIG_CLASSPATH=$HADOOPDIR > > $ export PIG_HADOOP_VERSION=0.20.2 > > $ export PIG_HOME="/usr/lib/pig" > > $ export PIG_CONF_DIR="/etc/pig/" > > $ export PIG_LOG_DIR="/var/log/pig" > > $ pig -x mapreduce > > 10/10/16 22:32:34 INFO pig.Main: Logging error messages to: > > /home/pigtest/conf/pig_1287261154272.log > > 2010-10-16 22:32:34,471 [main] INFO > > org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - > > Connecting > > > to > > hadoop file system at: file:/// > > grunt> > > ----- > > > > I am guessing I am doing something fundamentally wrong. How do I change > > the > > > Pig's settings? > > > > More info: using Cloudera package hadoop-pig from CDH3b3 > > (0.7.0+16-1~lenny- > > > cdh3b3). I would appreciate some pointers. > > > > Kind regards, > > > > Anze
