Good idea. :) 

Here is the output for Cloudera CDH3b3 distribution in case someone else needs 
it:
home:/usr/lib/pig/bin/.. conf:/usr/lib/pig/bin/../conf

Thanks for helping me out! 

Anze


On Sunday 17 October 2010, Gerrit Jansen van Vuuren wrote:
> Glad it worked for you  :)
> 
> I use the standard apache pig distributions.
> There are several places that environment variables can be changed and set,
> and I have no idea which one cloudera uses but here is a list:
> 
> /etc/profile.d/<any file> (we have hadoop.sh, pig.sh and java.sh here that
> sets the home variables and is managed by puppet)
> /etc/bash.bashrc  (not good idea to set it here)
> $HOME/.bashrc  (quick for users that don't have permission to root but not
> for production )
> $PIG_HOME/conf/pig-env.sh   (standard in all hadoop related projects, gets
> sourced by $PIG_HOME/bin/pig )
> 
> To see what variables your pig is picking up you can manually insert the
> lines
> echo "home:$PIG_HOME conf:$PIG_CONF_DIR" into the $PIG_HOME/bin/pig file
> just before it calls java.
> 
> Cheers,
>  Gerrit
> 
> -----Original Message-----
> From: Anze [mailto:[email protected]]
> Sent: Sunday, October 17, 2010 7:49 AM
> To: [email protected]
> Subject: Re: accessing remote cluster with Pig
> 
> 
> Gerrir, thank you for your answer! It has pointed me in the right
> direction.
> 
> 
> It looks like Pig (at least mine) ignores PIG_HOME. But with your help I
> was
> 
> able to debug a bit further:
> -----
> $ find / -name 'pig.properties'
> /etc/pig/conf.dist/pig.properties
> /etc/pig/conf/pig.properties
> /usr/lib/pig/example-confs/conf.default/pig.properties
> /usr/lib/pig/conf/pig.properties
> -----
> 
> I have changed /usr/lib/pig/conf/pig.properties and bingo - this is what my
> Pig uses.
> 
> So while Cloudera packaging makes /etc/pig/conf/pig.properties (the "Debian
> way"), it is not used at all. And it probably ignores the environment vars
> too.
> 
> Thanks again! :)
> 
> Anze
> 
> On Sunday 17 October 2010, Gerrit Jansen van Vuuren wrote:
> > Hi,
> > 
> > Pig configuration is in the file: $PIG_HOME/conf/pig.properties
> > 
> > The two parameters that tell pig where to find the namenode and job
> 
> tracker
> 
> > are:
> > 
> > E.g (assuming your using the default ports)
> > 
> > ----[ $PIG_HOME/conf/pig.properties ]---------------
> > 
> > fs.default.name=hdfs://<namenode url>:8020/
> > mapred.job.tracker=<jobtracker url>:8021
> > 
> > --------------
> > 
> > Having these properties you don't need to specify pig -x mapreduce, just
> > pig is enough.
> > 
> > 
> > Cheers,
> > 
> >  Gerrit
> > 
> > -----Original Message-----
> > From: Anze [mailto:[email protected]]
> > Sent: Saturday, October 16, 2010 9:53 PM
> > To: [email protected]
> > Subject: accessing remote cluster with Pig
> > 
> > Hi again! :)
> > 
> > I am trying to run Pig on a local machine, but I want it to connect to a
> > remote cluster. I can't make it use my settings - whatever I do, I get
> > this: -----
> > $ pig -x mapreduce
> > 10/10/16 22:17:43 INFO pig.Main: Logging error messages to:
> > /home/pigtest/conf/pig_1287260263699.log
> > 2010-10-16 22:17:43,896 [main] INFO
> > org.apache.pig.backend.hadoop.executionengine.HExecutionEngine -
> 
> Connecting
> 
> > to
> > hadoop file system at: file:///
> > grunt>
> > -----
> > 
> > I have copied the hadoop settings files (/etc/hadoop/conf/*) from the
> > remote
> > 
> > cluster's namenode to /home/pigtest/conf/ and exported PIG_CLASSPATH,
> > PIGDIR,
> > HADOOP_CLASSPATH,... I have also tried changing
> > /etc/pig/conf/pig.configuration (even wrote there some free text so it
> > would
> > 
> > at least give me an error message) - nothing. It still connects to
> 
> file:///
> 
> > and is still doesn't display a message about a jobtracker:
> > -----
> > $ export HADOOPDIR=/etc/hadoop/conf
> > $ export PIG_PATH=/etc/pig/conf
> > $ export PIG_CLASSPATH=$HADOOPDIR
> > $ export PIG_HADOOP_VERSION=0.20.2
> > $ export PIG_HOME="/usr/lib/pig"
> > $ export PIG_CONF_DIR="/etc/pig/"
> > $ export PIG_LOG_DIR="/var/log/pig"
> > $ pig -x mapreduce
> > 10/10/16 22:32:34 INFO pig.Main: Logging error messages to:
> > /home/pigtest/conf/pig_1287261154272.log
> > 2010-10-16 22:32:34,471 [main] INFO
> > org.apache.pig.backend.hadoop.executionengine.HExecutionEngine -
> 
> Connecting
> 
> > to
> > hadoop file system at: file:///
> > grunt>
> > -----
> > 
> > I am guessing I am doing something fundamentally wrong. How do I change
> 
> the
> 
> > Pig's settings?
> > 
> > More info: using Cloudera package hadoop-pig from CDH3b3
> 
> (0.7.0+16-1~lenny-
> 
> > cdh3b3). I would appreciate some pointers.
> > 
> > Kind regards,
> > 
> > Anze

Reply via email to