Gerrir, thank you for your answer! It has pointed me in the right direction. 

It looks like Pig (at least mine) ignores PIG_HOME. But with your help I was 
able to debug a bit further:
-----
$ find / -name 'pig.properties'
/etc/pig/conf.dist/pig.properties
/etc/pig/conf/pig.properties
/usr/lib/pig/example-confs/conf.default/pig.properties
/usr/lib/pig/conf/pig.properties
-----

I have changed /usr/lib/pig/conf/pig.properties and bingo - this is what my 
Pig uses.

So while Cloudera packaging makes /etc/pig/conf/pig.properties (the "Debian 
way"), it is not used at all. And it probably ignores the environment vars 
too.

Thanks again! :)

Anze



On Sunday 17 October 2010, Gerrit Jansen van Vuuren wrote:
> Hi,
> 
> Pig configuration is in the file: $PIG_HOME/conf/pig.properties
> 
> The two parameters that tell pig where to find the namenode and job tracker
> are:
> 
> E.g (assuming your using the default ports)
> 
> ----[ $PIG_HOME/conf/pig.properties ]---------------
> 
> fs.default.name=hdfs://<namenode url>:8020/
> mapred.job.tracker=<jobtracker url>:8021
> 
> --------------
> 
> Having these properties you don't need to specify pig -x mapreduce, just
> pig is enough.
> 
> 
> Cheers,
>  Gerrit
> 
> -----Original Message-----
> From: Anze [mailto:[email protected]]
> Sent: Saturday, October 16, 2010 9:53 PM
> To: [email protected]
> Subject: accessing remote cluster with Pig
> 
> Hi again! :)
> 
> I am trying to run Pig on a local machine, but I want it to connect to a
> remote cluster. I can't make it use my settings - whatever I do, I get
> this: -----
> $ pig -x mapreduce
> 10/10/16 22:17:43 INFO pig.Main: Logging error messages to:
> /home/pigtest/conf/pig_1287260263699.log
> 2010-10-16 22:17:43,896 [main] INFO
> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting
> to
> hadoop file system at: file:///
> grunt>
> -----
> 
> I have copied the hadoop settings files (/etc/hadoop/conf/*) from the
> remote
> 
> cluster's namenode to /home/pigtest/conf/ and exported PIG_CLASSPATH,
> PIGDIR,
> HADOOP_CLASSPATH,... I have also tried changing
> /etc/pig/conf/pig.configuration (even wrote there some free text so it
> would
> 
> at least give me an error message) - nothing. It still connects to file:///
> and is still doesn't display a message about a jobtracker:
> -----
> $ export HADOOPDIR=/etc/hadoop/conf
> $ export PIG_PATH=/etc/pig/conf
> $ export PIG_CLASSPATH=$HADOOPDIR
> $ export PIG_HADOOP_VERSION=0.20.2
> $ export PIG_HOME="/usr/lib/pig"
> $ export PIG_CONF_DIR="/etc/pig/"
> $ export PIG_LOG_DIR="/var/log/pig"
> $ pig -x mapreduce
> 10/10/16 22:32:34 INFO pig.Main: Logging error messages to:
> /home/pigtest/conf/pig_1287261154272.log
> 2010-10-16 22:32:34,471 [main] INFO
> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting
> to
> hadoop file system at: file:///
> grunt>
> -----
> 
> I am guessing I am doing something fundamentally wrong. How do I change the
> Pig's settings?
> 
> More info: using Cloudera package hadoop-pig from CDH3b3 (0.7.0+16-1~lenny-
> cdh3b3). I would appreciate some pointers.
> 
> Kind regards,
> 
> Anze

Reply via email to