Your mac needs to have the hadoop configuration (eg. hdfs-site.xml, mapred-site.xml, core-site.xml, depending on the version of hadoop) files available somewhere in pig's classpath. It may do to simply copy them directly from one of the remote machines.
--jacob @thedatachef On Tue, 2011-02-22 at 17:12 +0530, rashmi behera wrote: > Hi, > > I am new to Hbase/Hadoop concept. Following is the scenario -: > > 1) Our Hadoop is installed in a remote system. Data is loaded in HBase > through HBase writer. > > 2) I am trying to install pig on my local mac OS X( version 10.6.5) so that > i will fetch data from that remote system. I downloaded Pig latest release > from http://pig.apache.org/releases.html ( 17 December, 2010: release 0.8.0 > available) > > I did the following things - : > > supp:~ rashmi$ export PATH=/Users/rashmi/Desktop/pig-0.8.0/bin:$PATH > supp:~ rashmi$ pig -help > Error: JAVA_HOME is not set. > supp:~ rashmi$ export > JAVA_HOME=/System/Library/Frameworks/JavaVM.framework/Versions/1.6/Home > > when i ran pig -help i got the following output -: > > supp:~ rashmi$ pig -help > > Apache Pig version 0.8.0 (r1043805) > compiled Dec 08 2010, 17:26:09 > > USAGE: Pig [options] [-] : Run interactively in grunt shell. > Pig [options] -e[xecute] cmd [cmd ...] : Run cmd(s). > Pig [options] [-f[ile]] file : Run cmds found in file. > options include: > -4, -log4jconf - Log4j configuration file, overrides log conf > -b, -brief - Brief logging (no timestamps) > -c, -check - Syntax check > -d, -debug - Debug level, INFO is default > -e, -execute - Commands to execute (within quotes) > -f, -file - Path to the script to execute > -h, -help - Display this message. You can specify topic to get help for > that topic. > properties is the only topic currently supported: -h properties. > -i, -version - Display version information > -l, -logfile - Path to client side log file; default is current working > directory. > -m, -param_file - Path to the parameter file > -p, -param - Key value pair of the form param=val > -r, -dryrun - Produces script with substituted parameters. Script is not > executed. > -t, -optimizer_off - Turn optimizations off. The following values are > supported: > SplitFilter - Split filter conditions > MergeFilter - Merge filter conditions > PushUpFilter - Filter as early as possible > PushDownForeachFlatten - Join or explode as late as possible > ColumnMapKeyPrune - Remove unused data > LimitOptimizer - Limit as early as possible > AddForEach - Add ForEach to remove unneeded columns > MergeForEach - Merge adjacent ForEach > LogicalExpressionSimplifier - Combine multiple expressions > All - Disable all optimizations > All optimizations are enabled by default. Optimization values are > case insensitive. > -v, -verbose - Print all error messages to screen > -w, -warning - Turn warning logging on; also turns warning aggregation > off > -x, -exectype - Set execution mode: local|mapreduce, default is > mapreduce. > -F, -stop_on_failure - Aborts execution on the first failed job; default > is off > -M, -no_multiquery - Turn multiquery optimization off; default is on > -P, -propertyFile - Path to property file > > > when i ran pig command i got the following error -: > > supp:~ rashmi$ pig > 2011-02-22 12:48:26,319 [main] INFO org.apache.pig.Main - Logging error > messages to: /Users/rashmi/pig_1298359106317.log > 2011-02-22 12:48:26,474 [main] ERROR org.apache.pig.Main - ERROR 4010: > Cannot find hadoop configurations in classpath (neither hadoop-site.xml nor > core-site.xml was found in the classpath).If you plan to use local mode, > please put -x local option in command line > Details at logfile: /Users/rashmi/pig_1298359106317.log > > > My Question is > > 1) What all i need to do , so that i could connect to remote hadoop system > and fetch data. I read the documentation for this , but couldn't get any > clear idea. > may be becoz i m not java developer. But could you please explain what > all changes i need to do in my case? I ll be highly grateful for this. > > > > > >
