Hi there.

I am well underway with comparing Pig, Hive, JAQL etc...

The DataGenerator is proving a valuable tool for me. Thanks for that.

I have one query. I am able to use it in local mode, no problem, and some
experiments are complete.

However, I cannot seem to use it in MapReduce mode on the cluster. This is
my file "generateData" contents:
------------------
export pigjar=$HOME/installation/pig/pig-0.5.0/pig-0.5.0-core.jar
export zipfjar=$HOME/installation/pig/pig-0.5.0/sdsuLibJKD14.jar
export datagenjar=$HOME/rs46/installation/DataGenerator/dist/MyPig.jar
export conf_file=/usr/lib/hadoop/conf/hadoop-site.xml
export HADOOP_CLASSPATH=$pigjar:$zipfjar:$datagenjar
/usr/lib/hadoop/bin/hadoop jar $datagenjar
org.apache.pig.test.utils.datagen.DataGenerator -conf $conf_file -m 1 -rows
10000000 -f words.dat s:8:50:z:0
------------------

The error I receive when trying to run it with "-m 1" option (in cluster
mode):
Caused by: java.lang.ClassNotFoundException: sdsu.algorithms.data.Zipf

So in local mode, it successfully picks up the jar file sdsuLibJKD14.jar ,
but when running it in cluster mode, this classpath is not found?


thanks.

Rob Stewart

Reply via email to