Thanks for the clarification Dmitriy. I will try that out when I got the chance.
Felix On Fri, Dec 10, 2010 at 8:31 PM, Dmitriy Ryaboy <[email protected]> wrote: > The CDH3 distribution has security patched in; my understanding is that > this > changes the protocol, and both your server and client libraries must be > compatible. > You don't need the Cloudera version of pig, I think, but you do need their > version of the Hadoop jars on both sides -- so you can't take the fat > pig.jar, but must use the pig-nohadoop.jar version, and put the Cloudera > Hadoop jars on your classpath. > > -D > > On Fri, Dec 10, 2010 at 11:46 AM, felix gao <[email protected]> wrote: > > > I just fixed the problem. > > I am using CDH3b2. Appearently Cloudera have their own pig distribution. > > THere are some major patches going on for their version of pig 0.7 > > 0011-PIG-1452-to-remove-hadoop20.jar-from-lib-and-use-had.patch > > 0012-CLOUDERA-BUILD.-Build-pig-against-CDH3b3-snapshot.patch > > > > Now that I am really confused on which version to use from now. > > > > Thanks for the help. > > > > Felix > > > > > > On Fri, Dec 10, 2010 at 11:30 AM, Daniel Dai <[email protected]> > > wrote: > > > > > hadoop20.jar is more than hadoop-core.jar, it includes all hadoop > classes > > > and dependent libraries. Where did you get hadoop? Is that from CDH? > > which > > > version is it? > > > > > > > > > Daniel > > > > > > felix gao wrote: > > > > > >> Daniel, > > >> > > >> Here is what I did, the jar is already build by cloudera, so I did > > >> mv hadoop-core-0.20.2+737.jar hadoop20.jar to pig's lib dir > > >> > > >> then I did > > >> java -Dfs.default.name=hdfs://localhost:8020 > > >> -Dmapred.job.tracker=localhost:8021 -jar pig-0.7.0-core.jar > > >> 10/12/10 14:21:42 INFO pig.Main: Logging error messages to: > > >> /home/felix/pig-0.7.0/pig_1292008902688.log > > >> 2010-12-10 14:21:43,014 [main] INFO > > >> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - > > >> Connecting > > >> to hadoop file system at: hdfs://localhost:8020 > > >> 2010-12-10 14:21:43,275 [main] ERROR org.apache.pig.Main - ERROR 2999: > > >> Unexpected internal error. Failed to create DataStorage > > >> > > >> seems that still doesn't fix my problem. > > >> > > >> Felix > > >> > > >> > > >> On Fri, Dec 10, 2010 at 11:10 AM, Daniel Dai <[email protected]> > > >> wrote: > > >> > > >> > > >> > > >>> I didn't use Cloudera distribution before. Pig bundles Apache hadoop > > >>> 0.20.2 > > >>> client library. If Cloudera made some changes to hadoop, that could > be > > an > > >>> issue. > > >>> > > >>> One thing you can try is build hadoop20.jar by yourself ( > > >>> http://behemoth.strlen.net/~alex/hadoop20-pig-howto.txt< > http://behemoth.strlen.net/%7Ealex/hadoop20-pig-howto.txt>), > > put it in lib > > >>> (replace the original hadoop20.jar). > > >>> > > >>> Daniel > > >>> > > >>> > > >>> felix gao wrote: > > >>> > > >>> > > >>> > > >>>> Daniel, > > >>>> > > >>>> No, I am using 0.20.2 from Cloudera. > > >>>> here is all the jar under pig's lib > > >>>> $ ls ~/pig-0.7.0/lib > > >>>> automaton.jar hadoop-LICENSE.txt hadoop-lzo.jar hadoop18.jar > > >>>> hadoop20.jar hbase-0.20.0-test.jar hbase-0.20.0.jar jdiff > > >>>> zookeeper-hbase-1329.jar > > >>>> > > >>>> $ ls $HADOOP_HOME > > >>>> CHANGES.txt build.xml hadoop-0.20.2+737-ant.jar > > >>>> hadoop-ant-0.20.2+737.jar hadoop-examples.jar ivy > > >>>> webapps > > >>>> LICENSE.txt cloudera hadoop-0.20.2+737-core.jar > > >>>> hadoop-ant.jar > > >>>> hadoop-test-0.20.2+737.jar ivy.xml > > >>>> NOTICE.txt conf hadoop-0.20.2+737-examples.jar > > >>>> hadoop-core-0.20.2+737.jar hadoop-test.jar lib > > >>>> README.txt contrib hadoop-0.20.2+737-test.jar > > >>>> hadoop-core.jar > > >>>> hadoop-tools-0.20.2+737.jar logs > > >>>> bin example-confs hadoop-0.20.2+737-tools.jar > > >>>> hadoop-examples-0.20.2+737.jar hadoop-tools.jar pids > > >>>> > > >>>> > > >>>> $ ls $HADOOP_HOME/lib > > >>>> aspectjrt-1.6.5.jar commons-logging-api-1.0.4.jar > > >>>> jackson-mapper-asl-1.5.2.jar junit-4.5.jar > > >>>> servlet-api-2.5-6.1.14.jar > > >>>> aspectjtools-1.6.5.jar commons-net-1.4.1.jar > > >>>> jasper-compiler-5.5.12.jar kfs-0.2.2.jar > > >>>> slf4j-api-1.4.3.jar > > >>>> commons-cli-1.2.jar core-3.1.1.jar > > >>>> jasper-runtime-5.5.12.jar kfs-0.2.LICENSE.txt > > >>>> slf4j-log4j12-1.4.3.jar > > >>>> commons-codec-1.4.jar hadoop-fairscheduler-0.20.2+737.jar > > jdiff > > >>>> log4j-1.2.15.jar > xmlenc-0.52.jar > > >>>> commons-daemon-1.0.1.jar hadoop-lzo-0.4.6.jar > > >>>> jets3t-0.6.1.jar mockito-all-1.8.2.jar > > >>>> commons-el-1.0.jar hsqldb-1.8.0.10.LICENSE.txt > > >>>> jetty-6.1.14.jar mysql-connector-java-5.0.8-bin.jar > > >>>> commons-httpclient-3.0.1.jar hsqldb-1.8.0.10.jar > > >>>> jetty-util-6.1.14.jar native > > >>>> commons-logging-1.0.4.jar jackson-core-asl-1.5.2.jar > > >>>> jsp-2.1 > > >>>> oro-2.0.8.jar > > >>>> > > >>>> please tell me how to get this working with pig > > >>>> > > >>>> Thanks, > > >>>> > > >>>> Felix > > >>>> > > >>>> > > >>>> > > >>>> > > >>>> > > >>>> > > >>>> On Fri, Dec 10, 2010 at 12:20 AM, Daniel Dai <[email protected]> > > wrote: > > >>>> > > >>>> > > >>>> > > >>>> > > >>>> > > >>>>> Looks like hadoop client jar does not match the version of server > > side. > > >>>>> Are > > >>>>> you using hadoop 0.20.2 from Apache? > > >>>>> > > >>>>> Daniel > > >>>>> > > >>>>> -----Original Message----- From: felix gao > > >>>>> Sent: Thursday, December 09, 2010 5:48 PM > > >>>>> To: [email protected] > > >>>>> Subject: Strange problem with Pig 0.7.0 and Hadoop 0.20.2 and > Failed > > to > > >>>>> create DataStorage > > >>>>> > > >>>>> > > >>>>> I kept seening Failed to create DataStroage error when try to run > pig > > >>>>> > > >>>>> $ java -cp pig-0.7.0-core.jar:$HADOOP_CONF_DIR org.apache.pig.Main > -x > > >>>>> mapreduce > > >>>>> 10/12/09 20:35:31 INFO pig.Main: Logging error messages to: > > >>>>> /home/testpig/pig-0.7.0/pig_1291944931735.log > > >>>>> 2010-12-09 20:35:31,997 [main] INFO > > >>>>> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - > > >>>>> Connecting > > >>>>> to hadoop file system at: hdfs://localhost:8020 > > >>>>> 2010-12-09 20:35:32,333 [main] ERROR org.apache.pig.Main - ERROR > > 2999: > > >>>>> Unexpected internal error. Failed to create DataStorage > > >>>>> > > >>>>> $ cat pig_1291944931735.log > > >>>>> Error before Pig is launched > > >>>>> ---------------------------- > > >>>>> ERROR 2999: Unexpected internal error. Failed to create DataStorage > > >>>>> > > >>>>> java.lang.RuntimeException: Failed to create DataStorage > > >>>>> at > > >>>>> > > >>>>> > > >>>>> > > >>>>> > > > org.apache.pig.backend.hadoop.datastorage.HDataStorage.init(HDataStorage.java:75) > > >>>>> at > > >>>>> > > >>>>> > > >>>>> > > >>>>> > > > org.apache.pig.backend.hadoop.datastorage.HDataStorage.<init>(HDataStorage.java:58) > > >>>>> at > > >>>>> > > >>>>> > > >>>>> > > >>>>> > > > org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.init(HExecutionEngine.java:216) > > >>>>> at > > >>>>> > > >>>>> > > >>>>> > > >>>>> > > > org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.init(HExecutionEngine.java:126) > > >>>>> at org.apache.pig.impl.PigContext.connect(PigContext.java:184) > > >>>>> at org.apache.pig.PigServer.<init>(PigServer.java:184) > > >>>>> at org.apache.pig.PigServer.<init>(PigServer.java:173) > > >>>>> at org.apache.pig.tools.grunt.Grunt.<init>(Grunt.java:54) > > >>>>> at org.apache.pig.Main.main(Main.java:354) > > >>>>> Caused by: java.io.IOException: Call to localhost/127.0.0.1:8020 > > failed > > >>>>> on > > >>>>> local exception: java.io.EOFException > > >>>>> at org.apache.hadoop.ipc.Client.wrapException(Client.java:775) > > >>>>> at org.apache.hadoop.ipc.Client.call(Client.java:743) > > >>>>> at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220) > > >>>>> at $Proxy0.getProtocolVersion(Unknown Source) > > >>>>> at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:359) > > >>>>> at > > >>>>> > > org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:106) > > >>>>> at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:207) > > >>>>> at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:170) > > >>>>> at > > >>>>> > > >>>>> > > >>>>> > > >>>>> > > > org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:82) > > >>>>> at > > >>>>> > > org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1378) > > >>>>> at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66) > > >>>>> at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1390) > > >>>>> at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:196) > > >>>>> at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:95) > > >>>>> at > > >>>>> > > >>>>> > > >>>>> > > >>>>> > > > org.apache.pig.backend.hadoop.datastorage.HDataStorage.init(HDataStorage.java:72) > > >>>>> ... 8 more > > >>>>> Caused by: java.io.EOFException > > >>>>> at java.io.DataInputStream.readInt(DataInputStream.java:375) > > >>>>> at > > >>>>> > > >>>>> > > org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:501) > > >>>>> at org.apache.hadoop.ipc.Client$Connection.run(Client.java:446) > > >>>>> > > >>>>> if I ran java -cp pig-0.7.0-core.jar org.apache.pig.Main -x > mapreduce > > >>>>> command, I can atleast see the grunt shell. > > >>>>> > > >>>>> However, when using hadoop commands > > >>>>> $ hadoop fs -ls > > >>>>> Found 1 items > > >>>>> -rw-r--r-- 1 testpig supergroup 454557 2010-12-09 19:31 > > >>>>> /user/testpig/access_log.2010-08-30-23-01.lzo > > >>>>> > > >>>>> everything seems to be fine connecting to hdfs. > > >>>>> > > >>>>> My environment have the following settings > > >>>>> PIG_HOME=/home/testpig/pig-0.7.0 > > >>>>> HADOOP_HOME=/usr/lib/hadoop-0.20 (cloudera distribution) > > >>>>> HADOOP_CONF_DIR=/usr/lib/hadoop-0.20/conf > > >>>>> JAVA_HOME=/usr/java/default > > >>>>> > > >>>>> pig-env.sh have the following setting > > >>>>> export PIG_OPTS="$PIG_OPTS > > >>>>> -Djava.library.path=$HADOOP_HOME/lib/native/Linux-amd64-64" > > >>>>> export > > >>>>> > > >>>>> > > >>>>> > > >>>>> > > > PIG_CLASSPATH=$PIG_CLASSPATH:/home/testpig/hadoop-lzo.jar:/home/testpig/elephant-bird.jar:/home/testpig/elephant-bird/lib/* > > >>>>> export PIG_HADOOP_VERSION=20 > > >>>>> > > >>>>> > > >>>>> What is going on there? > > >>>>> > > >>>>> Thanks a lot. > > >>>>> > > >>>>> Felix > > >>>>> > > >>>>> > > >>>>> > > >>>>> > > >>>>> > > >>>> > > > > > >
