Daniel, Here is what I did, the jar is already build by cloudera, so I did mv hadoop-core-0.20.2+737.jar hadoop20.jar to pig's lib dir
then I did java -Dfs.default.name=hdfs://localhost:8020 -Dmapred.job.tracker=localhost:8021 -jar pig-0.7.0-core.jar 10/12/10 14:21:42 INFO pig.Main: Logging error messages to: /home/felix/pig-0.7.0/pig_1292008902688.log 2010-12-10 14:21:43,014 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: hdfs://localhost:8020 2010-12-10 14:21:43,275 [main] ERROR org.apache.pig.Main - ERROR 2999: Unexpected internal error. Failed to create DataStorage seems that still doesn't fix my problem. Felix On Fri, Dec 10, 2010 at 11:10 AM, Daniel Dai <[email protected]> wrote: > I didn't use Cloudera distribution before. Pig bundles Apache hadoop 0.20.2 > client library. If Cloudera made some changes to hadoop, that could be an > issue. > > One thing you can try is build hadoop20.jar by yourself ( > http://behemoth.strlen.net/~alex/hadoop20-pig-howto.txt), put it in lib > (replace the original hadoop20.jar). > > Daniel > > > felix gao wrote: > >> Daniel, >> >> No, I am using 0.20.2 from Cloudera. >> here is all the jar under pig's lib >> $ ls ~/pig-0.7.0/lib >> automaton.jar hadoop-LICENSE.txt hadoop-lzo.jar hadoop18.jar >> hadoop20.jar hbase-0.20.0-test.jar hbase-0.20.0.jar jdiff >> zookeeper-hbase-1329.jar >> >> $ ls $HADOOP_HOME >> CHANGES.txt build.xml hadoop-0.20.2+737-ant.jar >> hadoop-ant-0.20.2+737.jar hadoop-examples.jar ivy >> webapps >> LICENSE.txt cloudera hadoop-0.20.2+737-core.jar hadoop-ant.jar >> hadoop-test-0.20.2+737.jar ivy.xml >> NOTICE.txt conf hadoop-0.20.2+737-examples.jar >> hadoop-core-0.20.2+737.jar hadoop-test.jar lib >> README.txt contrib hadoop-0.20.2+737-test.jar >> hadoop-core.jar >> hadoop-tools-0.20.2+737.jar logs >> bin example-confs hadoop-0.20.2+737-tools.jar >> hadoop-examples-0.20.2+737.jar hadoop-tools.jar pids >> >> >> $ ls $HADOOP_HOME/lib >> aspectjrt-1.6.5.jar commons-logging-api-1.0.4.jar >> jackson-mapper-asl-1.5.2.jar junit-4.5.jar >> servlet-api-2.5-6.1.14.jar >> aspectjtools-1.6.5.jar commons-net-1.4.1.jar >> jasper-compiler-5.5.12.jar kfs-0.2.2.jar >> slf4j-api-1.4.3.jar >> commons-cli-1.2.jar core-3.1.1.jar >> jasper-runtime-5.5.12.jar kfs-0.2.LICENSE.txt >> slf4j-log4j12-1.4.3.jar >> commons-codec-1.4.jar hadoop-fairscheduler-0.20.2+737.jar jdiff >> log4j-1.2.15.jar xmlenc-0.52.jar >> commons-daemon-1.0.1.jar hadoop-lzo-0.4.6.jar >> jets3t-0.6.1.jar mockito-all-1.8.2.jar >> commons-el-1.0.jar hsqldb-1.8.0.10.LICENSE.txt >> jetty-6.1.14.jar mysql-connector-java-5.0.8-bin.jar >> commons-httpclient-3.0.1.jar hsqldb-1.8.0.10.jar >> jetty-util-6.1.14.jar native >> commons-logging-1.0.4.jar jackson-core-asl-1.5.2.jar jsp-2.1 >> oro-2.0.8.jar >> >> please tell me how to get this working with pig >> >> Thanks, >> >> Felix >> >> >> >> >> >> >> On Fri, Dec 10, 2010 at 12:20 AM, Daniel Dai <[email protected]> wrote: >> >> >> >>> Looks like hadoop client jar does not match the version of server side. >>> Are >>> you using hadoop 0.20.2 from Apache? >>> >>> Daniel >>> >>> -----Original Message----- From: felix gao >>> Sent: Thursday, December 09, 2010 5:48 PM >>> To: [email protected] >>> Subject: Strange problem with Pig 0.7.0 and Hadoop 0.20.2 and Failed to >>> create DataStorage >>> >>> >>> I kept seening Failed to create DataStroage error when try to run pig >>> >>> $ java -cp pig-0.7.0-core.jar:$HADOOP_CONF_DIR org.apache.pig.Main -x >>> mapreduce >>> 10/12/09 20:35:31 INFO pig.Main: Logging error messages to: >>> /home/testpig/pig-0.7.0/pig_1291944931735.log >>> 2010-12-09 20:35:31,997 [main] INFO >>> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - >>> Connecting >>> to hadoop file system at: hdfs://localhost:8020 >>> 2010-12-09 20:35:32,333 [main] ERROR org.apache.pig.Main - ERROR 2999: >>> Unexpected internal error. Failed to create DataStorage >>> >>> $ cat pig_1291944931735.log >>> Error before Pig is launched >>> ---------------------------- >>> ERROR 2999: Unexpected internal error. Failed to create DataStorage >>> >>> java.lang.RuntimeException: Failed to create DataStorage >>> at >>> >>> >>> org.apache.pig.backend.hadoop.datastorage.HDataStorage.init(HDataStorage.java:75) >>> at >>> >>> >>> org.apache.pig.backend.hadoop.datastorage.HDataStorage.<init>(HDataStorage.java:58) >>> at >>> >>> >>> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.init(HExecutionEngine.java:216) >>> at >>> >>> >>> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.init(HExecutionEngine.java:126) >>> at org.apache.pig.impl.PigContext.connect(PigContext.java:184) >>> at org.apache.pig.PigServer.<init>(PigServer.java:184) >>> at org.apache.pig.PigServer.<init>(PigServer.java:173) >>> at org.apache.pig.tools.grunt.Grunt.<init>(Grunt.java:54) >>> at org.apache.pig.Main.main(Main.java:354) >>> Caused by: java.io.IOException: Call to localhost/127.0.0.1:8020 failed >>> on >>> local exception: java.io.EOFException >>> at org.apache.hadoop.ipc.Client.wrapException(Client.java:775) >>> at org.apache.hadoop.ipc.Client.call(Client.java:743) >>> at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220) >>> at $Proxy0.getProtocolVersion(Unknown Source) >>> at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:359) >>> at org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:106) >>> at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:207) >>> at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:170) >>> at >>> >>> >>> org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:82) >>> at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1378) >>> at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66) >>> at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1390) >>> at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:196) >>> at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:95) >>> at >>> >>> >>> org.apache.pig.backend.hadoop.datastorage.HDataStorage.init(HDataStorage.java:72) >>> ... 8 more >>> Caused by: java.io.EOFException >>> at java.io.DataInputStream.readInt(DataInputStream.java:375) >>> at >>> org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:501) >>> at org.apache.hadoop.ipc.Client$Connection.run(Client.java:446) >>> >>> if I ran java -cp pig-0.7.0-core.jar org.apache.pig.Main -x mapreduce >>> command, I can atleast see the grunt shell. >>> >>> However, when using hadoop commands >>> $ hadoop fs -ls >>> Found 1 items >>> -rw-r--r-- 1 testpig supergroup 454557 2010-12-09 19:31 >>> /user/testpig/access_log.2010-08-30-23-01.lzo >>> >>> everything seems to be fine connecting to hdfs. >>> >>> My environment have the following settings >>> PIG_HOME=/home/testpig/pig-0.7.0 >>> HADOOP_HOME=/usr/lib/hadoop-0.20 (cloudera distribution) >>> HADOOP_CONF_DIR=/usr/lib/hadoop-0.20/conf >>> JAVA_HOME=/usr/java/default >>> >>> pig-env.sh have the following setting >>> export PIG_OPTS="$PIG_OPTS >>> -Djava.library.path=$HADOOP_HOME/lib/native/Linux-amd64-64" >>> export >>> >>> >>> PIG_CLASSPATH=$PIG_CLASSPATH:/home/testpig/hadoop-lzo.jar:/home/testpig/elephant-bird.jar:/home/testpig/elephant-bird/lib/* >>> export PIG_HADOOP_VERSION=20 >>> >>> >>> What is going on there? >>> >>> Thanks a lot. >>> >>> Felix >>> >>> >>> >> >
