It has to be a classpath issue with a different hadoop version being picked
up. I can very easily reproduce the same issue on my end (copied at the end
of this email)

Regarding your other question, running pig from command-line will let you
run UDFs and provide most functionality you would have otherwise. Note that
complex looping sometimes requires java/python or other pig supported
languages. If you don't have such a requirement command-line will work just
fine (it is also what's generally used more than Pig's Java APIs).

I would recommend trying to check on the classpath for your issue though in
any case.

Error before Pig is launched
----------------------------
ERROR 2999: Unexpected internal error. Failed to create DataStorage

java.lang.RuntimeException: Failed to create DataStorage
    at
org.apache.pig.backend.hadoop.datastorage.HDataStorage.init(HDataStorage.java:75)
    at
org.apache.pig.backend.hadoop.datastorage.HDataStorage.<init>(HDataStorage.java:58)
    at
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.init(HExecutionEngine.java:205)
    at
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.init(HExecutionEngine.java:118)
    at org.apache.pig.impl.PigContext.connect(PigContext.java:208)
    at org.apache.pig.PigServer.<init>(PigServer.java:246)
    at org.apache.pig.PigServer.<init>(PigServer.java:231)
    at org.apache.pig.tools.grunt.Grunt.<init>(Grunt.java:47)
    at org.apache.pig.Main.run(Main.java:487)
    at org.apache.pig.Main.main(Main.java:111)
Caused by: java.io.IOException: Call to x.y.z.net/XX.XX.XX.XX:54310 failed
on local exception: java.io.EOFException
    at org.apache.hadoop.ipc.Client.wrapException(Client.java:1103)
    at org.apache.hadoop.ipc.Client.call(Client.java:1071)
    at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
    at $Proxy1.getProtocolVersion(Unknown Source)
    at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:396)
    at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:379)
    at
org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:119)
    at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:238)
    at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:203)
    at
org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:89)
    at
org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1386)
    at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)
    at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1404)
    at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:254)
    at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:123)
    at
org.apache.pig.backend.hadoop.datastorage.HDataStorage.init(HDataStorage.java:72)
    ... 9 more
Caused by: java.io.EOFException
    at java.io.DataInputStream.readInt(DataInputStream.java:375)
    at
org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:800)
    at org.apache.hadoop.ipc.Client$Connection.run(Client.java:745)
================================================================================


On Thu, Mar 21, 2013 at 2:15 PM, Ryan Compton <[email protected]>wrote:

> java -cp
> target/mybigjar-1.1-SNAPSHOT-jar-with-dependencies.jar:$HADOOP_CONF_DIR:/home/rfcompton/Downloads/pig-0.11.0-src/pig-withouthadoop.jar
> org.apache.pig.Main
>
> Still can't create DataStorage. How far will I be able to get with pig
> if I restrict myself to the command line? UDFs will still work, right?
>
> On Thu, Mar 21, 2013 at 1:54 PM, Prashant Kommireddi
> <[email protected]> wrote:
> > Apologies, just read your email again to realize your Java program is
> > what's not working. Your command-line setup seems fine.
> >
> > Can you try running your program with
> > /home/rfcompton/Downloads/pig-0.11.0-withouthadoop.jar instead of
> > /home/rfcompton/Downloads/pig-0.11.0-src/pig.jar and make sure 0.20.2
> > hadoop is on the classpath.
> >
> >
> > On Thu, Mar 21, 2013 at 1:36 PM, Ryan Compton <[email protected]
> >wrote:
> >
> >> -bash-3.2$ pig -secretDebugCmd
> >> Find hadoop at /usr/bin/hadoop
> >> dry run:
> >> HADOOP_CLASSPATH:
> >>
> >>
> /data/osi/cm-gui/hbase-global-clientconfig/:/home/rfcompton/Downloads/pig-0.11.0-src/bin/../conf:/usr/java/jdk1.6.0_26/lib/tools.jar:/data/osi/cm-gui/global-clientconfig/hadoop-conf/:/home/rfcompton/Downloads/pig-0.11.0-src/bin/../build/ivy/lib/Pig/jython-standalone-2.5.2.jar:/home/isl/rfcompton/Downloads/pig-0.11.0-src/bin/../build/ivy/lib/Pig/jruby-complete-1.6.7.jar:/home/rfcompton/Downloads/pig-0.11.0-src/bin/../pig-withouthadoop.jar:
> >> HADOOP_OPTS: -Xmx1000m
> >> -Dpig.log.dir=/home/rfcompton/Downloads/pig-0.11.0-src/bin/../logs
> >> -Dpig.log.file=pig.log
> >> -Dpig.home.dir=/home/rfcompton/Downloads/pig-0.11.0-src/bin/..
> >> /usr/bin/hadoop jar
> >> /home/rfcompton/Downloads/pig-0.11.0-src/bin/../pig-withouthadoop.jar
> >>
> >> On Thu, Mar 21, 2013 at 1:34 PM, Prashant Kommireddi
> >> <[email protected]> wrote:
> >> > What is the output of "pig -secretDebugCmd"
> >> >
> >> > On Thu, Mar 21, 2013 at 1:29 PM, Ryan Compton <[email protected]
> >> >wrote:
> >> >
> >> >> Hi,
> >> >>
> >> >> Hmm, I've got that much:
> >> >>
> >> >> -bash-3.2$ ls $HADOOP_HOME | grep cdh3u3
> >> >> hadoop-0.20.2-cdh3u3-ant.jar
> >> >> hadoop-0.20.2-cdh3u3-core.jar
> >> >> hadoop-0.20.2-cdh3u3-examples.jar
> >> >> hadoop-0.20.2-cdh3u3-test.jar
> >> >> hadoop-0.20.2-cdh3u3-tools.jar
> >> >> hadoop-ant-0.20.2-cdh3u3.jar
> >> >> hadoop-core-0.20.2-cdh3u3.jar
> >> >> hadoop-examples-0.20.2-cdh3u3.jar
> >> >> hadoop-test-0.20.2-cdh3u3.jar
> >> >> hadoop-tools-0.20.2-cdh3u3.jar
> >> >>
> >> >> but I still have the same problem. More info:
> >> http://pastebin.com/MfUHwu0X
> >> >>
> >> >> On Thu, Mar 21, 2013 at 1:16 PM, Prashant Kommireddi
> >> >> <[email protected]> wrote:
> >> >> > Hi Ryan,
> >> >> >
> >> >> > Seems like you are trying to connect to a hadoop cluster running
> >> 0.20.2.
> >> >> > Pig by default uses hadoop 1.0 unless you specify otherwise.
> >> >> >
> >> >> > You should point HADOOP_HOME to your hadoop installation dir
> >> >> > (0.20.2-cdh3u3) from where you are launching pig
> >> >> >
> >> >> > export HADOOP_HOME=<path_to_0.20.2-cdh3u3>
> >> >> >
> >> >> >
> >> >> > On Thu, Mar 21, 2013 at 12:59 PM, Ryan Compton <
> >> [email protected]
> >> >> >wrote:
> >> >> >
> >> >> >> I can start a grunt shell just fine:
> >> >> >>
> >> >> >> -bash-3.2$ pwd
> >> >> >> /home/rfcompton/Downloads/pig-0.11.0-src
> >> >> >> -bash-3.2$ ./bin/pig
> >> >> >> 2013-03-21 12:55:00,048 [main] INFO  org.apache.pig.Main - Apache
> Pig
> >> >> >> version 0.11.1-SNAPSHOT (rexported) compiled Mar 21 2013, 12:49:21
> >> >> >> 2013-03-21 12:55:00,049 [main] INFO  org.apache.pig.Main - Logging
> >> >> >> error messages to:
> >> >> >> /home/rfcompton/Downloads/pig-0.11.0-src/pig_1363895700046.log
> >> >> >> 2013-03-21 12:55:00,076 [main] INFO
>  org.apache.pig.impl.util.Utils -
> >> >> >> Default bootup file /home/rfcompton/.pigbootup not found
> >> >> >> 2013-03-21 12:55:00,330 [main] INFO
> >> >> >> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine -
> >> >> >> Connecting to hadoop file system at: hdfs://master:8020/
> >> >> >> 2013-03-21 12:55:00,581 [main] INFO
> >> >> >> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine -
> >> >> >> Connecting to map-reduce job tracker at: node4:8021
> >> >> >> grunt>
> >> >> >>
> >> >> >> But I can't run a java program:
> >> >> >>
> >> >> >> -bash-3.2$ java -cp
> >> >> >>
> >> >> >>
> >> >>
> >>
> target/mybigjar-1.1-SNAPSHOT-jar-with-dependencies.jar:$HADOOP_CONF_DIR:/home/rfcompton/Downloads/pig-0.11.0-src/pig.jar
> >> >> >> org.apache.pig.Main
> >> >> >> 2013-03-21 12:55:58,806 [main] INFO  org.apache.pig.Main - Apache
> Pig
> >> >> >> version 0.11.1-SNAPSHOT (rexported) compiled Mar 21 2013, 12:49:21
> >> >> >> 2013-03-21 12:55:58,806 [main] INFO  org.apache.pig.Main - Logging
> >> >> >> error messages to: /home/rfcompton/pig_1363895758801.log
> >> >> >> 2013-03-21 12:55:58,829 [main] INFO
>  org.apache.pig.impl.util.Utils -
> >> >> >> Default bootup file /home/rfcompton/.pigbootup not found
> >> >> >> 2013-03-21 12:55:59,086 [main] INFO
> >> >> >> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine -
> >> >> >> Connecting to hadoop file system at: hdfs://master:8020/
> >> >> >> 2013-03-21 12:56:03,844 [main] ERROR org.apache.pig.Main - ERROR
> >> 2999:
> >> >> >> Unexpected internal error. Failed to create DataStorage
> >> >> >> Details at logfile:
> >> >> >> /home/rfcompton/myshrep/OSI/Code/geocoderV2/pig_1363895758801.log
> >> >> >> -bash-3.2$
> >> >> >>
> >> >> >> So I did this:
> >> >> >>
> >> >>
> >>
> https://cwiki.apache.org/confluence/display/PIG/FAQ#FAQ-Q%3AWhatshallIdoifIsaw%22FailedtocreateDataStorage%22%3F
> >> >> >> and it didn't help.
> >> >> >>
> >> >> >> If it matters,
> >> >> >>
> >> >> >> -bash-3.2$ hadoop version
> >> >> >> Hadoop 0.20.2-cdh3u3
> >> >> >>
> >> >>
> >>
>

Reply via email to