I wasn't trying to use HBase, but I have had the same problem. To get around it, I had to create a pig-nohadoop.jar, pass in the hadoop*.jar in the classpath, and register antlr in pig. I think it is a pig/hadoop compatibility error because I got the same error, but just to be sure, can you run normal hadoop jobs that do not use HBase, just to isolate variables?
2011/5/25 Dmitriy Ryaboy <[email protected]> > Use Pig 0.8.1 > > D > > On Wed, May 25, 2011 at 2:03 PM, Jameson Lopp <[email protected]> wrote: > > Our production environment has undergone software upgrades and now I'm > > working with: > > > > Hadoop 0.20.2-cdh3u0 > > Apache Pig version 0.8.0-cdh3u0 > > HBase 0.90.1-cdh3u0 > > > > My research indicates that these all OUGHT to play together nicely... I > > would kill for someone to publish a compatibility grid for the misc > > versions. > > > > Anyway, I'm trying to load from HBase : > > > > visitors = LOAD 'hbase://track' USING > > org.apache.pig.backend.hadoop.hbase.HBaseStorage('open:browser open:ip > > open:os open:createdDate', '-caching 1000') > > as (browser:chararray, > > ipAddress:chararray, os:chararray, createdDate:chararray); > > > > And I'm receiving the following error, which searching around seems to be > > indicative of compatibility issues between pig and hadoop: > > > > ERROR 2999: Unexpected internal error. Failed to create DataStorage > > > > java.lang.RuntimeException: Failed to create DataStorage > > at > > > org.apache.pig.backend.hadoop.datastorage.HDataStorage.init(HDataStorage.java:75) > > at > > > org.apache.pig.backend.hadoop.datastorage.HDataStorage.<init>(HDataStorage.java:58) > > at > > > org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.init(HExecutionEngine.java:196) > > at > > > org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.init(HExecutionEngine.java:116) > > at org.apache.pig.impl.PigContext.connect(PigContext.java:184) > > at org.apache.pig.PigServer.<init>(PigServer.java:243) > > at org.apache.pig.PigServer.<init>(PigServer.java:228) > > at org.apache.pig.tools.grunt.Grunt.<init>(Grunt.java:46) > > at org.apache.pig.Main.run(Main.java:545) > > at org.apache.pig.Main.main(Main.java:108) > > Caused by: java.io.IOException: Call to hadoop001/10.0.0.51:8020 failed > on > > local exception: java.io.EOFException > > at org.apache.hadoop.ipc.Client.wrapException(Client.java:775) > > at org.apache.hadoop.ipc.Client.call(Client.java:743) > > at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220) > > at $Proxy0.getProtocolVersion(Unknown Source) > > at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:359) > > at > > org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:106) > > at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:207) > > at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:170) > > at > > > org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:82) > > at > > org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1378) > > at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66) > > at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1390) > > at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:196) > > at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:95) > > at > > > org.apache.pig.backend.hadoop.datastorage.HDataStorage.init(HDataStorage.java:72) > > ... 9 more > > Caused by: java.io.EOFException > > at java.io.DataInputStream.readInt(DataInputStream.java:375) > > at > > org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:501) > > at org.apache.hadoop.ipc.Client$Connection.run(Client.java:446) > > > > Am I actually running incompatible versions? Should I bug the Cloudera > > folks? > > -- > > Jameson Lopp > > Software Engineer > > Bronto Software, Inc. > > >
