Our production environment has undergone software upgrades and now I'm working
with:
Hadoop 0.20.2-cdh3u0
Apache Pig version 0.8.0-cdh3u0
HBase 0.90.1-cdh3u0
My research indicates that these all OUGHT to play together nicely... I would kill for someone to
publish a compatibility grid for the misc versions.
Anyway, I'm trying to load from HBase :
visitors = LOAD 'hbase://track' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('open:browser
open:ip open:os open:createdDate', '-caching 1000')
as (browser:chararray, ipAddress:chararray,
os:chararray, createdDate:chararray);
And I'm receiving the following error, which searching around seems to be indicative of
compatibility issues between pig and hadoop:
ERROR 2999: Unexpected internal error. Failed to create DataStorage
java.lang.RuntimeException: Failed to create DataStorage
at
org.apache.pig.backend.hadoop.datastorage.HDataStorage.init(HDataStorage.java:75)
at
org.apache.pig.backend.hadoop.datastorage.HDataStorage.<init>(HDataStorage.java:58)
at
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.init(HExecutionEngine.java:196)
at
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.init(HExecutionEngine.java:116)
at org.apache.pig.impl.PigContext.connect(PigContext.java:184)
at org.apache.pig.PigServer.<init>(PigServer.java:243)
at org.apache.pig.PigServer.<init>(PigServer.java:228)
at org.apache.pig.tools.grunt.Grunt.<init>(Grunt.java:46)
at org.apache.pig.Main.run(Main.java:545)
at org.apache.pig.Main.main(Main.java:108)
Caused by: java.io.IOException: Call to hadoop001/10.0.0.51:8020 failed on local exception:
java.io.EOFException
at org.apache.hadoop.ipc.Client.wrapException(Client.java:775)
at org.apache.hadoop.ipc.Client.call(Client.java:743)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220)
at $Proxy0.getProtocolVersion(Unknown Source)
at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:359)
at
org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:106)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:207)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:170)
at
org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:82)
at
org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1378)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1390)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:196)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:95)
at
org.apache.pig.backend.hadoop.datastorage.HDataStorage.init(HDataStorage.java:72)
... 9 more
Caused by: java.io.EOFException
at java.io.DataInputStream.readInt(DataInputStream.java:375)
at
org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:501)
at org.apache.hadoop.ipc.Client$Connection.run(Client.java:446)
Am I actually running incompatible versions? Should I bug the Cloudera folks?
--
Jameson Lopp
Software Engineer
Bronto Software, Inc.