Thanks for the info
I have not yet verified with the hadoop list but it looks like the CDH3b4
0.20.2 hadoop-core.jar is incompatible or different from the hadoop-core.jar
that the pig build script pulls in via ivy. I was able to solve my problem by
building pig without hadoop (ant jar-withouthadoop) then manually including the
'correct' hadoop-core.jar in the class path. This is a bug but I don’t know
enough about the community to say who's; perhaps Cloudera's?
I would like to point out one bug I found in the Pig build.xml.
The main jar target (buildJar) has the following dependencies:
<zipfileset
src="${ivy.lib.dir}/hadoop-core-${hadoop-core.version}.jar" />
<zipfileset src="${lib.dir}/${automaton.jarfile}" />
<zipfileset src="${ivy.lib.dir}/junit-${junit.version}.jar" />
<zipfileset src="${ivy.lib.dir}/jsch-${jsch.version}.jar" />
<zipfileset src="${ivy.lib.dir}/jline-${jline.version}.jar" />
<zipfileset
src="${ivy.lib.dir}/jackson-mapper-asl-${jackson.version}.jar" />
<zipfileset
src="${ivy.lib.dir}/jackson-core-asl-${jackson.version}.jar" />
<zipfileset src="${ivy.lib.dir}/joda-time-${joda-time.version}.jar"
/>
<zipfileset src="${ivy.lib.dir}/${guava.jar}" />
<zipgroupfileset dir="${ivy.lib.dir}" includes="commons*.jar"/>
<zipgroupfileset dir="${ivy.lib.dir}" includes="log4j*.jar"/>
<zipgroupfileset dir="${ivy.lib.dir}" includes="jsp-api*.jar"/>
Yet in the 0.8.0 tag, the non-hadoop target (jar-withouthadoop) has:
<zipfileset src="${ivy.lib.dir}/junit-${junit.version}.jar" />
<zipfileset src="${ivy.lib.dir}/jsch-${jsch.version}.jar" />
<zipfileset src="${ivy.lib.dir}/jline-${jline.version}.jar" />
<zipfileset
src="${ivy.lib.dir}/jackson-mapper-asl-${jackson.version}.jar" />
<zipfileset
src="${ivy.lib.dir}/jackson-core-asl-${jackson.version}.jar" />
<zipfileset
src="${ivy.lib.dir}/joda-time-${joda-time.version}.jar" />
<zipfileset src="${lib.dir}/${automaton.jarfile}" />
Should it not be the same with the exception of the first line? Among other
things, withouthadoop jar is missing the logging dependencies.
Dan
-----Original Message-----
From: [email protected] [mailto:[email protected]] On Behalf Of Josh
Devins
Sent: March-22-11 4:59
To: [email protected]
Subject: Re: Incorrect header or version mismatch
Hey Dan
This usually means that you have mismatched Hadoop jar versions somewhere. I
encountered a similar problem with Oozie trying to talk to HDFS. Maybe try
posting to the Hadoop user list as well. In general, you should just need
the same hadoop-core.jar as on your cluster when you run Pig. From Pig all
you should need is pig.jar (and piggybank, etc.) and the pre-compiled jar
should suffice.
Cheers,
Josh
On 21 March 2011 22:56, Dan Hendry <[email protected]> wrote:
> First off, I am fairly new to both pig and Hadoop. I am having some
> problems
> connecting pig to a local hadoop cluster. I am getting the following error
> in the hadoop namenode logs whenever I try and start up pig:
>
>
>
> 2011-03-21 17:48:17,299 WARN org.apache.hadoop.ipc.Server: Incorrect header
> or version mismatch from 127.0.0.1:60928 got version 3 expected version 4
>
>
>
> I am using the cloudera deb repository (CDH3b4) installed according to
> https://docs.cloudera.com/display/DOC/CDH3+Installation+Guide. The hadoop
> version is 20.2 and running in pseudo distributed mode. I am using pig
> 0.8.0, both the provided tarball and a clone of the 0.8.0 tag compiled
> locally. Any help would be appreciated. I am getting the following error in
> the pig logs:
>
>
>
> Error before Pig is launched
>
> ----------------------------
>
> ERROR 2999: Unexpected internal error. Failed to create DataStorage
>
>
>
> java.lang.RuntimeException: Failed to create DataStorage
>
> at
>
> org.apache.pig.backend.hadoop.datastorage.HDataStorage.init(HDataStorage.jav
> a:75)
>
> at
>
> org.apache.pig.backend.hadoop.datastorage.HDataStorage.<init>(HDataStorage.j
> ava:58)
>
> at
>
> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.init(HExecuti
> onEngine.java:213)
>
> at
>
> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.init(HExecuti
> onEngine.java:133)
>
> at org.apache.pig.impl.PigContext.connect(PigContext.java:183)
>
> at org.apache.pig.PigServer.<init>(PigServer.java:225)
>
> at org.apache.pig.PigServer.<init>(PigServer.java:214)
>
> at org.apache.pig.tools.grunt.Grunt.<init>(Grunt.java:55)
>
> at org.apache.pig.Main.run(Main.java:462)
>
> at org.apache.pig.Main.main(Main.java:107)
>
> Caused by: java.io.IOException: Call to localhost/127.0.0.1:8020 failed on
> local exception: java.io.EOFException
>
> at org.apache.hadoop.ipc.Client.wrapException(Client.java:775)
>
> at org.apache.hadoop.ipc.Client.call(Client.java:743)
>
> at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220)
>
> at $Proxy0.getProtocolVersion(Unknown Source)
>
> at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:359)
>
> at
> org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:106)
>
> at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:207)
>
> at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:170)
>
> at
>
> org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSyste
> m.java:82)
>
> at
> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1378)
>
> at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)
>
> at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1390)
>
> at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:196)
>
> at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:95)
>
> at
>
> org.apache.pig.backend.hadoop.datastorage.HDataStorage.init(HDataStorage.jav
> a:72)
>
> ... 9 more
>
> Caused by: java.io.EOFException
>
> at java.io.DataInputStream.readInt(DataInputStream.java:375)
>
> at
> org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:501)
>
> at org.apache.hadoop.ipc.Client$Connection.run(Client.java:446)
>
>
> ============================================================================
> ====
>
>
No virus found in this incoming message.
Checked by AVG - www.avg.com
Version: 9.0.894 / Virus Database: 271.1.1/3522 - Release Date: 03/22/11
03:34:00