Hi all,

I have build Shark-0.9.1 using sbt using the below command:

*SPARK_HADOOP_VERSION=2.0.0-mr1-cdh4.6.0 sbt/sbt assembly*

My Hadoop cluster is also having version 2.0.0-mr1-cdh4.6.0.

But when I try to execute the below command from Spark shell,which reads a
file from HDFS, I get the "IPC version mismatch- IPC version 7 on server
versus IPC version 4" on client error on org.apache.hadoop.hdfs.DFSClient
class.











*scala> val s = sc.textFile("hdfs://host:port/test.txt")scala>
s.count()14/06/10 23:42:59 WARN util.NativeCodeLoader: Unable to load
native-hadoop library for your platform... using builtin-java classes where
applicable14/06/10 23:42:59 WARN snappy.LoadSnappy: Snappy native library
not loadedorg.apache.hadoop.ipc.RemoteException: Server IPC version 7
cannot communicate with client version 4    at
org.apache.hadoop.ipc.Client.call(Client.java:1070)    at
org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)    at
com.sun.proxy.$Proxy9.getProtocolVersion(Unknown Source)    at
org.apache.hadoop.ipc.RPC.getProxy(RPC.java:396)    at
org.apache.hadoop.ipc.RPC.getProxy(RPC.java:379)*
    at
org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:119)
    at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:238)
    at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:203)
    at
org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:89)
    at
org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1386)
    at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)
    at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1404)
    at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:254)
    at org.apache.hadoop.fs.Path.getFileSystem(Path.java:187)
    at
org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:176)
    at
org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:208)


Apparently this error is because of version mismatch of the hadoop-hdfs jar
between client (one referred by Spark) and server(hadoop cluster).But what
I don't understand is why is this mismatch (since I had built Spark with
the correct Hadoop version).

Any suggestions would be highly appreciated.

Thanks
Bijoy

Reply via email to