Dear everyone,

right now I am working with SparkR on cluster. The following are the
package versions installed on the cluster:
----
1) Hadoop and Yarn:
Hadoop 2.5.0-cdh5.3.1
Subversion http://github.com/cloudera/hadoop -r
4cda8416c73034b59cc8baafbe3666b074472846
Compiled by jenkins on 2015-01-28T00:46Z
Compiled with protoc 2.5.0
>From source with checksum 6a018149a764de4b8992755df9a2a1b

2) Spark: Spark version 1.2.0
For the SparkR installation, I was following the guide at
https://github.com/amplab-extras/SparkR-pkg, by cloning the SparkR-pkg.
Then, in SparkR-pkg, I typed:
SPARK_VERSION=1.2.0 ./install-dev.sh
SPARK_HADOOP_VERSION=2.5.0-cdh5.3.1 ./install-dev.sh
 ----

After the installation, I tested SparkR as follows:
MASTER=spark://xxx:7077 ./sparkR
R> rdd <- parallelize(sc, 1:10)
R> partitionSum <- lapplyPartition(rdd, function(part) { Reduce("+", part)
})
R> collect(partitionSum) # 15, 40

I got the result perfectly. However, when I try to get a file from HDFS or
local file, I always failed. For example,
R> lines <- textFile(sc, "hdfs://xxx:8020/user/lala/simulation/README.md")
R> count(lines)

The following are the errors I got:
------
collect on 2 failed with java.lang.reflect.InvocationTargetException
java.lang.reflect.InvocationTargetException
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at
edu.berkeley.cs.amplab.sparkr.SparkRBackendHandler.handleMethodCall(SparkRBackendHandler.scala:111)
        at
edu.berkeley.cs.amplab.sparkr.SparkRBackendHandler.channelRead0(SparkRBackendHandler.scala:58)
        at
edu.berkeley.cs.amplab.sparkr.SparkRBackendHandler.channelRead0(SparkRBackendHandler.scala:19)
        at
io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
        at
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
        at
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319)
        at
io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)
        at
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
        at
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319)
        at
io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:163)
        at
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
        at
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319)
        at
io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:787)
        at
io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:130)
        at
io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
        at
io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
        at
io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
        at
io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116)
        at
io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:137)
        at java.lang.Thread.run(Thread.java:744)
Caused by: org.apache.hadoop.ipc.RemoteException: Server IPC version 9
cannot communicate with client version 4
        at org.apache.hadoop.ipc.Client.call(Client.java:1070)
        at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
        at com.sun.proxy.$Proxy10.getProtocolVersion(Unknown Source)
        at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:396)
        at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:379)
        at
org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:119)
        at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:238)
        at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:203)
        at
org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:89)
        at
org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1386)
        at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)
        at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1404)
        at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:254)
        at org.apache.hadoop.fs.Path.getFileSystem(Path.java:187)
        at
org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:176)
        at
org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:208)
        at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:201)
        at
org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:205)
        at
org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:203)
        at scala.Option.getOrElse(Option.scala:120)
        at org.apache.spark.rdd.RDD.partitions(RDD.scala:203)
        at org.apache.spark.rdd.MappedRDD.getPartitions(MappedRDD.scala:28)
        at
org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:205)
        at
org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:203)
        at scala.Option.getOrElse(Option.scala:120)
        at org.apache.spark.rdd.RDD.partitions(RDD.scala:203)
        at org.apache.spark.SparkContext.runJob(SparkContext.scala:1328)
        at org.apache.spark.rdd.RDD.collect(RDD.scala:780)
        at
org.apache.spark.api.java.JavaRDDLike$class.collect(JavaRDDLike.scala:309)
        at org.apache.spark.api.java.JavaRDD.collect(JavaRDD.scala:32)
        ... 25 more
Error: returnStatus == 0 is not TRUE

-----
I have read some comments regarding the errors, which is caused by
different versions between the master node and its workers. However, I am
not so sure, maybe there is another reason. Moreover, I do not know how to
solve it. So, I am looking forward to your idea and advice.

Many thanks in advance,

Regards,

Lala SR

Reply via email to