Help realted with spark streaming usage error

Saurabh Pateriya Tue, 09 Dec 2014 05:47:50 -0800

Hi ,

I am running spark streaming in standalone mode on twitter source into single 
machine(using HDP virtual box)  I receive status from streaming context and I 
can print the same but when I try to save those statuses as RDD into Hadoop 
using rdd.saveAsTextFiles or 
saveAsHadoopFiles("hdfs://10.20.32.204:50070/user/hue/test","txt") I get below 
connection error.
My Hadoop version:2.4.0.2.1.1.0-385
Spark 1.1.0


ERROR-----------------------------------------------------

14/12/09 04:45:12 ERROR scheduler.JobScheduler: Error running job streaming job 
1418129110000 ms.1
java.io.IOException: Call to /10.20.32.204:50070 failed on local exception: 
java.io.EOFException
        at org.apache.hadoop.ipc.Client.wrapException(Client.java:1107)
        at org.apache.hadoop.ipc.Client.call(Client.java:1075)
        at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
        at com.sun.proxy.$Proxy9.getProtocolVersion(Unknown Source)
        at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:396)
        at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:379)
        at 
org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:119)
        at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:238)
        at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:203)
        at 
org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:89)
        at 
org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1386)
        at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)
        at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1404)
        at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:254)
        at org.apache.hadoop.fs.Path.getFileSystem(Path.java:187)
        at 
org.apache.hadoop.mapred.SparkHadoopWriter$.createPathFromString(SparkHadoopWriter.scala:193)
        at 
org.apache.spark.rdd.PairRDDFunctions.saveAsHadoopFile(PairRDDFunctions.scala:685)
        at 
org.apache.spark.rdd.PairRDDFunctions.saveAsHadoopFile(PairRDDFunctions.scala:572)
        at org.apache.spark.rdd.RDD.saveAsTextFile(RDD.scala:894)
        at 
org.apache.spark.streaming.dstream.DStream$$anonfun$8.apply(DStream.scala:762)
        at 
org.apache.spark.streaming.dstream.DStream$$anonfun$8.apply(DStream.scala:760)
        at 
org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply$mcV$sp(ForEachDStream.scala:41)
        at 
org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(ForEachDStream.scala:40)
        at 
org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(ForEachDStream.scala:40)
        at scala.util.Try$.apply(Try.scala:161)
        at org.apache.spark.streaming.scheduler.Job.run(Job.scala:32)
        at 
org.apache.spark.streaming.scheduler.JobScheduler$JobHandler.run(JobScheduler.scala:155)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:744)
Caused by: java.io.EOFException
        at java.io.DataInputStream.readInt(DataInputStream.java:392)
        at 
org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:811)
        at org.apache.hadoop.ipc.Client$Connection.run(Client.java:749)
[error] (run-main-0) java.io.IOException: Call to /10.20.32.204:50070 failed on 
local exception: java.io.EOFException
java.io.IOException: Call to /10.20.32.204:50070 failed on local exception: 
java.io.EOFException
        at org.apache.hadoop.ipc.Client.wrapException(Client.java:1107)
        at org.apache.hadoop.ipc.Client.call(Client.java:1075)
        at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
        at com.sun.proxy.$Proxy9.getProtocolVersion(Unknown Source)
        at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:396)
        at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:379)
        at 
org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:119)
        at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:238)
        at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:203)
        at 
org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:89)
        at 
org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1386)
        at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)
        at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1404)
        at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:254)
        at org.apache.hadoop.fs.Path.getFileSystem(Path.java:187)
        at 
org.apache.hadoop.mapred.SparkHadoopWriter$.createPathFromString(SparkHadoopWriter.scala:193)
        at 
org.apache.spark.rdd.PairRDDFunctions.saveAsHadoopFile(PairRDDFunctions.scala:685)
        at 
org.apache.spark.rdd.PairRDDFunctions.saveAsHadoopFile(PairRDDFunctions.scala:572)
        at org.apache.spark.rdd.RDD.saveAsTextFile(RDD.scala:894)
        at 
org.apache.spark.streaming.dstream.DStream$$anonfun$8.apply(DStream.scala:762)
        at 
org.apache.spark.streaming.dstream.DStream$$anonfun$8.apply(DStream.scala:760)
        at 
org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply$mcV$sp(ForEachDStream.scala:41)
        at 
org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(ForEachDStream.scala:40)
        at 
org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(ForEachDStream.scala:40)
        at scala.util.Try$.apply(Try.scala:161)
        at org.apache.spark.streaming.scheduler.Job.run(Job.scala:32)
        at 
org.apache.spark.streaming.scheduler.JobScheduler$JobHandler.run(JobScheduler.scala:155)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:744)
Caused by: java.io.EOFException
        at java.io.DataInputStream.readInt(DataInputStream.java:392)
        at 
org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:811)
        at org.apache.hadoop.ipc.Client$Connection.run(Client.java:749)
[trace] Stack trace suppressed: run last compile:run for the full output.





Thanks,
Saurabh
________________________________
Happiest Minds Disclaimer

This message is for the sole use of the intended recipient(s) and may contain 
confidential, proprietary or legally privileged information. Any unauthorized 
review, use, disclosure or distribution is prohibited. If you are not the 
original intended recipient of the message, please contact the sender by reply 
email and destroy all copies of the original message.

Happiest Minds Technologies <http://www.happiestminds.com>

________________________________

Help realted with spark streaming usage error

Reply via email to