Hey,
I've been struggling to set up a work flow with spark. I'm basically using
the AMI for the amplab3 tutorials, but added a couple of packages for R,
rJava and some of my own jars. Basically Spark 0.7.3 standalone. (can't get
Mesos running but that's a question for a different time)

I read data from S3, and do a cascade of filters, maps, joins and reduce on
them. If I perform the task with a smallish data set (<1000) it succeeds,
but if I use a data set of > 1.5M rows, I keep getting the follow error
when I do a collect on the RDD

13/09/21 00:41:45 INFO master.Master: Removing app app-20130921004115-0000
13/09/21 00:41:45 ERROR actor.ActorSystemImpl: RemoteClientError@akka://
[email protected]:44283: Error[java.net.ConnectException:Connection
refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:708)
at
org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.connect(NioClientSocketPipelineSink.java:404)
at
org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.processSelectedKeys(NioClientSocketPipelineSink.java:366)
at
org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.run(NioClientSocketPipelineSink.java:282)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:724)
]

I'm at  loss on where to start debugging -- is it some configuration issue
on my part, or some scala error, or some spark error? I've attached the log
file from the master and worker.... If anyone has any ideas on how to start
debugging, please.. I'll be very appreciative.

tks,shay

Attachment: spark-root-spark.deploy.master.Master-1-ip-10-232-35-179.ec2.internal.out
Description: Binary data

Attachment: spark-root-spark.deploy.worker.Worker-1-ip-10-168-42-45.ec2.internal.out
Description: Binary data

Attachment: stderr
Description: Binary data

Reply via email to