When I start a task on master, I can see there is a CoarseGralinedExcutorBackend java process running on worker, is that saying something?
2013/12/17 Jie Deng <[email protected]> > Hi Andrew, > > Thanks for helping! > Sorry I did not make my self clear, here is the output from iptables (both > master and worker): > > jie@jie-OptiPlex-7010:~/spark$ sudo ufw status > Status: inactive > jie@jie-OptiPlex-7010:~/spark$ sudo iptables -L > Chain INPUT (policy ACCEPT) > target prot opt source destination > > Chain FORWARD (policy ACCEPT) > target prot opt source destination > > Chain OUTPUT (policy ACCEPT) > target prot opt source destination > > > > > 2013/12/17 Andrew Ash <[email protected]> > >> Hi Jie, >> >> When you say firewall is closed does that mean ports are blocked between >> the worker nodes? I believe workers start up on a random port and send >> data directly between each other during shuffles. Your firewall may be >> blocking those connections. Can you try with the firewall temporarily >> disabled? >> >> Andrew >> >> >> On Mon, Dec 16, 2013 at 9:58 AM, Jie Deng <[email protected]> wrote: >> >>> Hi, >>> Thanks for reading, >>> >>> I am trying to running a spark program on cluster. The program can >>> successfully running on local; >>> The standalone topology is working, I can see workers from master webUI; >>> Master and worker are different machine, and worker status is ALIVE; >>> The thing is no matter I start a program from eclipse or ./run-example, >>> they both stop at some point like: >>> Stage Id Description SubmittedDuration Tasks: Succeeded/TotalShuffle >>> Read Shuffle Write 0 count at >>> SparkExample.java:31<http://jie-optiplex-7010.local:4040/stages/stage?id=0>2013/12/16 >>> 14:50:367 m >>> 0/2 >>> And after a while, the worker's state become DEAD. >>> >>> Spark directory on worker is copy from master by ./make-distribution, >>> firewall is all closed. >>> >>> Has anyone has the same issue before? >>> >> >> >
