don't bother...My problem is using spark-0.9 instead 0.8...because 0.9
fixed bug which can run from eclipse.


2013/12/17 Jie Deng <[email protected]>

> When I start a task on master, I can see there is a
> CoarseGralinedExcutorBackend java process running on worker, is that saying
> something?
>
>
> 2013/12/17 Jie Deng <[email protected]>
>
>> Hi Andrew,
>>
>> Thanks for helping!
>> Sorry I did not make my self clear, here is the output from iptables
>> (both master and worker):
>>
>> jie@jie-OptiPlex-7010:~/spark$ sudo ufw status
>> Status: inactive
>> jie@jie-OptiPlex-7010:~/spark$ sudo iptables -L
>> Chain INPUT (policy ACCEPT)
>> target     prot opt source               destination
>>
>> Chain FORWARD (policy ACCEPT)
>> target     prot opt source               destination
>>
>> Chain OUTPUT (policy ACCEPT)
>> target     prot opt source               destination
>>
>>
>>
>>
>> 2013/12/17 Andrew Ash <[email protected]>
>>
>>> Hi Jie,
>>>
>>> When you say firewall is closed does that mean ports are blocked between
>>> the worker nodes?  I believe workers start up on a random port and send
>>> data directly between each other during shuffles.  Your firewall may be
>>> blocking those connections.  Can you try with the firewall temporarily
>>> disabled?
>>>
>>> Andrew
>>>
>>>
>>> On Mon, Dec 16, 2013 at 9:58 AM, Jie Deng <[email protected]> wrote:
>>>
>>>> Hi,
>>>> Thanks for reading,
>>>>
>>>> I am trying to running a spark program on cluster. The program can
>>>> successfully running on local;
>>>> The standalone topology is working, I can see workers from master
>>>> webUI; Master and worker are different machine, and worker status is ALIVE;
>>>> The thing is no matter I start a program from eclipse or ./run-example,
>>>> they both stop at some point like:
>>>> Stage Id Description SubmittedDuration Tasks: Succeeded/TotalShuffle
>>>> Read Shuffle Write 0 count at 
>>>> SparkExample.java:31<http://jie-optiplex-7010.local:4040/stages/stage?id=0>2013/12/16
>>>>  14:50:367 m
>>>> 0/2
>>>>  And after a while, the worker's state become DEAD.
>>>>
>>>> Spark directory on worker is copy from master by ./make-distribution,
>>>> firewall is all closed.
>>>>
>>>> Has anyone has the same issue before?
>>>>
>>>
>>>
>>
>

Reply via email to