[ 
https://issues.apache.org/jira/browse/SPARK-8557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon resolved SPARK-8557.
---------------------------------
    Resolution: Incomplete

> Successful Jobs marked as KILLED Spark 1.4 Standalone
> -----------------------------------------------------
>
>                 Key: SPARK-8557
>                 URL: https://issues.apache.org/jira/browse/SPARK-8557
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core, Web UI
>         Environment: Spark Standalone 1.4.0 vs Spark stand alone 1.3.1
>            Reporter: Demi Ben-Ari
>            Priority: Major
>              Labels: bulk-closed
>
> We have two cluster installations, one with spark 1.3.1 and the new with 
> spark 1.4.0 (Both are standalone cluster installations).
> The original problem:
> We ran a job (Spark java application) on the new 1.4.0 cluster, and the same 
> job on the old 1.3.1 cluster 
> After the job was finished (in both clusters), we entered to the job's link 
> in the Web UI, and in the new 1.4.0 cluster, the workers are marked as KILLED 
> (I didn't killed them, and every place I checked, the logs and output seems 
> fine) - And the Job itself is marked as "FINISHED": 
> 2 worker-20150613111158-172.31.0.104-37240 4 10240 KILLED stdout stderr 
> 1 worker-20150613111158-172.31.15.149-58710 4 10240 KILLED stdout stderr 
> 3 worker-20150613111158-172.31.0.196-52939 4 10240 KILLED stdout stderr 
> 0 worker-20150613111158-172.31.1.233-53467 4 10240 KILLED stdout stderr 
> In the old 1.3.1 cluster:
> =============================
> the workers are marked as EXITED: 
> 1 worker-20150608115639-ip-172-31-6-134.us-west-2.compute.internal-47572 2 
> 10240 EXITED stdout stderr 
> 0 worker-20150608115639-ip-172-31-4-169.us-west-2.compute.internal-41828 2 
> 10240 EXITED stdout stderr 
> 2 worker-20150608115640-ip-172-31-0-37.us-west-2.compute.internal-32847 1 
> 10240 EXITED stdout stderr 
> Another representation to the problem  - 
> We ran an application on one worker cluster (of 1.4.0). On the application 
> page it’s marked as KILLED, and on the worker it’s marked as EXITED. When 
> running it on 1.3.1, everything is fine and marked as EXITED
> An attempt to reproduce the problem in spark-shell:
> =======================================
> We ran the following on both servers:
> root@ip-172-31-6-108 ~]$ spark/bin/spark-shell --total-executor-cores 1
> scala> val text = sc.textFile("hdfs:///some-file.txt”); 
> scala> text.count()
> —here I get the correct output in both servers
> At this stage, by checking spark-ui, both are marked as RUNNING
> Now, 
> We exit the spark shell (using ctrl+d), and if I check the spark UI now, the 
> job on 1.3.1 is marked as EXITED, and the job on 1.4.0 is marked as KILLED)
> Thanks,
> Nizan & Demi



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to