[
https://issues.apache.org/jira/browse/SPARK-8557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Hyukjin Kwon resolved SPARK-8557.
---------------------------------
Resolution: Incomplete
> Successful Jobs marked as KILLED Spark 1.4 Standalone
> -----------------------------------------------------
>
> Key: SPARK-8557
> URL: https://issues.apache.org/jira/browse/SPARK-8557
> Project: Spark
> Issue Type: Bug
> Components: Spark Core, Web UI
> Environment: Spark Standalone 1.4.0 vs Spark stand alone 1.3.1
> Reporter: Demi Ben-Ari
> Priority: Major
> Labels: bulk-closed
>
> We have two cluster installations, one with spark 1.3.1 and the new with
> spark 1.4.0 (Both are standalone cluster installations).
> The original problem:
> We ran a job (Spark java application) on the new 1.4.0 cluster, and the same
> job on the old 1.3.1 cluster
> After the job was finished (in both clusters), we entered to the job's link
> in the Web UI, and in the new 1.4.0 cluster, the workers are marked as KILLED
> (I didn't killed them, and every place I checked, the logs and output seems
> fine) - And the Job itself is marked as "FINISHED":
> 2 worker-20150613111158-172.31.0.104-37240 4 10240 KILLED stdout stderr
> 1 worker-20150613111158-172.31.15.149-58710 4 10240 KILLED stdout stderr
> 3 worker-20150613111158-172.31.0.196-52939 4 10240 KILLED stdout stderr
> 0 worker-20150613111158-172.31.1.233-53467 4 10240 KILLED stdout stderr
> In the old 1.3.1 cluster:
> =============================
> the workers are marked as EXITED:
> 1 worker-20150608115639-ip-172-31-6-134.us-west-2.compute.internal-47572 2
> 10240 EXITED stdout stderr
> 0 worker-20150608115639-ip-172-31-4-169.us-west-2.compute.internal-41828 2
> 10240 EXITED stdout stderr
> 2 worker-20150608115640-ip-172-31-0-37.us-west-2.compute.internal-32847 1
> 10240 EXITED stdout stderr
> Another representation to the problem -
> We ran an application on one worker cluster (of 1.4.0). On the application
> page it’s marked as KILLED, and on the worker it’s marked as EXITED. When
> running it on 1.3.1, everything is fine and marked as EXITED
> An attempt to reproduce the problem in spark-shell:
> =======================================
> We ran the following on both servers:
> root@ip-172-31-6-108 ~]$ spark/bin/spark-shell --total-executor-cores 1
> scala> val text = sc.textFile("hdfs:///some-file.txt”);
> scala> text.count()
> —here I get the correct output in both servers
> At this stage, by checking spark-ui, both are marked as RUNNING
> Now,
> We exit the spark shell (using ctrl+d), and if I check the spark UI now, the
> job on 1.3.1 is marked as EXITED, and the job on 1.4.0 is marked as KILLED)
> Thanks,
> Nizan & Demi
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]