[ 
https://issues.apache.org/jira/browse/SPARK-16676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15389726#comment-15389726
 ] 

Joe Chong commented on SPARK-16676:
-----------------------------------

It didn't. How do I troubleshoot. From the attached picture, the stage triggers 
the job, but it stayed in pending till I had to kill the job from Spark UI. 

> Spark jobs stay in pending
> --------------------------
>
>                 Key: SPARK-16676
>                 URL: https://issues.apache.org/jira/browse/SPARK-16676
>             Project: Spark
>          Issue Type: Bug
>          Components: MLlib, Spark Shell
>    Affects Versions: 1.5.2
>         Environment: Mac OS X Yosemite, Terminal, Spark-shell standalone
>            Reporter: Joe Chong
>         Attachments: Spark UI stays @ pending.png
>
>
> I've been having issues executing certain Scala statements within the 
> Spark-Shell. These statements are obtained through tutorial/blog written by 
> Carol McDonald in MapR. 
> The import statements, reading text files into DataFrames are OK. However, 
> when I try to do df.show(), the execution hits a road block. Checking the 
> Spark UI job, I see that the Stage's active, however, 1 of its dependent job 
> stays in Pending without any movement. The logs are as below. 
> scala> fltCountsql.show()
> 16/07/22 11:40:16 INFO spark.SparkContext: Starting job: show at <console>:46
> 16/07/22 11:40:16 INFO scheduler.DAGScheduler: Registering RDD 31 (show at 
> <console>:46)
> 16/07/22 11:40:16 INFO scheduler.DAGScheduler: Got job 4 (show at 
> <console>:46) with 200 output partitions
> 16/07/22 11:40:16 INFO scheduler.DAGScheduler: Final stage: ResultStage 
> 8(show at <console>:46)
> 16/07/22 11:40:16 INFO scheduler.DAGScheduler: Parents of final stage: 
> List(ShuffleMapStage 7)
> 16/07/22 11:40:16 INFO scheduler.DAGScheduler: Missing parents: 
> List(ShuffleMapStage 7)
> 16/07/22 11:40:16 INFO scheduler.DAGScheduler: Submitting ShuffleMapStage 7 
> (MapPartitionsRDD[31] at show at <console>:46), which has no missing parents
> 16/07/22 11:40:16 INFO storage.MemoryStore: ensureFreeSpace(18128) called 
> with curMem=115755879, maxMem=2778495713
> 16/07/22 11:40:16 INFO storage.MemoryStore: Block broadcast_5 stored as 
> values in memory (estimated size 17.7 KB, free 2.5 GB)
> 16/07/22 11:40:16 INFO storage.MemoryStore: ensureFreeSpace(7527) called with 
> curMem=115774007, maxMem=2778495713
> 16/07/22 11:40:16 INFO storage.MemoryStore: Block broadcast_5_piece0 stored 
> as bytes in memory (estimated size 7.4 KB, free 2.5 GB)
> 16/07/22 11:40:16 INFO storage.BlockManagerInfo: Added broadcast_5_piece0 in 
> memory on localhost:61408 (size: 7.4 KB, free: 2.5 GB)
> 16/07/22 11:40:16 INFO spark.SparkContext: Created broadcast 5 from broadcast 
> at DAGScheduler.scala:861
> 16/07/22 11:40:16 INFO scheduler.DAGScheduler: Submitting 2 missing tasks 
> from ShuffleMapStage 7 (MapPartitionsRDD[31] at show at <console>:46)
> 16/07/22 11:40:16 INFO scheduler.TaskSchedulerImpl: Adding task set 7.0 with 
> 2 tasks
> 16/07/22 11:40:16 INFO scheduler.TaskSetManager: Starting task 0.0 in stage 
> 7.0 (TID 4, localhost, PROCESS_LOCAL, 2156 bytes)
> 16/07/22 11:40:16 INFO executor.Executor: Running task 0.0 in stage 7.0 (TID 
> 4)
> 16/07/22 11:40:16 INFO storage.BlockManager: Found block rdd_2_0 locally
> 16/07/22 11:40:17 INFO executor.Executor: Finished task 0.0 in stage 7.0 (TID 
> 4). 2738 bytes result sent to driver
> 16/07/22 11:40:17 INFO scheduler.TaskSetManager: Finished task 0.0 in stage 
> 7.0 (TID 4) in 920 ms on localhost (1/2)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to