Hi, Anything relevant in ApplicationMaster's log? What about the executors? You should have 2 (default) so review the logs of each executors.
Pozdrawiam, Jacek Laskowski ---- https://medium.com/@jaceklaskowski/ Mastering Apache Spark http://bit.ly/mastering-apache-spark Follow me at https://twitter.com/jaceklaskowski On Tue, Jul 26, 2016 at 1:17 PM, Ascot Moss <ascot.m...@gmail.com> wrote: > It is YARN cluster, > > /bin/spark-submit \ > > --conf "spark.executor.extraJavaOptions=-XX:+UseG1GC -XX:+PrintGCTimeStamps > -XX:+PrintGCDetails" \ > > --driver-memory 64G \ > > --executor-memory 16g \ > > > On Tue, Jul 26, 2016 at 7:00 PM, Jacek Laskowski <ja...@japila.pl> wrote: >> >> Hi, >> >> What's the cluster manager? Is this YARN perhaps? Do you have any >> other apps on the cluster? How do you submit your app? What are the >> properties? >> >> Pozdrawiam, >> Jacek Laskowski >> ---- >> https://medium.com/@jaceklaskowski/ >> Mastering Apache Spark http://bit.ly/mastering-apache-spark >> Follow me at https://twitter.com/jaceklaskowski >> >> >> On Tue, Jul 26, 2016 at 1:27 AM, Ascot Moss <ascot.m...@gmail.com> wrote: >> > Hi, >> > >> > spark: 1.6.1 >> > java: java 1.8_u40 >> > I tried random forest training phase, the same code works well if with >> > 20 >> > trees (lower accuracy, about 68%). When trying the training phase with >> > more >> > tree, I set to 200 trees, it returned: >> > >> > "DAGScheduler: Job 20 finished: collectAsMap at DecisionTree.scala:651, >> > took >> > 19.556700 s Killed" . There is no WARN or ERROR from console, the task >> > is >> > just stopped in the end. >> > >> > Any idea how to resolve it? Should the timeout parameter be set to >> > longer >> > >> > regards >> > >> > >> > (below is the log from console) >> > >> > 16/07/26 00:02:47 INFO DAGScheduler: looking for newly runnable stages >> > >> > 16/07/26 00:02:47 INFO DAGScheduler: running: Set() >> > >> > 16/07/26 00:02:47 INFO DAGScheduler: waiting: Set(ResultStage 32) >> > >> > 16/07/26 00:02:47 INFO DAGScheduler: failed: Set() >> > >> > 16/07/26 00:02:47 INFO DAGScheduler: Submitting ResultStage 32 >> > (MapPartitionsRDD[75] at map at DecisionTree.scala:642), which has no >> > missing parents >> > >> > 16/07/26 00:02:47 INFO MemoryStore: Block broadcast_48 stored as values >> > in >> > memory (estimated size 2.2 MB, free 18.2 MB) >> > >> > 16/07/26 00:02:47 INFO MemoryStore: Block broadcast_48_piece0 stored as >> > bytes in memory (estimated size 436.9 KB, free 18.7 MB) >> > >> > 16/07/26 00:02:47 INFO BlockManagerInfo: Added broadcast_48_piece0 in >> > memory >> > on x.x.x.x:35450 (size: 436.9 KB, free: 45.8 GB) >> > >> > 16/07/26 00:02:47 INFO SparkContext: Created broadcast 48 from broadcast >> > at >> > DAGScheduler.scala:1006 >> > >> > 16/07/26 00:02:47 INFO DAGScheduler: Submitting 4 missing tasks from >> > ResultStage 32 (MapPartitionsRDD[75] at map at DecisionTree.scala:642) >> > >> > 16/07/26 00:02:47 INFO TaskSchedulerImpl: Adding task set 32.0 with 4 >> > tasks >> > >> > 16/07/26 00:02:47 INFO TaskSetManager: Starting task 0.0 in stage 32.0 >> > (TID >> > 185, x.x.x.x, partition 0,NODE_LOCAL, 1956 bytes) >> > >> > 16/07/26 00:02:47 INFO TaskSetManager: Starting task 1.0 in stage 32.0 >> > (TID >> > 186, x.x.x.x, partition 1,NODE_LOCAL, 1956 bytes) >> > >> > 16/07/26 00:02:47 INFO TaskSetManager: Starting task 2.0 in stage 32.0 >> > (TID >> > 187, x.x.x.x, partition 2,NODE_LOCAL, 1956 bytes) >> > >> > 16/07/26 00:02:47 INFO TaskSetManager: Starting task 3.0 in stage 32.0 >> > (TID >> > 188, x.x.x.x, partition 3,NODE_LOCAL, 1956 bytes) >> > >> > 16/07/26 00:02:47 INFO BlockManagerInfo: Added broadcast_48_piece0 in >> > memory >> > on x.x.x.x:58784 (size: 436.9 KB, free: 5.1 GB) >> > >> > 16/07/26 00:02:47 INFO MapOutputTrackerMasterEndpoint: Asked to send map >> > output locations for shuffle 12 to x.x.x.x:44434 >> > >> > 16/07/26 00:02:47 INFO MapOutputTrackerMaster: Size of output statuses >> > for >> > shuffle 12 is 180 bytes >> > >> > 16/07/26 00:02:47 INFO BlockManagerInfo: Added broadcast_48_piece0 in >> > memory >> > on x.x.x.x:46186 (size: 436.9 KB, free: 2.2 GB) >> > >> > 16/07/26 00:02:47 INFO BlockManagerInfo: Added broadcast_48_piece0 in >> > memory >> > on x.x.x.x:50132 (size: 436.9 KB, free: 5.0 GB) >> > >> > 16/07/26 00:02:47 INFO MapOutputTrackerMasterEndpoint: Asked to send map >> > output locations for shuffle 12 to x.x.x.x:47272 >> > >> > 16/07/26 00:02:47 INFO MapOutputTrackerMasterEndpoint: Asked to send map >> > output locations for shuffle 12 to x.x.x.x:46802 >> > >> > 16/07/26 00:02:49 INFO TaskSetManager: Finished task 2.0 in stage 32.0 >> > (TID >> > 187) in 2265 ms on x.x.x.x (1/4) >> > >> > 16/07/26 00:02:49 INFO TaskSetManager: Finished task 1.0 in stage 32.0 >> > (TID >> > 186) in 2266 ms on x.x.x.x (2/4) >> > >> > 16/07/26 00:02:50 INFO TaskSetManager: Finished task 0.0 in stage 32.0 >> > (TID >> > 185) in 2794 ms on x.x.x.x (3/4) >> > >> > 16/07/26 00:02:50 INFO TaskSetManager: Finished task 3.0 in stage 32.0 >> > (TID >> > 188) in 3738 ms on x.x.x.x (4/4) >> > >> > 16/07/26 00:02:50 INFO TaskSchedulerImpl: Removed TaskSet 32.0, whose >> > tasks >> > have all completed, from pool >> > >> > 16/07/26 00:02:50 INFO DAGScheduler: ResultStage 32 (collectAsMap at >> > DecisionTree.scala:651) finished in 3.738 s >> > >> > 16/07/26 00:02:50 INFO DAGScheduler: Job 19 finished: collectAsMap at >> > DecisionTree.scala:651, took 19.493917 s >> > >> > 16/07/26 00:02:51 INFO MemoryStore: Block broadcast_49 stored as values >> > in >> > memory (estimated size 1053.9 KB, free 19.7 MB) >> > >> > 16/07/26 00:02:52 INFO MemoryStore: Block broadcast_49_piece0 stored as >> > bytes in memory (estimated size 626.7 KB, free 20.3 MB) >> > >> > 16/07/26 00:02:52 INFO BlockManagerInfo: Added broadcast_49_piece0 in >> > memory >> > on x.x.x.x:35450 (size: 626.7 KB, free: 45.8 GB) >> > >> > 16/07/26 00:02:52 INFO SparkContext: Created broadcast 49 from broadcast >> > at >> > DecisionTree.scala:601 >> > >> > 16/07/26 00:02:52 INFO SparkContext: Starting job: collectAsMap at >> > DecisionTree.scala:651 >> > >> > 16/07/26 00:02:52 INFO DAGScheduler: Registering RDD 76 (mapPartitions >> > at >> > DecisionTree.scala:622) >> > >> > 16/07/26 00:02:52 INFO DAGScheduler: Got job 20 (collectAsMap at >> > DecisionTree.scala:651) with 4 output partitions >> > >> > 16/07/26 00:02:52 INFO DAGScheduler: Final stage: ResultStage 34 >> > (collectAsMap at DecisionTree.scala:651) >> > >> > 16/07/26 00:02:52 INFO DAGScheduler: Parents of final stage: >> > List(ShuffleMapStage 33) >> > >> > 16/07/26 00:02:52 INFO DAGScheduler: Missing parents: >> > List(ShuffleMapStage >> > 33) >> > >> > 16/07/26 00:02:52 INFO DAGScheduler: Submitting ShuffleMapStage 33 >> > (MapPartitionsRDD[76] at mapPartitions at DecisionTree.scala:622), which >> > has >> > no missing parents >> > >> > 16/07/26 00:02:52 INFO MemoryStore: Block broadcast_50 stored as values >> > in >> > memory (estimated size 10.0 MB, free 30.3 MB) >> > >> > 16/07/26 00:02:52 INFO MemoryStore: Block broadcast_50_piece0 stored as >> > bytes in memory (estimated size 2.9 MB, free 33.2 MB) >> > >> > 16/07/26 00:02:52 INFO BlockManagerInfo: Added broadcast_50_piece0 in >> > memory >> > on x.x.x.x:35450 (size: 2.9 MB, free: 45.8 GB) >> > >> > 16/07/26 00:02:52 INFO SparkContext: Created broadcast 50 from broadcast >> > at >> > DAGScheduler.scala:1006 >> > >> > 16/07/26 00:02:52 INFO DAGScheduler: Submitting 4 missing tasks from >> > ShuffleMapStage 33 (MapPartitionsRDD[76] at mapPartitions at >> > DecisionTree.scala:622) >> > >> > 16/07/26 00:02:52 INFO TaskSchedulerImpl: Adding task set 33.0 with 4 >> > tasks >> > >> > 16/07/26 00:02:52 INFO TaskSetManager: Starting task 1.0 in stage 33.0 >> > (TID >> > 189, x.x.x.x, partition 1,PROCESS_LOCAL, 2333 bytes) >> > >> > 16/07/26 00:02:52 INFO TaskSetManager: Starting task 0.0 in stage 33.0 >> > (TID >> > 190, x.x.x.x, partition 0,PROCESS_LOCAL, 2333 bytes) >> > >> > 16/07/26 00:02:52 INFO TaskSetManager: Starting task 2.0 in stage 33.0 >> > (TID >> > 191, x.x.x.x, partition 2,PROCESS_LOCAL, 2333 bytes) >> > >> > 16/07/26 00:02:52 INFO TaskSetManager: Starting task 3.0 in stage 33.0 >> > (TID >> > 192, x.x.x.x, partition 3,PROCESS_LOCAL, 2333 bytes) >> > >> > 16/07/26 00:02:53 INFO BlockManagerInfo: Added broadcast_50_piece0 in >> > memory >> > on x.x.x.x:58784 (size: 2.9 MB, free: 5.0 GB) >> > >> > 16/07/26 00:02:53 INFO BlockManagerInfo: Added broadcast_49_piece0 in >> > memory >> > on x.x.x.x:58784 (size: 626.7 KB, free: 5.0 GB) >> > >> > 16/07/26 00:02:53 INFO BlockManagerInfo: Added broadcast_50_piece0 in >> > memory >> > on x.x.x.x:46186 (size: 2.9 MB, free: 2.2 GB) >> > >> > 16/07/26 00:02:53 INFO BlockManagerInfo: Added broadcast_50_piece0 in >> > memory >> > on x.x.x.x:50132 (size: 2.9 MB, free: 5.0 GB) >> > >> > 16/07/26 00:02:53 INFO BlockManagerInfo: Added broadcast_49_piece0 in >> > memory >> > on x.x.x.x:46186 (size: 626.7 KB, free: 2.2 GB) >> > >> > 16/07/26 00:02:53 INFO BlockManagerInfo: Added broadcast_49_piece0 in >> > memory >> > on x.x.x.x:50132 (size: 626.7 KB, free: 5.0 GB) >> > >> > 16/07/26 00:02:57 INFO TaskSetManager: Finished task 0.0 in stage 33.0 >> > (TID >> > 190) in 4212 ms on x.x.x.x (1/4) >> > >> > 16/07/26 00:02:57 INFO TaskSetManager: Finished task 1.0 in stage 33.0 >> > (TID >> > 189) in 4989 ms on x.x.x.x (2/4) >> > >> > 16/07/26 00:03:07 INFO TaskSetManager: Finished task 2.0 in stage 33.0 >> > (TID >> > 191) in 14934 ms on x.x.x.x (3/4) >> > >> > 16/07/26 00:03:07 INFO TaskSetManager: Finished task 3.0 in stage 33.0 >> > (TID >> > 192) in 15172 ms on x.x.x.x (4/4) >> > >> > 16/07/26 00:03:07 INFO TaskSchedulerImpl: Removed TaskSet 33.0, whose >> > tasks >> > have all completed, from pool >> > >> > 16/07/26 00:03:07 INFO DAGScheduler: ShuffleMapStage 33 (mapPartitions >> > at >> > DecisionTree.scala:622) finished in 15.173 s >> > >> > 16/07/26 00:03:07 INFO DAGScheduler: looking for newly runnable stages >> > >> > 16/07/26 00:03:07 INFO DAGScheduler: running: Set() >> > >> > 16/07/26 00:03:07 INFO DAGScheduler: waiting: Set(ResultStage 34) >> > >> > 16/07/26 00:03:07 INFO DAGScheduler: failed: Set() >> > >> > 16/07/26 00:03:07 INFO DAGScheduler: Submitting ResultStage 34 >> > (MapPartitionsRDD[78] at map at DecisionTree.scala:642), which has no >> > missing parents >> > >> > 16/07/26 00:03:08 INFO MemoryStore: Block broadcast_51 stored as values >> > in >> > memory (estimated size 2.2 MB, free 35.4 MB) >> > >> > 16/07/26 00:03:08 INFO MemoryStore: Block broadcast_51_piece0 stored as >> > bytes in memory (estimated size 444.7 KB, free 35.8 MB) >> > >> > 16/07/26 00:03:08 INFO BlockManagerInfo: Added broadcast_51_piece0 in >> > memory >> > on x.x.x.x:35450 (size: 444.7 KB, free: 45.8 GB) >> > >> > 16/07/26 00:03:08 INFO SparkContext: Created broadcast 51 from broadcast >> > at >> > DAGScheduler.scala:1006 >> > >> > 16/07/26 00:03:08 INFO DAGScheduler: Submitting 4 missing tasks from >> > ResultStage 34 (MapPartitionsRDD[78] at map at DecisionTree.scala:642) >> > >> > 16/07/26 00:03:08 INFO TaskSchedulerImpl: Adding task set 34.0 with 4 >> > tasks >> > >> > 16/07/26 00:03:08 INFO TaskSetManager: Starting task 0.0 in stage 34.0 >> > (TID >> > 193, x.x.x.x, partition 0,NODE_LOCAL, 1956 bytes) >> > >> > 16/07/26 00:03:08 INFO TaskSetManager: Starting task 1.0 in stage 34.0 >> > (TID >> > 194, x.x.x.x, partition 1,NODE_LOCAL, 1956 bytes) >> > >> > 16/07/26 00:03:08 INFO TaskSetManager: Starting task 2.0 in stage 34.0 >> > (TID >> > 195, x.x.x.x, partition 2,NODE_LOCAL, 1956 bytes) >> > >> > 16/07/26 00:03:08 INFO TaskSetManager: Starting task 3.0 in stage 34.0 >> > (TID >> > 196, x.x.x.x, partition 3,NODE_LOCAL, 1956 bytes) >> > >> > 16/07/26 00:03:08 INFO BlockManagerInfo: Added broadcast_51_piece0 in >> > memory >> > on x.x.x.x:58784 (size: 444.7 KB, free: 5.0 GB) >> > >> > 16/07/26 00:03:08 INFO MapOutputTrackerMasterEndpoint: Asked to send map >> > output locations for shuffle 13 to x.x.x.x:44434 >> > >> > 16/07/26 00:03:08 INFO MapOutputTrackerMaster: Size of output statuses >> > for >> > shuffle 13 is 180 bytes >> > >> > 16/07/26 00:03:08 INFO BlockManagerInfo: Added broadcast_51_piece0 in >> > memory >> > on x.x.x.x:46186 (size: 444.7 KB, free: 2.2 GB) >> > >> > 16/07/26 00:03:08 INFO MapOutputTrackerMasterEndpoint: Asked to send map >> > output locations for shuffle 13 to x.x.x.x:47272 >> > >> > 16/07/26 00:03:08 INFO BlockManagerInfo: Added broadcast_51_piece0 in >> > memory >> > on x.x.x.x:50132 (size: 444.7 KB, free: 5.0 GB) >> > >> > 16/07/26 00:03:08 INFO MapOutputTrackerMasterEndpoint: Asked to send map >> > output locations for shuffle 13 to x.x.x.x:46802 >> > >> > 16/07/26 00:03:10 INFO TaskSetManager: Finished task 1.0 in stage 34.0 >> > (TID >> > 194) in 2240 ms on x.x.x.x (1/4) >> > >> > 16/07/26 00:03:10 INFO TaskSetManager: Finished task 0.0 in stage 34.0 >> > (TID >> > 193) in 2749 ms on x.x.x.x (2/4) >> > >> > 16/07/26 00:03:11 INFO TaskSetManager: Finished task 2.0 in stage 34.0 >> > (TID >> > 195) in 3818 ms on x.x.x.x (3/4) >> > >> > 16/07/26 00:03:11 INFO TaskSetManager: Finished task 3.0 in stage 34.0 >> > (TID >> > 196) in 3901 ms on x.x.x.x (4/4) >> > >> > 16/07/26 00:03:11 INFO DAGScheduler: ResultStage 34 (collectAsMap at >> > DecisionTree.scala:651) finished in 3.902 s >> > >> > 16/07/26 00:03:11 INFO TaskSchedulerImpl: Removed TaskSet 34.0, whose >> > tasks >> > have all completed, from pool >> > >> > 16/07/26 00:03:11 INFO DAGScheduler: Job 20 finished: collectAsMap at >> > DecisionTree.scala:651, took 19.556700 s >> > >> > Killed >> > >> > >> > >> > > > --------------------------------------------------------------------- To unsubscribe e-mail: user-unsubscr...@spark.apache.org