Hi Guys,

Could anyone help me understanding the logs below? Why the result in the second log is 0?

Thanks Guys

14/02/20 19:06:00 INFO JobScheduler: Finished job streaming job 1392919557000 ms.0 from job set of time 1392919557000 ms 14/02/20 19:06:00 INFO JobScheduler: Total delay: 3.185 s for time 1392919557000 ms (execution: 3.167 s) 14/02/20 19:06:00 INFO JobGenerator: Checkpointing graph for time 1392919557000 ms 14/02/20 19:06:00 INFO DStreamGraph: Updating checkpoint data for time 1392919557000 ms 14/02/20 19:06:00 INFO DStreamGraph: Updated checkpoint data for time 1392919557000 ms 14/02/20 19:06:00 INFO SparkContext: Starting job: first at NetworkWordCount.scala:87 14/02/20 19:06:00 INFO JobScheduler: Starting job streaming job 1392919558000 ms.0 from job set of time 1392919558000 ms 14/02/20 19:06:00 INFO CheckpointWriter: Saving checkpoint for time 1392919557000 ms to file 'hdfs://computer8:54310/user/root/INPUT/checkpoint-1392919557000' 14/02/20 19:06:00 INFO DAGScheduler: Registering RDD 812 (combineByKey at ShuffledDStream.scala:42) 14/02/20 19:06:00 INFO DAGScheduler: Got job 91 (first at NetworkWordCount.scala:87) with 1 output partitions (allowLocal=true) 14/02/20 19:06:00 INFO DAGScheduler: Final stage: Stage 181 (first at NetworkWordCount.scala:87)
14/02/20 19:06:00 INFO DAGScheduler: Parents of final stage: List(Stage 182)
14/02/20 19:06:00 INFO DAGScheduler: Missing parents: List(Stage 182)
14/02/20 19:06:00 INFO DAGScheduler: Submitting Stage 182 (MapPartitionsRDD[812] at combineByKey at ShuffledDStream.scala:42), which has no missing parents 14/02/20 19:06:00 INFO DAGScheduler: Submitting 2 missing tasks from Stage 182 (MapPartitionsRDD[812] at combineByKey at ShuffledDStream.scala:42)
14/02/20 19:06:00 INFO TaskSchedulerImpl: Adding task set 182.0 with 2 tasks
14/02/20 19:06:00 INFO TaskSetManager: Starting task 182.0:1 as TID 609 on executor 0: computer1.ant-net (PROCESS_LOCAL) 14/02/20 19:06:00 INFO TaskSetManager: Serialized task 182.0:1 as 3023 bytes in 0 ms 14/02/20 19:06:00 INFO TaskSetManager: Starting task 182.0:0 as TID 610 on executor 0: computer1.ant-net (NODE_LOCAL) 14/02/20 19:06:00 INFO TaskSetManager: Serialized task 182.0:0 as 3485 bytes in 0 ms 14/02/20 19:06:00 INFO TaskSetManager: Finished TID 609 in 17 ms on computer1.ant-net (progress: 0/2)
14/02/20 19:06:00 INFO DAGScheduler: Completed ShuffleMapTask(182, 1)
14/02/20 19:06:00 INFO BlockManagerMasterActor$BlockManagerInfo: Added input-0-1392919527400 in memory on computer1.ant-net:41142 (size: 2018.6 KB, free: 387.3 MB) 14/02/20 19:06:00 INFO TaskSetManager: Finished TID 610 in 67 ms on computer1.ant-net (progress: 1/2)
14/02/20 19:06:00 INFO TaskSchedulerImpl: Remove TaskSet 182.0 from pool
14/02/20 19:06:00 INFO DAGScheduler: Completed ShuffleMapTask(182, 0)
14/02/20 19:06:00 INFO DAGScheduler: Stage 182 (combineByKey at ShuffledDStream.scala:42) finished in 0.080 s
14/02/20 19:06:00 INFO DAGScheduler: looking for newly runnable stages
14/02/20 19:06:00 INFO DAGScheduler: running: Set(Stage 4)
14/02/20 19:06:00 INFO DAGScheduler: waiting: Set(Stage 181)
14/02/20 19:06:00 INFO DAGScheduler: failed: Set()
14/02/20 19:06:00 INFO CheckpointWriter: Deleting hdfs://computer8:54310/user/root/INPUT/checkpoint-1392919554000.bk
14/02/20 19:06:00 INFO DAGScheduler: Missing parents for Stage 181: List()
14/02/20 19:06:00 INFO DAGScheduler: Submitting Stage 181 (MappedRDD[815] at map at MappedDStream.scala:35), which is now runnable 14/02/20 19:06:00 INFO CheckpointWriter: Checkpoint for time 1392919557000 ms saved to file 'hdfs://computer8:54310/user/root/INPUT/checkpoint-1392919557000', took 3270 bytes and 102 ms 14/02/20 19:06:00 INFO DStreamGraph: Clearing checkpoint data for time 1392919557000 ms 14/02/20 19:06:00 INFO DStreamGraph: Cleared checkpoint data for time 1392919557000 ms 14/02/20 19:06:00 INFO DAGScheduler: Submitting 1 missing tasks from Stage 181 (MappedRDD[815] at map at MappedDStream.scala:35)
14/02/20 19:06:00 INFO TaskSchedulerImpl: Adding task set 181.0 with 1 tasks
14/02/20 19:06:00 INFO TaskSetManager: Starting task 181.0:0 as TID 611 on executor 0: computer1.ant-net (PROCESS_LOCAL) 14/02/20 19:06:00 INFO TaskSetManager: Serialized task 181.0:0 as 2057 bytes in 1 ms 14/02/20 19:06:00 INFO MapOutputTrackerMasterActor: Asked to send map output locations for shuffle 90 to sp...@computer1.ant-net:47226 14/02/20 19:06:00 INFO MapOutputTrackerMaster: Size of output statuses for shuffle 90 is 146 bytes 14/02/20 19:06:00 INFO TaskSetManager: Finished TID 611 in 25 ms on computer1.ant-net (progress: 0/1)
14/02/20 19:06:00 INFO TaskSchedulerImpl: Remove TaskSet 181.0 from pool
14/02/20 19:06:00 INFO DAGScheduler: Completed ResultTask(181, 0)
14/02/20 19:06:00 INFO DAGScheduler: Stage 181 (first at NetworkWordCount.scala:87) finished in 0.027 s 14/02/20 19:06:00 INFO SparkContext: Job finished: first at NetworkWordCount.scala:87, took 0.133625862 s
118967 (Total of words in a RDD)
#######################################################################################

14/02/20 19:06:00 INFO JobScheduler: Finished job streaming job 1392919558000 ms.0 from job set of time 1392919558000 ms 14/02/20 19:06:00 INFO JobGenerator: Checkpointing graph for time 1392919558000 ms 14/02/20 19:06:00 INFO DStreamGraph: Updating checkpoint data for time 1392919558000 ms 14/02/20 19:06:00 INFO DStreamGraph: Updated checkpoint data for time 1392919558000 ms 14/02/20 19:06:00 INFO SparkContext: Starting job: first at NetworkWordCount.scala:87 14/02/20 19:06:00 INFO CheckpointWriter: Saving checkpoint for time 1392919558000 ms to file 'hdfs://computer8:54310/user/root/INPUT/checkpoint-1392919558000' 14/02/20 19:06:00 INFO DAGScheduler: Registering RDD 821 (combineByKey at ShuffledDStream.scala:42) 14/02/20 19:06:00 INFO JobScheduler: Total delay: 2.322 s for time 1392919558000 ms (execution: 0.134 s) 14/02/20 19:06:00 INFO JobScheduler: Starting job streaming job 1392919559000 ms.0 from job set of time 1392919559000 ms 14/02/20 19:06:00 INFO DAGScheduler: Got job 92 (first at NetworkWordCount.scala:87) with 1 output partitions (allowLocal=true) 14/02/20 19:06:00 INFO DAGScheduler: Final stage: Stage 183 (first at NetworkWordCount.scala:87)
14/02/20 19:06:00 INFO DAGScheduler: Parents of final stage: List(Stage 184)
14/02/20 19:06:00 INFO DAGScheduler: Missing parents: List(Stage 184)
14/02/20 19:06:00 INFO DAGScheduler: Submitting Stage 184 (MapPartitionsRDD[821] at combineByKey at ShuffledDStream.scala:42), which has no missing parents 14/02/20 19:06:00 INFO DAGScheduler: Submitting 1 missing tasks from Stage 184 (MapPartitionsRDD[821] at combineByKey at ShuffledDStream.scala:42)
14/02/20 19:06:00 INFO TaskSchedulerImpl: Adding task set 184.0 with 1 tasks
14/02/20 19:06:00 INFO TaskSetManager: Starting task 184.0:0 as TID 612 on executor 0: computer1.ant-net (PROCESS_LOCAL) 14/02/20 19:06:00 INFO TaskSetManager: Serialized task 184.0:0 as 3024 bytes in 1 ms 14/02/20 19:06:00 INFO TaskSetManager: Finished TID 612 in 17 ms on computer1.ant-net (progress: 0/1)
14/02/20 19:06:00 INFO TaskSchedulerImpl: Remove TaskSet 184.0 from pool
14/02/20 19:06:00 INFO DAGScheduler: Completed ShuffleMapTask(184, 0)
14/02/20 19:06:00 INFO DAGScheduler: Stage 184 (combineByKey at ShuffledDStream.scala:42) finished in 0.018 s
14/02/20 19:06:00 INFO DAGScheduler: looking for newly runnable stages
14/02/20 19:06:00 INFO DAGScheduler: running: Set(Stage 4)
14/02/20 19:06:00 INFO DAGScheduler: waiting: Set(Stage 183)
14/02/20 19:06:00 INFO DAGScheduler: failed: Set()
14/02/20 19:06:00 INFO DAGScheduler: Missing parents for Stage 183: List()
14/02/20 19:06:00 INFO DAGScheduler: Submitting Stage 183 (MappedRDD[824] at map at MappedDStream.scala:35), which is now runnable 14/02/20 19:06:00 INFO DAGScheduler: Submitting 1 missing tasks from Stage 183 (MappedRDD[824] at map at MappedDStream.scala:35)
14/02/20 19:06:00 INFO TaskSchedulerImpl: Adding task set 183.0 with 1 tasks
14/02/20 19:06:00 INFO TaskSetManager: Starting task 183.0:0 as TID 613 on executor 0: computer1.ant-net (PROCESS_LOCAL) 14/02/20 19:06:00 INFO TaskSetManager: Serialized task 183.0:0 as 2057 bytes in 1 ms 14/02/20 19:06:00 INFO MapOutputTrackerMasterActor: Asked to send map output locations for shuffle 91 to sp...@computer1.ant-net:47226 14/02/20 19:06:00 INFO MapOutputTrackerMaster: Size of output statuses for shuffle 91 is 137 bytes 14/02/20 19:06:00 INFO TaskSetManager: Finished TID 613 in 23 ms on computer1.ant-net (progress: 0/1)
14/02/20 19:06:00 INFO TaskSchedulerImpl: Remove TaskSet 183.0 from pool
14/02/20 19:06:00 INFO DAGScheduler: Completed ResultTask(183, 0)
14/02/20 19:06:00 INFO DAGScheduler: Stage 183 (first at NetworkWordCount.scala:87) finished in 0.026 s 14/02/20 19:06:00 INFO SparkContext: Job finished: first at NetworkWordCount.scala:87, took 0.072442522 s
0 (Total of words in a RDD)





--
Informativa sulla Privacy: http://www.unibs.it/node/8155

Reply via email to