?
Ningjun
From: Ted Yu [mailto:yuzhih...@gmail.com]
Sent: Tuesday, February 23, 2016 2:30 PM
To: Kevin Mellott
Cc: Wang, Ningjun (LNG-NPV); user@spark.apache.org
Subject: Re: How to get progress information of an RDD operation
I think Ningjun was looking for programmatic way of tracking progress.
I took
I think Ningjun was looking for programmatic way of tracking progress.
I took a look at:
./core/src/main/scala/org/apache/spark/scheduler/SparkListener.scala
but there doesn't seem to exist fine grained events directly reflecting
what Ningjun looks for.
On Tue, Feb 23, 2016 at 11:24 AM, Kevin Me
Have you considered using the Spark Web UI to view progress on your job? It
does a very good job showing the progress of the overall job, as well as
allows you to drill into the individual tasks and server activity.
On Tue, Feb 23, 2016 at 12:53 PM, Wang, Ningjun (LNG-NPV) <
ningjun.w...@lexisnexi
How can I get progress information of a RDD operation? For example
val lines = sc.textFile("c:/temp/input.txt") // a RDD of millions of line
lines.foreach(line => {
handleLine(line)
})
The input.txt contains millions of lines. The entire operation take 6 hours. I
want to print out h