Re: Is there a way to get the current progress of the job?

2014-04-03 Thread Philip Ogren
This is great news thanks for the update! I will either wait for the 1.0 release or go and test it ahead of time from git rather than trying to pull it out of JobLogger or creating my own SparkListener. On 04/02/2014 06:48 PM, Andrew Or wrote: Hi Philip, In the upcoming release of Spark

Re: Is there a way to get the current progress of the job?

2014-04-03 Thread Philip Ogren
I can appreciate the reluctance to expose something like the JobProgressListener as a public interface. It's exactly the sort of thing that you want to deprecate as soon as something better comes along and can be a real pain when trying to maintain the level of backwards compatibility that

Re: Is there a way to get the current progress of the job?

2014-04-02 Thread Philip Ogren
What I'd like is a way to capture the information provided on the stages page (i.e. cluster:4040/stages via IndexPage). Looking through the Spark code, it doesn't seem like it is possible to directly query for specific facts such as how many tasks have succeeded or how many total tasks there

Re: Is there a way to get the current progress of the job?

2014-04-02 Thread Patrick Wendell
Hey Phillip, Right now there is no mechanism for this. You have to go in through the low level listener interface. We could consider exposing the JobProgressListener directly - I think it's been factored nicely so it's fairly decoupled from the UI. The concern is this is a semi-internal piece of

Re: Is there a way to get the current progress of the job?

2014-04-02 Thread Andrew Or
Hi Philip, In the upcoming release of Spark 1.0 there will be a feature that provides for exactly what you describe: capturing the information displayed on the UI in JSON. More details will be provided in the documentation, but for now, anything before 0.9.1 can only go through JobLogger.scala,

Re: Is there a way to get the current progress of the job?

2014-04-01 Thread Mark Hamstra
Some related discussion: https://github.com/apache/spark/pull/246 On Tue, Apr 1, 2014 at 8:43 AM, Philip Ogren philip.og...@oracle.comwrote: Hi DB, Just wondering if you ever got an answer to your question about monitoring progress - either offline or through your own investigation. Any

Re: Is there a way to get the current progress of the job?

2014-04-01 Thread Kevin Markey
The discussion there hits on the distinction of jobs and stages. When looking at one application, there are hundreds of stages, sometimes thousands. Depends on the data and the task. And the UI seems to track stages. And one could independently track them for such a job.

Re: Is there a way to get the current progress of the job?

2014-04-01 Thread Mayur Rustagi
You can get detailed information through Spark listener interface regarding each stage. Multiple jobs may be compressed into a single stage so jobwise information would be same as Spark. Regards Mayur Mayur Rustagi Ph: +1 (760) 203 3257 http://www.sigmoidanalytics.com @mayur_rustagi