Hey Phillip, Right now there is no mechanism for this. You have to go in through the low level listener interface.
We could consider exposing the JobProgressListener directly - I think it's been factored nicely so it's fairly decoupled from the UI. The concern is this is a semi-internal piece of functionality and something we might, e.g. want to change the API of over time. - Patrick On Wed, Apr 2, 2014 at 3:39 PM, Philip Ogren <philip.og...@oracle.com>wrote: > What I'd like is a way to capture the information provided on the stages > page (i.e. cluster:4040/stages via IndexPage). Looking through the Spark > code, it doesn't seem like it is possible to directly query for specific > facts such as how many tasks have succeeded or how many total tasks there > are for a given active stage. Instead, it looks like all the data for the > page is generated at once using information from the JobProgressListener. > It doesn't seem like I have any way to programmatically access this > information myself. I can't even instantiate my own JobProgressListener > because it is spark package private. I could implement my SparkListener > and gather up the information myself. It feels a bit awkward since classes > like Task and TaskInfo are also spark package private. It does seem > possible to gather up what I need but it seems like this sort of > information should just be available without by implementing a custom > SparkListener (or worse screen scraping the html generated by StageTable!) > > I was hoping that I would find the answer in MetricsServlet which is > turned on by default. It seems that when I visit > http://cluster:4040/metrics/json/ I should be able to get everything I > want but I don't see the basic stage/task progress information I would > expect. Are there special metrics properties that I should set to get this > info? I think this would be the best solution - just give it the right URL > and parse the resulting JSON - but I can't seem to figure out how to do > this or if it is possible. > > Any advice is appreciated. > > Thanks, > Philip > > > > On 04/01/2014 09:43 AM, Philip Ogren wrote: > >> Hi DB, >> >> Just wondering if you ever got an answer to your question about >> monitoring progress - either offline or through your own investigation. >> Any findings would be appreciated. >> >> Thanks, >> Philip >> >> On 01/30/2014 10:32 PM, DB Tsai wrote: >> >>> Hi guys, >>> >>> When we're running a very long job, we would like to show users the >>> current progress of map and reduce job. After looking at the api document, >>> I don't find anything for this. However, in Spark UI, I could see the >>> progress of the task. Is there anything I miss? >>> >>> Thanks. >>> >>> Sincerely, >>> >>> DB Tsai >>> Machine Learning Engineer >>> Alpine Data Labs >>> -------------------------------------- >>> Web: http://alpinenow.com/ >>> >> >> >