Hey Phillip,

Right now there is no mechanism for this. You have to go in through the low
level listener interface.

We could consider exposing the JobProgressListener directly - I think it's
been factored nicely so it's fairly decoupled from the UI. The concern is
this is a semi-internal piece of functionality and something we might, e.g.
want to change the API of over time.

- Patrick


On Wed, Apr 2, 2014 at 3:39 PM, Philip Ogren <philip.og...@oracle.com>wrote:

> What I'd like is a way to capture the information provided on the stages
> page (i.e. cluster:4040/stages via IndexPage).  Looking through the Spark
> code, it doesn't seem like it is possible to directly query for specific
> facts such as how many tasks have succeeded or how many total tasks there
> are for a given active stage.  Instead, it looks like all the data for the
> page is generated at once using information from the JobProgressListener.
> It doesn't seem like I have any way to programmatically access this
> information myself.  I can't even instantiate my own JobProgressListener
> because it is spark package private.  I could implement my SparkListener
> and gather up the information myself.  It feels a bit awkward since classes
> like Task and TaskInfo are also spark package private.  It does seem
> possible to gather up what I need but it seems like this sort of
> information should just be available without by implementing a custom
> SparkListener (or worse screen scraping the html generated by StageTable!)
>
> I was hoping that I would find the answer in MetricsServlet which is
> turned on by default.  It seems that when I visit
> http://cluster:4040/metrics/json/ I should be able to get everything I
> want but I don't see the basic stage/task progress information I would
> expect.  Are there special metrics properties that I should set to get this
> info?  I think this would be the best solution - just give it the right URL
> and parse the resulting JSON - but I can't seem to figure out how to do
> this or if it is possible.
>
> Any advice is appreciated.
>
> Thanks,
> Philip
>
>
>
> On 04/01/2014 09:43 AM, Philip Ogren wrote:
>
>> Hi DB,
>>
>> Just wondering if you ever got an answer to your question about
>> monitoring progress - either offline or through your own investigation.
>>  Any findings would be appreciated.
>>
>> Thanks,
>> Philip
>>
>> On 01/30/2014 10:32 PM, DB Tsai wrote:
>>
>>> Hi guys,
>>>
>>> When we're running a very long job, we would like to show users the
>>> current progress of map and reduce job. After looking at the api document,
>>> I don't find anything for this. However, in Spark UI, I could see the
>>> progress of the task. Is there anything I miss?
>>>
>>> Thanks.
>>>
>>> Sincerely,
>>>
>>> DB Tsai
>>> Machine Learning Engineer
>>> Alpine Data Labs
>>> --------------------------------------
>>> Web: http://alpinenow.com/
>>>
>>
>>
>

Reply via email to