Github user squito commented on the issue:
https://github.com/apache/spark/pull/21923
> Are there more specific use cases? I always feel it'd be impossible to
design APIs without seeing couple different use cases.
With this basic api, you could just do things that tie into the JVM in
general. For example, you can inspect memory or get thread dumps.
We could add an event for executor shutdown, if you wanted to cleanup any
shared resources. I haven't had a need for this, but I think this is something
I've heard requests for in the past.
I have another variant of this where you also get task start and end
events. This lets you control the monitoring a little more -- eg., I had
something which just started polling thread dumps only if there was a task from
stage 17 that had been taking longer than 5 seconds. But anything task related
is a bit trickier to decide the right api. Shoudl the task end event also get
the failure reason? Should those events get called in the same thread as the
task runner, or in another thread? Again, DeveloperApi gives us flexibility to
change those particulars down the road, but I didn't feel strongly about
getting them in right now.
> Currently this isn't putting any user docs up which might make sense if
our use case is debugging and we want to try to vet this as an alpha api.
thoughts?
I feel like we should leave it undocumented at first, just because I worry
about the average user not knowing what to do with it (or doing something they
*really* shouldn't be). But I don't feel super strongly about it.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]