[jira] [Commented] (SPARK-24918) Executor Plugin API

Imran Rashid (JIRA) Tue, 07 Aug 2018 07:59:12 -0700


    [ 
https://issues.apache.org/jira/browse/SPARK-24918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16571793#comment-16571793
 ]


Imran Rashid commented on SPARK-24918:
--------------------------------------

[~lucacanali] you could certainly sample stack traces, but the current proposal 
doesn't cover communication with the driver at all.  IMO that is too much 
complexity for v1.  Did you have a design in mind for that?

You could use the executor plugin to build your own communication between the 
driver and executors, but depending on what you want, might be tricky.

Do you think you could setup the configuration you need statically, when the 
application starts?  Eg. i had run a test to take stack traces anytime a task 
was running over some configurable time -- then I just needed task start & end 
events in my plugin.

> Executor Plugin API
> -------------------
>
>                 Key: SPARK-24918
>                 URL: https://issues.apache.org/jira/browse/SPARK-24918
>             Project: Spark
>          Issue Type: New Feature
>          Components: Spark Core
>    Affects Versions: 2.4.0
>            Reporter: Imran Rashid
>            Priority: Major
>              Labels: SPIP, memory-analysis
>
> It would be nice if we could specify an arbitrary class to run within each 
> executor for debugging and instrumentation.  Its hard to do this currently 
> because:
> a) you have no idea when executors will come and go with DynamicAllocation, 
> so don't have a chance to run custom code before the first task
> b) even with static allocation, you'd have to change the code of your spark 
> app itself to run a special task to "install" the plugin, which is often 
> tough in production cases when those maintaining regularly running 
> applications might not even know how to make changes to the application.
> For example, https://github.com/squito/spark-memory could be used in a 
> debugging context to understand memory use, just by re-running an application 
> with extra command line arguments (as opposed to rebuilding spark).
> I think one tricky part here is just deciding the api, and how its versioned. 
>  Does it just get created when the executor starts, and thats it?  Or does it 
> get more specific events, like task start, task end, etc?  Would we ever add 
> more events?  It should definitely be a {{DeveloperApi}}, so breaking 
> compatibility would be allowed ... but still should be avoided.  We could 
> create a base class that has no-op implementations, or explicitly version 
> everything.
> Note that this is not needed in the driver as we already have SparkListeners 
> (even if you don't care about the SparkListenerEvents and just want to 
> inspect objects in the JVM, its still good enough).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (SPARK-24918) Executor Plugin API

Reply via email to