[ 
https://issues.apache.org/jira/browse/SPARK-3270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14645131#comment-14645131
 ] 

Steve Ash commented on SPARK-3270:
----------------------------------

I would like each executor to have some singletons that have the same lifecycle 
as the executor.
1- serialization proxy objects that get read replaced by real objects managed 
by a singleton in the executor.
2- dependency injection of services inside of closures that run on the 
executor.  (Similar to how GridGain allows you to pass anonymous inner classes 
with fields that it will inject before running on local nodes)

Right now I just have both of these going through a static method to get the 
singletons which lazily loads up my stuff on the first access.  Some kind of 
extension API would provide a way for me to eagerly do this on executor startup 
so that (a) I get more precise/consistent control over when things happen, (b) 
i dont delay the first operation seems preferable.

The alternative that I see others mention is doing something equivalent to 
executing some RDD map operation such that each executor will be hit and do you 
initialization there.  I'm using spark streaming so I would have to keep doing 
this for every minibatch?  If I did it only once and not per minibatch, then if 
an executor failed and was restarted... I don't think things would be 
initialized.  Or I could fire this for every minibatch, but that seems like a 
lot of effort for something that seems like such an obvious application 
extension point.  

This is marked "in progress" is that just an accident or are there actually any 
further considerations here?

> Spark API for Application Extensions
> ------------------------------------
>
>                 Key: SPARK-3270
>                 URL: https://issues.apache.org/jira/browse/SPARK-3270
>             Project: Spark
>          Issue Type: New Feature
>          Components: Spark Core
>            Reporter: Michal Malohlava
>
> Any application should be able to enrich spark infrastructure by services 
> which are not available by default.  
> Hence, to support such application extensions (aka "extesions"/"plugins") 
> Spark platform should provide:
>   - an API to register an extension 
>   - an API to register a "service" (meaning provided functionality)
>   - well-defined points in Spark infrastructure which can be enriched/hooked 
> by extension
>   - a way of deploying extension (for example, simply putting the extension 
> on classpath and using Java service interface)
>   - a way to access extension from application
> Overall proposal is available here: 
> https://docs.google.com/document/d/1dHF9zi7GzFbYnbV2PwaOQ2eLPoTeiN9IogUe4PAOtrQ/edit?usp=sharing
> Note: In this context, I do not mean reinventing OSGi (or another plugin 
> platform) but it can serve as a good starting point.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to