[jira] [Commented] (SPARK-650) Add a "setup hook" API for running initialization code on each executor

JIRA Mon, 05 Dec 2016 04:40:37 -0800

    [ 
https://issues.apache.org/jira/browse/SPARK-650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15722170#comment-15722170
 ]


Michael Schmeißer commented on SPARK-650:
-----------------------------------------

A singleton is not really feasible if additional information is required which 
is known (or determined) by the driver and thus needs to be sent to the 
executors for the initialization to happen. In this case, the options are 1) 
use some side-channel that is "magically" inferred by the executor, 2) use an 
empty RDD, repartition it to the number of executors and run mapPartitions on 
it, 3) piggy-back the JavaSerializer to run the initialization before any 
function is called or 4) require every function which may need the resource to 
initialize it on its own.

Each of these options has significant drawbacks in my opinion. While 4 sounds 
good for most cases, it has some  cons which I've described earlier (my comment 
from Oct 16) and make it unfeasible for our use-case. Option 1 might be 
possible, but the data flow wouldn't be all that obvious. Right now, we go with 
a mix of option 2 and 3 (try to determine the number of executors and if you 
can't, hijack the serializer), but really, this is hacked and might break in 
future releases of Spark.

> Add a "setup hook" API for running initialization code on each executor
> -----------------------------------------------------------------------
>
>                 Key: SPARK-650
>                 URL: https://issues.apache.org/jira/browse/SPARK-650
>             Project: Spark
>          Issue Type: New Feature
>          Components: Spark Core
>            Reporter: Matei Zaharia
>            Priority: Minor
>
> Would be useful to configure things like reporting libraries



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-650) Add a "setup hook" API for running initialization code on each executor

Reply via email to