[ 
https://issues.apache.org/jira/browse/SPARK-650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15714762#comment-15714762
 ] 

Robert Neumann commented on SPARK-650:
--------------------------------------

Sean, I agree this is the essential question in this thread. If we get this 
sorted out, then we are good and can achieve consensus on what to do with this 
ticket.
A singleton "works" indeed. However, from a software engineering point of view 
it is not nice. There exists a class of Spark Streaming jobs that requires 
"setup -> do -> cleanup" semantics. The framework (in this case Spark 
Streaming) should explicitly support these semantics through appropriate API 
hooks. A singleton instead would hide these semantics and you would need to 
implement some laxy code to check whether an HBase connection was already setup 
or not; the singelton would need to do this for every write operation to HBase.
I do not think that application logic (the Singleton within the Spark Streaming 
job) is the right place to wire in the "setup -> do -> cleanup" pattern. It is 
a generic pattern and there exists a class of Spark Streaming jobs (not only 
one specific Streaming job) that are based on this pattern.

> Add a "setup hook" API for running initialization code on each executor
> -----------------------------------------------------------------------
>
>                 Key: SPARK-650
>                 URL: https://issues.apache.org/jira/browse/SPARK-650
>             Project: Spark
>          Issue Type: New Feature
>          Components: Spark Core
>            Reporter: Matei Zaharia
>            Priority: Minor
>
> Would be useful to configure things like reporting libraries



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to