[ 
https://issues.apache.org/jira/browse/SPARK-3215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14112803#comment-14112803
 ] 

Marcelo Vanzin commented on SPARK-3215:
---------------------------------------

Hi Matei,

Both suggestions came up during our internal talks. Doing it inside Hive is an 
option, but we thought more people would benefit is this were a part of Spark. 
While it may not be overly complicated, it's also not trivial, and having an 
"official" and well-maintained way of doing it is an advantage of this 
approach. It also keeps it closer to the core, making it easier to evolve to 
accommodate new features in Spark.

I also toyed with the idea of "use Spark API locally but have things run 
remotely" you mention, but I think that would require too many changes in Spark 
to be useful. It also has some downsides for the client, which needs to run 
more code / use more memory and thus suffers more from scalability and app 
isolation issues.

As Reynold suggests, it doesn't need to be inside the {{core/}} project - it 
could be a new sub-module.

> Add remote interface for SparkContext
> -------------------------------------
>
>                 Key: SPARK-3215
>                 URL: https://issues.apache.org/jira/browse/SPARK-3215
>             Project: Spark
>          Issue Type: New Feature
>          Components: Spark Core
>            Reporter: Marcelo Vanzin
>              Labels: hive
>         Attachments: RemoteSparkContext.pdf
>
>
> A quick description of the issue: as part of running Hive jobs on top of 
> Spark, it's desirable to have a SparkContext that is running in the 
> background and listening for job requests for a particular user session.
> Running multiple contexts in the same JVM is not a very good solution. Not 
> only SparkContext currently has issues sharing the same JVM among multiple 
> instances, but that turns the JVM running the contexts into a huge bottleneck 
> in the system.
> So I'm proposing a solution where we have a SparkContext that is running in a 
> separate process, and listening for requests from the client application via 
> some RPC interface (most probably Akka).
> I'll attach a document shortly with the current proposal. Let's use this bug 
> to discuss the proposal and any other suggestions.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to