[jira] [Commented] (SPARK-3215) Add remote interface for SparkContext

Marcelo Vanzin (JIRA) Wed, 27 Aug 2014 15:27:54 -0700

    [ 
https://issues.apache.org/jira/browse/SPARK-3215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14112958#comment-14112958
 ]


Marcelo Vanzin commented on SPARK-3215:
---------------------------------------

I think the "what RPC to use" discussion is mostly irrelevant. It only matters 
if the API you want to expose is the RPC layer itself. My proposal is to expose 
a Scala/Java API, so how the bytes get from one side to the other underneath 
doesn't matter much. We can fiddle with that all we want as long as the 
Scala/Java API remains the same.

Having the API at that level does make it trickier to support other languages, 
to address Shivaram's comments. But given what this feature proposes, I think 
it would be pretty hard to support multiple languages with a single backend 
anyway. If your client app is python, you need something on the server side 
that understands python.

Evan, yes, there is code in the job server that does something similar to this; 
it's still sort of tied to the job server itself, and I actually don't think 
the job server part itself is very interesting - at least not for the Hive 
needs. Basically the "remote context" part of my proposal looks a lot like 
JobManagerActor, and you'd have a client library to talk to it directly 
(instead of going through the job server). It'd be interesting to know more 
about the changes you're working on, though.

> Add remote interface for SparkContext
> -------------------------------------
>
>                 Key: SPARK-3215
>                 URL: https://issues.apache.org/jira/browse/SPARK-3215
>             Project: Spark
>          Issue Type: New Feature
>          Components: Spark Core
>            Reporter: Marcelo Vanzin
>              Labels: hive
>         Attachments: RemoteSparkContext.pdf
>
>
> A quick description of the issue: as part of running Hive jobs on top of 
> Spark, it's desirable to have a SparkContext that is running in the 
> background and listening for job requests for a particular user session.
> Running multiple contexts in the same JVM is not a very good solution. Not 
> only SparkContext currently has issues sharing the same JVM among multiple 
> instances, but that turns the JVM running the contexts into a huge bottleneck 
> in the system.
> So I'm proposing a solution where we have a SparkContext that is running in a 
> separate process, and listening for requests from the client application via 
> some RPC interface (most probably Akka).
> I'll attach a document shortly with the current proposal. Let's use this bug 
> to discuss the proposal and any other suggestions.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (SPARK-3215) Add remote interface for SparkContext

Reply via email to